Turing Machine Approach#
The Turing machine
(Unlike the SKI combinator calculus and the lambda calculus, the Turing machine does not have a single canonical formulation. Turing’s original 1936 example was not as atomic and later contributors discovered simplifications.)
A Turing machine is a theoretical model of computation that consists of an infinite tape divided into discrete cells, a tape head that can read and write symbols on the tape, and a finite set of states that govern the machine’s behavior. At each step, the machine reads the symbol under the tape head, consults its transition function to determine what action to take (write a symbol, move the tape head left or right, and change state), and then performs that action.
Per Hopcroft1979, p. 148, a universal Turing machine has the following components:
A finite, non-empty set of tape alphabet symbols, including a blank symbol.
A finite, non-empty set of states, including a start state and one or more halting states.
A transition function that, given the current state and the symbol being read, specifies the next state, the symbol to write, and the direction to move (left or right).
(Since the tape is infinite, there’s a sense in which no Turing machine has ever been built in the real world. We elide this subtlety.)
As Interpreter#
One way to approach Nock is that a Nock expression is like a tape populated with values which signal to the Turing machine head what action it should take next and how it should update the overall system state. This results in the most primitive Nock interpreter: the tree-walking interpreter.
We have to make some allowances for how to represent the noun laid out on the tape, which can be done by mapping tree slots to serial indices on the tape. Our first attempt would simply define the set of Turing machine symbols as the natural numbers, and the unwrapping of a noun would look like this:
[[40 50] [60 70]]
1 2 3 4 5 6 7 8 9 10 11 * * *
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
│ │ │ │ 40│ 50│ 60│ 70│ │ │ │ │ * * *
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘
However, a naïve layout at indices is liable to fail since many indices contain cells; at a minimum you would want a placeholder system for cells to indicate that they are occupied. There is also an “empty” value for unoccupied cells.
[[40 50] [60 70]]
1 2 3 4 5 6 7 8 9 10 11 * * *
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
│ C │ C │ C │ 40│ 50│ 60│ 70│ _ │ _ │ _ │ _ │ * * *
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘
Storing an entire noun in each cell wastes much of the ease of a Turing machine layout. Thus we will adopt a modified noun system which can structurally distinguish values; every entry is a pair. If it is a natural number (atom), we write [0 value]. If it is a cell indicator, it is a pair of numbers with zero and the current index for bookkeeping convenience. If it is an empty cell, it is a [0 0] crash.
[[40 50] [60 70]]
┌───────┬───────┬───────┬────┬────┬────┬────┬───────┬───────┐
│ [0 1] │ [0 2] │ [0 3] │ 40 │ 50 │ 60 │ 70 │ [0 0] │ [0 0] │
└───────┴───────┴───────┴────┴────┴────┴────┴───────┴───────┘
The Turing machine head consumes a Nock expression by treating each pair on the tape as a token. The head starts at slot 1—the root of the noun—and navigates toward the formula. In the slot-addressing scheme, slot 2 is the subject (the head of slot 1), slot 3 is the formula (the tail of slot 1), slot 6 is the opcode (the head of slot 3), and slot 7 is the formula’s argument (the tail of slot 3).
As a concrete example, consider evaluating [42 [1 99]]: subject 42, formula [1 99], opcode 1 (constant), result 99.
[42 [1 99]]
slot: 1 2 3 4 5 6 7
┌────────┬────────┬────────┬────────┬────────┬────────┬────────┐
│ [0 1] │ [0 42] │ [0 3] │ [0 0] │ [0 0] │ [0 1] │ [0 99] │
└────────┴────────┴────────┴────────┴────────┴────────┴────────┘
(Slots 4 and 5 are empty because the subject at slot 2 is an atom, not a cell.)
The head steps through the following transitions:
EVAL at slot 1: reads
[0 1]→ cell, navigate to formula root at slot 3READ_FORMULA at slot 3: reads
[0 3]→ cell, descend to formula head at slot 6READ_OPCODE at slot 6: reads
[0 1]→ opcode1, enter stateOPCODE_1OPCODE_1 at slot 7: reads
[0 99]→ result is99RETURN: write
99to the output region; halt
A Turing machine interpreter for Nock would implement the Nock evaluation rules as state transitions. At a high level, the state set includes one entry state per Nock opcode, plus auxiliary states for descending into sub-expressions and returning results.
State |
Symbol Read |
Action |
Next State |
|---|---|---|---|
|
cell |
navigate to formula at slot |
|
|
|
slot addressing |
|
|
|
constant return |
|
|
|
recursive apply |
|
|
|
cell test |
|
|
|
increment |
|
|
|
equality |
|
|
|
conditional |
|
|
|
compose |
|
|
|
push |
|
|
|
call (arm) |
|
|
|
edit |
|
|
|
hint |
|
|
|
return atom |
|
|
— |
eval argument sub-expression |
|
|
|
write |
|
|
any |
halt with error |
— |
The critical structural challenge is recursion. A Turing machine’s transition function is flat—there is no native notion of a call stack—yet opcodes 2, 4, 6, 7, 8, 9, and 11 all require evaluating sub-expressions before continuing. The machine simulates a call stack by encoding pending continuations in a designated region of the tape: before descending into a sub-expression, the current position and expected continuation state are written as a frame; when the sub-expression returns, that frame is popped and computation resumes.
This is precisely what the tree-walking interpreter does. It is the most direct Turing machine analog for Nock evaluation: the interpreter walks the noun tree, descending recursively into sub-expressions, maintaining a stack of pending operations, and propagating results back toward the root. Actual Nock implementations begin here before applying optimizations such as jet registration (replacing known subtrees with native code), memoization (caching results for referentially transparent sub-expressions), and tail-call elimination (discarding stack frames for tail-recursive loops).
Generally Recursive Functions#
The minimum requirements for a system to be Turing complete are minimal (Boolos et al., 2002, “6.2 Minimization”, pp. 70–71). Because computational systems are equivalent, it can be helpful to identify programming language features using another approach, such as a combinator calculus or lambda calculus. In this case, we can also gesture to the notion of generally recursive functions as explicated by Gödel (1934) and Kleene (1936). A system need only provide the following operations:
A constant function (including zero). Opcode 1 does this.
An increment (or successor) function. Opcode 4 does this.
A way to access parameters (or variables). Opcodes 0 and 9 along with opcodes 7 and 8 do this.
A way to compose functions (or concatenate programs). Opcodes 7 and 8 do this.
A way to perform recursion (or looping). Opcode 0 with, e.g., opcodes 3, 5, and 6 does this.
This frame relates to Turing completeness, but without the mechanical/procedural emphasis.