U: Definition
1 Purpose
This document defines the U function and its data
model.
2 License
U is in the public domain.
3 Status
This text is a DRAFT (version 0.15).
4 Data
A value in U is called a "term." There are three
kinds of term: "number," "pair," and "foo."
A number is any natural number (ie, nonnegative
integer).
A pair is an ordered pair of any two terms.
There is only one foo.
5 Syntax
U is a computational model, not a programming
language.
But a trivial ASCII syntax for terms is useful.
5.1 Trivial syntax: briefly
Numbers are in decimal. Pairs are in parentheses
that nest to the right. Foo is "~".
Whitespace is space or newline. Line comments
use "#".
5.2 Trivial syntax: exactly
term : number
| 40 ?white pair ?white 41
| foo
number : 48
| [49-57] *[48-57]
pair : term white term
| term white pair
foo : 126
white : *(32 | 10 | (35 *[32-126] 10))
6 Semantics
U is a pure function from term to term.
This document completely defines U. There is no
compatible way to extend or revise U.
6.1 Rules
[name] [pattern] [definition]
(a) ($a 0 $b) $b
(b) ($a 1 $b $c) 1
(c) ($a 1 $b) 0
(d) ($a 2 0 $b $c) $b
(e) ($a 2 %n $b $c) $c
(f) ($a 3 $b $c) =($b $c)
(g) ($a 4 %n) +%n
(h) ($a 5 (~ ~ $b) $c) $b
(i) ($a 5 (~ $b $c) $d) *($a $b $c $d)
(j) ($a 5 (~ ~) $b) ~
(k) ($a 5 (~ $b) $c) *($a $b $c)
(l) ($a 5 ($b $c) $d)
(*($a $b $d) *($a $c $d))
(m) ($a 5 $b $c) $b
(n) ($a 6 $b $c) *($a *($a 5 $b $c))
(o) ($a 7 $b) *($a 5 $a $a $b)
(p) ($a 8 $b $c $d) >($b $c $d)
(q) ($a $b $c) *($a 5 *($a 7 $b) $c)
(r) ($a $b) *($a $b)
(s) $a *$a
The rule notation is a pseudocode, only used in
this file. Its definition follows.
6.2 Rule pseudocode: briefly
Each line is a pattern match. "%" means
"number." Match in order. See operators below.
6.3 Rule pseudocode: exactly
Both pattern and definition use the same
evaluation language, an extension of the trivial
syntax.
An evaluation is a tree in which each node is a
term, a term-valued variable, or a unary
operation.
Variables are symbols marked with a constraint.
A variable "$name" matches any term. "%name"
matches any number.
There are four unary prefix operators, each of
which is a pure function from term to term: "=",
"+", "*", and ">". Their semantics follow.
6.4 Evaluation semantics
For any term $term, to compute U($term):
- find the first pattern, in order, that
matches $term.
- substitute its variable matches into its
definition.
- compute the substituted definition.
Iff this sequence of steps terminates, U($term)
"completes." Otherwise it "chokes."
Evaluation is strict: incorrect completion is a
bug. Choking is U's only error or exception
mechanism.
6.5 Simple operators: equal, increment, evaluate
=($a $b) is 0 if $a and $b are equal; 1 if they
are not.
+%n is %n plus 1.
*$a is U($a).
6.6 The follow operator
>($a $b $c) is always 0. But it does not always
complete.
We say "$c follows $b in $a" iff, for every $term:
if *($a 5 $b $term) chokes:
*($a 5 $c $term) chokes.
if *($a 5 $b $term) completes:
either:
*($a 5 $c $term) completes, and
*($a 5 $c $term) equals
*($a 5 $b $term)
or:
*($a 5 $c $term) chokes.
If $c follows $b in $a, >($a $b $c) is 0.
If this statement cannot be shown (ie, if there
exists any $term that falsifies it, generates an
infinitely recursive series of follow tests, or is
inversely self-dependent, ie, exhibits Russell's
paradox), >($a $b $c) chokes.
7 Implementation issues
This section is not normative.
7.1 The follow operator
Of course, no algorithm can completely implement
the follow operator. So no program can completely
implement U.
But this does not stop us from stating the
correctness of a partial implementation - for
example, one that assumes a hardcoded set of
follow cases, and fails when it would otherwise
have to compute a follow case outside this set.
U calls this a "trust failure." One way to
standardize trust failures would be to standardize
a fixed set of follow cases as part of the
definition of U. However, this is equivalent to standardizing a fixed trusted code base. The
problems with this approach are well-known.
A better design for U implementations is to
depend on a voluntary, unstandardized failure
mechanism. Because all computers have bounded
memory, and it is impractical to standardize a
fixed memory size and allocation strategy, every
real computing environment has such a mechanism.
For example, packet loss in an unreliable packet
protocol, such as UDP, is a voluntary failure
mechanism.
If the packet transfer function of a stateful UDP
server is defined in terms of U, failure to
compute means dropping a packet. If the server
has no other I/O, its semantics are completely
defined by its initial state and packet function.
7.2 Other unstandardized implementation details
A practical implementation of U will detect and
log common cases of choking. It will also need a
timeout or some other unspecified mechanism to
abort undetected infinite loops.
(Although trust failure, allocation failure or
timeout, and choke detection all depend on what
is presumably a single voluntary failure
mechanism, they are orthogonal and should not be
confused.)
Also, because U is so abstract, differences in
implementation strategy can result in performance
disparities which are almost arbitrarily extreme.
The difficulty of standardizing performance is
well-known.
No magic bullet can stop these unstandardized
issues from becoming practical causes of lock-in
and incompatibility. Systems which depend on U
must manage them at every layer.