Stack Programming and Combinators
There's an interesting relationship between the basic operators of stack-based programming languages and combinatory calculus. I'm not sure if it's in any way useful, although perhaps it allows one to think about the operational semantics of stack-based programming from a new angle. But in any case, it's just a nice connection that I only made thanks to Jon Purdy's talk "Concatenative Programming: From Ivory to Metal"[1].
Part 1. What is Stack-based programming?
Stack-based programming languages are a bit of an esoterical category of programming languages, of which the only ones that is widely known are Forth and PostScript (plus JVM bytecode, I suppose). There are many others like Factor, Cat, Kitten[2], Uxntal[3], and Uiua[4] that each have their own unique spin on things, but one thing they all have in common is that the conceptual model by which computation is achieved is through the manipulation of a stack of data. This is separate from the common notion of the The Stack in imperative programming where the calling of subroutines form a stack of return memory addresses. The stack in stack-based programming is where local variables and arguments to the called functions are stored. As such, there are some basic operations that are required to be able to manipulate this stack; many are the standard operations that would be used to manipulate any stack data structure and some are specific to array-programming.
- Pop :: Does exactly what you'd expect.
- Duplicate :: Often abbreviated to DUP, this duplicates the value at the top of the stack.
- Swap :: Also known as flip, this moves the second value to the top of the stack.
- Rotate :: Often abbreviated to ROT, this moves the third value to the top of the stack.
There are a bunch more, plus some that vary from language to language, but these are the basics.
Here's some example code in a made-up stack language. The code is read from left to right, with each word performing an operation. Numerical literals push their value onto the stack, and the words are either one of the operations defined above or a piece of logic assumed to be defined elsewhere that performs the named operation; popping value(s) off the stack and pushing a result back on.
Computing the average of a list of numbers
DUP LENGTH_OF_LIST SWAP SUM_LIST DIVIDE
Which when executed evolves the stack like so. Starting with [1,2,3] on the stack,
0. : [ [1,2,3] ] 1. DUP : [ [1,2,3] , [1,2,3] ] 2. LENGTH_OF_LIST : [ 3 , [1,2,3] ] 3. SWAP : [ [1,2,3] , 3 ] 4. SUM_LIST : [ 6 , 2 ] 5. DIVIDE : [ 3 ]
And what's left on the stack is the average, 3.
And converting Fahrenheit to Celsius would be
5 9 DIVIDE MULTIPLY 32 ADD
If we start with 10 on the stack, that evaluates like so.
0. : [ 10 ] 1. 5 : [ 5 , 10 ] 2. 9 : [ 9 , 5 , 10 ] 3. DIVIDE : [ 1.8 , 10 ] 4. MULTIPLY : [ 18 ] 5. 32 : [ 32 , 18 ] 6. ADD : [ 50 ]
Therefore, 10C = 50F.
Part 2. What is combinatory calculus?
This is another one of those models of universal computation, like Turing machines and Lambda calculus. Invented a couple of decades later and little less well known, it is based on the idea of modelling computation as the evaluation of combinators, where a combinator can be thought of as a pure mathematical function that only references axioms passed in as arguments. So identity is a combinator, but a function that adds two to a value is not because add is defined elsewhere. The most common set of combinators that is used when discussing combinatory calculus as a universal computation logic system is S, K, and I, which are defined as follows:
- Sxyz = xz(yz) :: This is the complicated one. `x` is a binary function that takes as argument `z` and the result of calling `y` -- a unary function -- also with `z`.
- Kxy = x :: In other words, it is the constant function, always returning `x` no matter what `y` is.
- Ix = x :: This is the identity function. Not strictly necessary as it can be derived from S and K, but often included nonetheless.
More combinators
Whilst today the canonical definition of combinator calculus is S, K, and I the original definition used the combinators B, C, K, and W. They are defined as follows:
- Bxyz = x(yz) :: Which is to say that it composes the combinators `x` and `y`, passing the argument `z` first to `y` and then to `x`.
- Cxyz = xzy :: This one flips the order of the arguments `y` and `z`, passing them to the two-argument combinator `x` in the new order.
- Kxy = x :: This is the same as the `K` in SKI, where we simply drop the second argument.
- Wxy = xyy :: Here, `x` takes two arguments, and is passed `y` for both of them.
It can be shown that these two sets are as expressive as each other, and in turn as expressive as lambda calculus and Turning machines.
Part 3. A detour into Continuations
The third piece to understand is continuations, here's a hand-wavy explanation. Lets take some program that is in the middle of execution (any program written in any paradigm, doesn't matter). We can define that point in terms of the state of the system and the rest of the program to be executed. Given that the rest of the program is just some code we can wrap it in a function, and that function we call a continuation. They form the mathematical underpinnings of a wide array of programming language functionality from exception handling to coroutines and defer statements.
If we look at stack-based programming languages specifically, this model is an even closer match. Stack-based languages are a subgrouping of concatenative languages where the entire program is nothing but the composition (or concatenation) of functions. As such, some point in the middle of executing such a program can be modelled precisely as the state of the stack and the function that is the rest of the program. For example, take this mathematical expression in a made-up stack language which is putting three numbers onto the stack, popping off the 2 and the 3 and putting back 5, and then popping off 4 and 5, and putting back 20.
4 3 2 + *
At the point where we've put 4 and 3 on the stack, the state of the stack is exactly those two values and the continuation of the rest of the program is
2 + *
which is just a function that pops two values from the stack and push the result of the calculation.
4. So what does all of this have to do with each other?
Lets take the K combinator (reminder: `Kxy = x`), if we let `x` be the continuation of the rest of a program and `y` the value at the top of the stack then `Kxy` simply evaluates to the rest of the program. In other words we have popped the value off the top of the stack and discarded it.
Now let's take the C combinator (reminder: `Cxyz = xzy`), if again we let `x` be the continuation of the rest of the program, and again `y` being the value at the top of the stack, but now with `z` being the value below it, `Cxyz` evaluates to the rest of the program being executed with `z` at the top of the stack and `y` below it. In other words, the C combinator swaps the two values at the top of the stack.
And the W combinator (`Wxy = xyy`)? Why that's just duplicating the value `y` at the top of the stack and executing the continuation `x`.
Finally, the B combinator (`Bxyz = y(yz)`). It's composition and it doesn't have a keyword in these stack-based languages because composition is implied by the simple act of placing one function after another in the program. The pure concatenation of the code. It is just the white space between the keywords.
That's nice, right? As I say, I have no idea if it means anything, but it's just a really nice symmetry.
~~~
[1] :: Concatenative Programming: From Ivory to Metal[2] :: Kitten
[3] :: Tal is the programming language for the Uxn virtual machine.
[4] :: Uiua