Deriving the Y combinator

Satyadev Nandakumar
Sun Sep 13 19:21:14 2015

The Y combinator is a fixpoint combinator in the untyped λ calculus. That means the following: if M is any λ expression, then M(YM) = YM. Thus Y can compute the fixpoint of any λ term (including itself, if you think about it.). Y combinator is one of the most common ways to do recursion in the λ calculus.

The Combinator

The Y combinator is:

 {λ g . g  [(λ x. g (xx)) (λ x. g(xx))]  }

We can of course, verify that for any λ-term M,

YM = {λ g . g[(λ x. g (xx)) (λ x. g(xx))]} M 
   = M[(λ x. M (xx)) (λ x. M(xx))]
   = M[λ x. M ((λ x. M(xx)) (λ x. M(xx)))]
   = M[YM].
as claimed. It begs the question, how did anyone come up with such an expression? We give two stories, each rather unconvincing, but probably a decent solace.

Angle 1: Russell's Paradox

If you read the history of the untyped λ calculus, it is mentioned in passing that the Y combinator evolves out of the Russell's Paradox. The following is an elaboration, to the best of my understanding.

Russell's paradox concerns a self-referential definition involving negation ("impredicative" definition). Consider the set S defined as

S = { x | x ∉ x }
That is, S is the collection of sets X such that X is not an element of itself. What does it mean for a set to be an element of itself? That's not the point - if you say that a set is a collection of objects satisfying a property,
X ∉ X
is a property, and we are justified in considering the nature of S.

Russell's paradox arises from thinking about whether S ∈ S or not. Suppose S ∉ S. Then by definition, S should be included in S. This is a contradiction. Now, suppose S ∈ S. Then since S contains only those X for which X ∉ X, we must have that S ∉ S. Thus either case leads to a contradiction. The statement S ∈ S cannot be determined to be true or false. This is known as Russell's paradox.

Let us encode a λ term representing the set S.

S=(λ x . N(xx))
With a stretch of imagination, we may say that this expresses "x∉x", the N standing for "notin" We can consider this as a definition of S - it stands for the set of all such objects x.

Then we can write Russell's paradox, which asks whether S∈S, as

(λ x. N(xx)) S = (λ x. N(xx))(λ x. N(xx))
               = N ((λ x. N(xx))(λ x. N(xx)))

We are almost done. In the above λ term, N really was not all that special. Abstracting it out, we get

Y=(λ g. g((λ x.N(xx)) (λ x. N(xx)))) 
where Russell's paradox is YN.

This is how the Russell's Paradox leads us to the Y combinator. Logicians surmise that this is how Curry might have discovered the Y combinator.

Angle 2: From self-reproducing programs

The basic self-reproducing λ expression is:
(λ x. xx)(λ x. xx)
We want an expression that not only reproduces itself, but also expands -
N = g(N N)
Well,
  g((λx.xx) (λx.xx)) = g((λx.xx) (λx.xx)),
so that does not quite work. We want something like
g(SS) = g(g(SS)),
What if we just guess that the term is
(λ g. g((λ x. g(xx)) (λ x. g(xx))))