Is it a llama? Is it a lambada? No, it's the lambda! - by Jarle Stabell

By: Jarle Stabell

Abstract: A short introduction to lambda expressions, including some history and digressions

Lambda expressions is a hot topic these days. Lambda expressions play a prominent role in Microsoft's LINQ/DLINQ technology (at the time of writing in preview). Since this technology from MS is language independent, it might also turn up in Delphi in the future, as indicated in a blog posting by Danny Thorpe (until recently the Delphi compiler architect). Borland already has a product supporting lambda expressions; Borland's ECO product, including its predecessor Bold, has supported OCL (Object Constraint Language, a part of UML) for many years. OCL is a functional/declarative language supporting lambda expressions.
(Many "modern" programming language features like lambda expressions and garbage collection are actually quite old ideas, implemented in one of the oldest programming languages around, LISP.)

A lambda expression can be seen simply as a way of defining a function.

In Delphi, you can define a function which adds 1 to every number you pass to it like this:

Function Successor(const inputNumber: Integer): Integer;
Begin
  Result:=inputNumber + 1;
End;

Many are familiar with this mathematical notation for the same function:

Successor(x) = x + 1

If that looks a bit unfamiliar to you, try 'f(x)=x+1' instead. A mathematician wouldn't use such a long name as 'Successor'. When a mathematician is to name a function, (s)he names the function 'f', unless that name already has been taken earlier in the same theorem, lecture or article or whatever is the relevant context. If 'f' is already taken, (s)he considers the next one, 'g' (or adds some more molecules to the 'f', turning it into 'f' with a hat or bar above it, or a mark to one of the sides.) . When running out of available letters, a mathematician prefers to switch to another alphabet or another font, before considering using longer names! Obviously, non-mathematicians suspect that the common usage of single-letter (and typically case sensitive) names in mathematics is an attempt at making maths look more cryptic than it really is (also notice all the Greek letters!). ;-)

History

In their famous work 'Principia Mathematicia', Bertrand Russel and Alfred N. Whitehead used an even more compact notation for defining functions, making our Successor function look like this:

^
x + 1

(The ^ (circumflex) is supposed to be on top of the x, this expression is to be read as "x hat plus one".)

So using this notation, we can define a function without even bothering to give it a name! However, this notation never became popular.

Obviously, mathematicians have always been interested in calculations. But only relatively recently the process of calculations itself came under serious study by mathematicians. The mathematician Alonzo Church wanted to define a foundation for mathematics, and needed a way to define what a computable function (computation/algorithm) is. In the 1930's, he (with input from some of his students) invented the Lambda Calculus for this purpose. (Alan Turing invented the computationally "equivalent", but rather different Turing machine around the same time. It was actually a student of Church, Kleene (the inventor of regular expressions) that coined the term "Turing Machine" for denoting Turing's famous automata construct))

According to [Barendregt1997], Church wanted to use the "hat notation" used in Principia Mathematica to define our Successor function like this:

^
x | x + 1

(The "hat" is supposed to be above the first x)

However, according to [Barendregt1997], the typesetter Church used to publish the first book about the lambda calculus [Church1941] wasn't able to fulfill Church's wish of having the "hat" on top of the x (not very hard to believe!), so instead the typesetter wrote it as:

^x | x + 1

Then (again according to [Barendregt1997]), another typesetter misunderstood this notation, and changed it into this notation:

λ x | x + 1

So according to [Barendregt1997], the reason the Greek letter 'λ' (lambda) was used in Church's book (and used forever after) is due to a combination of a technical limitation and plain old misunderstanding!

This is a nice story, but it is probably a myth. According to the article "Types in Logic and Mathematics before 1940":

"J. P. Seldin informed us that he had asked Church about it in 1982, and that Church had answered that there was no particular reason for choosing λ, that some letter was needed and that λ happened to have been chosen."

Please note that I'm "cheating" a bit, I'm using '|' to separate the parameter from the body of the expression instead of the dot notation that Church originally used (and which is still in common use), because the original (hundred years old) dot-notation looks unnecessary confusing to today's OO programmers)

In C# 3, it seems that the syntax will be this:

(x) => x + 1

which, when there's only a single parameter, can be abbreviated to:

x => x + 1

The essence for us is that a lambda expression is a very compact way of defining a function.

Higher order functions

The term 'lambda expression' implies that we're working with something that is an expression, as opposed to a statement. An expression results in some kind of value when evaluated. Values are the kind of things we store in variables, in fields in classes, send as parameters etc. Notice that in Delphi, events are actually properties, what distinguish them from "ordinary" properties is that events have method references as values.

In Delphi's TList class, there's a powerful method called Sort, which takes a single parameter. This parameter is a function, which tells the Sort method which of any two given items are the "largest" (or whether they are equally "large"). This makes the Sort method very flexible/powerful, it doesn't care about what kind of items are in the list, still it is able to sort them by calling the provided compare function when it needs to compare any two of the items in the list.

So a function taking other functions as parameters can be very powerful. (A function taking functions as parameters is usually called a 'higher order function')

Imagine that you want to call a higher order function with a very simple function as an argument. Let's say you have a list of objects representing programming languages called ProgrammingLanguageList of type TProgrammingLanguageList, and that it contains objects of the type TProgrammingLanguage and assume this latter class has a Name and a Version property. Assume that the ProgrammingLanguages list has a higher order method called Find, which takes one parameter, a boolean function which tells the list whether a given item is an item it should "find". (This Find method is called 'Where' in LINQ and 'Select' in OCL.)

That is, we have something like this:

TProgrammingLanugage = class
  Property Name: String ...;
  Property Version: Integer ...;
End;

TRecognizerFunction = function(const p: TProgrammingLanugage): Boolean;

TProgrammingLanuageList = class(some kind of list class)
  Function Find(const i_recognizerFunction: TRecognizerFunction): TProgrammingLanuageList;
End;

Assume you want to get the TProgrammingLanguage objects in this list having Name='Delphi'.

You can use this to find the Delphi languages in this list:

l_delphiLanguages:=ProgrammingLanguages.Find(MyFunctionWhichRecognizesDelphi);

and define MyFunctionWhichRecognizesDelphi as:

Function MyFunctionWhichRecognizesDelphi(const p: TProgrammingLanguage): Boolean;
Begin
  if p.Name='Delphi' then
    Result:=True
  else
    Result:=False;
End;

or, equivalently, in a more compact notation:


Function MyFunctionWhichRecognizesDelphi(const p: TProgrammingLanguage): Boolean; 
Begin
  Result:= (p.Name='Delphi');
End;

Having to write this MyFunctionWhichRecognizesDelphi is quite boring just to find a given item. If you could define this function "inline" (lambda expression), you wouldn't need the "standalone" function, instead you could write (using the C# syntax for the lambda expression):

l_delphiLanguages:=ProgrammingLanguages.Find(p => p.Name='Delphi');

This is much more elegant.

But there's more!

Imagine that the above expression is within a larger procedure, and that you only want to find a given version of Delphi, and that the version you want to find is stored in a local variable or a parameter.

Procedure .... 
  Var l_myDelphiVersion: Integer; 
  l_specificDelphiVersion: TProgrammingLanguageList; 
Begin 
  l_myDelphiVersion:=...somehow decide which Delphi version to find...; 
  l_specificDelphiVersion:=ProgrammingLanguages.Find(p =>(( p.Name='Delphi') and (p.Version=l_myDelphiVersion))); 
End;

This is nice and easy.

But how would you translate the above into a version not having the lambda expression? Remember that the Find function expects a function of a single parameter, how would it get at l_myDelphiVersion?

One way would be to assign l_myDelphiVersion to a global variable g_myDelphiVersion just before the Find call, and let MyFunctionWhichRecognizesDelphi look at this global variable in addition to its provided p parameter. However, that's obviously ugly, and wouldn't work in a multithreaded situation.

With the lambda-expression, the compiler does that magic for us (in a safe way). This is called 'Closure', taking a "snapshot" of the local variables/environment needed within the lambda-expression and providing them to the anonymous function when it gets called. A Delphi programmer has seen something similar, in that a method pointer needs to remember which object the method belongs to.

So lambda expressions makes it very simple to call higher order functions, making such powerful functions much more common than they are today. Probably the main reason they haven't popped up earlier in C# (and similar languages) is that they aren't very practical/useful without generics in statically typed languages

(In non-OO languages supporting higher order functions and closures, one can simulate "object-based" programming by exploiting closures to store the state of "objects" and pass an "object" around by passing the higher order function (which "remembers" its "object state" via the closure))

An OO programmer is actually very used to higher order functions. A method/routine taking an object as a parameter generalizes higher order functions because an object can have methods. Such an object can easily store the necessary "extra parameters" (the closure) needed. (In fact, this is what the C# 2.0 compiler does behind the scenes for anonymous methods (lambdas), creating an anonymous class having this anonymous method, and having fields to store the "extra parameters".)

But writing such classes are very boring/cumbersome, so boring that Borland and others assume we wouldn't bother calling "higher order" functions much, so they didn't provide us with those "higher order" functions (using objects) in the first place. But with support for lambda expressions, the compiler writes those boring classes for us and with generics, library authors can write typesafe methods like 'Find' in a generic way.

Example with multiple lambda expressions

Assume that 'customers' is a list/collection of customers, here's a C# 3 statement using 3 lambda expressions:

NameOfOsloCustomersSortedByName = customers.Where(c => c.City == "Oslo").OrderBy(c => c.name).Select(c => c.Name)

Notice the similarity to the following SQL query:

select c.Name from customers c where c.City = 'Oslo' order by c.Name

In the 'Object Constraint Language' (OCL), used in Borland's ECO product, this expression would be:

customers->select(c | c.City = 'Oslo')->orderBy(c | c.Name)->collect(c | c.Name)

Imagine how boring it would be to explicitly create 3 standalone functions (or more likely 3 standalone classes) in order to do this without the lambda expressions:

customers.Where(MyOsloRecognizerFunction).OrderBy(MyFunctionWhichReturnsTheNameOfAGivenCustomer).Select(MyCityNameSelectorFunction)

And this is the simple case, when the "small" functions doesn't need any "extra" parameters, if a "manual snapshot/closure" of the relevant variables needs to be taken, there will be object creation and destruction and try-finallys inside that expression making it quite unreadable.

In addition, with the lambda-expressions, the compiler has more headroom for optimization (because what you are trying to do is more transparent/"obvious" to the compiler, the compiler doesn't need to analyze a lot of code in order to figure out the "essence" of what you are doing), including the possibility of sending the whole (or parts of or slightly modified version of) the expression directly to a database engine. For the latter, see Microsoft's DLINQ project.

Digression

I mentioned in the beginning of this article that mathematicians tend to use names like 'f' and 'g' for function names. When the function is so important that it deserves a "global" name this "rule" is broken. Our innocent Successor function is very important to number theorists. It is usually called 'S', and sometimes even the "extraordinary" long 'Succ'! (which it also is called in Pascal). Given the number 0, and the Successor function, we can build all the integers. (As an example, 3 is "implemented/built" by S(S(S(0)))). And given all the integers, we can easily build all the rational numbers. And given all the rational numbers, we can build all the real numbers (somewhat complicated though). So from two "objects", the symbol '0' and the Successor function, we can build/construct all the numbers used in mathematics.

The famous mathematician Gvdel showed the world how to encode statements into integers. For us programmers, encoding characters as integers and text as sequences of integers (bytes) is the most natural thing in the world. When the Delphi compiler generates an executable for us, we get a file. A file is a sequence of bytes (0's and 1's). A sequence of 0's and 1's is an integer. Which means that any program we build is simply one *huge* integer. "Real programmers" write their programs using only 0 and the Successor function! ;-) (The slight drawback is that only the most tiny assembler-coded 'Hello World' program will use less digits (when using unary encoding) than the numbers of atoms in the universe.)

References

"Principia Mathematica", Bertrand Russell and Alfred North Whitehead, 1910-1913.

"The Calculi of Lambda Conversion", Alonzo Church, 1941

"The impact of the lambda calculus in logic and computer science". Henk Barendregt, 1997.
(Can be found via http://citeseer.ist.psu.edu/ )

"Types in Logic and Mathematics before 1940", Kamareddine, Laan and Nederpelt. (Available via citeseer)

Microsoft's LINQ Project: http://msdn.microsoft.com/netframework/future/linq/

Danny Thorpe's Blog: http://blogs.borland.com/dcc

About the Author

Jarle Stabell lives in Oslo, Norway. He wrote his first computer programs on a Sinclair ZX-81. He doesn't know the lambada, but enjoys practicing Irish set and step dancing as well as Argentine tango when not sitting in front of a computer.


Server Response from: ETNASC03