After reading SICP, I got the impression that, at least from a semantic point of view, the only real difference between programming languages is whether or not they support lambda expressions. (Of course, there are many pragmatic differences between languages, too many to list; but the point is that lambdas let you do whatever you can do in non-lambda-supporting languages, but not vice-versa.)
For instance, the book points out that you can implement a basic "methods operating on private data" system, as found in object-oriented languages, in Scheme using lambda constructors to allocate the memory:
> (define make-account (lambda ()
(define balance 0)
(list (lambda (x) (set! balance x))
(lambda () balance))))
> (define a (make-account))
> ((cadr a)) ; get-balance method
0
> ((car a) 5) ; set-balance! method
> ((cadr a)) ; get-balance method
5
> ((car a) 7) ; set-balance! method
> ((cadr a)) ; get-balance method
7
The balance variable is hidden to outside code, there is no way to
modify it except through the provided methods.
Well, due to this idea that the presence or non-presence of lambdas is
what makes all the difference, I wondered for some time whether the
new C++0x lambdas would allow for an entirely new style of programming
in C++. The fact is that I got a little tired of class-oriented
programming and have lately been doing everything in a functional
style. (Especially in my Python code, which supports classes
perfectly well, but also when I code in C... unfortunately the latter
involves a lot of casting to void*, which makes things difficult
since you lose all type safety.)
When I heard about C++0x lambdas, I started to wonder if this might renew my interest in the language. For one thing, there is little doubt that the it compiles to fairly efficient code, and supports types, so it might be interesting to try it as a typed functional programming platform. Of course, without further changes to the language ecosystem like encouraging the use of persistent data structures, it will always suffer the problems induced by mutability, but at least a functional style might encourage statelessness, instead of the current object-oriented style which does the exact opposite. On the other hand, I realized that most of my fatigue with OOP came from dealing with abstraction via complicated class hierarchies, not from mutability, and that is something that function-oriented programming really does solve, especially if combined with something as powerful as the C++ template engine! Regardless, I was wondering if I couldn't pull off some Scheme-like tricks in C++0x.
Quick review of C++0x lambda syntax:
auto f = [=](..args..){..body};
auto f = [&](..args..){..body};
The [=] means to create a closure where referenced variables are
copied by value, and [&] means copy by reference. You can also
specify how specific closed-over variables are handled.
So, I was curious how variables are handled. (To C++'ers, ignore the
printf... as a C guy, I just happen to dislike std::cout.) By the
way, you need g++ version 4.5 to compile this, which is easy to
install on Ubuntu through the gcc-4.5 package.
lambda.cpp:
#include <stdio.h>
int main()
{
int balance=6;
printf("balance: %d\n", balance);
auto v = [&] () {
printf("balance: %d\n", balance);
balance=2;
};
v();
v();
balance=8;
v();
return 0;
}
Run:
$ g++-4.5 -Wall -std=c++0x -o lambda lambda.cpp && ./lambda
balance: 6
balance: 6
balance: 2
balance: 8
So, here we can see that balance is truly a reference. It's
possible for both the lambda and the outside scope to modify the same
variable. If we change [&] to [=], we get an error saying that
balance is read-only. In fact, copied variables become const, but
that's okay because it doesn't really make sense to modify variables
that won't exist after the function runs. If we remove the
balance=2 line, it compiles, and we get:
$ g++-4.5 -std=c++0x -o lambda lambda.cpp && ./lambda
balance: 6
balance: 6
balance: 6
balance: 6
Since balance was copied when the lambda was created, modifying it
later doesn't change its value in the closure.
Okay, is it possible to create an "object", with private state?
lambda2.cpp:
#include <stdio.h>
#include <tuple>
int main()
{
auto account = [] () {
int balance = 0;
printf("Initial balance: %d\n", balance);
auto set = [&] (int n) {
printf("Changing balance from %d to %d.\n", balance, n);
balance = n;
};
auto get = [&] () {
return balance;
};
return std::make_tuple(set,get);
}();
printf("Balance: %d\n", std::get<1>(account)());
printf("Set balance to 5.\n");
std::get<0>(account)(5);
printf("Balance: %d\n", std::get<1>(account)());
return 0;
}
Notice we have to use an std::tuple<> to return a pair of
functions. We don't know their types so it's certainly nice to have
auto. Run it:
$ g++-4.5 -Wall -std=c++0x -o lambda2 lambda2.cpp && ./lambda2
Initial balance: 0
Balance: 0
Set balance to 5.
Changing balance from -1217990668 to 5.
Balance: 5
Holy cow, what happened to the balance inside set? Well, balance
is a variable in the contructor lambda. Once the constructor lambda
is called, there is no longer a reference to it so it is automatically
deallocated, but the inner lambda for set that was returned still
contains a reference to that memory.
Notice that the compiler does not warn us about this!
Is there a solution? Clearly we need to ensure that the memory for
balance is not deallocated. However, changing set to [=]
doesn't help (read-only), so we have to explicitly make balance an
allocated value. Problem being that we don't have a destructor for
when the lambda gets deallocated, so how can we ensure that its memory
also gets deallocated. The answer is to use a smart pointer type,
shared_ptr, which performs reference counting for us. It goes out
of "scope" when the lambda is destructed, and our memory is freed.
Let's provide an explicit deleter function so we can track it and be
sure:
lambda3.cpp:
#include <stdio.h>
#include <tuple>
#include <memory>
void deleter(int* p)
{
printf("deleting %p\n", p);
delete p;
}
int main()
{
auto account = [] () {
std::shared_ptr<int> balance(new int(0), deleter);
printf("Initial balance: %d (%p)\n", *balance, balance.get());
printf("balance.use_count(): %ld\n", balance.use_count());
auto set = [=] (int n) {
printf("Changing balance from %d to %d.\n", *balance, n);
*balance = n;
};
auto get = [=] () {
return *balance;
};
printf("balance.use_count(): %ld\n", balance.use_count());
return std::make_tuple(set,get);
}();
printf("Balance: %d\n", std::get<1>(account)());
printf("Set balance to 5.\n");
std::get<0>(account)(5);
printf("Balance: %d\n", std::get<1>(account)());
return 0;
}
Something to notice here is that I had to use copy semantics ([=])
for the set and get lambdas, because the shared_ptr is a value
type that must be copied for its reference-tracking to work. Taking a
reference to it defeats the purposes. We're printing out the
reference count for balance so we can understand what the
shared_ptr is doing.
$ g++-4.5 -Wall -std=c++0x -o lambda3 lambda3.cpp && ./lambda3
Initial balance: 0 (0x850e008)
balance.use_count(): 1
balance.use_count(): 3
Balance: 0
Set balance to 5.
Changing balance from 0 to 5.
Balance: 5
deleting 0x850e008
Here we see that the behaviour is as expected. Notice that the
reference count is 3 after declaring the lambdas! Each closure has a
reference, as does the constructor lambda before it returns. You can
see that the balance memory is indeed freed at the end of the
program when both closures go out of final scope.
So, we now have an intantiated "object" with its own methods, that
manipulates private data, without using the class keyword at all!
Although there's no advantage, in this short example, to avoiding real
classes, this is interesting because it's such a departure from how
you normally do this kind of thing in C++. It shows that you really
could construct a whole complicated stateful system in C++ without
using classes at all, allocating all dynamic memory through closures.
I'm not going to try to say why you'd want to do that here, I just
think it's cool that it's now possible at all.
Let's compare directly the bare bones of the two implementations, without all the cruft:
C++:
auto account = [] () {
std::shared_ptr<int> balance(new int(0));
return std::make_tuple([=] (int n) { *balance = n; },
[=] () { return *balance; });
}();
Scheme:
(define make-account (lambda ()
(define balance 0)
(list (lambda (x) (set! balance x))
(lambda () balance))))
They are about the same length. The complexity of the C++ version
makes it slightly harder to read at first glance, but C++'ers should
be comfortable enough with it. Mainly the explicit handling of memory
and the special need to use tuple for returning multiple values is
what makes it slightly more complicated to look at; the lambdas
themselves are pretty easy to read.
One thing I like about the C++ syntax is the control it gives you over the copy/reference semantics. You can specify it for each variable if you want.
Syntax aside, it's really striking how similar the two program structures are. This makes me wonder how interesting it would be to directly translate a Lispish- or ML-/Haskell-like shell syntax down to C++0x using a very basic parser. It could provide a nice (typed!) functional language that produces native and efficient code, and is directly able to interoperate with the massive C and C++ code base out there.
Discuss this article on reddit.
A bit of further experimentation with abstraction over lambdas led to
finding one awkward oddity: it seems to be difficult to combine
lambdas with templates. There is no template syntax for the lambda
syntax, and you can't easily return a lambda from a templated regular
function because it must be auto, requiring decltype, but you
can't put the lambda in decltype, so the compiler can't figure out
the return type. (Not 100% sure why it can't just infer it, but..)
Apparently there was an
issue
with something called "polymorphic lambdas" that just didn't make it
into the C++0x standard, unfortunately. However, I was able to create
a templated regular function that returns an std::function<>, and
puts a lambda inside it. Not sure if that's really an ideal answer,
but it allowed me to return a polymorphic closure:
#include <stdio.h>
#include <functional>
class A {
public:
int value() { return 12; }
};
class B {
public:
int value() { return 9; }
};
template<typename T>
std::function<int()> getT() {
auto v = ([] () {T a; return a.value();});
return v;
}
int main() {
printf("%d\n", getT<A>()());
printf("%d\n", getT<B>()());
return 0;
}