Archive for the ‘Lisp’ Category.

Kent Pitman, again

And anyway, the subject line presupposes that Lisp has not caught on. This is like saying that astrophysics or calculus or brain surgery has not caught on because in relative numbers, there might be more people doing other things. The success of Lisp is not measured in the number of people using it, it’s measured in the utility to those people who do use it. Turning it into C (or C++ or C#) to make it more popular would not be success. In the world’s menu of computer language options, we don’t need them all to be Taco Bell.

Toy Scheme interpreter in Lua

As part of my Lisp studies, I have implemented a toy Scheme interpreter in roughly 1000 lines of Lua. It is here. It supports tail-call optimisation, lexical scope for closures, and first-class continuations via call/cc.

I have departed from the traditional approach of implementing a Scheme interpreter in Scheme itself because I wanted to avoid possible confusions between the defined language and the defining language. This has made a lot of the concepts clearer. The evaluator is written in continuation-passing style, which is easily done in Lua because the latter has tail-call optimisations. This way it is easier to reify continuations to first-class values.

I have also added a few primitives to allow me to run some examples, as the factorial and fibonacci functions. More primitives can be easily added if desired.

Does the World need another Scheme system?

I am currently reading the third chapter of Lisp in Small Pieces. It is really a wonderful book. By teaching how to implement Lisp, it teaches a lot about using the language too. Moreover, reading it sometimes I feel the urge that almost every Schemer has at one time or another felt: The urge to write his own Scheme system.

Ok, I know the World is already sick of Schemes, it only takes a look at this list to know why. But I still think I have to do it, if only for my personal learning. There is no need to bore the World to death with another half-done Scheme. Sometimes I think about features that could make my Scheme useful, though. They are probably never going to be implemented, but here they are:

Full R5RS compliance

The R5RS is a piece of art (R6RS, on the other hand, is a stain, a mess). Of course it does not describe a terribly useful system, but it beautifully describes the core of one. Everything else can then come from there. For more compliance with other systems, the SRFIs can be used as the basis for a standard library.

Bytecode virtual machine

Although Scheme is a dynamic language, there is a lot that can be known about the program before executing it. Lexical variables, never assigned variables, closure and continuation creation etc. This can be processed at compile time, allowing for a simple and fast virtual machine. Interpreting the source code directly is easier but very inefficient for any serious system.

Incremental garbage collection

Stop-and-go garbage collectors are easier to implement, but in some applications the pause is inacceptable. For applications like interactive games, the smooth user experience only can be achieved by an incremental garbage collector. It would be nice if it was a moving collector too, to avoid too much fragmentation of the heap (some naive C programmers think the use of malloc and free can beat a garbage collector in any circustances, until they are hit in the face by heap fragmentation. There is a reason the Apache web server uses memory pools). If I remember correctly the CLR garbage collector is very good and compacts the memory without the use of from- and to-spaces, which is cool, because uses less heap memory. Hmm, while I am dreaming about my ideal garbage collector, make it concurrent too (see below).

Concurrency

These days the hot topic is concurrency. New personal computers with two and four cores sell more and more every day. The programming languages/environments that makes easy for the programmer to use these new cores are going to be big. Erlang is the hottest thing right now in this regard, although Haskell is a strong candidate too. In the mainstream Java has a concurrent VM, but Java uses the old lock/unlock paradigm of other mainstream languages, Clojure is the new language on top of the JVM that leverage that power for the programmer. But although this is all so hot, the only thing that seems to be happening to Scheme is Termite, which as far as I know can only use green threads. My hypotetical Scheme would have a concurrent VM, some Scheme compilation techniques already give a start in that direction. There would be maybe a spawn-continuation form that would take a continuation captured with call/cc and resume it in parallel with the current one, in a M:N model (M continuations spread on N OS threads).

Just-in-time compilation

Well, once dreaming, why not go all the way? One of the nice things about Scheme is the bottom-up, incremental development. Developing at the REPL gives instant feedback and ease debugging. What if it could give the speed of compiled-to-CPU languages as well? Another advantage is hot-swapping code in a running application. This can’t be done with Chicken or Gambit, which are fast, but Scheme-to-C compilers. Larceny, Chez and Ikarus can do this. By using a bytecode VM, the system is portable to wherever there is a C compiler, and by adding JIT support for some architectures it can be made fast too.

Escape continuations

Full, first-class continuations with indefinite extent are a powerful and mind-bending feature, but unfortunately, are usually hard to implement and causes pauses when invoked. Besides them, my hypotetical Scheme would have escape continuations. Instead of completely overwriting the execution stack, they simply unwind it until some recorded point, which can be made very fast. Exceptions would them be implemented as escape continuations rather than full ones.

Easy FFI

I am a long-time Lua user, and it is amazing how they could get the C API so good. Not only it is very easy to add new C/C++ libraries to Lua, it is trivial to embed the Lua interpreter itself in a legacy application, or an application that needs to be mostly in C/C++, such as games. As Scheme, unfortunately, is not one of the World’s favorite languages, the capacity to talk to the outside World is fundamental. And by being embeddable (which is the goal of Guile, but it sucks for Windows), it would be easier to take Scheme to another domains where it is not so popular today.

Well, I guess this is it. Thinking about it, there seems to be no Scheme system today with all these features. Will I have the skills to write something like this after LiSP? I doubt it, only the garbage collector would take me ages. But it would be nice if such thing existed…

Concurrent Scheme

While I wait for my 1-hour mobile phone software compilation to finish I decided to drop a quick post about something that has been on my mind lately. In his nice PhD thesis (that I had to read five times to finally understand it), Dybvig (the implementor of Chez Scheme) shows a compiler that rewards the pure functional use of the Scheme language. The compiler scans for free variables in lambda forms and, when creating the respective closures, put the values of the variables directly in the closure activation frame. The references to the free variables in the closure are then changed to references to the stack locations. Accessing the variables is then pretty fast. Besides, when a first-class continuation is created, it suffices to blindingly copy the execution stack because all closures will carry their free variables with them. Assignment is then added at a later section of the thesis. To support it the compiler scans the code for set! forms, gathering the variables that are ever assigned, and compiling them to indirections to heap memory. When a continuation is created the indirection box is copied, not the actual variable, which lives in the heap. So access to assigned variables is much slower.

Although the focus of Dybvig’s thesis is not on concurrency, I cannot help but think that his compiler could work very well for this too with just minor modifications. If variables (local or free) are never assigned, they could be replaced by their values and several different threads with different execution stacks could run in parallel. If a variable is assigned, the access to its indirection box could be protected by a mutex or similar synchronisation primitive. Of course this would make access to assigned variables even slower, because even variables ever used by only one single thread would need mutex lock/unlock operations. On the other hand, isn’t assignment discouraged in Scheme? The overall execution time would compensate for this, if the rest of the program was parallelised.

But surely I am missing something, because no one ever seems to think Scheme can be a contender in the massive-parallelised future. What bugs me is that I don’t know what it is that I am not taking in consideration.

DSLs are cool

One of the most promoted features of the Lisp dialects is how easy it is to create Domain Specific Languages with them. It is said that a Lisp programmer creates a language for the domain of his problem and then solves the problem as a particular case. That is why a lone developer or a small team can have so much power, because they are quickly dealing with a language customised
for them. In my case, one of the domains of my problems is the stock
market.

I am writing a very simple application in Scheme, with Gambit-C Scheme, for helping me trade stocks (nothing fancy, I am still an amateur). This application is going to have a comprehensive set of technical indicators which are calculated based on opening, closing, maximum and minimum stock prices as well as the volume traded, number of operations, number of stocks traded etc. Clearly these indicators must be first-class citizens of this small financial world. Once added to the system, I must know how many I have, what is the name the user should see for each of them and the parameters they take. Besides, its syntax must
allow users of the system to add indicators of their own without knowing the inner details of the system.

To accommodate those needs I have come up with a little language for my indicators. For instance, these are the Volume and Force Index indicators:

(indicator volume (x)
  "Volume"
  (volume x))

(indicator force-index (x)
  "Force Index"
  (if (> x 1)
      (* (- (closing (- x 1)) (closing x)) (volume x))
      0.0))

The indicator syntax allows me to give the indicator a user-friendly name, and to automatically add it to my indicator list. But there is something more subtle and interesting about them: the helper procedures. The procedures closing and volume are examples of procedures available to indicator writers. They cannot be lexically scoped, because they depend on the current stock being analised and the current sampling frequency (daily, weekly etc.). They must be bound to the appropriate procedures just in the scope where the indicators are being run. In Common Lisp this is easily done with special variables. In Scheme we have the SRFI-39, Parameter objects. But since the parameter objects act syntatically like closures, using them directly would demand that the indicator writer use a cumbersome syntax like:

(indicator force-index (x)
  "Force Index"
  (if (> x 1)
      (* (- ((closing) (- x 1)) ((closing) x)) ((volume) x))
      0.0))

This is very undesirable. The solution, in face of the lack of special variables, is to use a code walker. Due to the unique syntax of Lisp, this whole process can be combined into a simple macro:

(define-macro (indicator name args desc . body)
  (letrec ((helpers '(opening closing maximum minimum
                      volume num-operations num-stocks
                      num-samples))
           (walker (lambda (sexp)
                     (if (pair? sexp)
                         (let ((opr (car sexp))
                               (args (cdr sexp)))
                           (if (memq opr helpers)
                               (cons `(,opr) (map walker args))
                               (cons opr (map walker args))))
                         sexp))))
    `(let ((,name (lambda ,args ,@(map walker body))))
       (set! *indicator-list* (cons (cons ,desc ,name)
                                    *indicator-list*)))))

And that’s it. From now on I need to think only about indicators, and not where to put them, how to supply them helpers, what’s the frequency of their samples etc. They are high-level abstractions which are now part of my language.