Wednesday, August 1, 2007

ML → Scheme

I wrote my first real Scheme macro today. It was a macro to emulate ML's pattern-matching.

Writing a type-checker in Scheme, I found myself writing all these list?, eq?, and let* expressions, making my code look way too complicated than necessary. All of that can be done simply and concisely with pattern-matching. For Lisp hackers out there that aren't familiar with pattern-matching, it's basically a destructuring-bind where you can have constants or symbols in your binding expression which are compared with the value you are binding to, sort of like a cond where the predicates are implicitly specified in the pattern.

The great thing about pattern-matching is that code to process data looks just like the data you are processing. So it makes it extremely easy to look at code that uses pattern-matching and see what it's doing, as opposed to equivalent code that uses explicit conditionals and lets.

Without pattern-matching:
(cond ((and (list? exp) (eq? (first exp) 'typ-arrow)) (let ((param-typ (second exp)) (ret-typ (third exp))) (do-stuff param-typ ret-typ)) ((and (list? exp) (eq? (first exp) 'typ-union)) (let ((branch-typs (second exp))) (do-stuff-to-branches branch-typs))))
With:
(match exp (('typ-arrow param-typ ret-typ) (do-stuff param-typ ret-typ) (('typ-union branch-typs) (do-stuff-to-branches branch-typs)))

Anyway, when I first started using Scheme, I didn't like the prefix notation and the paren-syntax — or lack of syntax. Compiler writers out there, you know what I mean. ;-) Frankly, I don't know anyone who does like it at first. I thought, I'd rather write code in a more familiar way, then switch to prefixed constructors when I needed to create code like I would in ML. For example, the SML/NJ compiler exposes the abstract syntax as a datatype, so you can do just this.

However, this means that there are 2 representations of code in the language. Two. ...And that doesn't seem right.

After working with macros (I finally had the conceptual break-through of figuring out how to use double backquotes — ick!), I realized, this makes sense. The idea that code and data have the exact same concrete representation makes total sense. So much so that I believe this is the right way to do it.

But one question that comes to mind is, why does that one representation have to be prefix notation with parentheses everywhere?

It's not clear whether being able to write data in the same form as the regular parsed grammar (infixes and all) would be a good thing or not.

One thing I've since learned, but had trouble with when I first switched from ML to Scheme, was that in Scheme, the meaning of parentheses is overloaded. In ML, parentheses are used for disambiguation of parsing, and that's all. This is almost strictly so in OCaml. But in Scheme, not only is it used for specifying the parse in lists and sub-lists in quoted expressions, but also for function application. This confused the hell out of me for a little while, and my first inclination was to avoid using quote (i.e. ') and use list instead. But I soon got over that.

Overall, my experience learning Scheme has been extremely rewarding, as a great way to recognize patterns in something is to experience something completely different from it. And many decisions that were made in designing ML, the opposite choices were taken for Scheme or Lisp.

No comments: