Tuesday, April 22, 2008

My Ideal Job

Somewhere in New York there is a metaprogramming job with my name on it.

Where exactly? I haven't found it yet, but I'm sure it's keeping an eye out for me. ;-)

Friday, April 18, 2008

PL What-Ifs

What if you compiled a source language to multiple target languages? gaining the benefit of more than one platform.

For example, what if you were creating a brand new language that you wanted to be type-safe with all the intricacies of Haskell's type-system, but you wanted to take advantage of libraries written in Ruby. And you created a compiler that first compiled your program to Haskell, ran it through ghc's type-checker, and then, if it passed, compiled your program to Ruby. You'd get the benefit of Haskell's type-checker and Ruby's libraries.

What if a language wasn't statically typed or dynamically typed? but instead had a knob that could be tuned in one direction or the other depending on the situation.

For example, what if you wanted the benefits of static type-checking, but if you could just access the symbol table or use eval in one or two places in your code, it would be infinitely simpler at the cost of a possible runtime error. And no, this is not the same as implementing everything yourself with some sort of variant type, as all Turing-complete languages could. I'm thinking something more like Haskell's IO monad that allows you to execute impure code in an otherwise pure setting. In the same way that the IO monad infects everything it touches, so too would the dynamically-typed-code "monad". But that's just one way of doing it. Another way would be to specifically declare something to be a variant type whose properly typed value was implicitly projected out.

What if you could visualize the dependency graph of language objects like functions, modules, etc.?

For example, I've noticed that projects whose sub-projects have dependencies in a stack (i.e. more like a linear chain) are much easier to grok than those whose dependencies form an intricate cyclic graph. Would seeing these dependency graphs help in spotting possible complexity hot-spots, and thus, possible bug hot-spots? Or would visualizing the dependencies alone help us to better understand them. I'd expect my compiler to generate these automatically, of course, because it's already doing the dependency analysis anyway.

What if you could inline and un-inline function calls at will as you were editing the code?

For example, some people are good at thinking very abstractly and like to factor out commonalities as much as possible to reduce code. After a point though, diminishing returns are seen as code becomes unintuitive or "unreadable", deferring the simplest two-time-use definitions to a separate file for example. Where that point is is different for different people however. So what if a sufficient code-editor — i.e. a viewer for data that happens to be code — in addition to skins allowed different users to adjust how many levels functions got inlined. Said another way, what if your editor allowed you to macroexpand and un-macroexpand the code you were editing (inline, not in an output buffer somewhere) at the push of a button, arbitrary levels deep.

...Let us all keep asking questions. About programming and everything else.

Monday, April 14, 2008

The Phase Concept

Anyone who's been following my blog for a while may have seen a pattern by now. Everything I've written about programming languages has a theme, which when extrapolated, has one logical conclusion: to create a compiler for a programming language that is a good tool for creating other programming languages (possibly mini languages otherwise known as APIs or DSLs) with a GUI editor that is aware of the semantics of the language and whose target language is well-established with many existing tools.

The compiler project I mentioned last post is a start of that. However, it is by no means the final product. First of all, I conjecture that an s-expression-based source language will lend itself as a good target language for a graphical code-editor built later.

Secondly, the idea of having PHP as the compiler's target language was based on the desire to take advantage of the hordes of PHP code already out there for creating web apps. However, after creating an initial prototype, it became obvious that PHP's lack of support for closures is a huge obstacle in creating the compiler whose source language has closures. I can't imagine not having closures, so PHP is out. (It's not that it can't be done, but it would be significantly more work to compile away the closures.) This is good though, because it forces me to re-write (i.e. re-design) the compiler.

I also decided against compiling to Python. Even though I have good feelings towards it, I can't justify using it when it restricts closures to be one-liners in an otherwise imperative language. I was also considering Common Lisp as a target language. The thing is, using it as a target language leaves this new language with all the same problems that Common Lisp has, and so in a way that would defeat the purpose of building on top of something supported by armies of coders. Put another way, CL's armies are significantly smaller than the armies of other languages.

As much as I don't want to admit this, Ruby is starting to look like the best option for a target language.

So for those of you wondering if I'm going to release my little prototype, I see no reason to. It was written in Haskell as a proof of concept. The s-expression parser was taken from the Lisp interpreter I wrote, and I simply added the translation to PHP.

I have concerns though about certain features like eval. My first inclination was to include it, as I plan on having something like macros à la Lisp. That could slow down the development of a prototype, and thus feedback, so I may cut it from the first version. Including eval creates a bootstrapping problem. It requires me to either write the compiler in the target language or include enough language primitives to implement eval in the source language itself and re-write eval in my new language. This is a sad cut, but it's necessary to get a feel for the language quickly.

So what is this "language" I keep referring to? What's special about it? What will its purpose be? It's just an idea I've been toying with, and this prototyping is meant to try to figure out if it's a good idea or not.

Every language lends itself to writing code in a certain way. Java, for example, lends itself to writing code in an object-oriented way. You could, however, write Java code that looks more like garbage-collected C code with classes used only as namespaces. Or you could write functional code in Java, passing around "functors" built out of anonymous classes. But the reason people tend towards writing object-oriented code in Java is because Java lends itself to an OO design. It makes writing OO code cheap — so cheap that it changes the way you think about algorithms so that they fit an OO model.

But me, I already think of everything as a compiler. I see every program as a compilation from inputs to outputs. A giant function if you will. Of course, when a program is big, you break it up into multiple functions, each with its own inputs and outputs. On a larger scale, you break up groups of functions among modules, where each module's interface defines a mini DSL, and each module's implementation is the compiler for it.

In this way, every program is a composition of mini compilers between mini languages. Oftentimes data in a program will pass through many intermediate stages as it flows from input to output through various transformations. In the same way that C++ code gets compiled to C code, then to object code, and then finally to machine code, each stage that data flows through is a compilation phase.

With a C++ compiler, the data happens to be C++ source code which gets translated into machine code. However, a clock program is a compiler from the OS's API for retrieving the system's time to a graphical readout of the time. A database engine is a compiler from SQL statements (select, update, delete, etc.) to result sets. (Order of execution is significant, as updates affect the results of compiling select statements in the future.)

A text editor is an advanced compiler with many phases of compilation. Ignoring the transformation (or compilation) of keystrokes to key codes at the hardware and OS levels, text editors transform key presses (and perhaps mouse input) into formatted text, formatted text into the graphical layout of the formatted text, formatted text into linear byte-streams for saving, formatted text into postscript or something suitable for printing.

I already see everything as a compiler, so why not have a language that lends itself to writing programs in this paradigm. A language that makes it cheap to express computations as the composition of multiple phases of translations from one language to another.

It's all about dataflow and how that data changes form as it passes from one phase to the next. So for now, "phase" is its code name.

Wednesday, April 9, 2008

What I've Been Up To

The more I actually do, the less I write.

In November, I didn't blog a single post. The time I usually spent writing went to doing Project Euler problems and learning Haskell.

Last month, I gave a presentation on functional programming with Haskell for Philly Lambda. In order to better understand Haskell (and Lisp) for the presentation, I wrote a simple Lisp interpreter in Haskell. Now I've been spending my time on revising and re-working the presentation for a lunch-and-learn for my day-job. I have also been reading an amazing book called Gödel, Escher, Bach.

Although I was interested in Clojure for a while, I honestly don't think it has much of a future. It's hardly a hundred-year language, yet it's not all that great for today's problems either.1 Specifically, web apps.

Thus, I've embarked on an adventure in my spare time. While my former startup-team deserted me to pursue other things, I continued to do R&D.2 My latest fancy is a compiler from an s-expression-based language to PHP. I thought I'd go with the whole standing-on-the-shoulders-of-giants idea like Clojure, but my plan is that users of the language will not need to know a thing about PHP. However, someone who does should be able to reach in and access PHP's immense code-base.

Is it really a good idea to compile to PHP? I don't know. Would another language like Ruby, Python, or Common Lisp be more suitable? Possibly. This is simply meant to be an experiment, and I'm trying not to be afraid to make mistakes. Since I have a good deal of experience writing PHP, I thought using it as a target language would be the easiest thing to get a working prototype for.

The more I actually do, the less I write. Let's see if the converse is also true. ...I'll try to write more once I have something tangible.
1. I get the impression there are many quirks in Clojure due to its tight integration with Java. Its meta-data on code is great, but it begs for a code-editor.
2. In my opinion, the rest of the team left to get jobs and independent design work due to fluctuating states of mind and fear of risk. Unlike them, I have higher goals which reduce my wavering and prevent me from turning away.