Monday, September 17, 2007

Programming Is Writing

Programming is just like writing. Prose that is.

Specifically, I'm talking about essays — prose in which the purpose is to explain something. Which is exactly what a program does. It explains an algorithm.

Both programming and writing are constructive, as in the constructive logic sense. Writing a program is writing a proof, is writing an essay. In both, it is desirable to be clear and concise. What you write must be organized; the most straight-forward way is starting with a high-level overview, or a top-down organization. Both programs and prose must be read and understood sequentially. Sure you can jump around, but in both, each statement must follow from the previous one. And sub-points, or sub-goals, must flow from beginning to end to form the final point.

In prose, in addition to giving explanations, you give examples to get your point across. So why don't programming languages allow you to write examples as part of the source? In a sense, pattern-matching is exactly this but very limited. Is it not done simply because compilers are not good at inferring from examples? Even if this is the case, we shouldn't let it stop us. The purpose of a programming language is to allow an algorithm to be expressed, and secondarily to be executed. Now I know the economists out there will argue with me. And you're right. But I'm talking more from a design perspective.

Stories are also proofs — they are illustrations of {proof through action}. That's why theories often start from anecdotal evidence. Anecdotes are short story proofs. Everyone uses them all the time. Story is the universal proof-language. In other words, the universal programming language.

Now, I wouldn't want to write all my programs in this universal language for practical reasons. English for example, is not optimized for expressing algorithms. But it's universal in the sense that anyone can understand and relate to stories in the language — i.e. programs in the language — because they describe life experiences, which English is optimized for.

I was talking with Kyle the other day about how I and other programmers sometimes alienate people when we speak because we either have to clarify intuitively obvious ambiguity   or explain things by decomposing them into such detail that the explanation becomes incomprehensible to others. The other day, someone asked Kyle if the hat he was wearing was new. He replied by asking what the person meant by "new". From an intuitive perspective — the perspective of the person asking the question — Kyle should have answered simply "yes" or "no". But instead, the inquirer became slightly confused by the unexpected response, and they went back and forth going nowhere.

Kyle claims that programmers are better at recognizing puns because puns are instances of ambiguity, which programmers have been trained to spot. (I disagree though, because I find myself missing obvious puns, and I think this has to do with the seemingly under-developed right hemisphere of my brain, or at least the disconnectedness of the two hemispheres.) But that got me thinking... if programmers are good at spotting puns because puns are instances of ambiguity, wouldn't that make them bad at creating puns? Because I read recently in On Intelligence that part of what makes Shakespeare a genius in writing is how he can create a metaphor between two things and not actually be talking about either of them. I'm not an expert on Shakespeare, but I do know it's hard to use metaphors in writing. It's simply more abstract.

So if programming is just like writing, what would be the equivalent of a pun in programming?

A pun isn't merely ambiguity between two or more different meanings; it's actually double meaning. Triple meaning. Any or all of the meanings make sense.

So to have this in programming, for example, when a language has two namespaces: one for variables, and one for types; you would have to declare a name in both namespaces and find a way to state the name in such a way that both the variable and the type make sense.

Are there any languages out there that allow for such a thing? I'm not sure. But an obvious case of this would be for printing in a language that has type information available at runtime. You could write something like print foo where foo is defined as both a variable and a type. Which should it print? Obviously it would depend on the semantics of the language. But my point is, it could do both. Or it could look at you and smile coyly.

Like puns and the idea of including examples directly in the source, not separately in test cases, I believe there are more good ideas to be extracted from this analogy that programming is just like writing. Can anyone out there see any more? I'm working on a post about problem-solving methods that I hope to finish soon, but analogy seems to be an amazing problem-solving technique, if not a pattern of problem-solving techniques in itself. In other words, a meta-meta-solution.


mk said...

Common Lisp has separate namespaces for variables, classes, functions, among other things. So you can write something like the following:

(defclass horse () ())
(defvar horse "Shire")
(print-object horse nil)

That prints horse-the-variable in Common Lisp, but that's just the convention. In Common Lisp classes are objects too. To actually print the class you'd have to write:

(print-object (find-class 'horse) nil)

More interesting example is Ruby, where there's ambiguity between variable names and method calls. In Ruby parenthesis after method calls are optional, so "foo" may mean, depending on current context, a variable "foo" or a call to method "foo". Here's the pun:

def cook
puts "making a dish..."

cook = "James"

cook # order to make a dish or reference to a person?

A simple heuristic in Ruby parser makes that a variable reference, but the point is the same as you mentioned - both meanings make perfect sense.

The idea of embedding test cases inside source code reminds me of Python's doctest and the whole idea of literate programming. For example:

The answer is 42.

>>> the_answer
Traceback (most recent call last):
NameError: name 'the_answer' is not defined

...well, Python is not there yet.

You get the idea.

I stumbled upon your blog this evening and I really like it. More posts please. :-)

Jon T said...

mk, thanks for the intelligent input. If I had more comments like yours giving me interesting feedback (unfortunately I can't be an expert in all the languages out there), I'd be more likely to post more.