Thursday, September 27, 2007

Methods to the Aha

When I was a fresh CS major at CMU, I took the standard 15-251 Great Theoretical Ideas of Computer Science class. The professor Steven Rudich introduced us to some amazing topics. Even though I was never able to use it to its full potential, the most valuable thing taught was what he called the "aha! method". In other words, he taught that there was a method to getting to the point where a light bulb goes off in your head and you get an insight that allows you to solve a hard problem, making you say "aha!". And moreover, that this method can be learned.

The list he gave was as follows.
  • Phrase Hygiene
  • Representation
  • Induction
  • Modularity
  • Exemplification
  • Refinement
  • Abstraction
  • Bracketing
Briefly, he meant: tag assertions with phrases that indicate your degree of conviction; actively choose your representation of the problem; use the various forms of induction; break the problem into smaller subproblems; use small examples; don't give up after you've found an answer; abstract away the non-essential details of the problem; try to give upper and lower bounds and make them converge. I'm not going to go in to more details of each of them, but the overall idea is that each is a trick to help you more easily see the pattern in the problem that will lead you to a solution.1

As Rudich pointed out, in any field — martial arts, violin, tennis, magic, programming — the novice makes a huge motion, the black belt makes a small motion, and the master makes a tiny motion. The more complete the pattern you see, the less work is required to accomplish a task.

Recently, I've found a new method. What made me see this was an "interview question" I heard way back when I was in Rudich's class. It goes something like this... You have to cross a gorge that's 50 feet wide and 50 feet deep. All you have is two 20-foot ladders and all the rope you need. How do you do it?

And no, you can't go around it. What now? (You might want to actually think about this before reading on.)

Select text here for the spoiler... The answer is — since you have all the rope you need — that you fill the gorge with rope, and walk across.

Now, the answer to these riddles is often obvious in hindsight. But what can we learn from it? Why couldn't we see the answer before? It wasn't until recently that I realized that if you abstract away the details of this solution, what you had to do was substitute an abundance of one resource for the lack of another resource. And this I've discovered   is another method to the "aha!": Universal Substitution.

What I mean is that any resource can be substituted for any other resource. And this is simply amazing!

...As I mentioned in my last post, analogy seems to be a pattern of all of these. The Representation method above seems to hint at that. But how do you know which representation to choose? You have to already have made a connection — an analogy — with another problem.

Jeff Hawkins seems to think that all problems are solved through either memory or analogy. In other words, you either remember the solution to the problem if you've seen it before. Or you relate the problem to another problem and remember the solution to that, and then apply that solution to the specific situation at hand   which is something your mind does all the time.2

If this is true (which seems to make sense), then analogy is a problem-solving method. But not only that, it is a method to finding problem-solving methods! It is a method of solving any problem, even meta-problems and meta-meta-problems.3

So this leads us to our next question. How do you become good at creating analogies?

Analogies are connections between two different things. They are the result of abstracting something away from two things that are different and being left with things that are the same. Analogies are patterns.

One thing I've noticed that increases the likelihood of seeing patterns is trying different things — all sorts of things. Oftentimes you suddenly realize that things you've become so used to don't have to be the way they are. That it is not necessarily the right way or the wrong way. But also, when it's different enough, you can't help but see what isn't different. In other words, what remains constant — which is a pattern.

You never know where a pattern might show up, or what lines it might cross.4 Plus, oftentimes the analogies within a field have already been exhausted by others, while cross-discipline analogies lay un-reaped.

Also, there is the trap of becoming too close to something. After working on a problem arduously for a while, it is often helpful to first take a break, and then step back and look at the problem to get a bigger perspective. In other words, disentangle yourself from the details and take a look a the whole picture at once, just as if you were putting together a jigsaw puzzle.

All this really is, is another example of the law of Balance.

I've actually attempted to devise an algorithm for finding patterns before. The thing is, there are a seemingly infinite number of ways to categorize things. So if you try to categorize things and then look for something re-occurring, you'll never know if you're simply using the wrong categorization. That's why it's hard to learn something new in general. According to Jeff Hawkins, you have to see a pattern before you can learn anything. The people who tend to be seen as smart are the ones who pick up new things quickly, and they can do this because they see the patterns quicker than others. But it is a completely unconscious activity to them.

This begs the question though, is there another way of finding patterns? Can we find patterns without first categorizing things?

It's like if you have a hash function that abstracts away from things you're looking at, and you keep hashing different things hoping to find a collision. Once you do, you've found a pattern. For example, if your hash function took physical objects as input and gave a color as output, you could hash all the things on your desk and if any items resulted in the same color, you would have found a commonality between those two things, which is a simple yet significant pattern.

But what if you're looking for another pattern? Which hash function should you use? Is there another way besides this hashing method?

See also: Side-Stepping Obstacles in Space and Time
1. Try these out on this logic puzzle; they really work. The method can take you from being completely stuck to a point where you can work through the problem.
2. What Jeff Hawkins describes is abstraction and application. Sounds like functional programming to me. Lambda abstraction and function application. Is there anything more fundamental?
3. This really makes you wonder about things like Numenta, Jeff Hawkins' implementation of the brain which he calls hierarchical temporal memory. What problems will be solved once the size of a brain becomes limited by the amount of money someone is willing to throw at it, instead of by the human skull?
4. I'm not at all interested in sports, but recently I went to a baseball game with my brother, and I realized that anything — once you start optimizing, it becomes something so deep that you could spend an entire lifetime mastering it. Baseball. Cooking. Driving. Conversing. ...What then, is truly worth mastering?

Monday, September 17, 2007

Programming Is Writing

Programming is just like writing. Prose that is.

Specifically, I'm talking about essays — prose in which the purpose is to explain something. Which is exactly what a program does. It explains an algorithm.

Both programming and writing are constructive, as in the constructive logic sense. Writing a program is writing a proof, is writing an essay. In both, it is desirable to be clear and concise. What you write must be organized; the most straight-forward way is starting with a high-level overview, or a top-down organization. Both programs and prose must be read and understood sequentially. Sure you can jump around, but in both, each statement must follow from the previous one. And sub-points, or sub-goals, must flow from beginning to end to form the final point.

In prose, in addition to giving explanations, you give examples to get your point across. So why don't programming languages allow you to write examples as part of the source? In a sense, pattern-matching is exactly this but very limited. Is it not done simply because compilers are not good at inferring from examples? Even if this is the case, we shouldn't let it stop us. The purpose of a programming language is to allow an algorithm to be expressed, and secondarily to be executed. Now I know the economists out there will argue with me. And you're right. But I'm talking more from a design perspective.

Stories are also proofs — they are illustrations of {proof through action}. That's why theories often start from anecdotal evidence. Anecdotes are short story proofs. Everyone uses them all the time. Story is the universal proof-language. In other words, the universal programming language.

Now, I wouldn't want to write all my programs in this universal language for practical reasons. English for example, is not optimized for expressing algorithms. But it's universal in the sense that anyone can understand and relate to stories in the language — i.e. programs in the language — because they describe life experiences, which English is optimized for.

I was talking with Kyle the other day about how I and other programmers sometimes alienate people when we speak because we either have to clarify intuitively obvious ambiguity   or explain things by decomposing them into such detail that the explanation becomes incomprehensible to others. The other day, someone asked Kyle if the hat he was wearing was new. He replied by asking what the person meant by "new". From an intuitive perspective — the perspective of the person asking the question — Kyle should have answered simply "yes" or "no". But instead, the inquirer became slightly confused by the unexpected response, and they went back and forth going nowhere.

Kyle claims that programmers are better at recognizing puns because puns are instances of ambiguity, which programmers have been trained to spot. (I disagree though, because I find myself missing obvious puns, and I think this has to do with the seemingly under-developed right hemisphere of my brain, or at least the disconnectedness of the two hemispheres.) But that got me thinking... if programmers are good at spotting puns because puns are instances of ambiguity, wouldn't that make them bad at creating puns? Because I read recently in On Intelligence that part of what makes Shakespeare a genius in writing is how he can create a metaphor between two things and not actually be talking about either of them. I'm not an expert on Shakespeare, but I do know it's hard to use metaphors in writing. It's simply more abstract.

So if programming is just like writing, what would be the equivalent of a pun in programming?

A pun isn't merely ambiguity between two or more different meanings; it's actually double meaning. Triple meaning. Any or all of the meanings make sense.

So to have this in programming, for example, when a language has two namespaces: one for variables, and one for types; you would have to declare a name in both namespaces and find a way to state the name in such a way that both the variable and the type make sense.

Are there any languages out there that allow for such a thing? I'm not sure. But an obvious case of this would be for printing in a language that has type information available at runtime. You could write something like print foo where foo is defined as both a variable and a type. Which should it print? Obviously it would depend on the semantics of the language. But my point is, it could do both. Or it could look at you and smile coyly.

Like puns and the idea of including examples directly in the source, not separately in test cases, I believe there are more good ideas to be extracted from this analogy that programming is just like writing. Can anyone out there see any more? I'm working on a post about problem-solving methods that I hope to finish soon, but analogy seems to be an amazing problem-solving technique, if not a pattern of problem-solving techniques in itself. In other words, a meta-meta-solution.

Of Little Importance

I've been going crazy writing recently. I've got 7 other posts that I started this week that I haven't had a chance to finish. The reason I've been having so many ideas for posts is partly because I've been seeing so many patterns. So why have I been seeing so many patterns about programming?

I went to hang out with my crazy-cool cousins this weekend. The kind of cousins I only see a couple times a year, but every time it's a blast. And at some point I realized from a conversation that what I'm doing with programming — becoming a relative expert on it, focusing all my time on it, reading about it, writing about it — other people do all the time with sports, video games, movies, and other things. In other words, it's no different. No better. No worse. And the more I focus on programming, the more I will prosper at it. ...But at the same time, I won't be prospering at what's important in life.

This is the pattern of wasting your life, getting caught up in something with only relative value.

I recently saw a documentary called Flight from Death: The Quest for Immortality, and it made me realize that instead of trying to live a great life, if you live your life to make death acceptable, you will end up with a life greater than you can possibly imagine. And the phrase "it's a good day to live" will become synonymous with "it's a good day to die".

I don't believe it's chance that the times I've been on the edge of death were also the times I've felt most alive. It was a coincidence — a coinciding of events. In other words, a pattern. This also points to degrees of being alive — i.e. a life/death continuum — rather than the common conception of a binary switch.

So let's remember to focus on what's important.

Sunday, September 16, 2007

PL Feature Wish List

Here's my programming language feature wish list:
  • Parallelism using list comprehensions like nesl
  • Seamless parallelism over a network like erlang, with ability to extend with encryption/security
  • Turing-complete type-system like Qi, but with implicit runtime verification
  • Turing-complete macros like Lisp (i.e. extending the compiler), but maybe just the ability to have function parameters optionally evaluated which would give you the equivalent, assuming all computations that can be evaluated at compile-time are
  • Expose the interpreter/compiler for meta-circular evaluation
  • Arbitrary meta-data like Java annotations or .NET attributes on all source code nodes, not just functions.
  • Full source reflection at runtime, compile-time, anytime (although this might not be necessary if the IDE has scripting — see below)
  • In other words, blur distinction of compile/run/etc. times
  • Bi-directional mappings for implicit parsing and printing (i.e. syntax skinning) with ability to use them inline (which would, among other things, give you the usefulness of reader macros)
  • Seamless FFI and importing of Java libraries — we have to build on top of something so it might as well be the thing with the most well-developed libraries to make the language a viable option for production code
  • Seamless interaction with the native platform when needed
  • Infinite data structures like Haskell — I guess this means lazy evaluation
  • Seamless language-integrated persistence
  • Type inference with useful error messages when my programs fail to type-check
  • Namespaces — I've never seen a language sustain large applications without these, and many languages have failed to catch on
  • Modules and true functors (i.e. functions over modules)
  • Type-classes — they're really useful
  • User-definable monads
  • Examples/test cases in source
An IDE with:
  • Searching by function type-signature, which will also return functions that can be used with simple coercion functions on the parameters (e.g. using stringOfExp : exp -> string as an implicit coercion)
  • Graphing of module and function dependencies so I can see where the entry-points are when I am trying to look for something in a module I'm not familiar with
  • Profiling which feeds back into the editor (color-coded perhaps) so I can see where I should optimize when I'm editing
  • Scripting like Emacs but in the language that it edits, not a variant of Lisp or some other random language that the editor happens to be written in
  • Incremental background compilation with automatic dependency analysis
  • Knowledge of the abstract syntax tree to allow for (scripted) refactoring. In fact, why not define parts of the language (or skin) as mappings between UI events and semantics, taking the idea of bridging the gap between abstract syntax trees and the user even further.
  • Type inference that feeds back into the editor, as if you had typed it in
  • A graphical window designer like Visual Studio's Form Builder or Xcode's Interface Builder
  • Toggling of debug output which is structured, not just text, so that more information can be expanded without re-running. Toggling should be optionally automatic by tracking which pieces of code have changed recently and only showing dataflow affected by the changes.
A library with:
  • Good data-structures (e.g. hash tables, balanced trees, heaps, graphs, etc.)
  • Hash functions on common types and typed encryption algorithms (i.e. an 'a rsa_encrypted type) with an interface that makes them easy to use (i.e. no coding cost and little knowledge of encryption algorithms required)
  • A hardware/platform abstraction layer for every-day drawing using a box-model similar to CSS
Notice that just about everything here is agnostic to what the actual expressions of the language are.

Is this technically feasible? Absolutely. Just about all of it's been done before in one form or another.

Is this realistic? No. But no one ever made it anywhere interesting without dreaming big.

I'll add to this post's list when I think of something else.

Friday, September 14, 2007

Why Tagging Doesn't Work

It requires refactoring.

For example, when I bookmark something on delicious, I tag it with all the things I think are relevant to that site at the time. But the way I categorize sites and what things I believe to be important   change over time. So the only way I can search through my bookmarks to find all the things related to a certain topic that I wasn't even aware of before   is to linearly scan through all my bookmarks and re-tag them.

Is there a solution to this?

Wednesday, September 12, 2007

The Mind Is Strict

Don't think of a red ball. □

Monday, September 10, 2007

Birthday Patterns

Here's a cool experiment to try. Don't read ahead without actually doing this though or you'll ruin the results.

Choose a few people and categorize them. In one group, those who you consider to have a "down-to-earth look", and in the other group, those who you consider to be "hot". That's right. You must choose one. No one can be in both categories.

Who should you choose? Anyone you can find out their birthday or astrological sign easily. But it helps prevent bias if you don't already know their birthday. So use people you don't know their birthday but can easily find out, like a friend on Facebook or a celebrity.

For example, Jennifer Aniston: down-to-earth; Britney Spears: hot.

Once you've categorized the people in your list, look up their birthdays. Notice anything? A pattern?

If you don't see anything, try categorizing people whose birthday is in the first half of the year versus the second half of the year. Now do you see a pattern?

I've found this is scary-accurate. Almost to the point where it's a little creepy. I was playing this game with my friend who saw the pattern originally where we would pick a celebrity, guess if they were born in the first half of the year or the second based on how they look, and then look up their birthday. We found it worked much better for females, but this is probably due to the fact that we've been training for years judging females on looks. I suspect though that it works just as well for everyone as long as you have a trained eye.

When you start to think about this, the implications are crazy. Behavior and beauty — and as my friend theorizes, how suitable someone is as a spouse — are determined with high probability by your date of birth. I'd like to see skeptics of astrology try to explain this away.

I'm not very knowledgeable about astrology at all. I'm just beginning to notice that there really is some truth to it. We stumbled upon this little game last night after I looked at the birthdays list of my friends on Facebook. I noticed there was a disproportionately large number of birthdays in certain months. So I decided to count them up and graph them in Excel.

Wow! The results were stunning. Approximately double the median of birthdays per month appeared in a single month for my friends. The next highest month wasn't even close. What does that say about me? ...Apparently I've been inadvertently choosing friends based on birthday. Or your birthday really does correlate with your personality, of which I have a certain preference.

I know the skeptics still won't buy it. But that's okay. Those who see will prosper anyway. Because this is not just evidence of the pre-determination of your life, but also of the rule of Balance. There are rules to the Universe — patterns — and one of them is that you can't be both smart and beautiful, unless of course you're missing something else!