Wishful Coding

Didn't you ever wish your
computer understood you?

The Origin of Language

Dutch is an interesting language. It seems we’re at a crossroad of German, French and English influences.

During the Roman empire, the river Rijn was the border of the Germanic and Roman empire, of which German and Italian are direct descendants1.

Dutch is very close to German, but it was and is influenced by the ‘world languages’ of the time. (The world was a lot smaller at that time) And so it happened that during 1800-1900 or so, French was spoken by the elite, and we where part of the French empire at times.

Map of French and Roman empire

If you look at a Dutch dictionary today, you can divide most words in 3 categories, Words that are a lot like German, Archaic and chic words from France and modern words from English.

Dutch is a lot like Common Lisp or Scala2. If we’d make a cheesy map of Europe with languages and paradigms overlaid, you’d see that natural languages are complected and very multi-paradigm.

Programming languages are much more designed, and mostly unambiguous, like Esperanto or Lojban. Computers don’t like ambiguity. Some languages, like Scheme, are still designed for growth though.

</embed>

Near the end Guy Steele argues that programming languages need to be more like natural ones, and the other way around.

As we have seen earlier, natural languages grow, and grow a lot. But I know from experience that adding words to your language that are not native, feels forced at times, does not go well with grammar, and leads to extra complexity.

On the other hand, natural languages should be simpler. But I argue that this has nothing to do with syllables, but with choosing simple words with but one meaning, concerning one thing.

But do our programming languages have this property? Rich Hickey argues this is often not the case, and explains the word ‘simple’ in more detail.3

I can only wonder what future languages will look like. Since our applications will be limited by our understanding, programming will be an art of omission and simplicity.

Last video, I promise. ‘Uncle Bob’ Martin shows us what progress we have made in software development. Not much, compared to Moore’s law. We’re still programming Lisp, and doing assignment, branching and iteration.

What we did with structured programming, with object orientated programming and with functional programming, is taking stuff away. We converted conventions to rules.

Maybe future languages will have rules about simplicity?

  1. I’m not a historian, I never followed a single history lesson. Take with a spoon of salt. 

  2. More salt please; At least we’re back at programming. 

  3. I can’t seem to embed an InfoQ video. 

My Bookshelf 5/5: Making Ideas Happen

Great book, read it. And if you don’t, scan the index, it reads like a list of do’s and don’ts.

Making Ideas Happen

The book describes the Action Method, which resolves around persisting, following up, managing action steps(todo’s that start with a verb) and relentless execution. It’s not easy, but it makes sense.

I actually read this book during summer holiday. I’ve been, uh… practicing since then. On thing that stuck with me in particular is that connectivity is inverse productivity.

I found it much easier to focus on the action step at hand without all the distractions from email, IM, Twitter, etc. But when you are offline, you need to prepare yourself with all the docs and libs you need, put the whole internet on a floppy if you can1.

Internet on a Floppy

Behance developed a web application specifically for following the Action Method, but as you can imagine, having your task manager ‘in the cloud’ while working offline, is not ideal.

I need something that syncs between my laptop and my desktop, but works great offline. CouchDB seemed to fit the bill perfectly. After I started a homebrew solution and one based on the Backbone MVC framework, I found this.

Those who don’t understand UNIX are condemned to reinvent it, poorly. – Henry Spencer

Why not use mighty tools such as text files and rsync(or Dropbox)? Googling for todo.txt shows I’m not the first to have that idea. There is a more or less agreed upon format, and even a GUI.

Good luck with that, I’m going back to my task list, which features cleaning the kitchen.

  1. Half of my action steps at the time consisted of “Download X” 

Understand SQL, learn NoSQL

I learned SQL when I started PHP. I found a website named Tizag, where they had SQL tutorials. I installed PhpMyAdmin, created tables, ran queries like SELECT * FROM pages WHERE foo IS bar LEFT JOIN ON comments or whatever. It was magic.

No one ever explained to me how it stored the information, or how it was so fast (or slow). They did say that indexes made stuff faster, sometimes.

On the other hand, when you read the CouchDB guide, they do not primarily teach you their query language, but also a lot about how stuff works. A lot of this also applies to SQL databases.

Storage

CouchDB uses a B-tree to store documents. This provides O(log n) lookup, update, etc. rather than O(n) scanning of all documents. It seems most SQL databases use a B-tree as well, but not always.

Indexes

When you add a WHERE clause to your query, the database has to look at all documents for a match.

If you add an index to the field, you get a sorted representation of that field. This way you can get single items or ranges(time > 123456) in logarithmic time, using binary search.

CouchDB gives you a ‘view’ of the _id of a document, but other views will have to be created to create the equivalent of a WHERE clause. (What _id is in CouchDB, is your primary key in SQL)

Locking & Transactions

NoSQL databases are infamous for their lack of locking and transactions. Why? For the sake of scalability.

Let’s ignore for a moment that 90% of all apps can run on a single server. The idea is that creating a transaction synchronously on a whole cluster is nearly impossible, let alone fast. So you you just don’t to it at all, in NoSQL land.

The flip side is that not locking at all allows reads to be faster. More on that in the next section.

Interesting to note is that both CouchDB and PostgreSQL use MVCC, allowing for reads without locking. So this is not unique to NoSQL databases.

History & Recovery

CouchDB stores its data in append-only B-trees, meaning that data is never changed.

Because old data is still there and immutable, readers can access it without waiting for a write to complete. It is even possible to read old revisions of the data.

What is maybe even more interesting is that, if the server crashes in the middle of an update, the old data is still there.

InnoDB also applies a similar technique, unlike Mysam, which needs to scan and repair the whole database.

Joins

Basically the same thing as transactions, you don’t want to scavenge your whole cluster looking for all comments referencing a blogpost.

The high-performance way to do joins is to not do joins, instead unlearn everything your learned about normalization, and denormalize.

The other way to do it teaches us more about SQL joins though. Basically you create an index or view on the ‘foreign key’, and run a separate query to get the correct documents. Here is an elaborate example.

Conlusion

Neither SQL or NoSQL databases are magic, and they are even pretty similar in most ways. Don’t follow the hype, choose wisely.