or
Mian versus Clomian
or
How Clojure sat in a corner converting and boxing while Python did the work
or
A plot about Minecraft(pun intended)
Update: The Clojure version is now a lot faster. Thanks to the people of the Clojure mailing list. I also uploaded the map I used if you want to compare the results.
Okay, back to business. During the writing of the original Python hack I had to do a few tricks I thought would be easy to do in Clojure. So I started to wonder how the Clojure code would look and how fast it’d be.
My original hack was kind of slow, but it’s greatly improved and now renders a whole map in under 10 seconds.
- 4s for reading all files
- 3s for calculating the graph
- 8s total
The code to read all the files:
paths = glob(join(world_dir, '*/*/*.dat'))
raw_blocks = ''
for path in paths:
nbtfile = NBTFile(path, 'rb')
raw_blocks += nbtfile['Level']['Blocks'].value
The code to calculate the graph:
layers = [raw_blocks[i::128] for i in xrange(127)]
counts = [[] for i in xrange(len(bt_hexes))]
for bt_index in range(len(bt_hexes)):
bt_hex = bt_hexes[bt_index]
for layer in layers:
counts[bt_index].append(layer.count(bt_hex))
Nice eh? Now the Clojure version. Clojure doesn’t have a nice blob module, so I’ll spare you the code that gives me the data. Sufficient to say is that it also runs in about 4 seconds.
My initial version for the calculating was short and sweet and looked like this:
(defn freqs [blocks]
(->> blocks
(partition 128)
(apply map vector)
(pmap frequencies)))
Now, this is twice as fast as what I currently have, but it has a problem. While Python operates on bytes the whole time, these lines of Clojure operate on a sequence of objects. These objects are just a tad bigger than the bytes in a string, so keeping 99844096 of those in memory is impossible.
So, either I had to find a way to make Clojure throw away all the objects it had already processed, or I had to make it use a more compact storage for them. I tried both, and ended up with a function to concatenate Java arrays, but working with them is a real pain, so I made my function use them wrapped in Clojure goodness and made sure the Java GC threw them out as soon as I was done.
(defn freqs [blocks]
(->> blocks
(partition 128)
(reduce (fn [counts col]
(doall (map #(assoc! %1 %2 (inc (get %1 %2 0))) counts col)))
(repeatedly 128 #(transient {})))
(map persistent!)))
This is not threaded like to previous example, but it works. Everything I tried to make it use all my cores either started to eat more and more memory, or was slower then the single-treaded one. Most of them where both.
So, how fast is it?
- 5s file reading
- Over a minute of processing
- Over a minute + 5s total
Wait, what? Python did this in 3 seconds, right? Yea… So even if I had used the faster function and had 10GB of RAM it’d be 10 times slower.
Why? I don’t know. All I can come up with is that that Python just acts on a string, while Clojure does boxing and converting 99844096 times. If you happen to know what’s wrong, or how to make it faster, be sure to tell me!