Wishful Coding

Didn't you ever wish your
computer understood you?

Sorting obsession

It all started when, one day, I was thinking about the little I knew about sorting. What I knew was that sorting was usually done by comparing all items and switching them around, hitting the magical O(nlogn) barrier.

Thinking about how I would sort a deck of cards, sorting seemed more like a O(n) operation. With that in mind, I started implementing what turned out to be called Pigeonhole Sort. I also learned that sorting algorithms can be classified as being a Comparison sort or not.

(ns pigeonhole-sort)

(defn int-sort [s]
  (let [listmap (reduce #(update-in
                           (update-in %1 [%2] (fnil inc 0))
                           [:max] max %2) {:max 0} s)]
    (mapcat #(repeat (get listmap % 0) %)
            (range (inc (:max listmap))))))

With that out of the way, I started to compare my sort to the sort used by Java and Clojure, which turned out to be using Timsort, originating in Python. In my quest to understand Timsort - and why it was faster than my Pigeonhole Sort - I found this great website that graphs how different sort algorithms work.

Simply put, I was intrigued by the pattern displayed by Heap Sort. So next I found myself reading the Wikipedia pages for Binary Heaps, and shortly after that, I had started implementing that in Clojure.

This is, ladies and gentleman, a persistent heap, in all its heapyness, with a heapsort implemented on top of it.

(ns persistent-heap)

(defn swap [heap idx idy]
  (assoc heap idx (get heap idy) idy (get heap idx)))

(defn children [idx]
  (let [idx (inc (* idx 2))
        idy (inc idx)]
    [idx idy]))

(defn parent [idx]
  (if (not= 0 idx)
    (/ (- idx (if (odd? idx) 1 2)) 2)
    nil))

(defn tree 
  ([heap] (tree heap 0))
  ([heap idx]
   (let [[left right] (children idx)
         node (get heap idx nil)]
     (when node
       [node (tree heap left) (tree heap right)]))))

(defn heap-up 
  ([heap] (heap-up heap >= (dec (count heap))))
  ([heap compfn] (heap-up heap compfn (dec (count heap))))
  ([heap compfn idx]
   (if-let [par (parent idx)]
     (if (compfn (get heap idx) (get heap par))
       (recur (swap heap idx par) compfn par)
       heap)
     heap)))

(defn heap-down
  ([heap] (heap-down (pop (swap heap 0 (dec (count heap)))) >= 0))
  ([heap compfn] (heap-down (pop (swap heap 0 (dec (count heap)))) compfn 0))
  ([heap compfn idx]
   (let [[left right] (children idx)
         leftv (get heap left nil)
         rightv (get heap right nil)
         node (get heap idx nil)]
     (if (and node leftv rightv)
       (cond
         (compfn leftv (max rightv node))
           (recur (swap heap idx left) compfn left)
         (compfn rightv (max leftv node))
           (recur (swap heap idx right) compfn right)
         :else heap)
       heap))))

(deftype PersistentHeap [heap]
         clojure.lang.ISeq
         (first [this] (first heap))
         (next [this] (PersistentHeap. (heap-down heap)))
         (more [this] (.next this))
         (cons [this obj] (PersistentHeap. (heap-up (conj heap obj))))
         (seq [this] (seq heap)))

(defn persistent-heap [coll]
  (into (PersistentHeap. []) coll))

(defn heapsort [coll]
  (->> (persistent-heap coll)
       (iterate rest)
       (take (count coll))
       (map first)))

Some 'benchmark' results:

user=> (time (dorun (heapsort (shuffle (range 1e5)))))
"Elapsed time: 3148.041 msecs"
nil
user=> (time (dorun (int-sort (shuffle (range 1e5)))))
"Elapsed time: 477.781 msecs"
nil
user=> (time (dorun (sort (shuffle (range 1e5)))))
"Elapsed time: 105.354 msecs"

Note that heapsort is lazy, insofar that it does realize but does not sort the sequence untill requested. There is also a bug that allows for the last 2 items to be out of order.

I'm tired of the NoSQL buzz, and bring you SlouchDB

It has all the features your favorite NoSQL database has.

  • Fast
  • Scalable
  • Concurrent
  • Key/value store
  • Document orientated
  • MapReduce
  • In-memory
  • Persistent *
  • NoSQL
  • REST API *
  • Written in X, so...
* Requires a few extra lines.

The implementation looks like this.

(ns slouchdb)

; Fast scalable in-memory concurrent NoSQL database
(def db (ref {:bob  {:age 59 :sex :male}
              :bill {:age 17 :sex :male}
              :mary {:age 28 :sex :female}}))

;; Views ;;

(defn total-age []
  (reduce + (pmap (comp :age val) @db))) ; Parallel MapReduce!

;; Examples ;;

(println (apply str (interpose \newline
  ["Get an item"
  (:bob @db)
  "MapReduce"
  (total-age)
  "Add/update an item"
  (dosync (alter db assoc-in [:bill :age] 25))
  "Delete an item"
  (dosync (alter db dissoc :mary))])))

; For scaling, press CMD or CTRL and + or -

; For a persistent version, see java.io.Serializable

; For a REST API, check out Moustache and Aleph.
Published on

Cooperative concurrency in Clojure

Today I had an interesting discussion about cheap preemptive threads in Clojure. Someone claimed Haskell can easily start 10.000 threads without getting slow, while Clojure - and native threads in general - do get slow after a couple of thousands.

After a discussion about Agents and thread pools, I proposed cooperative concurrency as a cheap solution. Cooperative concurrency isn't exactly preemptive, but it works. This is a for-fun implementation of trampoline using a hidden gem in Clojure called PersistentQueue.

Like trampoline, recursive functions return another function, which get called in a loop. Unlike trampoline, mine takes a seq of functions, puts them into a queue and executes the frontmost function, putting the result on the back of the queue.

Below is an example of the classic mutual recursive odd and even functions running in parallel, or alternating actually. Like pcalls, but on a single thread.

(ns cooperative)

(defn cooperative-tasks
  "Trampoline over a queue for
  cooperative multitasking"
  [fs]
  (if (not-any? fn? fs)
    fs
    (recur
      (doall (for [f fs]
        (if (fn? f)
          (f)
          f))))))


(comment
  (declare my-odd?)

  (defn my-even? [n]
    (if (zero? n)
      true
      #(my-odd? (dec (Math/abs n)))))

  (defn my-odd? [n]
    (if (zero? n)
      false
      #(my-even? (dec (Math/abs n)))))

  (cooperative-tasks [#(my-odd? 50009) #(my-odd? 50008) #(my-even? 50008) #(my-even? 50009)])
)