Wishful Coding

Didn't you ever wish your computer understood you?

Lazy sorting

I was inspired by Edmund's post to try some other lazy sorts. These are far from competitive to Java's timsort, but they are an interesting concept.

It is not possible to make timsort lazy because it uses insertion sort which does not work that way.

Insertion sort is very similar to selection sort. As in selection sort, after k passes through the array, the first k elements are in sorted order. For selection sort these are the k smallest elements, while in insertion sort they are whatever the first k elements were in the unsorted array.

A chunked lazy timsort might be attempted though. Another option might be to use selection sort instead, or use another compounded sort like introsort.

Update: I added an insertion sort and a chunked lazy timsort, without all the frills that make it actually run fast.

all    #'clojure.core/sort "Elapsed time: 67.987 msecs"
sorted #'clojure.core/sort "Elapsed time: 22.214 msecs"
first  #'clojure.core/sort "Elapsed time: 56.249 msecs"

all    #'test/qsort "Elapsed time: 717.93 msecs"
sorted #'test/qsort "Elapsed time: 12388.316 msecs"
first  #'test/qsort "Elapsed time: 30.976 msecs"

all    #'test/msort "Elapsed time: 1317.334 msecs"
sorted #'test/msort "Elapsed time: 945.077 msecs"
first  #'test/msort "Elapsed time: 708.803 msecs"

all    #'test/hsort "Elapsed time: 83.909 msecs"
sorted #'test/hsort "Elapsed time: 75.235 msecs"
first  #'test/hsort "Elapsed time: 28.91 msecs"

all    #'test/isort "Elapsed time: 140.147 msecs"
sorted #'test/isort "Elapsed time: 23.097 msecs"
first  #'test/isort "Elapsed time: 110.453 msecs"

all    #'test/tsort "Elapsed time: 310.026 msecs"
sorted #'test/tsort "Elapsed time: 187.124 msecs"
first  #'test/tsort "Elapsed time: 72.164 msecs"
(ns lazy-sort)

(defn qsort [[head & tail]]
  (when head
    (lazy-cat (qsort (filter #(< % head) tail))
              [head]
              (qsort (remove #(< % head) tail)))))

(defn merge*
  [[f1 & r1 :as l1] [f2 & r2 :as l2]]
  (cond
    (nil? f1) l2
    (nil? f2) l1
    (<= f1 f2) (lazy-seq (cons f1 (merge* r1 l2)))
    (> f1 f2) (lazy-seq (cons f2 (merge* l1 r2)))))

(defn msort [s]
  (if (next s)
    (let [[left right] (split-at (/ (count s) 2) s)]
      (merge* (msort left) (msort right)))
    s))

(defn hsort [s]
  (let [heap (java.util.concurrent.PriorityBlockingQueue. s)]
    (repeatedly (count s) #(.take heap))))

(defn insert [^java.util.LinkedList s x]
  (let [i (.listIterator s (.size s))]
    (while (and (.hasPrevious i) (> (.previous i) x)))
    (.add i x)
    s))

(defn isort [s] ; not lazy
  (seq (reduce insert (java.util.LinkedList.) s)))

(defn treeduce [f s]
  (loop [s (partition-all 2 s)]
    (let [s (map (fn [[x y]] (f x y)) s)]
      (if (< 1 (count s))
        (recur (partition-all 2 s))
        (first s)))))

(defn tsort [s] ; naive
  (treeduce merge* (map isort (partition-all 128 s))))

(defn speed []
  (doseq [s [#'sort #'qsort #'msort #'hsort #'isort #'tsort]
          :let [sort (var-get s)]]
    (println)
    (print "all   " s "")
    (time (dotimes [_ 100] (dorun (sort (shuffle (range 1000))))))
    (print "sorted" s "")
    (time (dotimes [_ 100] (dorun (sort (range 1000)))))
    (print "first " s "")
    (time (dotimes [_ 100] (first (sort (shuffle (range 1000))))))))

Pepijn de Vos