Wishful Coding

Didn't you ever wish your
computer understood you?

Lazy sorting

I was inspired by Edmund's post to try some other lazy sorts. These are far from competitive to Java's timsort, but they are an interesting concept.

It is not possible to make timsort lazy because it uses insertion sort which does not work that way.

Insertion sort is very similar to selection sort. As in selection sort, after k passes through the array, the first k elements are in sorted order. For selection sort these are the k smallest elements, while in insertion sort they are whatever the first k elements were in the unsorted array.

A chunked lazy timsort might be attempted though. Another option might be to use selection sort instead, or use another compounded sort like introsort.

Update: I added an insertion sort and a chunked lazy timsort, without all the frills that make it actually run fast.

all    #'clojure.core/sort "Elapsed time: 67.987 msecs"
sorted #'clojure.core/sort "Elapsed time: 22.214 msecs"
first  #'clojure.core/sort "Elapsed time: 56.249 msecs"

all    #'test/qsort "Elapsed time: 717.93 msecs"
sorted #'test/qsort "Elapsed time: 12388.316 msecs"
first  #'test/qsort "Elapsed time: 30.976 msecs"

all    #'test/msort "Elapsed time: 1317.334 msecs"
sorted #'test/msort "Elapsed time: 945.077 msecs"
first  #'test/msort "Elapsed time: 708.803 msecs"

all    #'test/hsort "Elapsed time: 83.909 msecs"
sorted #'test/hsort "Elapsed time: 75.235 msecs"
first  #'test/hsort "Elapsed time: 28.91 msecs"

all    #'test/isort "Elapsed time: 140.147 msecs"
sorted #'test/isort "Elapsed time: 23.097 msecs"
first  #'test/isort "Elapsed time: 110.453 msecs"

all    #'test/tsort "Elapsed time: 310.026 msecs"
sorted #'test/tsort "Elapsed time: 187.124 msecs"
first  #'test/tsort "Elapsed time: 72.164 msecs"
(ns lazy-sort)

(defn qsort [[head & tail]]
  (when head
    (lazy-cat (qsort (filter #(< % head) tail))
              [head]
              (qsort (remove #(< % head) tail)))))

(defn merge*
  [[f1 & r1 :as l1] [f2 & r2 :as l2]]
  (cond
    (nil? f1) l2
    (nil? f2) l1
    (<= f1 f2) (lazy-seq (cons f1 (merge* r1 l2)))
    (> f1 f2) (lazy-seq (cons f2 (merge* l1 r2)))))

(defn msort [s]
  (if (next s)
    (let [[left right] (split-at (/ (count s) 2) s)]
      (merge* (msort left) (msort right)))
    s))

(defn hsort [s]
  (let [heap (java.util.concurrent.PriorityBlockingQueue. s)]
    (repeatedly (count s) #(.take heap))))

(defn insert [^java.util.LinkedList s x]
  (let [i (.listIterator s (.size s))]
    (while (and (.hasPrevious i) (> (.previous i) x)))
    (.add i x)
    s))

(defn isort [s] ; not lazy
  (seq (reduce insert (java.util.LinkedList.) s)))

(defn treeduce [f s]
  (loop [s (partition-all 2 s)]
    (let [s (map (fn [[x y]] (f x y)) s)]
      (if (< 1 (count s))
        (recur (partition-all 2 s))
        (first s)))))

(defn tsort [s] ; naive
  (treeduce merge* (map isort (partition-all 128 s))))

(defn speed []
  (doseq [s [#'sort #'qsort #'msort #'hsort #'isort #'tsort]
          :let [sort (var-get s)]]
    (println)
    (print "all   " s "")
    (time (dotimes [_ 100] (dorun (sort (shuffle (range 1000))))))
    (print "sorted" s "")
    (time (dotimes [_ 100] (dorun (sort (range 1000)))))
    (print "first " s "")
    (time (dotimes [_ 100] (first (sort (shuffle (range 1000))))))))

Published on

Leiningen project-in-a-gist

Since I wrote a pattern matcher in under 50 lines of code, I do not feel like setting up an entire project for it, but I do want to use it in other projects and share it with you.

Those great folks at Github added git access to Gists a while back, and it turns out you can make Leiningen run in a flat structure with a few extra lines of project.clj.

This allows you to make cute mini projects with git access that can be installed and published and are embeddable and sharable.

Here is the project.clj from my pattern matcher, I also pushed the jar to clojars.

(defproject seqex "1.0.0-SNAPSHOT"
  :description "a tiny pattern matcher"
  :dependencies [[org.clojure/clojure "1.2.1"]]
  :source-path ""
  :aot [seqex]
  :omit-source true)

First we tell Leiningen that our :source-path is the root dir. The other 2 lines are a bit of a hack to make sure only seqex.clj gets into the jar, Otherwise you'll get conflicts and broken jars.

Clojure micro pattern matcher

Today I had some ideas for my game related to cellular automata that involved pattern matching. I googled, and found Matchure.

It might be a great pattern matcher, but the syntax and the amount of macro trickery makes me feel uncomfortable.

In an attempt to write one that relies only on functions, I made this.

Update: This is now a cute project-in-a-gist that can be used as a checkout dependency in Leiningen and Cake.

(ns seqex
  (:import [clojure.lang
            Fn
            IPersistentVector
            IPersistentMap]))

(defn cps [funct cont]
  (fn [[f & r]]
    (when-let [f (funct f)]
      (when-let [r (cont r)]
        (if (true? r)
          (if (true? f) [] [f])
          (if (true? f) r (cons f r)))))))
;    (and (funct f) (cont r))))

(def _ (constantly true))

(def end nil?)

(defmulti matcher type)

(defn match [pattern items]
  ((reduce #(cps (matcher %2) %1)
           (rseq pattern))
     items))

(defmethod matcher :default [l]
  (partial = l))

(defmethod matcher Fn [f] f)

(defmethod matcher IPersistentVector [pattern]
  (partial match pattern))

(defmethod matcher IPersistentMap [pattern]
  (fn [m]
    (every? identity
      (for [[k v] pattern]
        (when-let [mv (get m k)]
          ((matcher v) mv))))))

(defmethod matcher java.util.regex.Pattern [pattern]
  (partial re-find pattern)) ; returns match

(defn store [pattern]
  (fn [s]
    (when ((matcher pattern) s)
      s)))

(defmacro cond-let
  [& clauses]
    (when clauses
      (list `if-let (first clauses)
            (if (next clauses)
                (second clauses)
                (throw (IllegalArgumentException.
                         "cond-let requires an even number of forms")))
            (cons `cond-let (next (next clauses))))))
(match
 [1
  [3 _ end]
  {:a even?, :b #(= 2 (count %))}
  end]
 [1 [3 5] {:a 4, :b [1 2], :d :a}])
; [] (is truthy)

(match [1 2 3 _] [1 2 3 4 5 6 7 8])
; []

(let [[x y] (match
              [1 #"\w.+?\w" 3 (store _) end]
              [1 "foo bar"  3        4])]
  (dotimes [_ y] (println x)))
; foo
; foo
; foo
; foo
; nil
(defproject seqex "1.0.0-SNAPSHOT"
  :description "a tiny pattern matcher"
  :dependencies [[org.clojure/clojure "1.2.1"]]
  :source-path ""
  :aot [seqex]
  :omit-source true)