Wishful Coding

Didn't you ever wish your
computer understood you?

My bookshelf 2/5: The Reasoned Schemer

This is the second part of my bookshelf, a series of posts about a couple of books I'm reading.

Unlike the CouchDB book, The Reasoned Schemer(TRS henceforth) is a book that you don't simply read on the couch, but actually study. This is why it took weeks rather than days to come to this point, where I understand how to use(but not implement) miniKanren.

The book has a gradient on the cover, which gives it the impression of a technical book(it's from MIT press), but it actually contains drawings and the typesetting is brilliant(it is written with some sort of LaTeX I believe). It is also very suitable for vegetarians, since it uses vegetarian food rather than foo/bar/baz.
My bookshelf 2/5: The Reasoned Schemer
The book is written in questions and answers, both in separate columns. You could overlay the answers, and try them yourself first.

TRS describes miniKanren, a logic programming language. It is the bare essence of Prolog and Kanren, written in a few hundred lines of Scheme.

I used Logos, an implementation in Clojure by David Nolen, to follow the examples in the book. This worked out very well, but there are a couple of minor differences.
  • fresh is called exist in Logos.
  • Logos uses vectors for binding variables.
  • Everything is interleaving, so there are no cond-i and all-i.
  • Clojure does hot have pairs, and only proper conses are allowed, it uses lcons instead.
  • Clojure has first and rest, so car-o and cdr-o are called after their clojure.core counterparts.
Later in the book, an arithmetic system is implemented in miniKanren. I attempted to port it to Logos, though it is not complete and bug free yet.

I do not yet fully understand the last few chapters of the book, one of which describes unification. The last chapter is unique in the book in that it does not use the question-answer style, but rather explains the inner workings of miniKanren in tightly packed chunks of information.

I really enjoyed the book, and I hope to find some good uses for my newly acquired powers.

My bookshelf 1/5: CouchDB

I recently ordered four programming books from Amazon, and one from Manning. I'm planning to share some comments, insights, thoughts and something akin to a review, as I 'resurface' on the web in between my reading spurts.

My bookshelf 1/5: CouchDB
I started with CouchDB: A definitive Guide, because I expected it to be the least mind-taxing of the five, and because I am considering CouchDB for two projects of mine.

The book contains a lot of good information, but is also very messy in places. It is baginer-paced, but omits a lot of details.

I found the first chapters, the chapter about CouchApp and the appendices to be the most informative.

The first chapters take you on a high-level tour of CouchDB, with a great deal of fine information. I was pleasantly reminded of the nature of MVCC and B-trees.

Multi Version Concurrency Control means there is absolutely no locking involved, so reads are really fast. Even more so, because views are generated on writing.

Because both databases and views are stored as B-trees, a surprising number of actions are completed in logarithmic time. Surprising ones include updating a (re)reduced view, and taking a range from a large number of items.

Another thing to remember is that everything is sorted by keys, and that keys can be complex types, such as lists.

After about half of the book, I started to skip sections, and I completely skipped the chapters about scaling, which seem nearly inappropriate for a beginner-level book.

One chapter I did read thoroughly is the chapter about CouchApp. CouchApp makes it really easy to maintain and deploy complex web applications that run sans lingua(I made that up; it means "nothing but CouchDB").

That chapter was really helpful, but comes with a flip-side. The information about CouchApp on the web is really sparse:
This blog post is in response to a lot of well-deserved confusion in the community around CouchApps.
The appendices mainly go into detail about things like installation and the details about the use of B-trees.

So, that's it for now. I'd call this book a "good mess". Keep an eye on this blog or my Twitter feed for the next books, and my first CouchDB apps.

Take n distinct random items from a vector

Another IRC hacking session.

plot

plot

(ns test
  (:use (incanter core charts)))

; naive, O(n+m)
(defn take-rand1 [n coll] (take n (shuffle coll)))

; lazy, O(n!@#$%m^&)
(defn take-rand2 [n coll]
  (take n (distinct (repeatedly #(rand-nth coll)))))

; reduce, reorder, subvec, O(m)
(defn take-rand3 [nr coll]
  (let [len (count coll)
        ; why doesn't rand-int take a start?
        rand-int (fn [lo hi] (+ lo (rand-int (- hi lo))))]
    (subvec (->> (range nr)
                 (reduce #(conj %1 [%2 (rand-int %2 len)]) [])
                 (reduce
                   (fn swap [a [i b]]
                      (assoc a b (get a i) i (get a b)))
                   coll))
            0 nr)))

; amalloy, O(m)
(defn take-rand4 [num coll]
  (first
   (nth
    (iterate (fn [[ret candidates]]
               (let [idx (rand-int (count candidates))]
                 [(conj ret (candidates idx))
                  (subvec (assoc candidates idx (candidates 0))
                          1)]))
             [[]
              coll])
    num)))

(defn t [f]
  (let [start (. System (nanoTime))]
    (f)
     (- (. System (nanoTime)) start)))

(defn plot-len [f n]
  (let [coll (vec (range n))]
    (t #(doall (f 1000 coll)))))

(defn plot-take [f n]
  (let [coll (vec (range 100000))]
    (t #(doall (f n coll)))))

(def x (range 1000 100000 1000))

(-> (scatter-plot x (map (partial plot-len  take-rand1) x) :series-label "shuffle" :legend true)
    (add-points   x (map (partial plot-len  take-rand2) x) :series-label "filtered")
    (add-points   x (map (partial plot-len  take-rand3) x) :series-label "reduce")
    (add-points   x (map (partial plot-len  take-rand4) x) :series-label "iterate")
    (save "len.png")) ; http://yfrog.com/5r80779474p

(-> (scatter-plot x (map (partial plot-take take-rand1) x) :series-label "shuffle" :legend true)
    (add-points   x (map (partial plot-take take-rand2) x) :series-label "filtered")
    (add-points   x (map (partial plot-take take-rand3) x) :series-label "reduce")
    (add-points   x (map (partial plot-take take-rand4) x) :series-label "iterate")
    (save "take.png")) ; http://yfrog.com/mr42647628p
Published on