Wishful Coding

Didn't you ever wish your computer understood you?

Allocation-agnostic programming

If you define a variable, you use code to generate that data. This is the same concept that underlies atoms in Clojure: Your change is expressed as a function that can be repeated until it succeeds. So what if you maintain a relation between the data and the code that generated it?

This is kind of what happens in Haskell, with its lazy evaluation. You don’t really know if the thing you’re using is evaluated. But you do know it’s only evaluated once, and kept around for as long as referenced.

What if you embrace functional purity, and allow deallocation and repeated evaluation? Below is a silly Clolure function that implements a future that may throw away and recompute your result. Variations of this theme are possible that compute in the current thread for less overhead, or additionally wrap the future in java.lang.ThreadLocal to avoid thrashing the cache. (Seems similar to durable-ref in a way)

(defn soft-future-call [f]
  (let [sref (atom (java.lang.ref.SoftReference. (future-call f)))]
    (reify clojure.lang.IDeref
      (deref [this]
        (if-some [fut (.get @sref)]
          (deref fut)
          (do
            (reset! sref (java.lang.ref.SoftReference. (future-call f)))
            (deref this)))))))

I can’t remember where I read it, but it is said that for single-core machines the fastest code reuses as much data as possible, while on multi-core machines it is often faster to avoid sharing data, and recompute it locally. It may also be interesting for larger-than-memory problems. For example, compiling a huge codebase can use a lot or RAM. Maybe it turns out that recomputing parts of the data is faster than relying on swap space. So it’s maybe an interesting paradigm to write code that abstracts away the distinction between a function and its result.

Well, it’s just a rough idea. Maybe it turns out to be really mundane, boring, and annoying. Maybe it turns out to be really powerful, with the right tools and abstractions.

Pepijn de Vos

All studl.es Lego Mindstorms building instructions added to this blog

The downside of having Spanish domain is that you get Spanish email about them. Somehow an email slipped through, which meant that studl.es suddenly got canceled.

I decided to merge studl.es into this blog, so that all the building instructions and videos I made will remain available. There are probably some broken links and images in there, sorry about that.

A complete list of all the articles is available here, and all the ones that have building instructions are available here.

Pepijn de Vos

Make all the Star Wars memes

Normal people watch movies or play games to relax. Me on the other hand, I write code about movies or games to relax.

I came across some memes that posed the question if every line from the Star Wars prequels is meme material, accompanied by a shot from said prequel with a character confirming said question. Easily nerd-sniped as I am, I thought, surely there are lines in the movie that are completely boring?

I figured it should be doable to extract the subtitles from the movie, and use those to generate every possible Star Wars meme. Well, at least all the ones adhering to the format described above.

Extracting the subtitles and timestamps

It turns out srt subtitles are a pretty easy format to grep, but in case the subtitles are embedded inside the video file, or in some other binary format, ffmpeg got you covered. Once you have the srt file, a simple grep command can be used to extract the timestamps.

Note the extra space and lack of decimal point. I’m lazy, and this seemed the easiest way to get a timestamp that refers to a frame when the subtitle is displayed. This sometimes fails, so a mean between the start and end time is obviously better.

ffmpeg -y -txt_format text -i sw.mkv out.srt;
timestamps=$(grep -oE " [0-9]{2}:[0-9]{2}:[0-9]{2}" out.srt);

Generating images for every subtitle line

Again, ffmpeg can do the job, but it requires a bit more than a straight copy-paste from the manual. Let’s go over the options after the full command, that I run inside a for loop.

ffmpeg -y -ss $ts -copyts -i sw.mkv -vf subtitles=sw.mkv -frames:v 1 frames/out$i.jpg;
  • -y answer yes to all prompts.
  • -ss timestamp seek to the specified timestamp. It is important this comes before the input file, otherwise it’ll render the whole movie up to that point. However, doing so changes the timestamps, which we need for the subtitles.
  • -copyts preserve the timestamp. This was a life-saver, thanks to the ffmpeg IRC channel.
  • -i the input file…
  • -vf subtitles=file specifies a filter that “burns” the subtitle into the movie.
  • -frames:v 1 save a single frame to the specified output file

Add the meme caption

With the hard part behind us, now we’re back to straight copy-pasting from the ImageMagick manual. The only interesting bit I added is the pointsize.

convert frames/out$i.jpg -background White -pointsize 32 label:'Insert funny caption' +swap  -gravity Center -append memes/meme$i.jpg;

Putting it all together

The whole script can be found here. I did not expect much, but it was already at frame 6 that I was pleasantly surprised.

Yes, of course

But then I thought

Surely you can do better

So I created a Twitter bot and a search page.

The Twitter bot is just an IFTTT applet that posts an image, served up by a mind-numbingly simple PHP script that I copied from somewhere.

For the search script, I wrote a little Python script that converts the srt file to a csv file that I can import into a MySQL database, which I then query using SELECT * FROM subtitles WHERE MATCH(sub) AGAINST ('words').

That’s all for now.

May the force be with you

Pepijn de Vos