Wishful Coding

Didn't you ever wish your
computer understood you?

Learning F# as a lesson in feminism

Some of my Hacker School friends tweet a lot about feminism, and sometimes I fail to see why things like this are such a big deal.

But since I started learning F#, it suddenly makes sense. Here’s why: F# is default Windows.

You can use F# on Mac or Linux, but at every step along the way you see screenshots, commands, programs and people that implicitly asume you’re using Visual Studio. You can ask questions in ##fsharp on Freenode, but you’ll need to explain you’re using Mono every single time. You can compile cross-platform programs, but with dll and exe extensions.

With this experience, I started looking at some other things.

Why do I use Linux anyway? Is it superior to Windows? In some ways, but also worse in other ways. The real reason is that outside of the .NET community, everything is thoroughly default Linux. Half of the tools and libraries I use will just not work. On some bigger projects, Windows support might be added as an afterthought, possibly through Cygwin.

Why do I work in English? I’m a Dutch guy. Same answer. Programming is default English, and only the biggest of the biggest projects have multilingual documentation and community. It would be incredibly frustrating to program without all the English resources.

Realising I fit the default programmer pretty accurately, and experiencing the annoyances of not being the default, it starts to be really easy to see why it might be frustrating to be an afterthought as a woman.

If you talk about a random guy on the internet, your probably talk about him like this. Read that again, then install this extension

I just assumed you where using Chrome implicitly. How rude. If you’re not, you can install it with apt-get install chromium.

Free beer meetups seem like a great idea at first, but why not make it free drinks instead? Let me sip my lemonade while you all get drunk.

I hope this post will get you thinking about how to alienate less people of any kind. It’s not about Linux or about woman, it’s about being aware of your assumptions. You can’t always cater to everyone, but you can at least acknowledge the people you’re not catering to.

Published on

Redis Pipelining

I’d like to announce Pypredis, a Python client for Redis that tries to answer the question

How fast can I pump data into Redis?

There are many answers to that question, depending on what your goal and constraints are. The answer that Pypredis is exploring is pipelining and sharding. Let me explain.

The best use case is a few slow and independent commands. For example, a couple of big SINTER commands.

The naive way to do it using redis-py is to just execute the commands one after the other and wait for a reply.

r = redis.StrictRedis()
r.sinter('set1', 'set2')
r.sinter('set3’, 'set4’)
r.sinter('set5’, 'set6’)
r.sinter('set7’, 'set8’)

In addition to the CPU time, you add a lot of latency by waiting for the response every time, so a better solution would be to use a pipeline.

r = redis.StrictRedis()
pl = r.pipeline()
pl.sinter('set1', 'set2')
pl.sinter('set3’, 'set4’)
pl.sinter('set5’, 'set6’)
pl.sinter('set7’, 'set8’)
pl.execute()

That is pretty good, but we can do better in two ways.

First of all, redis-py does not start sending commands until you call execute, wasting valuable time while building up the pipeline. Especially if other work is done in-between Redis commands.

Secondly, Redis is — for better or worse — single-threaded. So while the above pipeline might use 100% CPU on one core, the remaining cores might not be doing very much.

To utilise a multicore machine, sharding might be employed. However, sequentially executing pipelines on multiple Redis servers using redis-py actually performs worse.

pl1.execute() #blocks
pl2.execute() #blocks

The approach that Pypredis takes is to return a Future and send the command in another thread using an event loop.

Thus, pipelining commands in parallel to multiple Redis servers is a matter of not waiting for the result.

eventloop.send_command(conn1, “SINTER”, "set1", "set2")
eventloop.send_command(conn2, “SINTER”, "set3”, "set4”)
eventloop.send_command(conn1, “SINTER”, "set5”, "set6”)
eventloop.send_command(conn2, “SINTER”, "set7”, "set8”)

A very simple benchmark shows that indeed Pypredis is a lot faster on a few big and slow commands, but the extra overhead makes it slower for many small and fast commands.

pypredis ping
1.083333
redis-py ping
0.933333
pypredis sunion
0.42
redis-py sunion
11.736665
Published on

Writing a web server

A colleague asked what would be an interesting exercise to learn more about Perl. I think a HTTP server is a good thing to build because it’s a small project that helps you understand web development a lot better.

This post serves as a broad outline of how a HTTP server works, and as a collection of resources to get started.

There is of course the HTTP specification itself. It’s good for looking up specific things, but otherwise not very easy reading.

HTTP is a relatively simple text-based protocol on top of TCP. It consists of a request and a response, both of which are made up of a status line, a number of headers, a blank line, and the request/response body.

What I recommend doing is playing with a simple working server to see what happens.

Lets create a file and start a simple server.

$ echo 'Hello, world!' > test
$ python -m SimpleHTTPServer

This will serve the current directory at port 8000. We can now use curl to request the file we created. Use the -v flag to see the HTTP request and response.

$ curl -v http://localhost:8000/test
> GET /test HTTP/1.1
> User-Agent: curl/7.30.0
> Host: localhost:8000
> Accept: */*
> 
< HTTP/1.0 200 OK
< Server: SimpleHTTP/0.6 Python/2.7.6
< Date: Wed, 12 Mar 2014 17:51:26 GMT
< Content-type: application/octet-stream
< Content-Length: 14
< Last-Modified: Wed, 12 Mar 2014 17:51:06 GMT
< 
Hello, world!

Take a while to look up all the headers to see what each one does. Explain what happens to a friend, cat or plant.

Now you can in turn take the role of the client or server. Can you get Python to return you the file using netcat?

$ nc localhost 8000
<enter request here>

Now can you get curl to talk to you? Start listening with

$ nc -l 1234

Now in another terminal run

$ curl http://localhost:1234/test

You’ll see the request in the netcat window. Try writing a response. Remember to set Content-Length correctly.

Now it is time to actually write the server in the language of choice. Whichever one you use, it is probably loosely based on the Unix C API. To find out more about that, run

man socket

You’re looking for an PF_INET(IPv4) socket of the SOCK_STREAM(TCP) type. But other types exist.

Be sure to check out the SEE ALSO section for functions for working with the socket.

The basic flow for the web server is as folows.

  1. Create the socket.
  2. bind it to a port.
  3. Start to listen.
  4. accept an incoming connection. (will block)
  5. read the request.
  6. write the response.
  7. close the connection.
  8. Go back to accept.

Note that what you do after accept is subject to much debate. The simple case outlined above will handle only one request at a time. A few other options.

  • Start a new thread to handle the request.
  • Use a queue and a fixed pool of threads or processes to handle the requests. Apache does this.
  • Handle many requests asynchronously with select, epoll(linux) or kqueue(BSD). Node.js does this.

After you have a basic request/response working, there are many things you could explore.

  • Serve static files.
  • Add compression with gzip.
  • Support streaming requests and responses.
  • Run a CGI script.
  • Implement most of HTTP 1.0
  • Implement some HTTP 1.1 parts.
  • Look into pipelining and Keep-Alive.
  • Look into caching.
Published on