I've been meaning to cover GenStage for quite a while and finally managed to.
The format for this one is pretty different though. Read on.
This is a session where Theron Boerner agreed
to pair with me to prototype switching the
rabbit-ci build queue system from
RabbitMQ to GenStage. There were a
few reasons that he wanted to do this:
- He can eventually get rid of the RabbitMQ dependency.
- Once a job is enqueued right now, it can't be changed in any way.
- Making the current RabbitMQ-based queueing system support specific machine
types for specific build machines/jobs would be complicated.
Given this, he decided he wanted to just try implementing it in GenStage using
Postgres to store the build queues. It didn't hurt that
José had recently given a talk at the Elixir
London meetup in which he walked
through building a background job queue system using GenStage and taking
advantage of a new feature in PostgreSQL 9.5 that adds support for row
The code we ended up with is available on the
hunterboerner/gen_stage_playground repo on GitHub.
It's a very lengthy pairing session - 2 hours and 20 minutes. With that in
mind, here's a rough outline of various timings in the video:
- 00:00:24 - We discuss the GitHub Issue where he wrote up a couple of candidates for us to work on in our pairing session. He discusses how the system currently works and what he hopes to gain from the GenStage implementation.
- 00:02:30 - We start to look through GenStage's documentation to talk about what our routing should end up looking like. Ultimately we implemented something comically simple, with just a few producers and a few consumers and having them subscribe directly.
- 00:04:50 - Theron shows me where José's London talk covers the use of
FOR UPDATE SKIP LOCKED in PostgreSQL.
- 00:05:37 - We kick off the new project and install GenStage as a dependency.
- 00:13:43 - I realize once again that I want to play with spacemacs.
- 00:17:15 - We introduce Ecto.
- 00:27:12 - We write our first test.
- 00:46:20 - We start making our Producer produce builds rather than integers.
- 00:54:10 - José sends Theron the code for his talk which makes our cribbing from it substantially simpler.
- 01:01:00 - We realize that we have to upgrade our version of PostgreSQL on Theron's machine in order to move forward. This consumes the next 10 minutes.
- 01:11:00 - We can finally run the tests with our code that should start marking jobs as running when the producer produces them.
- 01:15:00 - I completely derail work briefly as my swiping to another desktop screws with his emacs setup. Wooooops.
- 01:20:00 - Now that our test shows the producer marks builds as running after producing them, we start implementing the actual BuildConsumer that will run the builds and consequently mark them as complete. This leads to us talking through some GenStage design realizations, reading the docs, and in general trying to figure out how to tweak our demand so we don't consume a ton of builds at once on a single worker. We also gradually figure out how
min_demand work by watching the logs and being confused for a long time until we are no longer confused.
- 01:49:00 - 29 minutes later, we finally start to have a decent internal model for how GenStage and demand work, and can finally begin to predict how a given tweak to the
min_demand options will affect the actual results. Once we get past the rest of this eureka moment, we start talking through design considerations for the BuildConsumer.
- 01:54:00 - We notice that our
returning statement isn't working exactly like we expected in our update. This ultimately is due to the default value in our schema, but it takes us a few minutes to work that out.
- 01:59:30 - We start actually prototyping the 'work' that the
BuildConsumer does, updating the status of the builds appropriately. This leads to a brief discussion of crash semantics inside our consumer.
- 02:01:00 - Our
BuildConsumer is doing what we want! Now we start adding more tests to simulate increasingly appropriate examples of an actual CI queue.
- 02:10:00 - We model a producer for each build type (think mac builds, linux builds, etc) and consumers that can handle those build types, and verify that our system can model this successfully.
- 02:12:00 - We introduce Tasks with
await to run builds concurrently.
- 02:16:40 - Wrapping up
So yeah...This is different than anything else I've released, but I think it's
useful. Please email me and let me know if you liked this episode or if you
hated it. I know it's a massive departure from the normal format, and I don't
intend to switch to doing 2 of these a week or anything, but if there's value
and people like them I'm completely willing to do them occasionally because it's
a fantastic experience for me. Also, I think there needed to be some public
discussion of building something slightly non-trivial with GenStage, and there
hadn't really been enough of that yet, so it seemed like a useful thing to get
A hybrid approach might be doing this sort of thing and then releasing the
pairing session as one episode in a week, and a succinct version of the
important meat we learned from it as the second episode. I'm all ears, send me
an email or leave a comment and let me know what you want to see!
See you soon!