[005.2] Sidekiq Pro: Speedy API Extensions

Sidekiq Pro's speedy API extensions, deleting jobs by `jid` and class, finding jobs by jid, and batch status.

Subscribe now

Sidekiq Pro: Speedy API Extensions [07.11.2016]

Sidekiq Pro adds a few features that allow you to manipulate the Redis-based queues by way of Lua scripts that run inside of Redis. This yields far better performance than you can achieve otherwise. Let's look at what's provided.

Project

We're going to use the sidekiq_batches project just to have some workers defined to play with. The first Pro feature we want to see is the ability to delete a job from a queue.

Deleting Jobs by jid

Assume you have a job that you know the jid of, and you want to delete it for some reason.

# We'll make a new job:
jid = NotificationWorker.perform_async("josh@dailydrip.com", "enemy@example.com")
# Now we can just delete that job from its queue:
queue = Sidekiq::Queue.new
queue.size # => 1
queue.delete_job(jid) # returns the job JSON
queue.size # => 0

That's great if you want to delete a particular job, but maybe you really have a ton of jobs of a certain class and you want none of them to run - perhaps you know they're all going to fail because of a bug that was introduced, and you'd like to just pre-empt them running in the first place.

Deleting Jobs by Class

Sidekiq Pro offers the ability to delete jobs by class:

# Let's make a few jobs:
100.times do
  NotificationWorker.perform_async("josh@dailydrip.com", "enemy@example.com")
  UploadWorker.perform_later(1)
end
queue.size # => 200
queue.delete_by_class(NotificationWorker)
queue.size # => 100
# Now we'll clear the queue just to clean up after ourselves
queue.clear

Finding Jobs by jid in a JobSet

You will from time to time want to find a job by jid. If you have a lot of jobs in your redis - and in production, you likely will - this is slow as it has to scan through them all. Sidekiq::Pro adds a Lua-based Redis extension that provides a much faster means of finding a job.

Let's start off by getting a lot of jobs into the RetrySet. First we'll make a worker that just raises.

vim app/workers/dead_job_worker.rb
class DeadJobWorker
  include Sidekiq::Worker

  def perform
    raise "lol no"
  end
end

Then we'll create a bunch of them:

100_000.times do
  DeadJobWorker.perform_async
end

So now we should end up with 100,000 jobs in our RetrySet once we start sidekiq. Let's start it:

sidekiq
# remember to stop it after, for the setup for the next part

Let's remove sidekiq-pro from our Gemfile to look at the performance of finding a job in a JobSet by jid without the Pro extensions.

((( remove from Gemfile, update config/routes.rb to no longer use pro )))

Now we can find it:

jid = "ff12267de573d487bcaa17d4"
retryset = Sidekiq::RetrySet.new
retryset.find_job(jid)

So you might say well that was fast enough, what's Josh on about? And that seems fair. But you won't actually have redis on localhost in production! So to simulate redis over a network, we'll introduce some latency.

We'll do this with a tool called toxiproxy

We start off by running the toxiproxy service:

toxiproxy-server

And we'll create a proxy for our redis instance that adds some latency:

# Add the proxy
toxiproxy-cli create redis -l localhost:26379 -u localhost:6379
# Make it toxic with 20ms latency...seems reasonable eh?
toxiproxy-cli toxic add redis -t latency -a latency=20

Next we'll configure our redis to use this port via the REDIS_URL environment variable, starting our rails console:

REDIS_URL=redis://localhost:26379 rails c

So now, with just 20ms latency to redis, what does our speed look like?

jid = "ff12267de573d487bcaa17d4"
retryset = Sidekiq::RetrySet.new
retryset.find_job(jid)

Woah...that's a lot slower, and that's not horrible latency. In fact, even with 2ms latency this operation took a few seconds for me, but with this it's a lot more pronounced. Let's switch to Sidekiq Pro and see how the speedier API does with this.

((( turn it back on in the Gemfile )))

REDIS_URL=redis://localhost:26379 rails c
jid = "ff12267de573d487bcaa17d4"
retryset = Sidekiq::RetrySet.new
retryset.find_job(jid)

This is with the same latency to our redis server. This is a huge win.

Polling for status of a batch

A need that comes up frequently is to know the status of a batch, to show it in your UI for long-running batches. Pro adds the ability to do this easily, with the batch_status rack app. You can add it to your project easily:

vim config/application.rb
# ...
require 'sidekiq/rack/batch_status'
#...
module SidekiqBatches
  class Application < Rails::Application
    # ...
    config.middleware.use Sidekiq::Rack::BatchStatus
  end
end

Let's add a batch so we can see its status:

csv = File.read("./wiki_articles.csv")
upload = Upload.create(body: csv)
upload.import!

Now you can visit http://localhost:3000/batch_status/#{batch_id}.json to see the json for a batch's status:

{
  "is_complete": false,
  "bid": "pIvxHDUlw7mvQw",
  "total": 56
  "pending": 56
  "description": "",
  "failures": 0,
  "created_at": 1468102260.4971678,
  "fail_info": []
}

We haven't started sidekiq yet. We have to start it for our upload worker to create the batch.

sidekiq

Now we can start a terminal and curl that URL every second:

while curl http://localhost:3000/batch_status/pIvxHDUlw7mvQw.json; do echo ""; sleep 1; done

We can see jobs being completed bit by bit. You can imagine having a JavaScript application making this request and updating a progress bar with it trivially.

Summary

So that's it. In today's episode, we saw Sidekiq Pro's speedy API extensions, deleting jobs by jid and class, and finding jobs by jid. We also saw how you can track the status of a batch as it is completed, for providing feedback to your users. I hope you enjoyed learning about Sidekiq Pro. See you soon, with details on Sidekiq Enterprise!

Resources