Workers are dumb

August 8, 2013 § Leave a comment

Workers are dumb
When doing heavy lifting in a webapp (anything from video transcoding to calling an API), a common practice is to take this heavy lifting out of the request and process it in the background. Doing so means the page can finish loading nice and quickly for the user. A common way to accomplish this is to use a message queue and workers. And workers are dumb.

There are many message queues and many more client libraries. I like Rails and Sidekiq so I’ll be using them for my examples, but I hope the principles can be applied to any queue/worker system.

Say we want to generate a PDF representation of a model. We might have some code that looks like this:

# app/models/model.rb
class Model < ActiveRecord::Base
def generate
PDFGenerator.generate_from_model(self) if self.can_be_printed?
end
end

Here’s the situation; we can call @model.generate to get our PDF (if it can be generated), but PDF generation takes ages and happens when a user presses the generate button, so we decide to move it out of the request and into a Worker.

Our initial attempt might look like this:


# app/workers/pdf_generation_worker.rb
class PdfGenerationWorker
include Sidekiq::Worker

def perform(model_id)
model = Model.find(model_id)
PDFGenerator.generate_from_model(model) if model.can_be_printed?
end
end

# app/models/model.rb
class Model < ActiveRecord::Base
def generate
PDFGenerationWorker.perform_async(self.id)
end
end

Mission accomplished. Sort of. Yes, the heavy lifting of generating the PDF is now being done in another process, but I don’t like this code for a few reasons.

Firstly, there is logic in the worker to safe-guard against generating a PDF that should not be generated. In my opinion, this logic belongs in the model and should remain there where it can be unit-tested and also easily found by the next developer.

Secondly, it’s really hard to generate a PDF from a test, a console or anywhere else. To do so means having to add a message to the queue – that’s the only way to reach the PDFGenerator.generate_from_model method.

Thirdly, and perhaps (or perhaps not) least importantly, we can’t use Sidekiq’s awesome #delay syntax.

Let’s refactor this to fix all of the above. First we’ll start with the model:

# app/models/model.rb
class Model < ActiveRecord::Base
def generate
PDFGenerationWorker.perform_async(self.id)
end

def generate!
PDFGenerator.generate_from_model(self) if self.can_be_printed?
end
end

And then quickly, the worker:


# app/worders/pdf_generation_worker.rb
class PdfGenerationWorker
include Sidekiq::Worker

def perform(model_id)
Model.find(model_id).generate!
end
end

In the model, we’ve added a bang version of our #generate method. I like this because I feel like it follows the convention of bang methods doing stuff – call generate without a bang and a job is enqueued, but call it with the bang and the PDF is generated right away.

Secondly, it means our worker can be dumb. All the worker does now is find the correct model by the id it is given and call the bang version of the method on it. Any special logic required to generate the PDF remains in the model, and the worker becomes that much more transparent.

It also means we can do away with an explicit worker class all together and use Sidekiq’s #delay syntax as I mentioned earlier. Here’s how that might look in the model.


# app/models/model.rb
class Model < ActiveRecord::Base
def generate
self.delay.generate!
end

def generate!
PDFGenerator.generate_from_model(self) if self.can_be_printed?
end
end

Using the #delay method like this takes advantage of something like an anonymous worker class provided by Sidekiq, because the worker knows nothing about the model at all. You can read a bit more about delay here.

So there you have it. I hope you feel that is a compelling argument for keeping workers dumb. It keeps logic where it should be, and in my opinion helps to simplify the whole messy business of message queues and background processing.

Update (27/07/2013)
After some good feedback on Twitter, I want to add another example. I want to illustrate that it is not that you can call back into a model from a worker, but that you call something from your worker, rather than executing complex logic in the worker itself. In this case, we’re going to call from the worker into a service object. The worker will remain dumb, and the class will remain usable. First, the model:


# app/models/model.rb
class Model < ActiveRecord::Base
def generate
PDFGenerationWorker.perform_async(self.id)
end
end

# app/worders/pdf_generation_worker.rb
class PdfGenerationWorker
include Sidekiq::Worker

def perform(model_id)
PDFGeneratorService.generate_from_model(Model.find(model_id))
end
end

# app/services/pdf_generation_service.rb
class PdfGenerationService
def self.generate_from_model(model)
PDFGenerator.generate_from_model(model) if model.can_be_printed?
end
end

In this (obviously contrived) example, we are not calling back into the model, but instead into a service object. Doing this allows us to trim down or model, keep our worker dumb and encapsulate the logic pertaining to the generating of a PDF in a class who’s only worry in the whole wide world is if it should allow the model to be generated as a PDF. Any updates to this logic happen in one place, etc. etc.

And we can still use the #delay syntax:


# app/models/model.rb
class Model < ActiveRecord::Base
def generate
PDFGeneratorService.delay.generate_from_model(self)
end
end

This was originally posted on .

Leave a Reply

What’s this?

You are currently reading Workers are dumb at Logical Friday.

meta

Follow

%d bloggers like this: