Ruby on AWS Lambda: Planning & Architecting

Nate Shoemaker - July 21, 2020

This article is part of our Ruby, AWS Lambda, and OCR blog series. A recent project had us migrating an existing pdf document processing system from Rails Sidekiq to AWS Lambda. The processing includes OCR, creating preview images, splicing the pdf, and more. Moving to Lambda reduced processing time by 300% in some cases; parallelization for the win!

This series will serve less as a step-by-step process to get OCR serverless infrastructure up and running and more of a highlight reel of our "Aha!" moments. In part one, we talked about creating an AWS Lambda Layer with Docker. In part two, we'll chat about architecting a serverless app. Check out the other posts in the series:

The Problem

Rails devs, tell me if you've been here before: you have small units of work that should be processed concurrently, so you reach for Sidekiq. Solved! Well, usually solved. Sidekiq is an amazing tool that we use at Hint, and it solves most concurrency problems.

However, a bottleneck is always a possibility. In our case, the document processor we engineered could process a PDF with one page, or 500. But since each page has to be processed and there is no page limit when uploading, it was common to process hundreds of pages. This work took much too time with Sidekiq, even with tens of workers. We wanted to see the same performance whether the PDF a user uploads has 50 pages, or 500.

The Solution

AWS Lambda was a good solution for our performance problem. It allows us to run our workers in parallel independent of the number of pages we are processing. However, Lambda doesn't give you any guidance on how to design a serverless application. You are provided the basic building blocks: a function that calls other functions, and the ability to use any other AWS service. Quite open-ended! Luckily, there are some standard architecture practices that have emerged in the serverless community. We'll be focusing on the most popular strategy: fan-out, fan-in.

"What do you mean, software architecture?"

Glad you asked! Martin Fowler said it best:

When people in the software industry talk about “architecture”, they refer to a hazily defined notion of the most important aspects of the internal design of a software system.

When we talk about architecture in the context of Lambda, we'll be touching on function composition, utilizing other AWS services, and how the application communicates with the real world. If you haven't, check out Martin Fowler's thoughts on software architecture.

Fan-out, Fan-in

The fan-out, fan-in architecture is simple: one function starts n number of functions, and when those finish, another function does something with the results. The first function is referred to as the ventilator (or vent). The vent function calls worker functions. Once the worker functions finish, the sink function is called. So, vent -> workers -> sink.

For example, let's use a hypothetical 100-page pdf.. The vent function is triggered from an outside application and 100 worker functions are called. Each worker process a single page. The sink function is then invoked when all 100 are done, and we do things with the result. In theory, 1,000,000 pages should take as long as 100, since all the processing is happening in parallel. Let's go into a little bit more detail about the whole process:

  • User uploads PDF from Rails app
  • PDF is uploaded to S3 via Active Storage
  • Vent function is triggered with the file key as an argument
  • Vent function calls n number of worker functions, passing file key as an argument so workers can find the file on S3
  • Worker function processes page, stores preview image, collects information about page and stores that in DynamoDB, and a record in DynamoDB that holds the keeps track of the pages left is decreased by one
  • If the worker is the final worker in the queue, the sink function is called
  • Sink function collects all page information from DynamoDB and sends results as JSON back to the Rails app

DynamoDB is a NoSQL database offered by AWS. When using the fan-out, fan-in pattern you must keep track of progress so the sink function can be called when all workers are completed. The nature of serverless means that no state can be held, thus an external, stateful service must be used. In our case, we are using it as a glorified counter. When a worker function finishes, we decrease the records count column (pages_left) by one:

  def decrement_pages_left(job_id)
    db = Aws::DynamoDB::Client.new
    resp = db.update_item(
      table_name:                  "documents-#{ENV['RUBY_ENV']}",
      key:                         { job_id: job_id },
      update_expression:           'set pages_left = pages_left - :val',
      expression_attribute_values: { ':val' => 1 },
      return_values:               'UPDATED_NEW'
    )

    invoke_sink_function(job_id) if resp[:attributes]['pages_left'] == 0
  end

Note that the sink function is then called if there are no pages left to process. Easy!

For a more in-depth look into the fan-out, fan-in pattern take a look at this fantastic blog post by Yan Cui.

Knowing what language is better suited for a certain task

We initially used Ruby for our vent, worker, and sink functions. However, we hit a bottleneck in the vent function. Invoking hundreds of concurrent network requests (which are IO bound) is not one of Ruby's strong suits. Using any kind of library that takes advantage of CPU cores doesn't gain any advantage because of Lambda's CPU limitations (more on that in the next section). So, what language does Lambda support that has asynchronous IO bound operations out of the box? JavaScript!

Now, this isn't a bash on Ruby. All programming languages have their strengths and weaknesses, and a rewrite in a different language should be well researched and thought out beforehand. Luckily for us, we have lots of Ruby and JS experience. Also, the vent function encapsulates very little business logic, so a rewrite would be a good fit if an initial spike proved fruitful. And it did! When processing a 186-page document, the vent function took 30 seconds in Ruby, and 2 seconds in JS. Nailed it!

If you experience performance issues on Lambda, make sure to research the problem thoroughly. The cool thing about Lambda is that you can use different languages throughout the application process. If part of the process would perform much better with a different language, try it out!

CPU/RAM limitations

When assigning resources to your Lambda functions, memory is the only configurable option. Why? Well, there is no obvious answer in the UI, and even worse, it's hidden deep in a FAQ. When you choose the amount of memory you want to allocate for a function, you are given proportional CPU power and other resources. What the docs don't tell you, however, is that if you allocate enough memory, you'll be given two cores instead of one.

This Stack Overflow comment has more info. TL;DR: if you are using 1.8GB or more of memory, you get to use two CPU cores, instead of one. Even two cores is not a lot, and that's why asynchronous operations in programming languages that utilize multiple cores don't perform well on Lambda. When architecting a serverless application, it's better to split larger tasks into smaller subtasks, when possible.

I hope you learned something new today! Architecting serverless applications come with their own unique set of challenges, but the community has great solutions that have been production-tested for quite some time.

Nate Shoemaker

Nate is our resident JavaScript nerd. He loves learning about and exploring the latest and greatest front-end technologies. Outside of work, he can be found spending time with his wife, studying Elixir, and looking for any excuse to buy music gear.

Ready to Get Started?

LET'S CONNECT