System Design

Rate Limiting

A technique used to control the amount of requests a user can make to an API within a certain timeframe.

Native Rails 8 Rate Limiting

This example uses the native rate_limit method in Rails 8 to limit API requests. We're limiting to 4 requests per minute based on the requesting domain.

class RateLimitingController < ApplicationController
  rate_limit to: 4, 
             within: 1.minute, 
             by: -> { request.domain },
             with: -> {
               respond_to do |format|
                 format.json { render json: { error: "Rate limited" }, status: :too_many_requests }
                 format.html { head :too_many_requests }
               end
             },
             only: :create
  
  def create
    # API endpoint logic here
  end
end

Live Demo

Click the button below to make API requests and observe the rate limiting in action. After 4 requests in a minute, you'll be rate limited.

API Responses

Responses will appear here...

Current Rate Limit Status

Requests made: 0 / 4

Rate limit status: 4 requests remaining.

Cache key: rate-limit:rate_limiting:softwere.xyz

How Rails 8 Rate Limiting Works

Rails 8 provides a simple DSL for rate limiting with the following parameters:

to: The maximum number of requests allowed (4 in our example)
within: The time window for the limit (1 minute)
by: A lambda that returns the key to identify clients (domain in our case)
with: A custom response handler when rate limit is exceeded
only/except: Action filters to apply selectively

Rails implements this using a cache-based counter that tracks requests within the specified window:

def rate_limit(to:, within:, by: -> { request.remote_ip }, with: -> { head :too_many_requests }, store: cache_store, name: nil, **options)
  before_action -> { rate_limiting(to: to, within: within, by: by, with: with, store: store, name: name) }, **options
end

def rate_limiting(to:, within:, by:, with:, store:, name:)
  cache_key = ["rate-limit", controller_path, name, instance_exec(&by)].compact.join(":")
  count = store.increment(cache_key, 1, expires_in: within)
  if count && count > to
    ActiveSupport::Notifications.instrument("rate_limit.action_controller", request: request) do
      instance_exec(&with)
    end
  end
end

This implementation uses a simple fixed window approach where:

Each request increments a counter stored in the cache
The counter has an expiration time equal to the rate limit window
When the window expires, the counter resets automatically
If the counter exceeds the limit, the rate limit is triggered

While simple to implement, this approach can lead to uneven request distribution at window boundaries. For instance, a user could make the maximum requests at the end of one window and again at the beginning of the next window, effectively doubling the allowed rate for a short period.

Fixed window rate limiting flaw: A user could send 4 requests at the end of Window 1 and 4 more at the beginning of Window 2, resulting in 8 requests in a short time span.

Sliding Window Counter Algorithm

The sliding window counter algorithm improves upon the fixed window approach by creating a rolling window that smoothly transitions between time periods, preventing traffic spikes at window boundaries.

Sliding window combines counts from the current and previous windows with appropriate weighting

The algorithm works as follows:

Track request counts in discrete time windows (e.g., per minute)
Calculate a weighted sum of the current window and previous window
The weight for the previous window is based on how much of the rolling window overlaps with it
Formula: current_window_count + previous_window_count * overlap_percentage

Sliding Window Demo

Try the sliding window counter algorithm which allows 5 requests per minute with a smooth rolling window.

API Responses

Responses will appear here...

Sliding Window Rate Limit Status

Current window count: 0

Previous window count: 0

Position in window: 20.0%

Weighted count: 0.0 / 5

Rate limit status: 5.0 requests remaining. Resets in 48 seconds.

Current sliding window visualization

Formula: current_count + previous_count × (1 - position) = 0 + 0 × 0.8 = 0.0

Cache key: rate-limit:sliding-window:rate_limiting:1752912120:softwere.xyz

Expires at: 08:03:00 (48 seconds from now)

Key Takeaways

Protection Against Abuse: Rate limiting protects APIs from abuse, denial-of-service attacks, and excessive usage that could impact system performance.
Native Rails 8 Support: Rails 8 offers a built-in DSL for implementing rate limiting with a simple rate_limit method that can be configured based on various parameters.
Fixed Window Limitations: The simple fixed window approach can lead to request spikes at window boundaries, potentially allowing double the intended rate for short periods.
Sliding Window Improvement: The sliding window counter algorithm provides a more even distribution of requests by smoothly transitioning between time periods, preventing traffic spikes.
Implementation Choices: The choice of rate limiting algorithm depends on your application's needs - fixed window is simpler to implement, while sliding window provides more consistent traffic control.