System Design

Rate Limiting

A technique used to control the amount of requests a user can make to an API within a certain timeframe.

Native Rails 8 Rate Limiting

This example uses the native rate_limit method in Rails 8 to limit API requests. We're limiting to 4 requests per minute based on the requesting domain.

class RateLimitingController < ApplicationController
  rate_limit to: 4, 
             within: 1.minute, 
             by: -> { request.domain },
             with: -> {
               respond_to do |format|
                 format.json { render json: { error: "Rate limited" }, status: :too_many_requests }
                 format.html { head :too_many_requests }
               end
             },
             only: :create
  
  def create
    # API endpoint logic here
  end
end

Live Demo

Click the button below to make API requests and observe the rate limiting in action. After 4 requests in a minute, you'll be rate limited.

API Responses

Responses will appear here...

Current Rate Limit Status

Requests made: 0 / 4
Rate limit status: 4 requests remaining.
Cache key: rate-limit:rate_limiting:softwere.xyz

How Rails 8 Rate Limiting Works

Rails 8 provides a simple DSL for rate limiting with the following parameters:

  • to: The maximum number of requests allowed (4 in our example)
  • within: The time window for the limit (1 minute)
  • by: A lambda that returns the key to identify clients (domain in our case)
  • with: A custom response handler when rate limit is exceeded
  • only/except: Action filters to apply selectively

Rails implements this using a cache-based counter that tracks requests within the specified window:

def rate_limit(to:, within:, by: -> { request.remote_ip }, with: -> { head :too_many_requests }, store: cache_store, name: nil, **options)
  before_action -> { rate_limiting(to: to, within: within, by: by, with: with, store: store, name: name) }, **options
end

def rate_limiting(to:, within:, by:, with:, store:, name:)
  cache_key = ["rate-limit", controller_path, name, instance_exec(&by)].compact.join(":")
  count = store.increment(cache_key, 1, expires_in: within)
  if count && count > to
    ActiveSupport::Notifications.instrument("rate_limit.action_controller", request: request) do
      instance_exec(&with)
    end
  end
end

This implementation uses a simple fixed window approach where:

  • Each request increments a counter stored in the cache
  • The counter has an expiration time equal to the rate limit window
  • When the window expires, the counter resets automatically
  • If the counter exceeds the limit, the rate limit is triggered

While simple to implement, this approach can lead to uneven request distribution at window boundaries. For instance, a user could make the maximum requests at the end of one window and again at the beginning of the next window, effectively doubling the allowed rate for a short period.

Window Boundary Window 1 (1 minute) Limit: 4 requests Window 2 (1 minute) Limit: 4 requests Traffic Spike! 8 requests in short period

Fixed window rate limiting flaw: A user could send 4 requests at the end of Window 1 and 4 more at the beginning of Window 2, resulting in 8 requests in a short time span.

Sliding Window Counter Algorithm

The sliding window counter algorithm improves upon the fixed window approach by creating a rolling window that smoothly transitions between time periods, preventing traffic spikes at window boundaries.

Time Number of requests Current time previous minute current minute Rolling minute 70% 30% Rate limit: 5 requests/min Figure 4-11

Sliding window combines counts from the current and previous windows with appropriate weighting

The algorithm works as follows:

  • Track request counts in discrete time windows (e.g., per minute)
  • Calculate a weighted sum of the current window and previous window
  • The weight for the previous window is based on how much of the rolling window overlaps with it
  • Formula: current_window_count + previous_window_count * overlap_percentage

Sliding Window Demo

Try the sliding window counter algorithm which allows 5 requests per minute with a smooth rolling window.

API Responses

Responses will appear here...

Sliding Window Rate Limit Status

Current window count: 0
Previous window count: 0
Position in window: 38.0%
Weighted count: 0.0 / 5
Rate limit status: 5.0 requests remaining. Resets in 37 seconds.
Current sliding window visualization
t-1 t t+1 Previous (0) Current (0) Now Sliding Window 62% 38% Weighted Count 0.0/5
Formula: current_count + previous_count × (1 - position) = 0 + 0 × 0.62 = 0.0
Cache key: rate-limit:sliding-window:rate_limiting:1748928360:softwere.xyz
Expires at: 05:27:00 (37 seconds from now)

Key Takeaways

  • Protection Against Abuse: Rate limiting protects APIs from abuse, denial-of-service attacks, and excessive usage that could impact system performance.
  • Native Rails 8 Support: Rails 8 offers a built-in DSL for implementing rate limiting with a simple rate_limit method that can be configured based on various parameters.
  • Fixed Window Limitations: The simple fixed window approach can lead to request spikes at window boundaries, potentially allowing double the intended rate for short periods.
  • Sliding Window Improvement: The sliding window counter algorithm provides a more even distribution of requests by smoothly transitioning between time periods, preventing traffic spikes.
  • Implementation Choices: The choice of rate limiting algorithm depends on your application's needs - fixed window is simpler to implement, while sliding window provides more consistent traffic control.