How the Ratelimit Demo Works

A deep dive into the architecture and implementation of our global ratelimit performance comparison

Overview

This demo compares the latency performance of two ratelimit services: Unkey and Upstash Redis(a popular Redis-based approach) across six global regions.

The comparison runs on Vercel Edge Runtime, which automatically routes requests to the nearest edge location, providing a realistic test of how each service performs from different geographic locations.

Important: We used AWS region us-east-1 for Upstash Redis. This is where Unkey is hosted however we are globally distributed to ensure a fair performance comparison with consistent infrastructure.

Architecture

Frontend (React/Next.js)

  • Test Configuration: Users can set the rate limit (requests per window) and duration (10s, 60s, or 5m)
  • Parallel Testing: Simultaneously tests all 6 regions when you click "Test"
  • Real-time Visualization: Displays latency data in line charts and bar charts using Recharts
  • Data Persistence: Uses browser localStorage to maintain test history across sessions

Backend API Routes

Each region has its own API endpoint that runs on Vercel's Edge Runtime:

  • /bom1Mumbai, India
  • /fra1Frankfurt, Germany
  • /iad1Washington, DC
  • /kix1Osaka, Japan
  • /lhr1London, UK
  • /sfo1San Francisco, CA

Each endpoint is configured with preferredRegion to ensure the code runs in that specific geographic location.

How the Test Works

1. User Identity

Each user gets a unique identifier stored in a cookie. This ensures consistent ratelimiting across test runs and prevents interference between different users.

2. Parallel Execution

When you click "Test", the frontend makes simultaneous POST requests to all 6 regional endpoints. Each endpoint:

  • Configures both Unkey and Upstash ratelimiters with your specified settings
  • Uses performance.now() to measure precise timing at the lambda level, ensuring accurate rate limiting metrics
  • Runs both ratelimit checks in parallel using Promise.all()
  • Returns the results including latency measurements

3. Latency Measurement

Latency is measured from the moment the ratelimit request starts until the response is received. This includes:

  • Network round-trip time to the ratelimit service
  • Processing time within the ratelimit service
  • Any connection establishment overhead

Implementation Details

Unkey Ratelimiting

const unkey = new UnkeyRatelimit({
  namespace: "ratelimit-demo",
  rootKey: env().RATELIMIT_DEMO_ROOT_KEY,
  limit,
  duration,
});

const result = await unkey.limit(`${id}-unkey-${region}`);
  • Uses Unkey's ratelimit service
  • Each user gets a unique key per region to prevent cross-region interference
  • Supports sliding window ratelimiting

Upstash Redis Ratelimiting

const upstash = new UpstashRatelimit({
  redis: Redis.fromEnv(),
  limiter: UpstashRatelimit.slidingWindow(limit, duration),
});

const result = await upstash.limit(`${id}-upstash-${region}`);
  • Uses Upstash global Redis compatible database with a primary in us-east-1
  • Implements sliding window algorithm
  • Separate keys per user and region for fair comparison

Response Format

Each regional endpoint returns:

{
  "time": 1703123456789,
  "unkey": {
    "success": true,
    "limit": 10,
    "remaining": 9,
    "reset": 1703123466789,
    "latency": 45.2
  },
  "upstash": {
    "success": true,
    "limit": 10,
    "remaining": 9,
    "reset": 1703123466789,
    "latency": 127.8
  }
}

Data Visualization

Charts

  • Bar Chart: Compares latest latency across all regions side-by-side
  • Line Charts: Show latency trends over time for each service
  • Region Cards: Display detailed metrics for the most recent test

Color Coding

Each region has a consistent color across all visualizations, making it easy to track performance patterns:

Mumbai: Purple
Frankfurt: Teal
Washington DC: Pink
Osaka: Brown
London: Green
San Francisco: Yellow

Why This Matters

Real-World Performance

This demo provides realistic performance data because:

  • Tests run from actual edge locations where your users would be
  • Includes real network latency and geographic distribution effects
  • Uses the same deployment architecture (Vercel Edge) many applications use

Key Performance Factors

  • Geographic Distribution: How close the ratelimit service is to your users
  • Network Optimization: How well the service handles global connectivity
  • Infrastructure Design: Whether the service is built for edge computing

What the Results Show

Typically, you'll observe that Unkey demonstrates consistently lower latency across regions due to its edge-native architecture, while traditional Redis-based solutions may show higher latency especially from regions distant from the Redis instance.

Technical Notes

Edge Runtime

All API routes use export const runtime = "edge" to ensure they run on Vercel's Edge Runtime rather than Node.js, providing faster cold starts and better geographic distribution.

Measurement Precision

We use performance.now() for sub-millisecond timing precision, ensuring accurate latency measurements even for very fast operations. This is run on the Lambda to ensure consistent and accurate timing measurements.

Reset Functionality

The "Reset" button clears the user cookie, effectively giving you a fresh identity for testing. This is useful when you want to test different scenarios without being affected by previous rate limit state.

Data Persistence

Test results are stored in browser localStorage, so you can refresh the page or come back later and still see your test history. Data is scoped to the specific demo instance.