Rate Limiting
Quick Definition
A technique to control the rate of requests a user can make to an application, protecting against abuse and brute force attacks.
What is Rate Limiting?
Rate limiting is a security and resource management technique that controls how many requests a user, IP address, or API key can make within a specific time window. It's essential for preventing abuse, protecting against brute force attacks, and ensuring fair resource allocation.
Common rate limiting strategies include:
- Fixed window: Simple count reset at fixed intervals (e.g., 100 requests per minute)
- Sliding window: Rolling time window for more accurate limiting
- Token bucket: Allows bursting while maintaining average rate
- Leaky bucket: Processes requests at a fixed rate, queuing excess
Rate limiting can be applied based on:
- IP address
- User account or API key
- Endpoint or URL path
- Request attributes (headers, cookies)
Examples
A login endpoint might have a rate limit of 5 attempts per minute per IP address. After 5 failed login attempts, further requests are blocked for the remainder of the minute, preventing brute force password attacks.
Frequently Asked Questions
What HTTP status code should be returned when rate limiting?
The standard HTTP status code for rate limiting is 429 (Too Many Requests). The response should include a Retry-After header indicating when the client can retry. Many WAFs also support custom response pages or redirects.