What Is Rate Limiting?

Rate Limiting : A technique used to control the number of requests a user or client can make to an API or web application within a specified time window. Rate limiting prevents abuse, brute force attacks, resource exhaustion, and excessive API usage by rejecting requests that exceed the defined threshold.

Why It Matters for AI-Coded Apps

AI-generated APIs almost never include rate limiting. Login endpoints without rate limiting are vulnerable to credential stuffing and brute force attacks. AI feature endpoints (which call expensive LLM APIs) can be abused to rack up massive API bills. Rate limiting is critical for both security and cost control.

Real-World Example

A login endpoint allows unlimited attempts. An attacker uses a credential stuffing tool to try 10,000 username/password combinations per minute from a leaked database. Without rate limiting, the application processes every attempt, and the attacker gains access to accounts with weak passwords.

How to Detect and Prevent It

Implement rate limiting on all public endpoints, especially authentication. Use sliding window algorithms for accurate limiting. Apply per-IP and per-user limits. Return 429 Too Many Requests with a Retry-After header. Use libraries like express-rate-limit (Node.js), slowapi (FastAPI), or edge-level limiting (Cloudflare, Vercel).

Frequently Asked Questions

What rate limits should I set for a login endpoint?

A common starting point: 5 failed attempts per account per 15 minutes, 20 requests per IP per minute. After exceeding the limit, require a CAPTCHA or temporary lockout. Adjust based on your user base and threat model.

What is the difference between rate limiting and throttling?

Rate limiting hard-blocks requests after a threshold is reached (returns 429 error). Throttling slows down requests by adding delays or queuing them. Rate limiting is better for security (blocking brute force); throttling is better for graceful degradation under load.

How do I rate limit serverless functions?

Serverless functions (Vercel, AWS Lambda) don’t have persistent memory for counters. Use external stores: Redis (Upstash for serverless), DynamoDB, or edge-level rate limiting from your CDN/platform (Vercel’s built-in rate limiting, Cloudflare Rate Limiting rules).

Scan your app for security issues automatically

Vibe Eval checks for 200+ vulnerabilities in AI-generated code.

Try Vibe Eval

AI Coding Security Insights.
Ship Vibe-Coded Apps Safely.

Effortlessly test and evaluate web application security using Vibe Eval agents.