Skip to content

Rate Limiting

Controls restricting how many API requests a user can make within a time period. AI API providers use rate limits to ensure fair usage, prevent abuse, and manage capacity. Limits are typically expressed in requests per minute or tokens per minute.

Related terms

API (Application Programming Interface)InferenceThroughput
← Back to glossary