Rate Limits
Understanding API rate limits and best practices for Taam Cloud LLM services
Understanding Rate Limits
Each account has a default rate limit of 300 Requests Per Minute (RPM) Per Model.
Rate limits control how frequently users can make requests to our LLM API within specific time periods. These limits help ensure fair resource distribution and maintain service stability.
Prevent Abuse
Protect against API misuse and abuse
Fair Usage
Ensure fair resource distribution
Stability
Maintain consistent API performance
Rate Limit Details
Best Practices
Implement Request Throttling
Add rate limiting in your application code to stay within limits:
Add Exponential Backoff
Implement retry logic with increasing delays:
Monitor Usage
Track your API usage through our dashboard: View Usage Stats
Handling Rate Limits
When you receive a 429 error, implement these handling strategies:
- Retry Later: Wait for the specified cooldown period
- Optimize Requests: Batch operations when possible
- Monitor Usage: Track your consumption patterns
Increasing Rate Limits
Contact Support
Join our Discord community for immediate assistance
Enterprise Needs
Book a call with our sales team for custom limits
Repeatedly exceeding rate limits may result in temporary account restrictions. Please monitor your usage and request limit increases if needed.
Was this page helpful?