Rate limits + quotas
Every accepted ingest request flows through two gates: a per-second rate limit (token bucket) and a monthly event quota. Both are per-workspace. They are independent — a workspace can be rate-limited without exhausting its quota, and vice versa.
Per-second rate limit
Section titled “Per-second rate limit”A token bucket with refill rate RATE_LIMIT_RPS and bucket capacity
RATE_LIMIT_BURST. Defaults: 100 rps, 200 burst.
When the bucket is empty, the request returns:
HTTP/1.1 429 Too Many RequestsRetry-After: 1X-Ratelimit-Reason: per_second_rate_limitContent-Type: application/json
{ "error": { "code": "rate_limit_exceeded", "message": "per-second rate limit exceeded" } }The SDK retry orchestrator treats 429 as transient and backs off (see
Retry). The Retry-After header is always 1 for
this gate — the bucket replenishes within one second of refill rate.
Monthly quota
Section titled “Monthly quota”A per-workspace monthly event ceiling. Defaults: 10_000_000 events per
calendar month. Two thresholds:
- Soft ceiling (
QUOTA_SOFT_PCT, default80%): the request still succeeds but a warning header is attached so operators can set up alerts. Header:X-Ratelimit-Reason: monthly_quota_soft. - Hard ceiling (
100%): subsequent requests return:
HTTP/1.1 402 Payment RequiredX-Ratelimit-Reason: monthly_quota_exceededContent-Type: application/json
{ "error": { "code": "monthly_quota_exceeded", "message": "workspace monthly event quota exhausted" } }The SDK retry orchestrator treats 402 as permanent for the rest of
the month — there is no point retrying when the cause is a quota that
won’t reset until midnight on the 1st. This is a deliberate choice to
prevent retry storms from sites whose plan is over-limit.
Per-workspace overrides
Section titled “Per-workspace overrides”Defaults set via env are platform-wide. Per-workspace overrides live in
the workspaces row:
| Column | Purpose |
|---|---|
rate_limit_rps | Per-second refill rate. NULL falls back to env default. |
rate_limit_burst | Bucket capacity. NULL falls back to env default. |
quota_monthly_events | Monthly hard ceiling. NULL falls back to env default. |
Operators set these via direct SQL in v1.0 (a workspaces dashboard endpoint that exposes these fields lands in v1.1).
What gets counted
Section titled “What gets counted”Every accepted event counts against the monthly quota — including those
that are subsequently dropped by the bot or internal-traffic filter (in
Drop mode they were “accepted” by the rate limiter and then dropped by
the filter, so they do count). DLQ rows count.
What does NOT count:
- Requests rejected at auth / consent (4xx never reaches the counter).
- Health checks (
/healthz,/metrics— they bypass the limiter entirely).
Performance
Section titled “Performance”The token-bucket check is benchmarked at p99 < 1 µs, well inside the 5 ms ingest budget. The quota counter is incremented once per accepted event inside the same Postgres transaction as the storage write — no extra round trip.
Monitoring
Section titled “Monitoring”Alerts to consider:
- Rate-limit drops sustained for more than 5 min for a single
workspace — usually a rogue client or an under-tuned default. The
syntarie_events_dropped_total{reason="rate_limit"}metric carries no per-workspace label by design (cardinality), so you must cross-check against operator logs to identify the workspace. - Soft ceiling exceeded — alert per-workspace and contact the customer. Catching this before the hard ceiling avoids end-of-month surprises.
- Hard ceiling exceeded — alert per-workspace; the customer’s events are not flowing.
The structured-log line that fires on each rate-limit decision carries the workspace id at debug level for forensic analysis.