Releases: germanoeich/nirn-proxy
v1.3.3
v1.3.2
- Introduced the DISABLE_GLOBAL_RATELIMIT_DETECTION env var, which disables the optimistic REST global ratelimit calculation
- Fixed a case where if the calls to /bot/gateway (which happen for the ratelimit detection) encountered a 429, the proxy would reply with a 500
In a future (breaking) release, the automatic detection of REST limits will be permanently disabled, as it stands it is too unreliable and uses an endpoint with an aggressive ratelimit, which hinders the ability of the proxy to scale up effectively.
Both these changes stem from issues we faced in production, where if the rl detection was enabled, the proxy would call /bot/gateway when creating a new queue. This endpoint had a pretty aggressive ratelimit, which, given enough nodes in the cluster, caused the queue startup to throw 500s. Libraries in general will not retry a 500, so this caused the loss of requests.
Full Changelog: v1.3.1...v1.3.2
v1.3.1
- Removed request aborting. The idea was good, but predicting how/when ratelimits will clear is fragile and the proxy was not built with this use case in mind, a full change in internal data-structures would be needed to support this use case, and clients should be able to reason about when and how to abort a request anyway.
- Added a special case for some paths with @me parameters and no ratelimits, so that they are spread across the cluster, preventing a single node from handling all requests for these endpoints.
Full Changelog: v1.3.0...v1.3.1
v1.3.0
- Implemented the ability to abort requests based on how long they will wait for ratelimits [#5] - Thanks bluenix :)
- Client IPs are displayed on log entries for easier debugging
- Open connections metric now exports the method and route label for debugging route-level issues
- Fixed the proxy retrying too early for scope:shared ratelimits
- Context canceled was downgraded error to warning level
- Fixed a very rare crash
v1.2.3
Fixes:
- Fix a bug that was introduced on v1.2.0, which caused global ratelimits to not work when running in standalone mode
- Fix a panic regarding non-utf8 characters in request paths
- Fix startup panics when in cluster mode, regarding memberlist not being ready yet
v1.2.2
Added:
- BOT_RATELIMIT_OVERRIDES That allows you to override a bots global rest limits
- /nirn/healthz endpoint for healthchecking
Fixed:
- Fixed a deadlock that happened on ocassion at very high throughputs, causing a node to be unresposive for a period of time
- Fixed nodes joining a cluster before the proxy was ready, resulting in increased errors rates during deployments
v1.2.1
Changes:
- Adds the DISABLE_HTTP_2 env var for disabling HTTP2 support
- Adds a hook to sanitize webhook and interaction tokens from logs
- Makes context deadline exceeded errors output to the "warn" level
Fixes:
- Unversioned api paths not being properly destructured into buckets
- getOrCreateQueue errors generating panics down the code path
v1.2.0
This release:
- Adds Bearer token support, using an LRU for bearer queues and optimizing the routing to distribute bearer requests across the cluster evenly
- Optimizes the cluster routing path to be lighter on the node responsible for the first hop
v1.1.3
This release:
- Fixes some instances where a queue could be locked on a 401 coming from interactions, which can happen during normal operation.
- Makes sure the NoAuth (Requests without an Auth header and no bot attached) bucket is not lockable on 401s.
v1.1.2
This release adds:
- An error level log on the 401 queue locking mechanism in order to detect which routes are returning 401s
- The DISABLE_401_LOCK unstable env var to disable the mechanism entirely (But not the log).