#Conditional Ratelimiting
1 messages · Page 1 of 1 (latest)
@empty granite
If no response in a reasonable time, ping @Member.
To close, type ['!', 'byte ']solve.
Please include an MCVE if relevant.
Similar to something like this:
from math import ceil
import redis.asyncio as redis
from fastapi import FastAPI, Request, Depends, HTTPException, Body
from fastapi.responses import JSONResponse
from fastapi_limiter import FastAPILimiter
from fastapi_limiter.depends import RateLimiter
from pydantic import BaseModel
from starlette.responses import Response
from starlette.status import HTTP_429_TOO_MANY_REQUESTS
app = FastAPI()
# Replace 'redis://localhost:6379' with your Redis server URL
REDIS_URL = "redis://localhost:6379"
async def callback(request: Request, response: Response, pexpire: int):
"""
default callback when too many requests
:param request:
:param pexpire: The remaining milliseconds
:param response:
:return:
"""
expire = ceil(pexpire / 1000)
raise HTTPException(
HTTP_429_TOO_MANY_REQUESTS, "Too Many Requests.", headers={"Retry-After": str(expire)}
)
@app.on_event("startup")
async def startup():
await FastAPILimiter.init(redis.from_url(REDIS_URL), http_callback=callback)
async def rate_limit_dependency(request: Request, response: Response, data: dict = Body(...)):
if data.get("model") != "test":
# Apply rate limit
await RateLimiter(times=3, seconds=1)(request, response)
# await RateLimiter(times=10, minutes=1)(request, response) This does not work, only the first one works
# If the model is "test", no rate limit is applied
@app.post("/data")
async def read_data(request: Request, response: Response, data: dict = Body(...), _=Depends(rate_limit_dependency)):
# The rate limit is applied conditionally in the dependency
return {"message": "Data received", "data": data}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)```
I'm not expert on our middlewares but could you do something along the lines of utilizing the check_throttle_handler with your own.
async def rate_limit_exceeded_handler(request: Request, response: Response, pexpire: int):
expire = ceil(pexpire / 1000)
raise HTTPException(status_code=429, detail="Too Many Requests.", headers={"Retry-After": str(expire)})
async def lukinhas_throttle_handler(request):
payload = await request.json()
if payload.get("model") != "test":
# some kinda redis magic here
else:
return False
return True
rate_limit_config = RateLimitConfig(
rate_limit=("second", 3) # arbitrary number sry
check_throttle_handler=custom_throttle_handler,
)
app = Litestar(..., middleware=[rate_limit_config.middleware])
one of the others will need to chime in 🙂
thank you
@drifting lava I actually did it in another way already
I just created another endpoint
With another ratelimit
And then I have a main endpoint
That redirects to the correct endpoints
With the correct ratelimits
Could I possibly do something like this without Redis?
@unkempt spire
Could you help me?
already did it
async def throttle_handler(request: Request):
payload = await request.json()
client = request.client.host
try:
if payload.get("model").startswith("gpt-4"):
with RateLimit(resource='users_list', client=client, max_requests=3, expire=60):
return True
elif payload.get("model").startswith("gpt-3.5-turbo"):
with RateLimit(resource='users_list', client=client, max_requests=4, expire=60):
return True
else:
with RateLimit(resource='users_list', client=client, max_requests=5, expire=60):
return True
except TooManyRequests:
return False```
Using this library:
Hey @toxic sparrow - someone will give this attention when they have time available. Individually pinging us to get our attention isn't helpful.
This code:
async def custom_throttle_handler(request: Request):
try:
data = await request.json()
print(data)
client = request.client.host
try:
if data.get("model").startswith("gpt-4"):
with RateLimiter(limit=3, period=timedelta(minutes=1)):
return False
elif data.get("model").startswith("gpt-3.5-turbo"):
with RateLimiter(limit=4, period=timedelta(minutes=1)):
return False
else:
with RateLimiter(limit=5, period=timedelta(minutes=1)):
return False
except RateLimitExceeded:
return True
except Exception as e:
print(e)
return True```
Is literally not being able to get the JSON payload. It gives an error while trying to do it.
@static junco Could you help me here? I think it has passed enough time for me to ping someone
It suddenly stopped working.
Can you provide an MCVE for Litestar?
It's impossibly hard to help you when you don't include the error or a way that I can reproduce it
thanks
async def rate_limit_exceeded_handler(request: Request, response: Response, pexpire: int):
expire = ceil(pexpire / 1000)
raise HTTPException(status_code=429, detail="Too Many Requests.", headers={"Retry-After": str(expire)})
async def custom_throttle_handler(request: Request):
data = await request.json()
try:
if data.get("model").startswith("gpt-4"):
with RateLimiter(limit=3, period=timedelta(minutes=1)):
return False
elif data.get("model").startswith("gpt-3.5-turbo"):
with RateLimiter(limit=4, period=timedelta(minutes=1)):
return False
else:
with RateLimiter(limit=5, period=timedelta(minutes=1)):
return False
except RateLimitExceeded:
return True
rate_limit_config = RateLimitConfig(
rate_limit=("minute", 1),
check_throttle_handler=custom_throttle_handler,
exclude_opt_key=["/", "/v1", "/v1/models", "/v1/models/*", "/rp/*", "/rp"]
)```
this is how I'm doing it
app = litestar.Litestar(
route_handlers=[root, v1, v1_models, v1_model_info, admin_add_key, admin_revoke_key, admin_check_key,
v1_chat_completions, rp_chat_completions, rp_models, rp], middleware=[rate_limit_config.middleware],
cors_config=cors_config)```
the ratelimiter class is from this library
How are you able to give a nice example for FastAPI here #1175212071147819100 message
But not give one for Litestar? I'm happy to help, but please try to be considerate.
I'm looking for a complete copy+paste example
with the error message
no error messages
C:\Users\ThatLukinhasGuy\miniconda3\envs\pythonProject1\python.exe C:\Users\ThatLukinhasGuy\PycharmProjects\pythonProject1\api.py
INFO: Started server process [3832]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://localhost:1337 (Press CTRL+C to quit)
INFO: 127.0.0.1:51127 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
just gives this
the rest of the code is just the routes
what about when you run with debug mode?
let me try
I forgot how to do that in Uvicorn
# Runs the app
uvicorn.run(host="localhost", port=1337, app=app, log_level="debug")
oh
add debug=True
The same thing happens with this code
Are you able to inspect payload with a debugger or print the contents?
async def throttle_handler(request: Request):
payload = await request.json()
client = request.client.host
try:
if payload.get("model").startswith("gpt-4"):
with RateLimit(resource='users_list', client=client, max_requests=3, expire=60):
return True
elif payload.get("model").startswith("gpt-3.5-turbo"):
with RateLimit(resource='users_list', client=client, max_requests=4, expire=60):
return True
else:
with RateLimit(resource='users_list', client=client, max_requests=5, expire=60):
return True
except TooManyRequests:
return False
I don't see a print statement though?
I tried it now
async def custom_throttle_handler(request: Request):
payload = await request.json()
print(payload)
client = request.client.host
try:
if payload.get("model").startswith("gpt-4"):
with RateLimit(resource='users_list', client=client, max_requests=3, expire=60):
return True
elif payload.get("model").startswith("gpt-3.5-turbo"):
with RateLimit(resource='users_list', client=client, max_requests=4, expire=60):
return True
else:
with RateLimit(resource='users_list', client=client, max_requests=5, expire=60):
return True
except TooManyRequests:
return False```
If you'll send me a full app that includes a route handler that is similar to what you have for fastapi, I can step through a debugger for you
I'm using Litestar
let me send the full app
I'm sorry to be difficult about this, but please review this: https://stackoverflow.com/help/minimal-reproducible-example
You have a lot of logic that's not relevant here
i get that's it easier for you to paste the whole code, but it's not easier for those that are trying to help you.
I'll be back later this afternoon and will try to review again if there's a simpler example
ok
from math import ceil
import litestar
import uvicorn
from litestar import Request, Response
from litestar.exceptions import HTTPException
from litestar.middleware.rate_limit import RateLimitConfig
from redis_rate_limit import RateLimit, TooManyRequests
async def rate_limit_exceeded_handler(request: Request, response: Response, pexpire: int):
expire = ceil(pexpire / 1000)
raise HTTPException(status_code=429, detail="Too Many Requests.", headers={"Retry-After": str(expire)})
async def custom_throttle_handler(request: Request):
payload = await request.json()
client = request.client.host
try:
if payload.get("model") == "test":
with RateLimit(resource='users_list', client=client, max_requests=3, expire=60):
return True
else:
with RateLimit(resource='users_list', client=client, max_requests=5, expire=60):
return True
except TooManyRequests:
return False
rate_limit_config = RateLimitConfig(
rate_limit=("minute", 1),
check_throttle_handler=custom_throttle_handler
)
@litestar.post(sync_to_thread=True, path="/")
def root() -> dict[str, str]:
return {"Hello": "World!"}
app = litestar.Litestar(route_handlers=[root], middleware=[rate_limit_config.middleware])
uvicorn.run(host="localhost", port=1337, app=app)
@static junco
here is a simpler example
For anyone who wants to look further, just having await request.json() or await request.body() inside custom_throttle_handler seems to cause this issue
from litestar import Litestar, Request, get
from litestar.middleware.rate_limit import RateLimitConfig
async def custom_throttle_handler(request: Request) -> bool:
await request.json()
return True
@get()
async def root() -> dict[str, str]:
return {"Hello": "World!"}
app = Litestar(
route_handlers=[root],
middleware=[
RateLimitConfig(
rate_limit=("minute", 1), check_throttle_handler=custom_throttle_handler
).middleware
],
debug=True,
)
So, is this solved or no?
no, they want to use the request body inside the handler, but it is not possible
#1175212071147819100 message they get this
So, is there any alternatives?
@static junco So, what can I do?
not that I know, you will have to wait and see what others are suggesting, you can raise an issue on github
FYI - I just started looking at this now
the issue with accessing Request.json() in the rate limit middleware is that the connection object that is passed through is not given access to the correct asgi receive() coroutine: https://github.com/litestar-org/litestar/blob/a2e5b7854edce5b40d23bc3493d47f579bdcddef/litestar/middleware/rate_limit.py#L73
litestar/middleware/rate_limit.py line 73
request: Request[Any, Any, Any] = app.request_class(scope)
b/c it is only given scope, it defaults to empty_receive() and empty_send() which actually raise exceptions when they are called..
accessing Request.json() calls Request.receive() which is actually empty_receive()
this will be an easy enough thing to fix - but I've noticed there's quite a bit of inconsistency in the arguments that we pass to our connection objects in different places, so I'd like to fix all of that in one go if possible
Not by me 😬
well, it worked now
🤷♂️
nvm, still happening
@unkempt spire do you have an ETA of when you are going to fix it?
no
the way you keep mentioning us in here is not helpful at all - I'm going to mute this thread now
you can follow along with https://github.com/litestar-org/litestar/issues/2744 and https://github.com/litestar-org/litestar/pull/2718 to keep yourself updated
Here are all of the places in library code that we instantiate a request object: litestar/litestar/contrib/prometheus/middleware.py Line 137 in 7f898ea request = Request[Any, Any, Any](scope, recei...
After exploring this, I don't think we actually have an issue with the behavior of the middleware, the only issue that I think we have is that we should have typed the should_check_request() method to receive an ASGIConnection instance instead of Request. This is because ASGIConnection only has methods that access the connection scope, and no methods that will await receive().
I don't think that we should encourage accessing the connection data (e.g., Request.json() within middleware because there might be other middleware in the stack that want to wrap the receive() coro, and if any prior middleware accesses the connection data before that middleware has been called, then it will result in that later middleware being cryptically ineffective.
For header/client addr/user/session based rate limiting, middleware is good, b/c those are all resolved from connection scope data and don't require awaiting the connection data. For rate limiting that requires accessing the connection data, a Guard would be a better approach b/c the request that is received by the guard has already passed through the entire middleware stack.
I couldn't find any useful way to do rate limiting (I'm assuming it's "you can try login 5 times in a 5 Min window") type thing.
Not sure if redid could work for it tho cause you need 3 bits of info for it. (Unless redis key is the client ID and the payload is a json list of datetimes)
- The client ID
- The DateTime list so that you can Len([x for x if x > y]) or something
- The "type" of rate limit
Thinking about it... Could have a dict on the app where you build a md5 hash or something about the client (IP, proxy ip) and a list as value, and append now() to it. Have middleware that checks the rate limiter "hive" (since client key/I'd will give the same value for that client and then to do something like a cleanup of it on every run.
For all client_keys if latest key < now - x minutes remove the client_key
For the current key add an attempt (append(now)) and do the count how many attempt items there are since now minus x. If too many raise exception
If you want to limit a page to like you can't hit this endpoint more than once every 2 minutes you could use the same logic then I think but instead of a list you add it as a now payload and compare.
You can add that as a middleware and assign it to each route you want. Might want to generate the client key by something like md5(op,proxy_ip,request.get_the_url_and_method_of_the_route) - the middleware gets the request handler in it.
I would still impliment it as a ",utility" type class tho and attach it to the app side so that you can manually use it as well elsewhere. .attempt(client_key,type_key) and just have the middleware populate and check it and raise if needs be
Attempts = AttemptsClass(limit=5,minutes=5)
If Attempts.check(client_key,type_key:
Raise limitexception()
Attempts.add(client_key,type_key)
byte solve