Limit max request size #2155

Kludex · 2023-05-26T09:18:53Z

Discussed in #1516

^{Originally posted by aviramha April 5, 2020}
As discussed in the Gitter, my opinion is that starlette should provide a default limit for request size.
The main reason is that without it, any Starlette application is vulnerable to very easy DoS.
For example, newbie me can write a program as follows:

from starlette.requests import Request
from starlette.responses import Response


async def app(scope, receive, send):
    assert scope['type'] == 'http'
    request = Request(scope, receive)
    body = b''
    json = await request.json()
    response = Response(body, media_type='text/plain')
    await response(scope, receive, send)

As a malicious user, I could send a 30GB sized JSON and cause the memory to go OOM.
Other frameworks support this also - Django, Quart.
My proposal is to add a default limit which can be overrided in the app configuration.

Important

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

The text was updated successfully, but these errors were encountered:

adriangb · 2023-05-26T14:01:27Z

If we're going to make it a configurable middleware it might also make sense to have some sort of timeout for connections and each chunk, maybe infinite by default but definitely tunable.

Another thing to keep in mind is that this is likely something users want to control on a per-endpoint basis. That is, if I have an app that has an upload feature where I'm expecting 1GB files it's likely a single endpoint that expects 1GB files so I'd want to bump up the limits just for that endpoint. That makes me think that the best strategy may be a per-endpoint middleware w/ a companion middleware that just tweaks the config by changing it in scope. That would allow layering and overriding of these settings. This would be similar to #2026

alex-oleshkevich · 2023-05-31T09:51:26Z

This is a good one! I also agree that we need a global setting and per-route (Route + Mount).
Use case: global limit - 1mb, photo upload endpoint - 10mb limit.

We can add max_body_size to request.form(), request.json(), and request.stream() functions.

Kludex · 2023-06-24T11:18:24Z

Why should the ASGI application be the one to set this instead of the server?

alex-oleshkevich · 2023-06-24T13:05:55Z

Why should the ASGI application be the one to set this instead of the server?

Example: global POST limit is 1mb, for selected endpoints that upload files - 100mb.
Setting this at the server level is global and leaves no chance to override it.

abersheeran · 2023-11-06T06:55:52Z

Adding a LimitRequestSizeMiddleware is the simplest and forward-compatible way.

Kludex · 2023-11-06T17:04:05Z

Adding a LimitRequestSizeMiddleware is the simplest and forward-compatible way.

Yeah. Shall we follow this path?

adriangb · 2023-11-06T17:05:51Z

Yes I think someone should make a PR and we can discuss the details (override vs. min/max, should there be a default, etc.) there.

adriangb · 2023-11-08T15:18:10Z

Yes I think someone should make a PR

I am someone, I made a PR 😆 : #2328

defnull · 2024-11-12T20:02:19Z

The PR was closed, but the idea is still on the table, so here are my 2ct:

Global request size limits do no work for application that actually want to accept file uploads on some routes. Those uploads are usually spooled to temporary files and not limited by available memory. JSON is parsed into in-memory structures, though. Enforcing the same limit on both types of data is not sensible.
Not having reasonable default limits is an invitation for developers to forget about this aspect and write vulnerable applications.
How do others do it? That should not matter much, but: Bottle, Django and probably many others do have size limits. Not for the request body, but for what is loaded into memory by functions like Request.json(). Werkzeug/Flask says that calling Request.get_data() is a bad idea without checking request size first, but does it anyway when parsing json. Not the best role model perhaps.
Frameworks that parse the request body before calling the request handler function (e.g. FastAPI) make it extra hard to be safe. You cannot check the request body size before the parsing step is triggered.
Request.json() is a function, adding a size limit parameter would backwards compatible.

raceychan · 2025-03-27T14:13:26Z

Hi guys:

are you still interested in this? I just scan through your discussion and came up with something like this:

class Request(HTTPConnection):
    _form: FormData | None

    def __init__(
        self,
        scope: Scope,
        receive: Receive = empty_receive,
        send: Send = empty_send,
        max_content_length: int | None = None,
    ):
        super().__init__(scope)
        assert scope["type"] == "http"
        self._receive = receive
        self._send = send
        self._stream_consumed = False
        self._is_disconnected = False
        self._form = None
        if max_content_length is not None:
            assert max_content_length > 0
            if self.headers.get("content-length") > max_content_length:
                raise ValueError("Body too large")
        self._max_content_length = max_content_length

    @property
    def method(self) -> str:
        return typing.cast(str, self.scope["method"])

    @property
    def receive(self) -> Receive:
        return self._receive

    async def stream(self, chunk_size: int | None = None) -> typing.AsyncGenerator[bytes, None]:
        if hasattr(self, "_body"):
            yield self._body
            yield b""
            return
        if self._stream_consumed:
            raise RuntimeError("Stream consumed")
        buffer = bytearray()
        while not self._stream_consumed:
            message = await self._receive()
            if message["type"] == "http.request":
                body = message.get("body", b"")
                buffer.extend(body)  # Append new data to buffer
                if chunk_size:
                    while len(buffer) >= chunk_size:
                        yield buffer[:chunk_size]  # Yield chunk
                        del buffer[:chunk_size]  # Remove yielded data
                if not message.get("more_body", False):
                    self._stream_consumed = True
                    if buffer:
                        yield bytes(buffer)  # Yield remaining buffer data
            elif message["type"] == "http.disconnect":  # pragma: no branch
                self._is_disconnected = True
                raise ClientDisconnect()
        yield b""

    async def body(self, chunk_size: int | None = None) -> bytes:
        if not hasattr(self, "_body"):
            chunks: list[bytes] = []
            async for chunk in self.stream(chunk_size):
                chunks.append(chunk)
            self._body = b"".join(chunks)
        return self._body

    async def json(self) -> typing.Any:
        if not hasattr(self, "_json"):  # pragma: no branch
            body = await self.body()
            self._json = json.loads(body)
        return self._json

    async def _get_form(
        self,
        *,
        max_files: int | float = 1000,
        max_fields: int | float = 1000,
        max_part_size: int = 1024 * 1024,
        chunk_size: int | None = None
    ) -> FormData:
        if self._form is None:  # pragma: no branch
            assert (
                parse_options_header is not None
            ), "The `python-multipart` library must be installed to use form parsing."
            content_type_header = self.headers.get("Content-Type")
            content_type: bytes
            content_type, _ = parse_options_header(content_type_header)
            if content_type == b"multipart/form-data":
                try:
                    multipart_parser = MultiPartParser(
                        self.headers,
                        self.stream(chunk_size),
                        max_files=max_files,
                        max_fields=max_fields,
                        max_part_size=max_part_size,
                    )
                    self._form = await multipart_parser.parse()
                except MultiPartException as exc:
                    if "app" in self.scope:
                        raise HTTPException(status_code=400, detail=exc.message)
                    raise exc
            elif content_type == b"application/x-www-form-urlencoded":
                form_parser = FormParser(self.headers, self.stream(chunk_size))
                self._form = await form_parser.parse()
            else:
                self._form = FormData()
        return self._form

    def form(
        self,
        *,
        max_files: int | float = 1000,
        max_fields: int | float = 1000,
        max_part_size: int = 1024 * 1024,
        chunk_size: int | None = None
    ) -> AwaitableOrContextManager[FormData]:
        return AwaitableOrContextManagerWrapper(
            self._get_form(
                max_files=max_files, max_fields=max_fields, max_part_size=max_part_size, chunk_size=chunk_size
            )
        )

    async def close(self) -> None:
        if self._form is not None:  # pragma: no branch
            await self._form.close()

    async def is_disconnected(self) -> bool:
        if not self._is_disconnected:
            message: Message = {}

            # If message isn't immediately available, move on
            with anyio.CancelScope() as cs:
                cs.cancel()
                message = await self._receive()

            if message.get("type") == "http.disconnect":
                self._is_disconnected = True

        return self._is_disconnected

    async def send_push_promise(self, path: str) -> None:
        if "http.response.push" in self.scope.get("extensions", {}):
            raw_headers: list[tuple[bytes, bytes]] = []
            for name in SERVER_PUSH_HEADERS_TO_COPY:
                for value in self.headers.getlist(name):
                    raw_headers.append(
                        (name.encode("latin-1"), value.encode("latin-1"))
                    )
            await self._send(
                {"type": "http.response.push", "path": path, "headers": raw_headers}
            )

This assumes that a request won't have body larger than what is claimed in content-length , last time i checked either httptools or uvicorn verify this
Checking content-length is very cheap and easy to do so, if this is to defend ourselves from malicious users this can be quite effective. we might send some fancier error response within request_response

def request_response(
    func: typing.Callable[[Request], typing.Awaitable[Response] | Response],
) -> ASGIApp:
    """
    Takes a function or coroutine `func(request) -> response`,
    and returns an ASGI application.
    """
    f: typing.Callable[[Request], typing.Awaitable[Response]] = (
        func if is_async_callable(func) else functools.partial(run_in_threadpool, func)  # type:ignore
    )

    async def app(scope: Scope, receive: Receive, send: Send) -> None:
        try:
            request = Request(scope, receive, send)
        except ValueError: # we might want something more specific like RequestBodyOverSizedError
            return app_that_send_error_message
        async def app(scope: Scope, receive: Receive, send: Send) -> None:
            response = await f(request)
            await response(scope, receive, send)

        await wrap_app_handling_exceptions(app, request)(scope, receive, send)

    return app

This should be backward compatible

alex-oleshkevich · 2025-03-27T14:36:29Z

Hi guys:

are you still interested in this? I just scan through your discussion and came up with something like this:

class Request(HTTPConnection):
_form: FormData | None

def __init__(
    self,
    scope: Scope,
    receive: Receive = empty_receive,
    send: Send = empty_send,
    max_content_length: int | None = None,
):
    super().__init__(scope)
    assert scope["type"] == "http"
    self._receive = receive
    self._send = send
    self._stream_consumed = False
    self._is_disconnected = False
    self._form = None
    if max_content_length is not None:
        assert max_content_length > 0
        if self.headers.get("content-length") > max_content_length:
            raise ValueError("Body too large")
    self._max_content_length = max_content_length

@property
def method(self) -> str:
    return typing.cast(str, self.scope["method"])

@property
def receive(self) -> Receive:
    return self._receive

async def stream(self, chunk_size: int | None = None) -> typing.AsyncGenerator[bytes, None]:
    if hasattr(self, "_body"):
        yield self._body
        yield b""
        return
    if self._stream_consumed:
        raise RuntimeError("Stream consumed")
    buffer = bytearray()
    while not self._stream_consumed:
        message = await self._receive()
        if message["type"] == "http.request":
            body = message.get("body", b"")
            buffer.extend(body)  # Append new data to buffer
            if chunk_size:
                while len(buffer) >= chunk_size:
                    yield buffer[:chunk_size]  # Yield chunk
                    del buffer[:chunk_size]  # Remove yielded data
            if not message.get("more_body", False):
                self._stream_consumed = True
                if buffer:
                    yield bytes(buffer)  # Yield remaining buffer data
        elif message["type"] == "http.disconnect":  # pragma: no branch
            self._is_disconnected = True
            raise ClientDisconnect()
    yield b""

async def body(self, chunk_size: int | None = None) -> bytes:
    if not hasattr(self, "_body"):
        chunks: list[bytes] = []
        async for chunk in self.stream(chunk_size):
            chunks.append(chunk)
        self._body = b"".join(chunks)
    return self._body

async def json(self) -> typing.Any:
    if not hasattr(self, "_json"):  # pragma: no branch
        body = await self.body()
        self._json = json.loads(body)
    return self._json

async def _get_form(
    self,
    *,
    max_files: int | float = 1000,
    max_fields: int | float = 1000,
    max_part_size: int = 1024 * 1024,
    chunk_size: int | None = None
) -> FormData:
    if self._form is None:  # pragma: no branch
        assert (
            parse_options_header is not None
        ), "The `python-multipart` library must be installed to use form parsing."
        content_type_header = self.headers.get("Content-Type")
        content_type: bytes
        content_type, _ = parse_options_header(content_type_header)
        if content_type == b"multipart/form-data":
            try:
                multipart_parser = MultiPartParser(
                    self.headers,
                    self.stream(chunk_size),
                    max_files=max_files,
                    max_fields=max_fields,
                    max_part_size=max_part_size,
                )
                self._form = await multipart_parser.parse()
            except MultiPartException as exc:
                if "app" in self.scope:
                    raise HTTPException(status_code=400, detail=exc.message)
                raise exc
        elif content_type == b"application/x-www-form-urlencoded":
            form_parser = FormParser(self.headers, self.stream(chunk_size))
            self._form = await form_parser.parse()
        else:
            self._form = FormData()
    return self._form

def form(
    self,
    *,
    max_files: int | float = 1000,
    max_fields: int | float = 1000,
    max_part_size: int = 1024 * 1024,
    chunk_size: int | None = None
) -> AwaitableOrContextManager[FormData]:
    return AwaitableOrContextManagerWrapper(
        self._get_form(
            max_files=max_files, max_fields=max_fields, max_part_size=max_part_size, chunk_size=chunk_size
        )
    )

async def close(self) -> None:
    if self._form is not None:  # pragma: no branch
        await self._form.close()

async def is_disconnected(self) -> bool:
    if not self._is_disconnected:
        message: Message = {}

        # If message isn't immediately available, move on
        with anyio.CancelScope() as cs:
            cs.cancel()
            message = await self._receive()

        if message.get("type") == "http.disconnect":
            self._is_disconnected = True

    return self._is_disconnected

async def send_push_promise(self, path: str) -> None:
    if "http.response.push" in self.scope.get("extensions", {}):
        raw_headers: list[tuple[bytes, bytes]] = []
        for name in SERVER_PUSH_HEADERS_TO_COPY:
            for value in self.headers.getlist(name):
                raw_headers.append(
                    (name.encode("latin-1"), value.encode("latin-1"))
                )
        await self._send(
            {"type": "http.response.push", "path": path, "headers": raw_headers}
        )

This assumes that a request won't have body larger than what is claimed in content-length , last time i checked either httptools or uvicorn verify this
Checking content-length is very cheap and easy to do so, if this is to defend ourselves from malicious users this can be quite effective. we might send some fancier error response within request_response

def request_response(
func: typing.Callable[[Request], typing.Awaitable[Response] | Response],
) -> ASGIApp:
"""
Takes a function or coroutine func(request) -> response,
and returns an ASGI application.
"""
f: typing.Callable[[Request], typing.Awaitable[Response]] = (
func if is_async_callable(func) else functools.partial(run_in_threadpool, func) # type:ignore
)

async def app(scope: Scope, receive: Receive, send: Send) -> None:
    try:
        request = Request(scope, receive, send)
    except ValueError: # we might want something more specific like RequestBodyOverSizedError
        return app_that_send_error_message
    async def app(scope: Scope, receive: Receive, send: Send) -> None:
        response = await f(request)
        await response(scope, receive, send)

    await wrap_app_handling_exceptions(app, request)(scope, receive, send)

return app

This should be backward compatible

User can put any value into the header and render server unreliable, so the proposed variant is not optimal.

raceychan · 2025-03-27T14:51:19Z

@alex-oleshkevich

yeah, then we might:

maintain a received_bytes_num inside Request.stream and compare it to self.max_content_length
if it exceeds then raise error

the thing is through, if we assume server is not realiable, and receive might return arbitrarily large body then there is nothing we can do, since once we await receive() it is in our memeory, unless we implement a server that is realiable, right?

defnull · 2025-03-27T18:42:18Z

if we assume server is not realiable

WSGI defines that servers "should not" pass more bytes to the application than specified in the Content-Length header, if present. I skimmed the ASGI spec and it seems to be totally silent about this detail. Which means IMHO that servers cannot be trusted to actually enforce Content-Length. They should, but not doing so is not a bug.

But does that really matter for this discussion? Content-Length is optional anyway. If it's missing, then the ASGI app cannot know the content size in advance. The request may be Transfer-Encoding: chunked or terminated HTTP/1.0 style by closing half the socket. HTTP allows arbitrary large uploads of unknown size.

A safeguard to prevent OOMs or other resource exhaustion attacks should never depend on a client specified header or undefined server behavior. The only reliable way to enforce such a limit is to count received bytes.

Kludex added the feature New feature or request label May 31, 2023

Kludex added this to the Version 1.x milestone May 31, 2023

alex-oleshkevich mentioned this issue Jun 6, 2023

Add request_max_size option #2175

Closed

adriangb linked a pull request Nov 6, 2023 that will close this issue

Add RequestSizeLimitMiddleware and RequestTimeoutMiddleware #2328

Open

Kludex mentioned this issue Nov 27, 2023

Add LimitBodySizeMiddleware #2350

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit max request size #2155

Limit max request size #2155

Kludex commented May 26, 2023 •

edited by polar-sh bot

Loading

adriangb commented May 26, 2023

alex-oleshkevich commented May 31, 2023

Kludex commented Jun 24, 2023 •

edited

Loading

alex-oleshkevich commented Jun 24, 2023 •

edited

Loading

abersheeran commented Nov 6, 2023

Kludex commented Nov 6, 2023

adriangb commented Nov 6, 2023

adriangb commented Nov 8, 2023

defnull commented Nov 12, 2024

raceychan commented Mar 27, 2025 •

edited

Loading

alex-oleshkevich commented Mar 27, 2025

raceychan commented Mar 27, 2025 •

edited

Loading

defnull commented Mar 27, 2025 •

edited

Loading

Limit max request size #2155

Limit max request size #2155

Comments

Kludex commented May 26, 2023 • edited by polar-sh bot Loading

Discussed in #1516

adriangb commented May 26, 2023

alex-oleshkevich commented May 31, 2023

Kludex commented Jun 24, 2023 • edited Loading

alex-oleshkevich commented Jun 24, 2023 • edited Loading

abersheeran commented Nov 6, 2023

Kludex commented Nov 6, 2023

adriangb commented Nov 6, 2023

adriangb commented Nov 8, 2023

defnull commented Nov 12, 2024

raceychan commented Mar 27, 2025 • edited Loading

alex-oleshkevich commented Mar 27, 2025

raceychan commented Mar 27, 2025 • edited Loading

defnull commented Mar 27, 2025 • edited Loading

Kludex commented May 26, 2023 •

edited by polar-sh bot

Loading

Kludex commented Jun 24, 2023 •

edited

Loading

alex-oleshkevich commented Jun 24, 2023 •

edited

Loading

raceychan commented Mar 27, 2025 •

edited

Loading

raceychan commented Mar 27, 2025 •

edited

Loading

defnull commented Mar 27, 2025 •

edited

Loading