execute future in tokio::spawn causes more memory consumption. #7064

dream-1ab · 2025-01-02T20:23:42Z

Version
Rust: rustc 1.83.0 (90b35a623 2024-11-26)

PS /media/dreamlab/Development/Project/meshel_customer_service/backend/new/customer_service/target/release> cargo tree | grep tokio
│   │   └── tokio v1.42.0
│   │       └── tokio-macros v2.4.0 (proc-macro)
│   │   ├── tokio v1.42.0 (*)
│   ├── tokio v1.42.0 (*)
│   ├── tokio-tungstenite v0.26.1
│   │   ├── tokio v1.42.0 (*)
│   │   ├── tokio v1.42.0 (*)
│   │   ├── tokio-util v0.7.13
│   │   │   └── tokio v1.42.0 (*)
│   │   └── tokio v1.42.0 (*)
│   │   └── tokio v1.42.0 (*)
│   ├── tokio v1.42.0 (*)
│   ├── tokio-rustls v0.26.1
│   │   └── tokio v1.42.0 (*)
├── tokio v1.42.0 (*)
    │   ├── tokio v1.42.0 (*)
    ├── tokio v1.42.0 (*)
    ├── tokio-util v0.7.13 (*)

Platform
The output of uname -a (UNIX), or version and 32 or 64-bit (Windows)

Linux dreamlab-xiaomibookpro162022 6.5.0-1024-oem #25-Ubuntu SMP PREEMPT_DYNAMIC Mon May 20 14:47:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Description

I'm working on a server-side project uses rust + tokio + tower + axum.
suddenly I noticed that my hello world axum http server takes almost 100MB of ram when I run load test with ab -c 1000 -n 500000 http://0.0.0.0:7999/
and finally, I found why simple hello world program takes 100MB of ram.
this is caused when I run server initialization inside of tokio::spawn.
it consumes 10x more memory than running without tokio::spawn.

I tried this code:

main.rs

use std::{future::Future, time::Duration};

use axum::{http::{HeaderName, HeaderValue}, routing::get, Router};
// use axum_extensions::request_counter::RequestCounter;
use tower_http::cors::Any;

mod axum_extensions;

#[tokio::main]
async fn main() {
    let (host, port) = (
        std::env::var("SERVER_HOST").unwrap_or("0.0.0.0".to_string()),
        std::env::var("SERVER_PORT").unwrap_or("7999".to_string()).parse().unwrap_or(7999),
    );

    let app = Router::new()
        .route("/", get(|| async {
            "hello world"
        }))
        // .layer(RequestCounter::new())
        .layer(tower_http::cors::CorsLayer::new().allow_headers(Any).allow_methods(Any).allow_origin(Any))
        .layer(tower_http::set_header::SetResponseHeaderLayer::appending(HeaderName::from_static("developer"), HeaderValue::from_static("Meshel DreamLab software technologies")))
        .layer(tower_http::set_header::SetResponseHeaderLayer::appending(HeaderName::from_static("server"), HeaderValue::from_static("Rust + Tokio + Hyper + Axum")))
    ;

    let task = async move {
        let tcp_server = tokio::net::TcpListener::bind((host, port)).await.unwrap();
        axum::serve(tcp_server, app).await.unwrap();
    };
    
//****************OVER HERE, Switch comment/comment out those two function to try.*******************
    // run_without_spawn(task).await;
    run_with_spawn(task).await;
//***************************************************
}

async fn run_without_spawn<T>(future: impl Future<Output = T>) {
    future.await;
}

async fn run_with_spawn<T: Send + 'static>(future: impl Future<Output = T> + Send + 'static) {
    tokio::spawn(future).await.unwrap();
}

Cargo.toml

[package]
name = "customer_service"
version = "0.1.0"
edition = "2021"

[dependencies]
axum = { version = "0.8.1", features = ["ws"] }
neo4rs = { version = "0.8.0", features = ["json", "serde_json"] }
serde = { version = "1.0.217", features = ["derive"] }
serde_json = "1.0.134"
tokio = { version = "1.42.0", features = ["full"] }
tower = { version = "0.5.2", features = ["full"] }
tower-http = { version = "0.6.2", features = ["full"] }

[code sample that causes the bug]
You can cause the same problem by commenting out ``// run_with_spawn(task).await;and commentrun_without_spawn(task).await;`

    //run_without_spawn(task).await;
    run_with_spawn(task).await;

here is expected result: (without using tokio::spawn)

here is screenshot after 500,000 times load test with `ab -c 1000 -n 500000 http://0.0.0.0:7999/` when use tokio::spawn

and memory never goes back to normal (meaning around 10MB).

The text was updated successfully, but these errors were encountered:

Darksonn · 2025-01-03T09:24:10Z

Please try to measure the memory using this utility:

use core::sync::atomic::{AtomicUsize, Ordering::Relaxed};
use std::alloc::{GlobalAlloc, Layout, System};

struct TrackedAlloc {}

#[global_allocator]
static ALLOC: TrackedAlloc = TrackedAlloc;

static TOTAL_MEM: AtomicUsize = AtomicUsize::new(0);

unsafe impl GlobalAlloc for TrackedAlloc {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        let ret = System.alloc(layout);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(layout.size(), Relaxed);
        }
        ret
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        TOTAL_MEM.fetch_sub(layout.size(), Relaxed);
        System.dealloc(ptr, layout);
    }

    unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {
        let ret = System.alloc_zeroed(layout);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(layout.size(), Relaxed);
        }
        ret
    }
    unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 {
        let ret = System.realloc(ptr, layout, new_size);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(new_size.wrapping_sub(layout.size()), Relaxed);
        }
        ret
    }
}

dream-1ab · 2025-01-03T10:22:51Z

Please try to measure the memory using this utility:

use core::sync::atomic::{AtomicUsize, Ordering::Relaxed};
use std::alloc::{GlobalAlloc, Layout, System};

struct TrackedAlloc {}

#[global_allocator]
static ALLOC: TrackedAlloc = TrackedAlloc;

static TOTAL_MEM: AtomicUsize = AtomicUsize::new(0);

unsafe impl GlobalAlloc for TrackedAlloc {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        let ret = System.alloc(layout);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(layout.size(), Relaxed);
        }
        ret
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        TOTAL_MEM.fetch_sub(layout.size(), Relaxed);
        System.dealloc(ptr, layout);
    }

    unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {
        let ret = System.alloc_zeroed(layout);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(layout.size(), Relaxed);
        }
        ret
    }
    unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 {
        let ret = System.realloc(ptr, layout, new_size);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(new_size.wrapping_sub(layout.size()), Relaxed);
        }
        ret
    }
}

without tokio::spawn:

242440 bytes
236 kilobytes
0 megabytes

with tokio::spawn:

242696 bytes
237 kilobytes
0 megabytes

results are identical, so why tokio::spawn consumes more memory in system monitor? is this because memory fragmentation or memory allocator cache?

Darksonn · 2025-01-03T10:49:44Z

Memory allocators often hold on to memory you are not using so that future allocations are faster. That's most likely what is happening. Of course, fragmentation could also be a factor. Have you tried with jemalloc?

dream-1ab · 2025-01-03T12:33:17Z

it's okay if small amount of memory is cached by memory allocator for future use but 100MB is not small amount of memory, I think.
in my case my system memory is sufficient so this is may not a problem but what will happens if system memory is not enough?
in this case does system triggers low memory signal to all running applications so applications will release memory they holding and not using currently?

I will try the same thing with jemalloc.

dream-1ab · 2025-01-03T12:56:48Z

Memory allocators often hold on to memory you are not using so that future allocations are faster. That's most likely what is happening. Of course, fragmentation could also be a factor. Have you tried with jemalloc?

with the following modification:

use std::{future::Future, time::Duration};

use axum::{http::{HeaderName, HeaderValue}, routing::get, Router};
use axum_extensions::request_counter::RequestCounter;
use routes::accounts::account_management_route;
use tower_http::cors::Any;

mod axum_extensions;
mod routes;


use core::sync::atomic::{AtomicUsize, Ordering::Relaxed};
use std::alloc::{GlobalAlloc, Layout, System};

static _JEMALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc{};

struct TrackedAlloc;

#[global_allocator]
static ALLOC: TrackedAlloc = TrackedAlloc;

static TOTAL_MEM: AtomicUsize = AtomicUsize::new(0);

unsafe impl GlobalAlloc for TrackedAlloc {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        let ret = _JEMALLOC.alloc(layout);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(layout.size(), Relaxed);
        }
        ret
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        TOTAL_MEM.fetch_sub(layout.size(), Relaxed);
        _JEMALLOC.dealloc(ptr, layout);
    }

    unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {
        let ret = _JEMALLOC.alloc_zeroed(layout);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(layout.size(), Relaxed);
        }
        ret
    }
    unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 {
        let ret = _JEMALLOC.realloc(ptr, layout, new_size);
        if !ret.is_null() {
            TOTAL_MEM.fetch_add(new_size.wrapping_sub(layout.size()), Relaxed);
        }
        ret
    }
}


#[tokio::main]
async fn main() {
    let (host, port) = (
        std::env::var("SERVER_HOST").unwrap_or("0.0.0.0".to_string()),
        std::env::var("SERVER_PORT").unwrap_or("7999".to_string()).parse().unwrap_or(7999),
    );

    let app = Router::new()
        .route("/", get(|| async {
            "Welcome to our customer service."
        }))
        .route("/memory", get(|| async {
            let bytes_of_mem = TOTAL_MEM.load(Relaxed);
            format!("{} bytes = {} kilobytes = {} megabytes", bytes_of_mem, bytes_of_mem / 1024, bytes_of_mem / 1024 / 1024)
        }))
        .nest("/api/v1", Router::new()
            .nest("/account", account_management_route().await)
        )
        .layer(RequestCounter::new())
        .layer(tower_http::cors::CorsLayer::new().allow_headers(Any).allow_methods(Any).allow_origin(Any))
        .layer(tower_http::set_header::SetResponseHeaderLayer::appending(HeaderName::from_static("developer"), HeaderValue::from_static("Meshel DreamLab software technologies")))
        .layer(tower_http::set_header::SetResponseHeaderLayer::appending(HeaderName::from_static("server"), HeaderValue::from_static("Rust + Tokio + Hyper + Axum")))
    ;

    let task = async move {
        let tcp_server = tokio::net::TcpListener::bind((host, port)).await.unwrap();
        axum::serve(tcp_server, app).await.unwrap();
    };
    
    tokio::spawn(task).await.unwrap();
}

jemalloc with tokio::spawn:

250928 bytes = 245 kilobytes = 0 megabytes

and plasma system monitor:

jemalloc without tokio::spawn

250352 bytes = 244 kilobytes = 0 megabytes

plasma system monitor:

Darksonn · 2025-01-03T14:15:03Z

Jemalloc does give cached memory back to the OS, but only after a delay. And it doesn't happen if the application is completely idle. You can try configuring jemalloc with background_thread:true,tcache_max:4096 to let freeing of cached memory happen in the background even if the application isn't calling into the allocator. See jemalloc's tuning page for more info.

dream-1ab · 2025-01-03T15:46:48Z

Jemalloc does give cached memory back to the OS, but only after a delay. And it doesn't happen if the application is completely idle. You can try configuring jemalloc with background_thread:true,tcache_max:4096 to let freeing of cached memory happen in the background even if the application isn't calling into the allocator. See jemalloc's tuning page for more info.

after background_thread cargo feature is enabled of kitv-jemallocator, during load testing, memory consumption goes to 200MB+ and goes back to 15MB~ within 10 seconds after load test is ending.

in my case this is totally acceptable, and this doesn't happen during normal execution.
thank you.

Darksonn · 2025-01-03T16:30:52Z

You're welcome.

dream-1ab added A-tokio Area: The main tokio crate C-bug Category: This is a bug. labels Jan 2, 2025

Darksonn closed this as completed Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

execute future in tokio::spawn causes more memory consumption. #7064

execute future in tokio::spawn causes more memory consumption. #7064

dream-1ab commented Jan 2, 2025 •

edited

Loading

Darksonn commented Jan 3, 2025

dream-1ab commented Jan 3, 2025

Darksonn commented Jan 3, 2025

dream-1ab commented Jan 3, 2025

dream-1ab commented Jan 3, 2025

Darksonn commented Jan 3, 2025

dream-1ab commented Jan 3, 2025

Darksonn commented Jan 3, 2025

execute future in tokio::spawn causes more memory consumption. #7064

execute future in tokio::spawn causes more memory consumption. #7064

Comments

dream-1ab commented Jan 2, 2025 • edited Loading

here is expected result: (without using tokio::spawn)

here is screenshot after 500,000 times load test with ab -c 1000 -n 500000 http://0.0.0.0:7999/ when use tokio::spawn

Darksonn commented Jan 3, 2025

dream-1ab commented Jan 3, 2025

without tokio::spawn:

with tokio::spawn:

Darksonn commented Jan 3, 2025

dream-1ab commented Jan 3, 2025

dream-1ab commented Jan 3, 2025

jemalloc with tokio::spawn:

jemalloc without tokio::spawn

Darksonn commented Jan 3, 2025

dream-1ab commented Jan 3, 2025

Darksonn commented Jan 3, 2025

dream-1ab commented Jan 2, 2025 •

edited

Loading

here is screenshot after 500,000 times load test with `ab -c 1000 -n 500000 http://0.0.0.0:7999/` when use tokio::spawn