Use Gunicorn + Uvicorn to manage workers in llama stack on Unix systems

### 🚀 Describe the new functionality needed

Reference Blog: https://medium.com/@iklobato/mastering-gunicorn-and-uvicorn-the-right-way-to-deploy-fastapi-applications-aaa06849841e

In [llamastack/llama_stack/cli/stack/run.py](https://github.com/llamastack/llama-stack/blob/main/llama_stack/cli/stack/run.py#L171C13-L171C87):
```py
uvicorn.run("llama_stack.core.server.server:create_app", **uvicorn_config)
```

This section of code should be able to initialize the llama stack server with Gunicorn in Unix operating system environments 



### 💡 Why is this needed? What if we don't build it?


Fast API can handle more concurrent workloads when using Gunicorn as the process manager with Uvicorn workers.  This will not change any of the functional behavior of llama stack, other than the production performance. Running in this deployment pattern is also known to more easily integrate with open telemetry auto instrumentation [without issue](https://opentelemetry.io/docs/zero-code/python/troubleshooting/#deploy-with-gunicorn-and-uvicornworker) but is no longer a blocker for it in llama stack at the moment.

### Other thoughts

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use Gunicorn + Uvicorn to manage workers in llama stack on Unix systems #3883

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use Gunicorn + Uvicorn to manage workers in llama stack on Unix systems #3883

Description

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions