Skip to content

Conversation

th3w1zard1
Copy link
Contributor

@th3w1zard1 th3w1zard1 commented Jun 13, 2025

  • Combines the frontend/nextjs app (exposing port 3000) and the python uvicorn backend (exposing port 8000).
  • Allows the environment variables HOST, PORT, and NEXTJS_PORT to configure internals.
  • Small strategic changes to improve reliability of the build, e.g. setting up retries and increasing timeouts.
  • Comments to clarify each phase/step of the build process.

Build and run

docker buildx build -f Dockerfile.fullstack . --tag assafelovic/gpt-researcher-fullstack-test
docker run -p 3000:3000 -p 8000:8000 --env-file .env assafelovic/gpt-researcher-fullstack-test

EDIT: Thank you for merging #1411 -- I'll see if I can submit another PR adding this to its workflow (and removing the MCP build) this weekend

AI-written explanations on the changes:


PR: Add Dockerfile.fullstack for Unified Frontend and Backend Deployment

Overview

This PR introduces a new Dockerfile.fullstack that builds and serves both the Next.js frontend and the FastAPI (uvicorn) backend in a single Docker container. This approach streamlines deployment and simplifies development or production environments where maintaining separate containers is unnecessary or impractical.

Key Components and Workflow

1. Multi-Stage Build

  • Frontend Build: Uses a Node.js image to build the Next.js frontend.
  • Browser/Backend Tools: Installs Chromium, Firefox, Chromedriver, Geckodriver, and required build tools for browser-based features or testing.
  • Backend Build: Installs all backend Python dependencies.
  • Final Stage: Combines frontend and backend artifacts, and adds Node.js, supervisord, and nginx.

2. Supervisord: Unified Process Management

  • Supervisord acts as a process manager, launching and monitoring three key services within the container:
    • Uvicorn (FastAPI backend): Serves the API on a configurable port (default 8000). This is exposed inside the container (PORT)
    • Next.js Frontend: Runs the production server on an internal port (default 3001). This is not exposed inside the container. (ARG/ENV NEXT_INTERNAL_PORT).
    • Nginx: Acts as a reverse proxy, listening on the public-facing port (default 3000). (ARG/ENV NEXT_PORT).

Both the uvicorn backend and the nextjs app are accessible on NEXT_PORT (3000). This was done so reverse proxies outside the container would work properly. Example:

  gptr:
    build:
      context: https://github.com/assafelovic/gpt-researcher.git#main
      dockerfile: Dockerfile.fullstack
    image: assafelovic/gpt-researcher-aio:latest
    stdin_open: true
    container_name: gptr
    hostname: gptr
    expose:
      - 3000  # nginx unifying route
      - 3001  # nextjs internal port
      - 8000  # uvicorn external port
    ports:
      - 8000:8000  # unsure if this is needed.
    volumes:
      - ${CONFIG_PATH:-./configs}/gptr/logs:/usr/src/app/logs
      - ${CONFIG_PATH:-./configs}/gptr/outputs:/usr/src/app/outputs
      - ${CONFIG_PATH:-./configs}/gptr/reports:/usr/src/app/reports
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - BRAVE_API_KEY=${BRAVE_API_KEY}
      - DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY}
      - EXA_API_KEY=${EXA_API_KEY}
      - FIRECRAWL_API_KEY=${FIRECRAWL_API_KEY}
      - FIRE_CRAWL_API_KEY=${FIRE_CRAWL_API_KEY}
      - GEMINI_API_KEY=${GEMINI_API_KEY}
      - GLAMA_API_KEY=${GLAMA_API_KEY}
      - GROQ_API_KEY=${GROQ_API_KEY}
      - HF_TOKEN=${HF_TOKEN}
      - HUGGINGFACE_ACCESS_TOKEN=${HUGGINGFACE_ACCESS_TOKEN}
      - HUGGINGFACE_API_TOKEN=${HUGGINGFACE_API_TOKEN}
      - LANGCHAIN_API_KEY=${LANGCHAIN_API_KEY}
      - MISTRAL_API_KEY=${MISTRAL_API_KEY}
      - MISTRALAI_API_KEY=${MISTRALAI_API_KEY}
      - OPENAI_API_KEY=${OPENAI_API_KEY}
      - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
      - PERPLEXITY_API_KEY=${PERPLEXITY_API_KEY}
      - PERPLEXITYAI_API_KEY=${PERPLEXITYAI_API_KEY}
      - REPLICATE_API_KEY=${REPLICATE_API_KEY}
      - REVID_API_KEY=${REVID_API_KEY}
      - SAMBANOVA_API_KEY=${SAMBANOVA_API_KEY}
      - SEARCH1API_KEY=${SEARCH1API_KEY}
      - SERPAPI_API_KEY=${SERPAPI_API_KEY}
      - TAVILY_API_KEY=${TAVILY_API_KEY}
      - TOGETHERAI_API_KEY=${TOGETHERAI_API_KEY}
      - UNIFY_API_KEY=${UNIFY_API_KEY}
      - UPSTAGE_API_KEY=${UPSTAGE_API_KEY}
      - UPSTAGEAI_API_KEY=${UPSTAGEAI_API_KEY}
      - YOU_API_KEY=${YOU_API_KEY}
      - CHOKIDAR_USEPOLLING=true
      - LOGGING_LEVEL=DEBUG
      - NEXT_PUBLIC_GA_MEASUREMENT_ID=${NEXT_PUBLIC_GA_MEASUREMENT_ID}
    labels:
      traefik.enable: "true"
      traefik.http.routers.gptr.service: gptr
      traefik.http.routers.gptr.rule: Host(`gptr.${DOMAIN}`)
      traefik.http.services.gptr.loadbalancer.server.port: 3000
      traefik.http.routers.gptr-legacy.service: gptr-legacy
      traefik.http.routers.gptr-legacy.rule: Host(`gptr-legacy.${DOMAIN}`)
      traefik.http.services.gptr-legacy.loadbalancer.server.port: 8000

Without the nginx solution, Traefik would need the following lines to actually work properly:

labels:
      # ...other labels above here omitted for brevity...
      traefik.http.routers.gptr-backend.service: gptr-legacy
      traefik.http.routers.gptr-backend.rule: Host(`gptr.${DOMAIN}`) && (PathPrefix(`/ws`) || PathPrefix(`/outputs`) || PathPrefix(`/reports`))

This setup ensures all three processes are started together and automatically restarted if any of them crash, which is crucial in a Docker environment where by default only one foreground process is allowed.

3. Nginx: Reverse Proxy and Routing

  • Nginx serves several roles:

    • Reverse Proxy: Forwards API routes (/outputs, /reports, /ws) to the FastAPI backend.
    • Static & Frontend Routing: Proxies all other requests to the Next.js frontend server.
    • WebSocket Support: Handles WebSocket upgrades for real-time features.
    • Compression & Logging: Enables gzip and request logging for observability and performance. (gzip on, gzip_types, access_log, error_log). Simply mount a volume to /var/log/nginx to persist them through container restarts.

    This is needed because neither Next.js nor FastAPI can efficiently serve both frontend and API routes from a shared public port, nor handle WebSockets and static assets as robustly as nginx.

Usage

  • Build:
    docker build -f Dockerfile.fullstack -t gptr:fullstack .
  • Run:
    docker run -p 3000:3000 -p 8000:8000 gptr:fullstack
    • The application will be available on port 3000.
    • API endpoints and WebSocket connections are seamlessly proxied to the backend.

Why This Approach?

  • Simplicity: One container, one deployment, easier local testing and smaller orchestration footprint.
  • Robustness: supervisord ensures all core processes are running, or restarts them if they fail.
  • Production-Ready: nginx provides a production-grade reverse proxy, static asset delivery, and WebSocket support/routing

@th3w1zard1
Copy link
Contributor Author

th3w1zard1 commented Jun 13, 2025

The workflows that are failing (from #1411 ) can be fixed pretty easily:

  • add the GITHUB_PERSONAL_ACCESS_TOKEN so it can push the image to ghcr.io
  • add DOCKER_TOKEN and DOCKER_USERNAME so it can push to docker.io

These should be in the repository secrets

Apologies if I did not specify this in that PR.

Never ended up using it.
@assafelovic
Copy link
Owner

@th3w1zard1 this is great! Can you maybe include a few words about it in the docs?

@th3w1zard1
Copy link
Contributor Author

OK this was somewhat of a pain to figure out. So apparently I learned two things:

  • The WebSocket connections from the nextjs app, e.g. wss:// ws://, i.e. ws://localhost:8000/ws are browser/client relevant, not server-relevant. For longer than I care to admit I thought the hostname:port were server-side.
  • NEXT_PUBLIC_GPTR_API_URL and REACT_APP_GPTR_API_URL do absolutely nothing. I have zero idea why. For whatever reason, whenever nextjs builds + runs, it ignores this completely when doing websocket calls to the backend. I checked dumb guy normal stuff like .env, ensuring the variables were used and prioritized over stuff like apiUrlInLocalStorage, etc. Eventually I gave up and wrote a solution based on observed results. Perhaps CORS is off somewhere but I couldn't figure out where, it wasn't disabled anywhere obvious in the backend.

@th3w1zard1
Copy link
Contributor Author

Hello @assafelovic,

This PR is completely ready. Would you prefer if I update the github workflow in a separate pull request, or is it acceptable to do so within this one?

Additionally, I am drafting numerous pull requests that have greatly enhanced my experience. However, navigating through git diffs can be rather tedious. As you might expect, there have been several occasions over the past few months where I was close to fully implementing a feature and preparing a clean pull request, only to be interrupted by other commitments and return to a dozen new commits, putting me effectively back where I started.

Copy link
Owner

@assafelovic assafelovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!!

@assafelovic assafelovic merged commit c368d61 into assafelovic:master Jun 26, 2025
0 of 8 checks passed
@th3w1zard1 th3w1zard1 deleted the patch-1 branch June 29, 2025 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants