The Standalone Email Assistant is a LangGraph-powered workflow that triages incoming email, coordinates tools, and drafts high-quality replies using Google's Gemini models. The project showcases how to layer human-in-the-loop checkpoints, durable memory, and Gmail automation on top of a modular graph so you can evolve an assistant from a simple responder into a production-ready agent.
- LangGraph 1.0 graph lineup with sync durability baked in: baseline responder, HITL, persistent memory, and Gmail-native automation.
- Multi-mode streaming (
updates/messages/custom) wired throughget_stream_writer()plus thestream_progresshelper for live progress events. - Gemini 2.5 tool orchestration (
tool_choice="any") with guardrails for spam handling, no-reply detection, and deterministic evaluation modes. - Rich tool belt including calendar scheduling, email drafting, spam labeling, and reminder automation.
- Durable execution via SQLite-backed checkpoints/stores so interrupts and memory survive process restarts.
- Optional S-Class reminder worker that escalates follow-ups via Gmail labels and background polling.
| Name | What it adds | Typical use |
|---|---|---|
email_assistant |
Core triage → respond loop with Gemini Pro and basic tools. | Playground runs, unit demos. |
email_assistant_hitl |
Human-in-the-loop approvals before tools execute. | Agent Inbox, supervised pilots. |
email_assistant_hitl_memory |
Persistent memory with SqliteSaver plus response preference routing. |
Long-running or personalized assistants. |
email_assistant_hitl_memory_gmail |
Gmail API integration, spam tooling, and structured outputs for evaluators. | Default production/test target. |
See AGENTS.md for a deep dive into routing, tool behavior, and the latest feature upgrades.
- Triage – Classifies each email (respond, ignore/notify, spam). Fallback logic defaults to respond so flows never stall.
- Respond – Drafts replies, schedules meetings, or escalates reminders using LangGraph tools.
- Tools & Integrations –
send_email_tool,schedule_meeting_tool, Gmail spam helper, and more. HITL wraps sensitive actions. - Memory & Checkpoints – SQLite checkpoints and stores persist agent state and preference history across runs.
- Human Interrupts – HITL agents pause on tool calls using
HumanInterrupt. Auto-accept can be enabled for demos/tests.
flowchart TD
START --> TRIAGE[triage_router]
TRIAGE --> APPLY[apply_reminder_actions_node]
TRIAGE -->|notify/HITL| HITL[triage_interrupt_handler]
APPLY --> RESPONSE[response_agent]
RESPONSE --> MARK[mark_as_read_node]
MARK --> END
HITL -->|respond| APPLY
HITL -->|ignore / accept| END
This diagram reflects the production email_assistant_hitl_memory_gmail graph: every run starts in triage, optionally pauses for HITL before reminder actions are applied, and exits after finalising the Gmail thread. Additional diagrams for the other agents live in DIAGRAMS.md.
- Python 3.11+
uv(recommended) orpip- Google Gemini API access via
GOOGLE_API_KEY - Optional: Gmail API credentials for the Gmail agent (
setup_gmail.py)
uv venv
source .venv/bin/activate
uv pip install -e .
# or: pip install -e .Create a .env file (referenced by langgraph.json) with at least:
GOOGLE_API_KEY=...
GEMINI_MODEL=gemini-2.5-pro
Model helper defaults: email_assistant.configuration.get_llm now normalises provider/model pairs via init_chat_model, defaulting to google_genai:gemini-2.5-pro. Override the provider by setting EMAIL_ASSISTANT_MODEL_PROVIDER or prefixing the model value (e.g., google_genai:gemini-2.0-pro-exp).
Helpful toggles (leave unset for live runs):
HITL_AUTO_ACCEPT=1– auto-accept tool interrupts.EMAIL_ASSISTANT_SKIP_MARK_AS_READ=1– skip Gmailmark_as_readcall.EMAIL_ASSISTANT_EVAL_MODE=1– deterministic, offline tool calls.EMAIL_ASSISTANT_RECIPIENT_IN_EMAIL_ADDRESS=1– evaluator compatibility mode.EMAIL_ASSISTANT_MODEL_PROVIDER=google_genai– explicit provider override forinit_chat_model(defaults togoogle_genai).EMAIL_ASSISTANT_SQLITE_TIMEOUT=60– optional override (seconds) for SQLite busy timeouts when running LangSmith traces or parallel judges; defaults to 30.EMAIL_ASSISTANT_TRACE_TIMEZONE=Australia/Sydney– override the timezone used when auto-grouping LangSmith projects (email-assistant-AGENT-YYYYMMDD). Defaults to Australia/Sydney.EMAIL_ASSISTANT_TRACE_DEBUG=1– log LangGraph stream events and tracing metadata to stdout (useful when validating custom streaming progress).EMAIL_ASSISTANT_TRACE_STAGE/EMAIL_ASSISTANT_TRACE_TAGS– append rollout metadata to LangSmith runs for multi-stage deploys.EMAIL_ASSISTANT_TIMEZONE=Australia/Melbourne– default runtime timezone used by scripts and reminders when no explicit timezone is provided.EMAIL_ASSISTANT_JUDGE_PROJECT=email-assistant:judge– base LangSmith project name for Gemini judge evaluations (date suffix appended automatically).EMAIL_ASSISTANT_LOG_PATH=.../EMAIL_ASSISTANT_LOG_LEVEL=DEBUG– configure the shared log file (logs/email_assistant.logby default) that captures both frontend and backend logger output when the package is imported.
- All notebooks, scripts, and tests request
stream_mode=["updates","messages","custom"]; thecustomchannel surfacesstream_progressevents for live demos and automated logs. scripts/run_real_outputs.py --streammirrors the production agent stream and prints each channel as it arrives. Combine withEMAIL_ASSISTANT_TRACE_DEBUG=1when you need verbose instrumentation.- LangSmith project helpers respect
EMAIL_ASSISTANT_TRACE_PROJECT,EMAIL_ASSISTANT_TRACE_STAGE, andEMAIL_ASSISTANT_TRACE_TAGS, keeping traces grouped when replaying multi-mode streaming runs (EMAIL_ASSISTANT_TRACE_PROJECTchanges the base name; the daily suffix still applies). - See
dev_tickets/LangChain-LangGraph-v1-implementation-ticket.mdfor the upgrade log covering the LangGraph 1.0 migration and streaming instrumentation decisions.
langgraph up # Start LangGraph Studio
# or run a graph directly
langgraph run email_assistant_hitl_memory_gmailSelect the graph that matches your use case (langgraph.json lists all available graphs). Increase recursion_limit (e.g., 100) in Studio for long tool sequences.
- HITL agents surface approvals through LangGraph Agent Inbox (
HumanInterrupt). - Resume payloads can accept, edit, or provide feedback (see README_LOCAL.md for samples).
- Set
HITL_AUTO_ACCEPT=1for automated Acceptance during tests or demos.
- Live Gemini suites:
pytest tests/test_live_smoke.py --agent-module=email_assistant_hitl_memory_gmailpytest tests/test_live_hitl_spam.py --agent-module=email_assistant_hitl_memory_gmailpytest tests/test_response.py --agent-module=email_assistant_hitl_memory_gmail -k tool_calls
- Offline/deterministic runs: set
EMAIL_ASSISTANT_EVAL_MODE=1(and optionallyEMAIL_ASSISTANT_UPDATE_SNAPSHOTS=1). - SQLite lock avoidance: tests now auto-configure unique
EMAIL_ASSISTANT_CHECKPOINT_PATH/EMAIL_ASSISTANT_STORE_PATHvalues, but when running scripts manually set them to fresh locations (e.g./tmp/checkpoints.sqlite) to avoidOperationalError: database is lockedfrom earlier runs. python scripts/run_tests_langsmith.pymirrors the tool-call suite and records traces when LangSmith is configured.- Enable Gemini 2.5 Flash judging (
EMAIL_ASSISTANT_LLM_JUDGE=1) to score correctness/tool usage; addEMAIL_ASSISTANT_JUDGE_STRICT=1to fail on judge verdicts.
The reminder stack now includes a LangGraph dispatcher plus a background worker:
triage_routerbatches cancel/create intents and defers HITL-created reminders by persisting create actions to the reminder store's pending queue until the reviewer approves them.apply_reminder_actions_nodeexecutes batched operations atomically throughReminderStore.apply_actions()and exposes the outcome asreminder_dispatch_outcome.triage_interrupt_handlerreplay pending reminder actions only after a reviewer chooses to respond, keeping notify flows HITL-first.scripts/reminder_worker.pypromotes due reminders:- Configure labels/timers with
REMINDER_*environment variables. - Run once:
python scripts/reminder_worker.py --once - Continuous loop:
python scripts/reminder_worker.py --loop(tmux/cron examples live in README_LOCAL.md)
- Configure labels/timers with
- See
notebooks/reminder_flow.ipynbfor a diagram + code sample showing the dispatcher in action.
| Path | Description |
|---|---|
src/email_assistant/ |
Agent graphs, tools, prompts, checkpointing, and server helpers. |
scripts/ |
Utilities including dataset registration and reminder worker. |
tests/ |
Live/eval pytest suites, spam flow coverage, notebooks smoke tests. |
datasets/ |
LangSmith experiment datasets (register_dataset_from_jsonl.py). |
docs/ |
Placeholder for extended documentation. |
README_LOCAL.md |
Local development recipes, dependency pin notes, testing nuances. |
README_LOCAL.md– local dev/test details, tmux/cron recipes, notebook tips.AGENTS.md– agent evolution, feature changelog, evaluation modes.dev_tickets/LangChain-LangGraph-v1-implementation-ticket.md– end-to-end implementation log with acceptance checklist and rollout notes.dev_tickets/LangChain-LangGraph-v1-follow-up-ticket.md– Phase 5 validation/demo follow-ups, risks, and merge coordination tasks.notebooks/UPDATES.md– notebook refresh log, live-first checklists, reminder/HITL env toggles.CONTRIBUTING.md– branching, review, and testing expectations.system_prompt.md– canonical assistant instructions.
- CodeRabbit CI still gates PRs automatically, but contributors must run a local review after finishing edits:
coderabbit review --plainfrom the repo root (or--prompt-onlyfor a lighter summary). - Share the exact command you ran in handoff notes so others can reproduce or compare results (
coderabbit auth statusis handy when double-checking login). - Need a refresher?
coderabbit --helplists every subcommand,coderabbit <subcommand> --helpdives into its flags, andcoderabbit watch --helpcovers auto-review workflows.
Pull requests are welcome; follow the workflow in CONTRIBUTING.md. Expect to run the live Gemini suites unless credentials are unavailable.
Distributed under the MIT License. See LICENSE for details.