-
Notifications
You must be signed in to change notification settings - Fork 3.1k
bug fixes for motia-content-creation #186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughReplaces OpenAI with Ollama in LinkedIn and Twitter generation steps, updates Firecrawl scraping API and emits a new generate-content event, adjusts dependencies in package.json, adds .gitignore entries for TypeScript and lockfile, and updates README to include “npx motia install”. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as Step Handler
participant Ollama as Ollama Chat API
Note over U: Content generation (LinkedIn/Twitter)
U->>Ollama: chat(model=OLLAMA_MODEL, messages, options{temperature, num_predict})
Ollama-->>U: { message: { content } }
U->>U: Parse JSON or fallback to text
U-->>U: Emit final content (unchanged emission interface)
sequenceDiagram
autonumber
participant S as scrape.step.py
participant FC as Firecrawl
participant Bus as Event Bus
S->>FC: scrape(url, formats=["markdown"])
FC-->>S: { markdown, metadata? }
alt Has markdown
S->>S: Extract content and title
S->>Bus: emit(topic="generate-content", data{requestId,url,title,content,timestamp})
else Missing markdown
S->>S: Raise error
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
motia-content-creation/README.md (1)
119-122
: Fix typo: “deb ug” → “debug”.Minor but user-facing docs should be typo-free.
Apply this diff:
-The Motia workbench provides an interactive UI where you can easily deb ug and monitor your flows as interactive diagrams. It runs automatically with the development server. +The Motia workbench provides an interactive UI where you can easily debug and monitor your flows as interactive diagrams. It runs automatically with the development server.
🧹 Nitpick comments (13)
motia-content-creation/README.md (1)
49-49
: Good call adding “npx motia install” after dependency install.This will help first-time users get Motia’s artifacts in place. Consider also documenting the optional OLLAMA_MODEL override (defaults to deepseek-r1 in code) so users know how to switch models via .env.
Proposed addition to the “Configure environment” block:
# Optional: override default model (defaults to deepseek-r1) OLLAMA_MODEL=deepseek-r1
motia-content-creation/.gitignore (1)
111-113
: Lockfile policy: align across package managers.You’re ignoring package-lock.json but not pnpm-lock.yaml or yarn.lock. Decide whether to commit lockfiles (recommended for reproducible builds) or ignore them consistently across all managers.
If you intend to ignore all lockfiles, add:
+# Package manager lockfiles (if you choose not to commit them) +pnpm-lock.yaml +yarn.lockIf you intend to commit them, remove package-lock.json from .gitignore instead.
motia-content-creation/steps/generate-twitter.step.py (4)
38-45
: Avoid blocking the event loop; wrap Ollama call with asyncio.to_thread.ollama.chat is synchronous; calling it inside an async handler blocks Motia’s event loop. Use to_thread and also avoid reusing the same name for both response and parsed payload.
Apply this diff:
- twitter_content = ollama.chat( - model=OLLAMA_MODEL, - messages=[{'role': 'user', 'content': twitterPrompt}], - options={ - 'temperature': 0.7, - 'num_predict': 2000 - } - ) + response = await asyncio.to_thread( + ollama.chat, + model=OLLAMA_MODEL, + messages=[{'role': 'user', 'content': twitterPrompt}], + options={ + 'temperature': 0.7, + 'num_predict': 2000 + } + )Also ensure this import exists at the top:
import asyncio
48-51
: Harden JSON parsing against code fences and non-JSON outputs.DeepSeek-R1 can emit fenced blocks or preambles. Strip fences and parse; otherwise fall back to text.
Apply this diff:
- try: - twitter_content = json.loads(twitter_content['message']['content']) - except Exception: - twitter_content = {'text': twitter_content['message']['content']} + def _parse_json(s: str): + s = s.strip() + if s.startswith("```"): + s = s.strip("`") + # Handle ```json\n ... \n``` or ```\n ... \n``` + s = "\n".join(line for line in s.splitlines() if not line.lower().startswith("json")) + return json.loads(s) + try: + twitter_content = _parse_json(response['message']['content']) + except Exception: + twitter_content = {'text': response['message']['content']}
61-62
: Emit timestamps in UTC with timezone info.Use timezone-aware UTC to make downstream ordering reliable.
Apply this diff:
- 'generatedAt': datetime.now().isoformat(), + 'generatedAt': datetime.utcnow().replace(tzinfo=None).isoformat() + 'Z',Alternatively:
from datetime import datetime, timezone ... generatedAt=datetime.now(timezone.utc).isoformat()
1-18
: Validate input against the Pydantic model before use.You declare GenerateInput but never validate input. This guards against malformed events early.
Add right after handler starts:
validated = GenerateInput(**input) # Then use validated.* instead of input['...']If you prefer dicts downstream:
payload = validated.model_dump()motia-content-creation/steps/generate-linkedin.step.py (5)
38-45
: Use asyncio.to_thread to prevent blocking during Ollama call.Same concern as Twitter step. This also makes the existing asyncio import purposeful and resolves the Ruff F401 warning.
Apply this diff:
- response = ollama.chat( - model=OLLAMA_MODEL, - messages=[{'role': 'user', 'content': linkedinPrompt}], - options={ - 'temperature': 0.7, - 'num_predict': 2000 - } - ) + response = await asyncio.to_thread( + ollama.chat, + model=OLLAMA_MODEL, + messages=[{'role': 'user', 'content': linkedinPrompt}], + options={ + 'temperature': 0.7, + 'num_predict': 2000 + } + )
51-54
: Make JSON parsing resilient to non-JSON responses.Mirror the hardened parsing used in the Twitter step to handle fenced blocks or plain text gracefully.
Apply this diff:
- try: - linkedin_content = json.loads(response['message']['content']) - except Exception: - linkedin_content = {'text': response['message']['content']} + def _parse_json(s: str): + s = s.strip() + if s.startswith("```"): + s = s.strip("`") + s = "\n".join(line for line in s.splitlines() if not line.lower().startswith("json")) + return json.loads(s) + try: + linkedin_content = _parse_json(response['message']['content']) + except Exception: + linkedin_content = {'text': response['message']['content']}
64-64
: Prefer UTC for generatedAt.Align timestamp behavior across steps and downstream services.
Apply this diff:
- 'generatedAt': datetime.now().isoformat(), + 'generatedAt': datetime.utcnow().replace(tzinfo=None).isoformat() + 'Z',Or use timezone-aware datetime.now(timezone.utc).isoformat().
3-4
: If you choose not to adopt to_thread, remove unused asyncio import.In case you don’t adopt the non-blocking change, delete the unused import to satisfy Ruff (F401).
-import asyncio
30-36
: Validate input with Pydantic before templating.Early validation helps catch bad events and prevents template injection from unexpected types.
Add:
validated = GenerateInput(**input) linkedinPrompt = linkedinPromptTemplate.replace('{{title}}', validated.title).replace('{{content}}', validated.content)motia-content-creation/steps/scrape.step.py (2)
33-34
: Broaden the success check and remove the extraneous f-string (Ruff F541)
- Firecrawl SDKs sometimes return dict-like payloads. Relying solely on
hasattr(scrapeResult, 'markdown')
can miss valid dict results.- The exception string is static; drop the
f
prefix to satisfy Ruff F541.- if not hasattr(scrapeResult, 'markdown'): - raise Exception(f"Firecrawl scraping failed: No content returned") + has_md_attr = getattr(scrapeResult, 'markdown', None) + has_md_key = isinstance(scrapeResult, dict) and scrapeResult.get('markdown') + if not (has_md_attr or has_md_key): + raise Exception("Firecrawl scraping failed: No content returned")
36-37
: Make content/title extraction resilient to both object and dict payloadsSupport both attribute- and key-based access for
markdown
andmetadata.title
. Also ensure the title is a string and trimmed.- content = scrapeResult.markdown or '' - title = getattr(scrapeResult.metadata, 'title', 'Untitled Article') if hasattr(scrapeResult, 'metadata') else 'Untitled Article' + # Content + content = getattr(scrapeResult, 'markdown', None) + if content is None and isinstance(scrapeResult, dict): + content = scrapeResult.get('markdown') + content = content or '' + + # Title + metadata = getattr(scrapeResult, 'metadata', None) + if metadata is None and isinstance(scrapeResult, dict): + metadata = scrapeResult.get('metadata') + if isinstance(metadata, dict): + title = metadata.get('title') or 'Untitled Article' + else: + title = getattr(metadata, 'title', None) or 'Untitled Article' + if not isinstance(title, str): + title = str(title) + title = title.strip() or 'Untitled Article'
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
motia-content-creation/.gitignore
(1 hunks)motia-content-creation/README.md
(1 hunks)motia-content-creation/package.json
(1 hunks)motia-content-creation/steps/generate-linkedin.step.py
(2 hunks)motia-content-creation/steps/generate-twitter.step.py
(2 hunks)motia-content-creation/steps/scrape.step.py
(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
motia-content-creation/steps/scrape.step.py (2)
paralegal-agent-crew/src/tools/firecrawl_search_tool.py (1)
FirecrawlSearchTool
(13-46)brand-monitoring/brand_monitoring_flow/src/brand_monitoring_flow/main.py (1)
scrape_data_and_analyse
(68-260)
🪛 Ruff (0.12.2)
motia-content-creation/steps/generate-linkedin.step.py
4-4: asyncio
imported but unused
Remove unused import: asyncio
(F401)
motia-content-creation/steps/scrape.step.py
34-34: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🔇 Additional comments (2)
motia-content-creation/steps/generate-twitter.step.py (1)
6-6
: ‘ollama’ dependency confirmed in requirements
The file motia-content-creation/requirements.txt declaresollama>=0.5.0
, so the import in generate-twitter.step.py is satisfied. No changes needed.motia-content-creation/steps/scrape.step.py (1)
3-3
: Confirm & Align Firecrawl SDK Usage and Pin VersionWe’re seeing mixed imports and method calls for the Firecrawl SDK across the repo, but no pinned dependency to tie it all together. To avoid runtime errors, please:
• Pin the Firecrawl SDK in your project’s dependency manifest
– Add or updaterequirements.txt
,pyproject.toml
, orPipfile
to include an explicitfirecrawl==<version>
line.• Verify which classes and methods the installed SDK actually exports
Run locally (adjust for your environment):# Check installed version python3 -m pip show firecrawl # Introspect available exports python3 - << 'EOF' import importlib try: pkg = importlib.import_module('firecrawl') print("Exports:", [n for n in dir(pkg) if not n.startswith('_')]) except Exception as e: print("Import error:", e) EOFConfirm whether the package provides:
–Firecrawl
orFirecrawlApp
(or both)
–.scrape(...)
,.scrape_url(...)
,.search(...)
, etc.• Align all import sites and method calls to the verified API shape (or add a shim if you need to support both)
Key locations to update:
- motia-content-creation/steps/scrape.step.py
- mcp-voice-agent/agent.py
- paralegal-agent-crew/src/tools/firecrawl_search_tool.py
- content_planner_flow/notebook.ipynb
- Website-to-API-with-FireCrawl/notebook.ipynb & app.py
- Colivara-deepseek-website-RAG/app.py
Without these steps, you’re at risk of import or attribute errors if the SDK has changed. Let me know once you’ve confirmed the correct API shape or if you’d like help adding a compatibility layer.
"axios": "^1.10.0", | ||
"dotenv": "^16.5.0", | ||
"zod": "^3.25.67", | ||
"axios": "^1.10.0" | ||
"install": "^0.13.0", | ||
"motia": "^0.4.0-beta.90", | ||
"openai": "^4.90.0", | ||
"or": "^0.2.0", | ||
"pnpm": "^10.15.0", | ||
"zod": "^3.25.67" | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Prune likely-unused/supply-chain-risk dependencies from runtime deps.
Given the migration to Ollama in Python steps, these runtime deps look unnecessary in the Node package and can enlarge the attack surface:
- openai (Node) — Python steps use ollama; no Node usage shown
- pnpm — package manager should not be an app runtime dependency
- install, or — uncommon packages; please justify usage
- axios — only keep if the TS scheduling steps actually import it
Move truly-needed tools to devDependencies or remove entirely.
Run this script to confirm actual usage before removal:
If unused, apply this diff:
"dependencies": {
"@mendable/firecrawl-js": "^1.0.0",
- "axios": "^1.10.0",
"dotenv": "^16.5.0",
- "install": "^0.13.0",
"motia": "^0.4.0-beta.90",
- "openai": "^4.90.0",
- "or": "^0.2.0",
- "pnpm": "^10.15.0",
"zod": "^3.25.67"
},
🏁 Script executed:
#!/bin/bash
# Verify usage of suspicious deps across the repo
set -euo pipefail
echo "Searching for imports/usages..."
rg -n -C2 -g '!**/node_modules/**' -g '!**/dist/**' -g '!**/build/**' \
-e '\bfrom\s+["'\'']axios["'\'']' \
-e '\brequire\(["'\'']axios["'\'']\)' \
-e '\bfrom\s+["'\'']openai["'\'']' \
-e '\brequire\(["'\'']openai["'\'']\)' \
-e '\bfrom\s+["'\'']pnpm["'\'']' \
-e '\brequire\(["'\'']pnpm["'\'']\)' \
-e '\bfrom\s+["'\'']install["'\'']' \
-e '\brequire\(["'\'']install["'\'']\)' \
-e '\bfrom\s+["'\'']or["'\'']' \
-e '\brequire\(["'\'']or["'\'']\)' \
motia-content-creation
Length of output: 1385
Remove unused dependencies and retain only truly required runtime packages
The usage scan confirms:
- axios is imported in
- motia-content-creation/steps/schedule-twitter.step.ts (line 3)
- motia-content-creation/steps/schedule-linkedin.step.ts (line 3)
so it must remain independencies
.
- openai, pnpm, install, and or have no import/require occurrences and can be safely removed.
Apply this diff to motia-content-creation/package.json:
"dependencies": {
"@mendable/firecrawl-js": "^1.0.0",
"axios": "^1.10.0",
"dotenv": "^16.5.0",
"motia": "^0.4.0-beta.90",
- "openai": "^4.90.0",
- "install": "^0.13.0",
- "or": "^0.2.0",
- "pnpm": "^10.15.0",
"zod": "^3.25.67"
},
This pruning reduces supply-chain risk without impacting any runtime code.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"axios": "^1.10.0", | |
"dotenv": "^16.5.0", | |
"zod": "^3.25.67", | |
"axios": "^1.10.0" | |
"install": "^0.13.0", | |
"motia": "^0.4.0-beta.90", | |
"openai": "^4.90.0", | |
"or": "^0.2.0", | |
"pnpm": "^10.15.0", | |
"zod": "^3.25.67" | |
}, | |
"dependencies": { | |
"@mendable/firecrawl-js": "^1.0.0", | |
"axios": "^1.10.0", | |
"dotenv": "^16.5.0", | |
"motia": "^0.4.0-beta.90", | |
"zod": "^3.25.67" | |
}, |
🤖 Prompt for AI Agents
In motia-content-creation/package.json around lines 22 to 30, the dependencies
list contains unused packages; keep "axios" (used in
steps/schedule-twitter.step.ts and steps/schedule-linkedin.step.ts) and remove
"openai", "pnpm", "install", and "or" from the dependencies section; update
package.json by deleting those four entries (and any trailing commas adjusted)
so only required runtime packages remain, then run a quick install (pnpm/npm)
and a codebase search to confirm no runtime imports were missed.
firecrawl = Firecrawl(api_key=FIRECRAWL_API_KEY) | ||
|
||
scrapeResult = app.scrape_url(input['url']) | ||
scrapeResult = firecrawl.scrape(input['url'], formats=["markdown"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Guard missing FIRECRAWL_API_KEY and avoid blocking the event loop; also prefer a neutral client name
- If
FIRECRAWL_API_KEY
is unset,Firecrawl(api_key=None)
may fail later; fail fast with a clear message. firecrawl.scrape(...)
is synchronous; calling it in anasync def
may block the event loop during network I/O. Offload to a thread.- Minor: naming the variable
firecrawl
shadows the import symbol;client
is clearer.
Apply:
- firecrawl = Firecrawl(api_key=FIRECRAWL_API_KEY)
-
- scrapeResult = firecrawl.scrape(input['url'], formats=["markdown"])
+ if not FIRECRAWL_API_KEY:
+ raise RuntimeError("FIRECRAWL_API_KEY is not set. Configure it in your environment.")
+
+ client = Firecrawl(api_key=FIRECRAWL_API_KEY)
+
+ # Offload sync HTTP call to a worker thread to avoid blocking the event loop
+ scrapeResult = await asyncio.to_thread(client.scrape, str(input['url']), formats=["markdown"])
Add outside this range:
import asyncio # at the top with other imports
🤖 Prompt for AI Agents
In motia-content-creation/steps/scrape.step.py around lines 29 to 31, guard
against a missing FIRECRAWL_API_KEY, avoid blocking the event loop, and rename
the variable: check that FIRECRAWL_API_KEY is set and raise a clear exception
immediately if not; instantiate the client into a variable named client (not
firecrawl) and call client.scrape off the event loop by wrapping the synchronous
call in asyncio.to_thread (or run_in_executor) from inside the async function;
also ensure import asyncio is added at the top with other imports.
Summary by CodeRabbit