Skip to content

Conversation

shubham-tomar
Copy link

@shubham-tomar shubham-tomar commented Aug 24, 2025

  • Replaced openAI integration with deepSeek-r1
  • fixed fireCrawl implementation

Summary by CodeRabbit

  • New Features
    • LinkedIn and Twitter content generation now powered by Ollama, with model selectable via the OLLAMA_MODEL environment variable (default provided).
  • Improvements
    • Web scraping reliability improved and automatically triggers downstream content generation after a successful scrape.
  • Documentation
    • Updated installation instructions to include running “npx motia install” after installing dependencies.
  • Chores
    • Updated dependency manifest to include additional runtime packages and reorganize entries.
    • Expanded .gitignore to exclude extra generated files.

Copy link
Contributor

coderabbitai bot commented Aug 24, 2025

Walkthrough

Replaces OpenAI with Ollama in LinkedIn and Twitter generation steps, updates Firecrawl scraping API and emits a new generate-content event, adjusts dependencies in package.json, adds .gitignore entries for TypeScript and lockfile, and updates README to include “npx motia install”.

Changes

Cohort / File(s) Summary
AI provider switch to Ollama
motia-content-creation/steps/generate-linkedin.step.py, motia-content-creation/steps/generate-twitter.step.py
Replaced AsyncOpenAI usage with Ollama chat API; added OLLAMA_MODEL env var (default deepseek-r1); updated message/response handling and JSON parsing; removed OpenAI API key/client.
Scraper API update and event emission
motia-content-creation/steps/scrape.step.py
Migrated from FirecrawlApp.scrape_url to Firecrawl.scrape(..., formats=["markdown"]); revised success checks and metadata access; now emits a generate-content event with scraped data.
Dependency manifest adjustments
motia-content-creation/package.json
Reorganized and expanded dependencies (added axios, install, or, pnpm; reintroduced motia, openai, zod). No script/devDependency changes.
Docs and project hygiene
motia-content-creation/README.md, motia-content-creation/.gitignore
README: adds “npx motia install” after install step. .gitignore: ignores types.d.ts and package-lock.json; normalizes EOF newline.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as Step Handler
  participant Ollama as Ollama Chat API
  Note over U: Content generation (LinkedIn/Twitter)
  U->>Ollama: chat(model=OLLAMA_MODEL, messages, options{temperature, num_predict})
  Ollama-->>U: { message: { content } }
  U->>U: Parse JSON or fallback to text
  U-->>U: Emit final content (unchanged emission interface)
Loading
sequenceDiagram
  autonumber
  participant S as scrape.step.py
  participant FC as Firecrawl
  participant Bus as Event Bus
  S->>FC: scrape(url, formats=["markdown"])
  FC-->>S: { markdown, metadata? }
  alt Has markdown
    S->>S: Extract content and title
    S->>Bus: emit(topic="generate-content", data{requestId,url,title,content,timestamp})
  else Missing markdown
    S->>S: Raise error
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

I tap my paws on keys so bright,
Swapped clouds for llamas in the night.
I scrape, I squeak, emit on cue—
Fresh links and tweets, both crisp and new.
In burrows deep, my deps align;
Ship it now—content’s divine! 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
motia-content-creation/README.md (1)

119-122: Fix typo: “deb ug” → “debug”.

Minor but user-facing docs should be typo-free.

Apply this diff:

-The Motia workbench provides an interactive UI where you can easily deb ug and monitor your flows as interactive diagrams. It runs automatically with the development server.
+The Motia workbench provides an interactive UI where you can easily debug and monitor your flows as interactive diagrams. It runs automatically with the development server.
🧹 Nitpick comments (13)
motia-content-creation/README.md (1)

49-49: Good call adding “npx motia install” after dependency install.

This will help first-time users get Motia’s artifacts in place. Consider also documenting the optional OLLAMA_MODEL override (defaults to deepseek-r1 in code) so users know how to switch models via .env.

Proposed addition to the “Configure environment” block:

# Optional: override default model (defaults to deepseek-r1)
OLLAMA_MODEL=deepseek-r1
motia-content-creation/.gitignore (1)

111-113: Lockfile policy: align across package managers.

You’re ignoring package-lock.json but not pnpm-lock.yaml or yarn.lock. Decide whether to commit lockfiles (recommended for reproducible builds) or ignore them consistently across all managers.

If you intend to ignore all lockfiles, add:

+# Package manager lockfiles (if you choose not to commit them)
+pnpm-lock.yaml
+yarn.lock

If you intend to commit them, remove package-lock.json from .gitignore instead.

motia-content-creation/steps/generate-twitter.step.py (4)

38-45: Avoid blocking the event loop; wrap Ollama call with asyncio.to_thread.

ollama.chat is synchronous; calling it inside an async handler blocks Motia’s event loop. Use to_thread and also avoid reusing the same name for both response and parsed payload.

Apply this diff:

-        twitter_content = ollama.chat(
-            model=OLLAMA_MODEL,
-            messages=[{'role': 'user', 'content': twitterPrompt}],
-            options={
-                'temperature': 0.7,
-                'num_predict': 2000
-            }
-        )    
+        response = await asyncio.to_thread(
+            ollama.chat,
+            model=OLLAMA_MODEL,
+            messages=[{'role': 'user', 'content': twitterPrompt}],
+            options={
+                'temperature': 0.7,
+                'num_predict': 2000
+            }
+        )

Also ensure this import exists at the top:

import asyncio

48-51: Harden JSON parsing against code fences and non-JSON outputs.

DeepSeek-R1 can emit fenced blocks or preambles. Strip fences and parse; otherwise fall back to text.

Apply this diff:

-        try:
-            twitter_content = json.loads(twitter_content['message']['content'])
-        except Exception:
-            twitter_content = {'text': twitter_content['message']['content']}
+        def _parse_json(s: str):
+            s = s.strip()
+            if s.startswith("```"):
+                s = s.strip("`")
+                # Handle ```json\n ... \n``` or ```\n ... \n```
+                s = "\n".join(line for line in s.splitlines() if not line.lower().startswith("json"))
+            return json.loads(s)
+        try:
+            twitter_content = _parse_json(response['message']['content'])
+        except Exception:
+            twitter_content = {'text': response['message']['content']}

61-62: Emit timestamps in UTC with timezone info.

Use timezone-aware UTC to make downstream ordering reliable.

Apply this diff:

-                'generatedAt': datetime.now().isoformat(),
+                'generatedAt': datetime.utcnow().replace(tzinfo=None).isoformat() + 'Z',

Alternatively:

from datetime import datetime, timezone
...
generatedAt=datetime.now(timezone.utc).isoformat()

1-18: Validate input against the Pydantic model before use.

You declare GenerateInput but never validate input. This guards against malformed events early.

Add right after handler starts:

validated = GenerateInput(**input)
# Then use validated.* instead of input['...']

If you prefer dicts downstream:

payload = validated.model_dump()
motia-content-creation/steps/generate-linkedin.step.py (5)

38-45: Use asyncio.to_thread to prevent blocking during Ollama call.

Same concern as Twitter step. This also makes the existing asyncio import purposeful and resolves the Ruff F401 warning.

Apply this diff:

-        response = ollama.chat(
-            model=OLLAMA_MODEL,
-            messages=[{'role': 'user', 'content': linkedinPrompt}],
-            options={
-                'temperature': 0.7,
-                'num_predict': 2000
-            }
-        )  
+        response = await asyncio.to_thread(
+            ollama.chat,
+            model=OLLAMA_MODEL,
+            messages=[{'role': 'user', 'content': linkedinPrompt}],
+            options={
+                'temperature': 0.7,
+                'num_predict': 2000
+            }
+        )

51-54: Make JSON parsing resilient to non-JSON responses.

Mirror the hardened parsing used in the Twitter step to handle fenced blocks or plain text gracefully.

Apply this diff:

-        try:
-            linkedin_content = json.loads(response['message']['content'])
-        except Exception:
-            linkedin_content = {'text': response['message']['content']}
+        def _parse_json(s: str):
+            s = s.strip()
+            if s.startswith("```"):
+                s = s.strip("`")
+                s = "\n".join(line for line in s.splitlines() if not line.lower().startswith("json"))
+            return json.loads(s)
+        try:
+            linkedin_content = _parse_json(response['message']['content'])
+        except Exception:
+            linkedin_content = {'text': response['message']['content']}

64-64: Prefer UTC for generatedAt.

Align timestamp behavior across steps and downstream services.

Apply this diff:

-                'generatedAt': datetime.now().isoformat(),
+                'generatedAt': datetime.utcnow().replace(tzinfo=None).isoformat() + 'Z',

Or use timezone-aware datetime.now(timezone.utc).isoformat().


3-4: If you choose not to adopt to_thread, remove unused asyncio import.

In case you don’t adopt the non-blocking change, delete the unused import to satisfy Ruff (F401).

-import asyncio

30-36: Validate input with Pydantic before templating.

Early validation helps catch bad events and prevents template injection from unexpected types.

Add:

validated = GenerateInput(**input)
linkedinPrompt = linkedinPromptTemplate.replace('{{title}}', validated.title).replace('{{content}}', validated.content)
motia-content-creation/steps/scrape.step.py (2)

33-34: Broaden the success check and remove the extraneous f-string (Ruff F541)

  • Firecrawl SDKs sometimes return dict-like payloads. Relying solely on hasattr(scrapeResult, 'markdown') can miss valid dict results.
  • The exception string is static; drop the f prefix to satisfy Ruff F541.
-    if not hasattr(scrapeResult, 'markdown'):
-        raise Exception(f"Firecrawl scraping failed: No content returned")
+    has_md_attr = getattr(scrapeResult, 'markdown', None)
+    has_md_key = isinstance(scrapeResult, dict) and scrapeResult.get('markdown')
+    if not (has_md_attr or has_md_key):
+        raise Exception("Firecrawl scraping failed: No content returned")

36-37: Make content/title extraction resilient to both object and dict payloads

Support both attribute- and key-based access for markdown and metadata.title. Also ensure the title is a string and trimmed.

-    content = scrapeResult.markdown or ''
-    title = getattr(scrapeResult.metadata, 'title', 'Untitled Article') if hasattr(scrapeResult, 'metadata') else 'Untitled Article'
+    # Content
+    content = getattr(scrapeResult, 'markdown', None)
+    if content is None and isinstance(scrapeResult, dict):
+        content = scrapeResult.get('markdown')
+    content = content or ''
+
+    # Title
+    metadata = getattr(scrapeResult, 'metadata', None)
+    if metadata is None and isinstance(scrapeResult, dict):
+        metadata = scrapeResult.get('metadata')
+    if isinstance(metadata, dict):
+        title = metadata.get('title') or 'Untitled Article'
+    else:
+        title = getattr(metadata, 'title', None) or 'Untitled Article'
+    if not isinstance(title, str):
+        title = str(title)
+    title = title.strip() or 'Untitled Article'
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 10226c9 and 655024f.

📒 Files selected for processing (6)
  • motia-content-creation/.gitignore (1 hunks)
  • motia-content-creation/README.md (1 hunks)
  • motia-content-creation/package.json (1 hunks)
  • motia-content-creation/steps/generate-linkedin.step.py (2 hunks)
  • motia-content-creation/steps/generate-twitter.step.py (2 hunks)
  • motia-content-creation/steps/scrape.step.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
motia-content-creation/steps/scrape.step.py (2)
paralegal-agent-crew/src/tools/firecrawl_search_tool.py (1)
  • FirecrawlSearchTool (13-46)
brand-monitoring/brand_monitoring_flow/src/brand_monitoring_flow/main.py (1)
  • scrape_data_and_analyse (68-260)
🪛 Ruff (0.12.2)
motia-content-creation/steps/generate-linkedin.step.py

4-4: asyncio imported but unused

Remove unused import: asyncio

(F401)

motia-content-creation/steps/scrape.step.py

34-34: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (2)
motia-content-creation/steps/generate-twitter.step.py (1)

6-6: ‘ollama’ dependency confirmed in requirements
The file motia-content-creation/requirements.txt declares ollama>=0.5.0, so the import in generate-twitter.step.py is satisfied. No changes needed.

motia-content-creation/steps/scrape.step.py (1)

3-3: Confirm & Align Firecrawl SDK Usage and Pin Version

We’re seeing mixed imports and method calls for the Firecrawl SDK across the repo, but no pinned dependency to tie it all together. To avoid runtime errors, please:

• Pin the Firecrawl SDK in your project’s dependency manifest
– Add or update requirements.txt, pyproject.toml, or Pipfile to include an explicit firecrawl==<version> line.

• Verify which classes and methods the installed SDK actually exports
Run locally (adjust for your environment):

# Check installed version
python3 -m pip show firecrawl

# Introspect available exports
python3 - << 'EOF'
import importlib
try:
    pkg = importlib.import_module('firecrawl')
    print("Exports:", [n for n in dir(pkg) if not n.startswith('_')])
except Exception as e:
    print("Import error:", e)
EOF

Confirm whether the package provides:
Firecrawl or FirecrawlApp (or both)
.scrape(...), .scrape_url(...), .search(...), etc.

• Align all import sites and method calls to the verified API shape (or add a shim if you need to support both)
Key locations to update:

  • motia-content-creation/steps/scrape.step.py
  • mcp-voice-agent/agent.py
  • paralegal-agent-crew/src/tools/firecrawl_search_tool.py
  • content_planner_flow/notebook.ipynb
  • Website-to-API-with-FireCrawl/notebook.ipynb & app.py
  • Colivara-deepseek-website-RAG/app.py

Without these steps, you’re at risk of import or attribute errors if the SDK has changed. Let me know once you’ve confirmed the correct API shape or if you’d like help adding a compatibility layer.

Comment on lines +22 to 30
"axios": "^1.10.0",
"dotenv": "^16.5.0",
"zod": "^3.25.67",
"axios": "^1.10.0"
"install": "^0.13.0",
"motia": "^0.4.0-beta.90",
"openai": "^4.90.0",
"or": "^0.2.0",
"pnpm": "^10.15.0",
"zod": "^3.25.67"
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Prune likely-unused/supply-chain-risk dependencies from runtime deps.

Given the migration to Ollama in Python steps, these runtime deps look unnecessary in the Node package and can enlarge the attack surface:

  • openai (Node) — Python steps use ollama; no Node usage shown
  • pnpm — package manager should not be an app runtime dependency
  • install, or — uncommon packages; please justify usage
  • axios — only keep if the TS scheduling steps actually import it

Move truly-needed tools to devDependencies or remove entirely.

Run this script to confirm actual usage before removal:

If unused, apply this diff:

   "dependencies": {
     "@mendable/firecrawl-js": "^1.0.0",
-    "axios": "^1.10.0",
     "dotenv": "^16.5.0",
-    "install": "^0.13.0",
     "motia": "^0.4.0-beta.90",
-    "openai": "^4.90.0",
-    "or": "^0.2.0",
-    "pnpm": "^10.15.0",
     "zod": "^3.25.67"
   },

🏁 Script executed:

#!/bin/bash
# Verify usage of suspicious deps across the repo
set -euo pipefail
echo "Searching for imports/usages..."
rg -n -C2 -g '!**/node_modules/**' -g '!**/dist/**' -g '!**/build/**' \
  -e '\bfrom\s+["'\'']axios["'\'']' \
  -e '\brequire\(["'\'']axios["'\'']\)' \
  -e '\bfrom\s+["'\'']openai["'\'']' \
  -e '\brequire\(["'\'']openai["'\'']\)' \
  -e '\bfrom\s+["'\'']pnpm["'\'']' \
  -e '\brequire\(["'\'']pnpm["'\'']\)' \
  -e '\bfrom\s+["'\'']install["'\'']' \
  -e '\brequire\(["'\'']install["'\'']\)' \
  -e '\bfrom\s+["'\'']or["'\'']' \
  -e '\brequire\(["'\'']or["'\'']\)' \
  motia-content-creation

Length of output: 1385


Remove unused dependencies and retain only truly required runtime packages

The usage scan confirms:

  • axios is imported in
    • motia-content-creation/steps/schedule-twitter.step.ts (line 3)
    • motia-content-creation/steps/schedule-linkedin.step.ts (line 3)
      so it must remain in dependencies.
  • openai, pnpm, install, and or have no import/require occurrences and can be safely removed.

Apply this diff to motia-content-creation/package.json:

   "dependencies": {
     "@mendable/firecrawl-js": "^1.0.0",
     "axios": "^1.10.0",
     "dotenv": "^16.5.0",
     "motia": "^0.4.0-beta.90",
-    "openai": "^4.90.0",
-    "install": "^0.13.0",
-    "or": "^0.2.0",
-    "pnpm": "^10.15.0",
     "zod": "^3.25.67"
   },

This pruning reduces supply-chain risk without impacting any runtime code.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"axios": "^1.10.0",
"dotenv": "^16.5.0",
"zod": "^3.25.67",
"axios": "^1.10.0"
"install": "^0.13.0",
"motia": "^0.4.0-beta.90",
"openai": "^4.90.0",
"or": "^0.2.0",
"pnpm": "^10.15.0",
"zod": "^3.25.67"
},
"dependencies": {
"@mendable/firecrawl-js": "^1.0.0",
"axios": "^1.10.0",
"dotenv": "^16.5.0",
"motia": "^0.4.0-beta.90",
"zod": "^3.25.67"
},
🤖 Prompt for AI Agents
In motia-content-creation/package.json around lines 22 to 30, the dependencies
list contains unused packages; keep "axios" (used in
steps/schedule-twitter.step.ts and steps/schedule-linkedin.step.ts) and remove
"openai", "pnpm", "install", and "or" from the dependencies section; update
package.json by deleting those four entries (and any trailing commas adjusted)
so only required runtime packages remain, then run a quick install (pnpm/npm)
and a codebase search to confirm no runtime imports were missed.

Comment on lines +29 to +31
firecrawl = Firecrawl(api_key=FIRECRAWL_API_KEY)

scrapeResult = app.scrape_url(input['url'])
scrapeResult = firecrawl.scrape(input['url'], formats=["markdown"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Guard missing FIRECRAWL_API_KEY and avoid blocking the event loop; also prefer a neutral client name

  • If FIRECRAWL_API_KEY is unset, Firecrawl(api_key=None) may fail later; fail fast with a clear message.
  • firecrawl.scrape(...) is synchronous; calling it in an async def may block the event loop during network I/O. Offload to a thread.
  • Minor: naming the variable firecrawl shadows the import symbol; client is clearer.

Apply:

-    firecrawl = Firecrawl(api_key=FIRECRAWL_API_KEY)
-
-    scrapeResult = firecrawl.scrape(input['url'], formats=["markdown"])
+    if not FIRECRAWL_API_KEY:
+        raise RuntimeError("FIRECRAWL_API_KEY is not set. Configure it in your environment.")
+
+    client = Firecrawl(api_key=FIRECRAWL_API_KEY)
+
+    # Offload sync HTTP call to a worker thread to avoid blocking the event loop
+    scrapeResult = await asyncio.to_thread(client.scrape, str(input['url']), formats=["markdown"])

Add outside this range:

import asyncio  # at the top with other imports
🤖 Prompt for AI Agents
In motia-content-creation/steps/scrape.step.py around lines 29 to 31, guard
against a missing FIRECRAWL_API_KEY, avoid blocking the event loop, and rename
the variable: check that FIRECRAWL_API_KEY is set and raise a clear exception
immediately if not; instantiate the client into a variable named client (not
firecrawl) and call client.scrape off the event loop by wrapping the synchronous
call in asyncio.to_thread (or run_in_executor) from inside the async function;
also ensure import asyncio is added at the top with other imports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant