Skip to content

Commit d5eca91

Browse files
committed
The base commit
1 parent 5eb9b55 commit d5eca91

File tree

20 files changed

+328
-181
lines changed

20 files changed

+328
-181
lines changed

docs/README.md

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,135 @@
11
## Project Documentation
22

33
Add the documentation here.
4+
5+
---
6+
7+
Of course. Building a tool like "Octo" is a fantastic project. Based on my analysis of its architecture and my own
8+
experience, here are some ideas for building a similar tool, broken down into core concepts, advanced features, and
9+
product-level thinking.
10+
11+
### 1\. Core Architecture & Foundational Ideas
12+
13+
This is the minimum viable product (MVP) you'd need to get a functioning agent.
14+
15+
* **The Agent Loop:** The heart of any agent is its operational cycle. The classic model is a variation of a REPL
16+
(`Read-Eval-Print Loop`):
17+
$$[Input] \rightarrow [Think] \rightarrow [Act] \rightarrow [Observe]$$
18+
19+
* **Input:** The user provides a prompt.
20+
* **Think:** The agent (your code) sends the prompt and conversation history to an LLM. The LLM's "thinking"
21+
process might involve generating a plan or deciding which tool to use.
22+
* **Act:** Your code parses the LLM's response. If it's a tool call, you execute it. If it's a text response, you
23+
display it.
24+
* **Observe:** The result of the action (tool output or error) is formatted and added to the history. The loop
25+
then repeats with this new context.
26+
27+
* **A Modular Tool System:** This is non-negotiable. Don't hard-code your tools. Create a `Tool` interface and a
28+
registry. "Octo" does this very well. A simple version could look like this:
29+
30+
```typescript
31+
interface Tool {
32+
name: string;
33+
description: string; // Crucial for the LLM to know when to use it
34+
argumentsSchema: t.Type<any>; // Using 'structural' or 'zod' for schemas
35+
execute(args: any): Promise<string>;
36+
}
37+
38+
const toolRegistry: Map<string, Tool> = new Map();
39+
```
40+
41+
This allows you to add new tools like `git_diff` or `run_tests` just by defining a new object that fits the
42+
interface.
43+
44+
* **Rich History Management:** Your history isn't just a list of strings. It's a structured log of events. "Octo's"
45+
`HistoryItem` type is a good example. You should explicitly differentiate between:
46+
47+
* `UserMessage`
48+
* `AssistantMessage` (the LLM's text response)
49+
* `AssistantToolRequest` (the LLM's decision to call a tool)
50+
* `ToolResult` (the output from your code running the tool)
51+
* `SystemNotification` (e.g., "File `x.ts` was modified externally.")
52+
53+
### 2\. Enhancing the Core - "Leveling Up"
54+
55+
These are features that move from a simple proof-of-concept to a robust and reliable tool.
56+
57+
* **LLM Abstraction Layer:** "Octo" uses an IR for this. Your goal is to write code against your own generic
58+
`LLMProvider` interface, not directly against the OpenAI or Anthropic SDKs.
59+
60+
```typescript
61+
interface LLMProvider {
62+
generateResponse(history: LlmIR[], tools: Tool[]): AsyncGenerator<ResponseChunk>;
63+
}
64+
```
65+
66+
This lets you swap models mid-conversation, test new providers, or even integrate local models running via Ollama or
67+
llama.cpp with minimal friction.
68+
69+
* **Context Window Management:** This is a critical, practical problem. A long conversation will exceed the LLM's
70+
context limit.
71+
72+
* **Simple:** Use a "sliding window" approach like "Octo" does in `windowing.ts`. Keep only the last N tokens of
73+
the conversation.
74+
* **Advanced:** Implement a summarization strategy. For older parts of the conversation, use a cheaper/faster LLM
75+
to create a summary and replace the original messages with it.
76+
* **RAG (Retrieval-Augmented Generation):** For providing context about a large codebase, don't stuff entire files
77+
into the prompt. Use vector embeddings (e.g., with `pgvector` or a library like `llamaindex`) to find the most relevant
78+
code snippets for the user's current query and inject only those into the prompt.
79+
80+
* **Self-Correction and Autofix:** "Octo's" use of a separate model to fix malformed JSON is brilliant. Expand on
81+
this:
82+
83+
* **JSON Repair:** This is the most common use case. LLMs often produce JSON with trailing commas or missing
84+
brackets.
85+
* **Code Syntax Repair:** If a tool generates code (`edit` or `create`), you can have a "linter" step that uses an
86+
LLM to fix basic syntax errors before writing to disk.
87+
* **Search String Repair:** "Octo" does this for its `diff` edits. This is a great feature to prevent frustrating
88+
"search text not found" errors.
89+
90+
### 3\. Advanced Concepts & "Next Frontier" Ideas
91+
92+
These are more speculative ideas that could give your tool a unique edge.
93+
94+
* **Multi-Step Planning:** Instead of having the LLM emit one tool call at a time, prompt it to produce a full plan of
95+
action as a JSON object (e.g., a list of steps with dependencies). Your agent then becomes an executor for this plan,
96+
running the tools in sequence and feeding the results back for the next step. This dramatically increases autonomy.
97+
98+
* **Sandboxed Execution Environment:** Running `bash` commands from an LLM directly on your machine is a massive
99+
security risk.
100+
101+
* Use Docker to spin up a container for each session or command. The agent can only modify files inside the
102+
container's volume mount.
103+
* Explore WebAssembly (Wasm) as a secure, lightweight sandboxing target for running code or tools.
104+
105+
* **GUI / Rich Interface:** While "Octo" is a great CLI app, a simple web UI or a VS Code extension could provide huge
106+
value.
107+
108+
* Visualize the agent's plan as a graph.
109+
* Provide rich diff viewers for proposed changes.
110+
* Allow the user to directly edit the agent's proposed tool arguments before execution.
111+
112+
### 4\. Technical Stack & Library Choices
113+
114+
* **Language:** **TypeScript**. For a project of this complexity, type safety is not optional.
115+
* **CLI Framework:** **Ink** (like Octo) is great for rich, interactive UIs. For a more traditional CLI,
116+
**Commander.js** or **Yargs** are standard.
117+
* **Schema & Validation:** **Zod** is the current industry standard and is excellent for parsing and validating
118+
unpredictable LLM outputs. `structural` is also a fine choice.
119+
* **LLM Interaction:** The **Vercel AI SDK (`ai`)** is a strong starting point. It has built-in helpers for streaming,
120+
tool usage, and supports multiple providers.
121+
122+
### 5\. Product & SaaS Ideas
123+
124+
If you're thinking of this as more than a personal project:
125+
126+
* **The "Bring-Your-Own-Key" (BYOK) Model:** This is the easiest way to start. Users provide their own API keys, and
127+
your tool is just the client-side orchestrator. You can sell the tool itself as a one-time purchase or a subscription.
128+
* **The Full SaaS Model:** You manage the API keys and bill users for usage (with a markup). This is more complex but
129+
offers more value. You could provide premium features:
130+
* **Hosted Sandboxes:** Users run their code in your secure, cloud-based environments.
131+
* **Team Collaboration:** Shared sessions, toolsets, and prompts.
132+
* **Specialized Fine-Tuned Models:** Offer your own fine-tuned "autofix" or planning models as a premium feature.
133+
134+
Start with the core loop and a solid, modular tool system. The `FileTracker` and `autofix` ideas from "Octo" are
135+
high-impact features I'd prioritize next. Good luck.

env.example

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,2 @@
1-
# Backend Configuration
2-
ENVIRONMENT=development
3-
PORT=3000
1+
OPENAI_API_KEY="OPENAI_AP_KEY"
2+
GOOGLE_API_KEY="GOOGLE_API_KEY"

nodemon.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"watch": ["src"],
3+
"ext": "ts",
4+
"ignore": ["src/**/*.test.ts", "dist", "node_modules", ".git", ".vscode"],
5+
"exec": "tsx src/cli.ts"
6+
}

package.json

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,46 @@
11
{
2-
"name": "template-typescript-project",
2+
"name": "abi-coding-assistant",
33
"version": "0.1.0",
4-
"description": "A template for strict TypeScript projects.",
5-
"main": "dist/backend/server.js",
4+
"description": "An AI coding assistant.",
5+
"main": "dist/cli.js",
66
"type": "module",
77
"scripts": {
8-
"start": "node dist/backend/server.js",
9-
"predev": "npm run build",
10-
"dev": "concurrently \"tsc -w\" \"cpx \\\"src/frontend/public/**\\\" \\\"dist/frontend/public\\\" -w\" \"tsx watch src/backend/server.ts\" \"serve dist/frontend/public -l 5000\"",
11-
"build": "rm -rf ./dist && tsc && cpx \"src/frontend/public/**\" \"dist/frontend/public\"",
8+
"start": "node dist/cli.js",
9+
"dev": "nodemon",
10+
"build": "rm -rf ./dist && tsc",
1211
"test": "vitest run",
1312
"coverage": "vitest run --coverage",
1413
"test:watch": "vitest",
1514
"lint": "eslint . --ext .ts",
1615
"format": "prettier . --write",
1716
"typecheck": "tsc --noEmit"
1817
},
19-
"keywords": [],
18+
"keywords": [
19+
"ai",
20+
"typescript",
21+
"cli"
22+
],
2023
"author": "",
2124
"license": "MIT",
2225
"dependencies": {
26+
"@ai-sdk/openai": "^0.0.33",
27+
"ai": "^3.2.16",
2328
"dotenv": "^16.4.5",
24-
"express": "^4.19.2",
25-
"express-async-errors": "^3.1.1"
29+
"zod": "^3.23.8"
2630
},
2731
"devDependencies": {
28-
"@types/express": "^4.17.21",
2932
"@types/node": "^20.14.9",
30-
"@types/supertest": "^6.0.2",
3133
"@typescript-eslint/eslint-plugin": "^7.15.0",
3234
"@typescript-eslint/parser": "^7.15.0",
3335
"@vitest/coverage-v8": "^3.2.4",
34-
"concurrently": "^9.2.0",
35-
"cpx": "^1.5.0",
3636
"eslint": "^8.57.0",
3737
"eslint-config-prettier": "^9.1.0",
38+
"nodemon": "^3.1.10",
3839
"prettier": "^3.3.2",
39-
"serve": "^14.2.4",
40-
"supertest": "^7.0.0",
4140
"tsx": "^4.16.2",
4241
"typescript": "^5.5.3",
4342
"vite-tsconfig-paths": "^4.3.2",
44-
"vitest": "^3.2.4",
45-
"zod": "^3.23.8"
43+
"vitest": "^3.2.4"
4644
},
4745
"engines": {
4846
"node": ">=20.0.0"

src/agent/history.ts

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
// src/agent/history.ts
2+
import { z } from "zod";
3+
import { toolSchemas } from "./tools/index.js";
4+
5+
// A Zod schema for any possible tool call
6+
export const toolCallSchema = z.discriminatedUnion("name", toolSchemas);
7+
export type ToolCall = z.infer<typeof toolCallSchema>;
8+
9+
// Types for our conversation history
10+
export type UserMessage = {
11+
role: "user";
12+
content: string;
13+
};
14+
15+
export type AssistantMessage = {
16+
role: "assistant";
17+
content: string;
18+
toolCalls?: ToolCall[];
19+
};
20+
21+
export type ToolResult = {
22+
role: "tool";
23+
toolCallId: string;
24+
toolName: ToolCall["name"];
25+
result: any;
26+
};
27+
28+
export type Message = UserMessage | AssistantMessage | ToolResult;

src/agent/llm.ts

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
// src/agent/llm.ts
2+
import { openai } from "@ai-sdk/openai";
3+
import { generateText } from "ai";
4+
import "dotenv/config"; // Load .env file
5+
import { toolSchemas } from "./tools/index.js";
6+
7+
export async function getNextAssistantResponse(history: any[]) {
8+
// Dynamically build the tools object for the API call
9+
const tools: Record<string, any> = {};
10+
for (const schema of toolSchemas) {
11+
const toolName = schema.shape.name.value;
12+
tools[toolName] = {
13+
description: `A tool for ${toolName}`, // Generic description
14+
parameters: schema.shape.arguments,
15+
};
16+
}
17+
18+
// A more specific description for our readFile tool
19+
if (tools.readFile) {
20+
tools.readFile.description = "Reads the content of a file at a given path.";
21+
}
22+
23+
return generateText({
24+
model: openai("gpt-4-turbo"),
25+
system: `You are a helpful AI assistant named Abi. You can use tools to help the user.`,
26+
messages: history,
27+
tools,
28+
});
29+
}

src/agent/main.ts

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
// src/agent/main.ts
2+
import type { Message, AssistantMessage, ToolCall } from "./history.js";
3+
import { getNextAssistantResponse } from "./llm.js";
4+
import { toolImplementations } from "./tools/index.js";
5+
6+
export async function run(history: Message[]) {
7+
console.log("🤖 Abi is thinking...");
8+
9+
const response = await getNextAssistantResponse(history);
10+
11+
// If we have tool calls, execute them
12+
if (response.toolCalls && response.toolCalls.length > 0) {
13+
const assistantMessage: AssistantMessage = {
14+
role: "assistant",
15+
content: "",
16+
toolCalls: [], // We will populate this below
17+
};
18+
history.push(assistantMessage);
19+
20+
for (const toolCall of response.toolCalls) {
21+
// Re-structure the tool call to match our internal schema
22+
const toolCallForHistory: ToolCall = {
23+
name: toolCall.toolName as any,
24+
arguments: toolCall.args,
25+
};
26+
assistantMessage.toolCalls?.push(toolCallForHistory);
27+
28+
const tool = toolImplementations[toolCall.toolName as keyof typeof toolImplementations];
29+
30+
if (!tool) {
31+
console.error(`Unknown tool: ${toolCall.toolName}`);
32+
continue;
33+
}
34+
35+
console.log(`› Calling tool: ${toolCall.toolName} with args:`, toolCall.args);
36+
const result = await tool(toolCall.args as any);
37+
38+
// Add tool result to history
39+
history.push({
40+
role: "tool",
41+
toolCallId: toolCall.toolCallId,
42+
toolName: toolCall.toolName as any,
43+
result,
44+
});
45+
}
46+
47+
// Run the loop again with the tool results to get the final text response
48+
await run(history);
49+
50+
} else if (response.text) {
51+
// We have a final text response
52+
console.log("🤖 Abi:", response.text);
53+
history.push({ role: "assistant", content: response.text });
54+
}
55+
}

src/agent/tools/index.ts

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
// src/agent/tools/index.ts
2+
import { readFile, readFileSchema } from "./readFile.js";
3+
4+
// An array of Zod schemas for all our tools.
5+
// When you add a new tool, add its schema here.
6+
export const toolSchemas = [readFileSchema] as const;
7+
8+
// A map of tool names to their actual functions.
9+
// When you add a new tool, add its implementation here.
10+
export const toolImplementations = {
11+
readFile: readFile,
12+
};

src/agent/tools/readFile.ts

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
// src/agent/tools/readFile.ts
2+
import { z } from "zod";
3+
import fs from "fs/promises";
4+
5+
export const readFileSchema = z.object({
6+
name: z.literal("readFile"),
7+
arguments: z.object({
8+
path: z.string().describe("The path to the file to read."),
9+
}),
10+
});
11+
12+
export async function readFile(args: z.infer<typeof readFileSchema>["arguments"]): Promise<string> {
13+
try {
14+
const fileContent = await fs.readFile(args.path, "utf-8");
15+
return fileContent;
16+
} catch (error: any) {
17+
return `Error reading file: ${error.message}`;
18+
}
19+
}

src/backend/app.ts

Lines changed: 0 additions & 30 deletions
This file was deleted.

0 commit comments

Comments
 (0)