Skip to content

Commit

Permalink
Merge pull request #50 from meysamhadeli/refactor/refactor-and-bug-fixes
Browse files Browse the repository at this point in the history
refactor: refactor and bug fixes
  • Loading branch information
meysamhadeli authored Nov 9, 2024
2 parents 07839e7 + fbb3a12 commit 2811098
Show file tree
Hide file tree
Showing 17 changed files with 505 additions and 179 deletions.
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

> 💡 **codai is an AI code assistant designed to help developers efficiently manage their daily tasks through a session-based CLI, such as adding new features, refactoring,
and performing detailed code reviews. What makes codai stand out is its deep understanding of the entire context of your project, enabling it to analyze your code base
and suggest improvements or new code based on your context. This AI-powered tool supports multiple LLM models, including GPT-4, GPT-4o, GPT-4o mini, Ollama, and more.**
and suggest improvements or new code based on your context. This AI-powered tool supports multiple LLM models, including GPT-4o, GPT-4, GPT-4o mini, Ollama, and more.**

We use **two** main methods to manage context: **RAG** (Retrieval-Augmented Generation) and **Summarize Full Context of Code**.
Each method has its own benefits and is chosen depending on the specific needs of the request. Below is a description of each method.
Expand Down Expand Up @@ -53,8 +53,9 @@ ai_provider_config:
embedding_url: "http://localhost:11434/v1/embeddings" (Optional, If you want use RAG.)
embedding_model: "text-embedding-3-small" (Optional, If you want use RAG.)
temperature: 0.2
threshold: 0.3
theme: "dracula"
RAG: true (Optional, if you want, can disable RAG.)
rag: true (Optional, If you want use RAG.)
```
> Note: We used the standard integration of [OpenAI APIs](https://platform.openai.com/docs/api-reference/introduction) and [Ollama APIs](https://github.com/ollama/ollama/blob/main/docs/api.md) and you can find more details in documentation of each APIs.
Expand All @@ -74,10 +75,10 @@ This flexibility allows you to customize config of codai on the fly.

## 🔮 LLM Models
### ⚡ Best Models
The codai works well with advanced LLM models specifically designed for code generation, including `GPT-4`, `GPT-4o`, and `GPT-4o mini`. These models leverage the latest in AI technology, providing powerful capabilities for understanding and generating code, making them ideal for enhancing your development workflow.
The codai works well with advanced LLM models specifically designed for code generation, including `GPT-4o` and `GPT-4`. These models leverage the latest in AI technology, providing powerful capabilities for understanding and generating code, making them ideal for enhancing your development workflow.

### 💻 Local Models
In addition to cloud-based models, codai is compatible with local models such as `Ollama`. To achieve the best results, it is recommended to utilize models like `DeepSeek-Coder-v2`, `CodeLlama`, and `Mistral`. These models have been optimized for coding tasks, ensuring that you can maximize the efficiency and effectiveness of your coding projects.
In addition to cloud-based models, codai is compatible with local models such as `Ollama`. To achieve the best results, it is recommended to utilize models like [Phi-3-medium instruct (128k)](https://github.com/marketplace/models/azureml/Phi-3-medium-128k-instruct), [Mistral Large (2407)](https://github.com/marketplace/models/azureml-mistral/Mistral-large-2407) and [Meta-Llama-3.1-70B-Instruct](https://github.com/marketplace/models/azureml-meta/Meta-Llama-3-1-70B-Instruct). These models have been optimized for coding tasks, ensuring that you can maximize the efficiency and effectiveness of your coding projects.

### 🌐 OpenAI Embedding Models
The codai platform uses `OpenAI embedding models` to retrieve `relevant content` with high efficiency. Recommended models include are **text-embedding-3-large**, **text-embedding-3-small**, and **text-embedding-ada-002**, both known for their `cost-effectiveness` and `accuracy` in `capturing semantic relationships`. These models are ideal for applications needing high-quality performance in `code context retrieval`.
Expand Down Expand Up @@ -132,7 +133,7 @@ Summarize the full context of your codebase using Tree-sitter for accurate and e
Implement a Retrieval-Augmented Generation system to improve the relevance and accuracy of code suggestions by retrieving relevant context from the project.

⚡ **Support variety of LLM models:**
Work with advanced LLM models like `GPT-4, GPT-4o, GPT-4o mini and Ollama` to get high-quality code suggestions and interactions.
Work with advanced LLM models like `GPT-4o, GPT-4, GPT-4o mini and Ollama` to get high-quality code suggestions and interactions.

🗂️ **Edit Multiple Files at Once:**
Enable the AI to modify several files at the same time, making it easier to handle complex requests that need changes in different areas of the code.
Expand Down
11 changes: 6 additions & 5 deletions cmd/code.go
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ func handleCodeCommand(rootDependencies *RootDependencies) {
topN := -1

// Step 6: Find relevant code chunks based on the user query embedding
fullContextCodes = rootDependencies.Store.FindRelevantChunks(queryEmbedding[0], topN, rootDependencies.Config.AIProviderConfig.EmbeddingModel, rootDependencies.Config.AIProviderConfig.Threshold)
fullContextCodes = rootDependencies.Store.FindRelevantChunks(queryEmbedding[0], topN, rootDependencies.Config.AIProviderConfig.Threshold)
return nil
}

Expand All @@ -151,7 +151,6 @@ func handleCodeCommand(rootDependencies *RootDependencies) {
continue startLoop
}

fmt.Println()
spinnerLoadContextEmbedding.Stop()
}

Expand All @@ -160,6 +159,8 @@ func handleCodeCommand(rootDependencies *RootDependencies) {
chatRequestOperation := func() error {
finalPrompt, userInputPrompt := rootDependencies.Analyzer.GeneratePrompt(fullContextCodes, rootDependencies.ChatHistory.GetHistory(), userInput, requestedContext)

var b = finalPrompt + userInputPrompt
fmt.Println(b)
// Step 7: Send the relevant code and user input to the AI API
responseChan := rootDependencies.CurrentProvider.ChatCompletionRequest(ctx, userInputPrompt, finalPrompt)

Expand Down Expand Up @@ -200,7 +201,7 @@ func handleCodeCommand(rootDependencies *RootDependencies) {
if requestedContext != "" && err == nil {
aiResponseBuilder.Reset()

fmt.Println(lipgloss_color.BlueSky.Render("Trying to send above context files for getting code suggestion fromm AI...\n"))
fmt.Println(lipgloss_color.BlueSky.Render("\nThese files need to changes...\n"))

err = chatRequestOperation()

Expand All @@ -212,9 +213,9 @@ func handleCodeCommand(rootDependencies *RootDependencies) {
}

// Extract code from AI response and structure this code to apply to git
changes, err := rootDependencies.Analyzer.ExtractCodeChanges(aiResponseBuilder.String())
changes := rootDependencies.Analyzer.ExtractCodeChanges(aiResponseBuilder.String())

if err != nil || changes == nil {
if changes == nil {
fmt.Println(lipgloss_color.BlueSky.Render("\nno code blocks with a valid path detected to apply."))
continue
}
Expand Down
2 changes: 1 addition & 1 deletion cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ func handleRootCommand(cmd *cobra.Command) *RootDependencies {

rootDependencies.ChatHistory = providers.NewChatHistory()

rootDependencies.Analyzer = code_analyzer.NewCodeAnalyzer(rootDependencies.Cwd)
rootDependencies.Analyzer = code_analyzer.NewCodeAnalyzer(rootDependencies.Cwd, rootDependencies.Config.RAG)

rootDependencies.Store = embedding_store.NewEmbeddingStoreModel()

Expand Down
185 changes: 117 additions & 68 deletions code_analyzer/analyzer.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package code_analyzer
import (
"encoding/json"
"fmt"
"github.com/charmbracelet/lipgloss"
"github.com/meysamhadeli/codai/code_analyzer/contracts"
"github.com/meysamhadeli/codai/code_analyzer/models"
"github.com/meysamhadeli/codai/embed_data"
Expand All @@ -26,56 +25,37 @@ import (

// CodeAnalyzer handles the analysis of project files.
type CodeAnalyzer struct {
Cwd string
Cwd string
IsRAG bool
}

// Define styles for the box
var (
boxStyle = lipgloss.NewStyle().Border(lipgloss.NormalBorder()).Align(lipgloss.Center)
)

func (analyzer *CodeAnalyzer) GeneratePrompt(codes []string, history []string, userInput string, requestedContext string) (string, string) {
var promptTemplate string
if analyzer.IsRAG {
promptTemplate = string(embed_data.RagContextPrompt)
} else {
promptTemplate = string(embed_data.SummarizeFullContextPrompt)
}

// Combine the relevant code into a single string
code := strings.Join(codes, "\n---------\n\n")

prompt := fmt.Sprintf("%s\n\n______\n%s\n\n", fmt.Sprintf(boxStyle.Render("Here is the summary of context of project")+"\n\n%s", code), fmt.Sprintf(boxStyle.Render("Here is the general template prompt for using AI")+"\n\n%s", string(embed_data.CodeBlockTemplate)))
prompt := fmt.Sprintf("%s\n\n______\n%s\n\n______\n", fmt.Sprintf("## Here is the summary of context of project\n\n%s", code), fmt.Sprintf("## Here is the general template prompt for using AI\n\n%s", promptTemplate))
userInputPrompt := fmt.Sprintf("## Here is user request\n%s", userInput)

if requestedContext != "" {
prompt = prompt + fmt.Sprintf(boxStyle.Render("Here are the details of the context of the project that was requested for use in your task")+"\n\n%s", requestedContext)
prompt = prompt + fmt.Sprintf("## Here are the requsted full context files for using in your task\n\n%s______\n", requestedContext)
}

userInputPrompt := fmt.Sprintf(boxStyle.Render("Here is user request")+"\n%s", userInput)
historyPrompt := boxStyle.Render("Here is the history of chats") + "\n\n" + strings.Join(history, "\n---------\n\n")
historyPrompt := "## Here is the history of chats\n\n" + strings.Join(history, "\n---------\n\n")
finalPrompt := fmt.Sprintf("%s\n\n______\n\n%s", historyPrompt, prompt)

return finalPrompt, userInputPrompt
}

// NewCodeAnalyzer initializes a new CodeAnalyzer.
func NewCodeAnalyzer(cwd string) contracts.ICodeAnalyzer {
return &CodeAnalyzer{Cwd: cwd}
}

// ApplyChanges updates or creates a file at the given relativePath with the specified code.
func (analyzer *CodeAnalyzer) ApplyChanges(relativePath, code string) error {
// Ensure the directory structure exists
dir := filepath.Dir(relativePath)
if err := os.MkdirAll(dir, os.ModePerm); err != nil {
return fmt.Errorf("failed to create directory: %w", err)
}

// Check if file exists
if _, err := os.Stat(relativePath); os.IsNotExist(err) {
// File does not exist, create and write code
if err := ioutil.WriteFile(relativePath, []byte(code), 0644); err != nil {
return fmt.Errorf("failed to create file: %w", err)
}
} else {
// File exists, update the content
if err := ioutil.WriteFile(relativePath, []byte(code), 0644); err != nil {
return fmt.Errorf("failed to update file: %w", err)
}
}
return nil
func NewCodeAnalyzer(cwd string, isRAG bool) contracts.ICodeAnalyzer {
return &CodeAnalyzer{Cwd: cwd, IsRAG: isRAG}
}

func (analyzer *CodeAnalyzer) GetProjectFiles(rootDir string) ([]models.FileData, []string, error) {
Expand Down Expand Up @@ -227,18 +207,54 @@ func (analyzer *CodeAnalyzer) ProcessFile(filePath string, sourceCode []byte) []
return elements
}

func (analyzer *CodeAnalyzer) TryGetInCompletedCodeBlocK(relativePaths string) (string, error) {
var codes []string

// Simplified regex to capture only the array of files
re := regexp.MustCompile(`\[.*?\]`)
match := re.FindString(relativePaths)

if match == "" {
return "", fmt.Errorf("no file paths found in input")
}

// Parse the match into a slice of strings
var filePaths []string
err := json.Unmarshal([]byte(match), &filePaths)
if err != nil {
return "", fmt.Errorf("failed to unmarshal JSON: %v", err)
}

// Loop through each relative path and read the file content
for _, relativePath := range filePaths {
content, err := os.ReadFile(relativePath)
if err != nil {
continue
}

codes = append(codes, fmt.Sprintf("File: %s\n\n%s", relativePath, content))
}

if len(codes) == 0 {
return "", fmt.Errorf("no valid files read")
}

requestedContext := strings.Join(codes, "\n---------\n\n")

return requestedContext, nil
}

// ExtractCodeChanges extracts code changes from the given text.
func (analyzer *CodeAnalyzer) ExtractCodeChanges(text string) ([]models.CodeChange, error) {
func (analyzer *CodeAnalyzer) ExtractCodeChanges(text string) []models.CodeChange {
if text == "" {
return nil, nil
return nil
}

// Regex patterns
filePathPattern := regexp.MustCompile(`(?i)(?:.*?File:\s*)([^\s*]+?\.[a-zA-Z0-9]+)\b`)
// Regex patterns for file paths and code blocks
filePathPattern := regexp.MustCompile(`(?i)(?:\d+\.\s*|File:\s*)([^\s*]+?\.[a-zA-Z0-9]+)\b`)
// Capture entire diff blocks, assuming they are enclosed in ```diff ... ```
codeBlockPattern := regexp.MustCompile("(?s)```[a-zA-Z0-9]*\\s*(.*?)\\s*```")

var codeChanges []models.CodeChange

// Find all file path matches and code block matches
filePathMatches := filePathPattern.FindAllStringSubmatch(text, -1)
codeMatches := codeBlockPattern.FindAllStringSubmatch(text, -1)
Expand All @@ -249,54 +265,87 @@ func (analyzer *CodeAnalyzer) ExtractCodeChanges(text string) ([]models.CodeChan
minLength = len(codeMatches)
}

// Create code changes up to the minimum length
// Initialize code changes
var codeChanges []models.CodeChange
for i := 0; i < minLength; i++ {
relativePath := strings.TrimSpace(filePathMatches[i][1])
code := strings.TrimSpace(codeMatches[i][1])

// Capture the relative path and associated diff content
codeChange := models.CodeChange{
RelativePath: relativePath,
Code: code,
}
codeChanges = append(codeChanges, codeChange)
}

return codeChanges, nil
return codeChanges
}

func (analyzer *CodeAnalyzer) TryGetInCompletedCodeBlocK(relativePaths string) (string, error) {
var codes []string
func (analyzer *CodeAnalyzer) ApplyChanges(relativePath, diff string) error {
// Ensure the directory structure exists
dir := filepath.Dir(relativePath)
if err := os.MkdirAll(dir, os.ModePerm); err != nil {
return fmt.Errorf("failed to create directory: %w", err)
}

// Simplified regex to capture only the array of files
re := regexp.MustCompile(`\[.*?\]`)
match := re.FindString(relativePaths)
// Process the diff content: handle additions and deletions
diffLines := strings.Split(diff, "\n")
var updatedContent []string

if match == "" {
return "", fmt.Errorf("no file paths found in input")
}
for _, line := range diffLines {
trimmedLine := strings.TrimSpace(line)
if strings.HasPrefix(trimmedLine, "-") {
// Ignore lines that start with "-", effectively deleting them
continue
} else if strings.HasPrefix(trimmedLine, "+") {
// Add lines that start with "+", but remove the "+" symbol
updatedContent = append(updatedContent, strings.ReplaceAll(trimmedLine, "+", " "))

// Parse the match into a slice of strings
var filePaths []string
err := json.Unmarshal([]byte(match), &filePaths)
if err != nil {
return "", fmt.Errorf("failed to unmarshal JSON: %v", err)
} else {
// Keep all other lines as they are
updatedContent = append(updatedContent, line)
}
}

// Loop through each relative path and read the file content
for _, relativePath := range filePaths {
content, err := os.ReadFile(relativePath)
if err != nil {
continue
// Handle deletion if code is empty
if diff == "" {
// Check if file exists, then delete if it does
if err := os.Remove(relativePath); err != nil {
if os.IsNotExist(err) {
fmt.Printf("File %s does not exist, so no deletion necessary.\n", relativePath)
} else {
return fmt.Errorf("failed to delete file: %w", err)
}
}

codes = append(codes, fmt.Sprintf("File: %s\n\n%s", relativePath, content))
// After file deletion, check if the directory is empty and delete it if so
if err := removeEmptyDirectoryIfNeeded(dir); err != nil {
return err
}
} else {
// Write the updated content to the file
if err := ioutil.WriteFile(relativePath, []byte(strings.Join(updatedContent, "\n")), 0644); err != nil {
return fmt.Errorf("failed to write to file: %w", err)
}
}

if len(codes) == 0 {
return "", fmt.Errorf("no valid files read")
}
return nil
}

requestedContext := strings.Join(codes, "\n---------\n\n")
// removeEmptyDirectoryIfNeeded checks if a directory is empty, and if so, deletes it
func removeEmptyDirectoryIfNeeded(dir string) error {
// Check if the directory is empty
entries, err := os.ReadDir(dir)
if err != nil {
return fmt.Errorf("failed to read directory %s: %w", dir, err)
}

return requestedContext, nil
// If the directory is empty, remove it
if len(entries) == 0 {
if err := os.Remove(dir); err != nil {
return fmt.Errorf("failed to delete empty directory %s: %w", dir, err)
}
}
return nil
}
Loading

0 comments on commit 2811098

Please sign in to comment.