Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,16 @@ import org.apache.pdfbox.pdmodel.PDDocument
import org.apache.pdfbox.rendering.PDFRenderer
import org.apache.pdfbox.text.PDFTextStripper
import java.awt.image.BufferedImage
import java.io.File
import javax.imageio.ImageIO
import java.io.File
import java.util.*
import javax.imageio.spi.IIORegistry
import javax.imageio.spi.ImageReaderSpi
import javax.imageio.spi.ImageWriterSpi
import java.util.ServiceLoader

class PDFReader(pdfFile: File) : PaginatedDocumentReader, RenderableDocumentReader {
class PDFReader(pdfFile: File) : PaginatedDocumentReader, RenderableDocumentReader {
private val document: PDDocument = Loader.loadPDF(pdfFile)
private val renderer: PDFRenderer = PDFRenderer(document)

companion object {
init {
val registry = IIORegistry.getDefaultInstance()
Expand Down
116 changes: 113 additions & 3 deletions docs/cognitive_modes.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,9 +110,9 @@ complex internal "state of mind" to solve problems iteratively.

* **How It Works (Internal Logic):**

1. **Initialization:** Upon receiving a user message, it calls `initThinking` to create an initial `ReasoningState`.
This data structure contains the agent's goals (short and long-term), knowledge (facts, hypotheses, open questions),
and execution context (next steps).
1. **Initialization:** Upon receiving a user message, it initializes its internal state using a specific **Cognitive Strategy**.
The default strategy (`ProjectManagerStrategy`) creates a `ReasoningState` containing goals, knowledge, and execution context.
However, other strategies can be used to define different mental models (e.g., Scientific Method, Agile Development).
2. **The Main Loop (Think-Act-Reflect):**
* **Think (`getNextTask`):** At the start of each iteration, the agent analyzes its current `ReasoningState` and
the history of past actions to decide on a small batch of tasks to execute next.
Expand All @@ -128,6 +128,12 @@ complex internal "state of mind" to solve problems iteratively.
* **Iterative:** Refines its understanding and plan over time.
* **Stateful & Reflective:** The `ReasoningState` acts as its memory and consciousness, allowing it to learn from its
actions.
* **Cognitive Strategies:** The mode's behavior is defined by its strategy. Available strategies include:
* **Project Manager:** Standard goal-oriented planning.
* **Scientific Researcher:** Hypothesis-driven investigation.
* **Agile Developer:** Iterative Test-Driven Development.
* **Critical Auditor:** Security and logic validation.
* **Creative Writer:** Narrative and content generation.

* **When to Use It:**
* Complex, ambiguous, or poorly defined problems that require research, experimentation, and adaptation.
Expand Down Expand Up @@ -188,3 +194,107 @@ of smaller, manageable sub-goals and tasks, and then orchestrates their executio
* **Weaknesses:**
* Incurs significant overhead from the constant planning, decomposition, and status updates.
* The success of the entire plan is highly dependent on the quality of the AI's decomposition logic.

### 5. Parallel Mode

The `ParallelMode` is a batch-processing engine designed to execute a specific task across multiple inputs simultaneously.
* **High-Level Concept:** Analyze the user's request to identify a template task and a set of variables (e.g., a list of files). Generate all combinations of these variables, render the template for each, and execute the resulting tasks in parallel.
* **How It Works (Internal Logic):**
1. **Configuration Parsing:** The user's message is analyzed by a `ParsedAgent` to extract a `Config` object. This includes:
* **Variables:** Lists of items to process (e.g., file paths, input strings). Supports glob patterns (e.g., `src/**/*.kt`).
* **Template:** A string with placeholders (e.g., "Review the code in {{file}}").
* **Concurrency:** How many tasks to run at once.
* **Mode:** How to combine variables (`CrossJoin` for all combinations, `Zip` for pairing).
2. **Expansion & Combination:** Variable values are expanded (e.g., resolving file globs). The system then generates a list of task configurations based on the selected mode.
3. **Parallel Execution:** A `FixedConcurrencyProcessor` manages the execution. For each combination:
* The template is rendered with the specific values.
* The system determines the appropriate task implementation (using logic similar to Conversational Mode).
* The task is executed, and results are displayed in a tabbed interface.
* **Key Characteristics:**
* **High Throughput:** Optimized for running many independent tasks at once.
* **Template-Driven:** Uses a single instruction template applied to many contexts.
* **Flexible Inputs:** Supports file globs and variable lists.
* **When to Use It:**
* Batch operations on files (e.g., "Refactor all Java files in src/").
* Running the same analysis on multiple datasets.
* Testing a prompt against a variety of inputs.
* **Strengths:**
* Drastically reduces time for repetitive tasks.
* Automates the creation of many similar tasks.
* Visualizes progress across multiple streams via tabs.
* **Weaknesses:**
* Not suitable for tasks with dependencies between steps.
* Can consume significant API resources quickly due to parallelism.

### 6. Protocol Mode (Experimental)

The `ProtocolMode` is a rigorous, state-machine-driven strategy designed to enforce specific methodologies and ensure high-quality output through validation.
* **High-Level Concept:** Define a strict protocol (a set of states with instructions and validation criteria) to achieve the user's request. The system moves through these states, executing actions and validating them with a "Referee" agent before proceeding.
* **How It Works (Internal Logic):**
1. **Protocol Definition:** The agent analyzes the request and defines a `ProtocolDefinition`. This is a state machine containing a list of states (e.g., "Red", "Green", "Refactor" for TDD), an initial state, and transitions. Each state has specific instructions and validation criteria.
2. **State Execution Loop:**
* **Action:** The system enters the current state and uses a "StateExecutor" agent to perform the required task based on the state's instructions.
* **Validation:** A "Referee" agent reviews the result of the action against the state's `validationCriteria`.
* **Retry/Transition:** If the validation passes, the system transitions to the defined `nextState`. If it fails, the system retries the action (up to a limit) with feedback from the Referee.
3. **Termination:** The process continues until a terminal state (no next state) is reached or a safety limit is hit.
* **Key Characteristics:**
* **Methodical:** Enforces structured workflows like TDD or Read-Draft-Verify.
* **Self-Correcting:** The Referee loop ensures that each step meets quality standards before moving on.
* **Transparent:** The protocol and state transitions are clearly visible.
* **When to Use It:**
* Tasks requiring strict adherence to a process (e.g., Test-Driven Development).
* Generating high-stakes documentation or code where verification is crucial.
* Complex workflows that can be modeled as a state machine.
* **Strengths:**
* High reliability due to the validation step.
* Enforces best practices (like writing tests before code).
* Clear separation of concerns between execution and validation.
* **Weaknesses:**
* Can be slow due to the overhead of validation and potential retries.
* Rigid compared to conversational modes.

### 7. Session Mode (Experimental)

The `SessionMode` focuses on deep interaction with a single tool. It assigns an AI "Operator" to drive a specific tool continuously until a goal is achieved.
* **High-Level Concept:** Select the most appropriate tool for the user's request, then enter a loop where an AI operator issues commands to that tool, interprets the output, and issues new commands until the task is done.
* **How It Works (Internal Logic):**
1. **Tool Selection:** The system analyzes the user's message to select a single, persistent tool (e.g., a specific CLI wrapper or coding agent).
2. **Session Loop:**
* **Plan:** A "SessionOperator" agent reviews the conversation history and the current goal. It decides whether the goal is complete or what the next command should be.
* **Execute:** The command is executed by the selected tool.
* **Update:** The command and its result are added to the session history.
3. **Termination:** The loop ends when the Operator deems the goal complete or a limit is reached.
* **Key Characteristics:**
* **Tool-Centric:** Locks onto one tool and uses it extensively.
* **Autonomous Operator:** The AI acts as a user of the tool, navigating its interface or command set.
* **Stateful:** Maintains the context of the tool's session.
* **When to Use It:**
* Tasks that require multiple interactions with the same utility (e.g., "Debug this issue using the terminal").
* Exploratory tasks where the AI needs to "poke around" using a specific instrument.
* **Strengths:**
* Allows for complex, multi-step operations within a specific domain.
* Reduces context switching by focusing on one tool.
* **Weaknesses:**
* Limited to the capabilities of the selected tool.
* Can get stuck in loops if the tool provides confusing feedback.

### 8. Council Mode

The `CouncilMode` implements a democratic, multi-agent decision-making process. Instead of a single agent driving the process, a "council" of distinct personas collaborates to nominate and vote on tasks.
* **High-Level Concept:** A group of specialized agents (e.g., CEO, CTO, QA) independently analyze the situation and nominate tasks. They then vote on the best course of action. The winning tasks are executed, and all agents update their internal states based on the results.
* **How It Works (Internal Logic):**
1. **Council Initialization:** The mode initializes a list of `CognitiveSchemaStrategy` instances, representing the council members (default: CEO, CTO, QA). Each member maintains its own private state.
2. **The Main Loop:**
* **Nomination:** Each council member analyzes the current situation and nominates tasks.
* **Voting:** If there are conflicting nominations, the council members vote on the proposed tasks.
* **Execution:** The tasks with the most votes are executed.
* **State Update:** Every council member observes the results of the executed tasks and updates their own internal state/perspective accordingly.
* **Key Characteristics:**
* **Multi-Perspective:** Balances different viewpoints (e.g., business value vs. technical feasibility vs. quality).
* **Democratic:** Decisions are made via voting, preventing one narrow perspective from dominating.
* **When to Use It:**
* High-stakes projects requiring balanced decision-making.
* Complex architectural design where trade-offs need to be weighed.
* Situations where a single agent might be prone to bias or tunnel vision.


2 changes: 1 addition & 1 deletion gradle.properties
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
pluginName=Cognotik - Open Source Agentic Power Tools
pluginRepositoryUrl=https://github.com/SimiaCryptus/Cognotik
libraryGroup=com.cognotik
libraryVersion=2.0.36
libraryVersion=2.0.37
gradleVersion=8.13
org.gradle.caching=true
org.gradle.configureondemand=false
Expand Down
17 changes: 3 additions & 14 deletions intellij/src/main/kotlin/cognotik/actions/plan/PlanConfigDialog.kt
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import com.simiacryptus.cognotik.plan.TaskType
import com.simiacryptus.cognotik.plan.TaskTypeConfig
import com.simiacryptus.cognotik.plan.cognitive.CognitiveModeStrategies
import com.simiacryptus.cognotik.plan.newSettings
import com.simiacryptus.cognotik.plan.tools.toApiChatModel
import com.simiacryptus.cognotik.platform.ApplicationServices
import com.simiacryptus.cognotik.platform.model.ApiChatModel
import com.simiacryptus.cognotik.platform.model.ApiData
Expand Down Expand Up @@ -202,7 +203,7 @@ class PlanConfigDialog(
}

private fun editTaskConfig(entry: TaskConfigEntry) {
val dialog = TaskConfigEditDialog(null, entry.taskType, entry.config, visibleModelsCache)
val dialog = TaskConfigDialog(null, entry.taskType, entry.config, visibleModelsCache)
if (dialog.showAndGet()) {
val updatedConfig = dialog.getConfig()
val oldKey =
Expand Down Expand Up @@ -249,7 +250,7 @@ class PlanConfigDialog(
)
return
}
val editDialog = TaskConfigEditDialog(null, taskType, newConfig, visibleModelsCache)
val editDialog = TaskConfigDialog(null, taskType, newConfig, visibleModelsCache)
if (editDialog.showAndGet()) {
val config = editDialog.getConfig()
val key = if (config.name != null) "${taskType.name}_${config.name}" else taskType.name
Expand Down Expand Up @@ -721,15 +722,3 @@ class PlanConfigDialog(
}
}


fun ChatModel.toApiChatModel(): ApiChatModel {
val apis = ApplicationServices.fileApplicationServices().userSettingsManager.getUserSettings().apis
return ApiChatModel(
model = this, provider = ApiData(
key = apis.find { it.provider == this.provider }?.key
?: throw IllegalArgumentException("No API Key for ${this.provider?.name}"),
baseUrl = apis.find { it.provider == this.provider }?.baseUrl ?: this.provider?.base ?: "",
provider = this.provider,
).validate()
)
}
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,13 @@ import com.simiacryptus.cognotik.plan.tools.online.fetch.FetchMethod
import com.simiacryptus.cognotik.plan.tools.online.processing.ProcessingStrategyType
import com.simiacryptus.cognotik.plan.tools.online.seed.SeedMethod
import com.simiacryptus.cognotik.plan.tools.social.PersuasiveEssayTask
import com.simiacryptus.cognotik.plan.tools.file.PdfFormTask
import com.simiacryptus.cognotik.plan.tools.toApiChatModel
import java.awt.Component
import java.awt.Dimension
import javax.swing.*

class TaskConfigEditDialog(
class TaskConfigDialog(
project: Project?,
private val taskType: TaskType<*, *>,
private val config: TaskTypeConfig,
Expand Down Expand Up @@ -105,6 +107,7 @@ class TaskConfigEditDialog(
is MCPToolTask.MCPToolTaskTypeConfig -> createMCPToolFields(config)
is SubPlanTask.SubPlanTaskTypeConfig -> createSubPlanningFields(config)
is PersuasiveEssayTask.PersuasiveEssayTaskTypeConfig -> createPersuasiveEssayFields(config)
is PdfFormTask.PdfFormTypeConfig -> createPdfFormFields(config)
// Add more task types as needed
}
}
Expand All @@ -126,6 +129,19 @@ class TaskConfigEditDialog(
group("Self-Healing Settings") {
}
}
private fun com.intellij.ui.dsl.builder.Panel.createPdfFormFields(config: PdfFormTask.PdfFormTypeConfig) {
group("PDF Form Settings") {
row("Template File:") {
val field = JBTextField(config.template_file ?: "")
field.toolTipText = "Path to the PDF template file relative to project root"
cell(field)
.align(Align.FILL)
.comment("Path to the PDF template file (e.g., templates/form.pdf)")
configFields["template_file"] = field
}
}
}


private fun com.intellij.ui.dsl.builder.Panel.createMCPToolFields(config: MCPToolTask.MCPToolTaskTypeConfig) {
group("MCP Tool Settings") {
Expand Down Expand Up @@ -284,7 +300,7 @@ class TaskConfigEditDialog(
val config = if (dialog.isQuickSelect) {
newConfig
} else {
val configDialog = TaskConfigEditDialog(null, taskType, newConfig, availableModels)
val configDialog = TaskConfigDialog(null, taskType, newConfig, availableModels)
if (configDialog.showAndGet()) configDialog.getConfig() else return
}

Expand All @@ -295,7 +311,7 @@ class TaskConfigEditDialog(
}

private fun editSubTaskConfig(entry: SubTaskConfigEntry, parentConfig: SubPlanTask.SubPlanTaskTypeConfig) {
val dialog = TaskConfigEditDialog(null, entry.taskType, entry.config, availableModels)
val dialog = TaskConfigDialog(null, entry.taskType, entry.config, availableModels)
if (dialog.showAndGet()) {
val updatedConfig = dialog.getConfig()
val newKey =
Expand Down Expand Up @@ -613,6 +629,18 @@ class TaskConfigEditDialog(
}
}
}
// Validate PdfFormTask fields
if (config is PdfFormTask.PdfFormTypeConfig) {
val templateFile = (configFields["template_file"] as? JBTextField)?.text?.trim()
if (templateFile.isNullOrEmpty()) {
Messages.showWarningDialog(
"Template file path cannot be empty",
"Invalid Value"
)
configFields["template_file"]?.requestFocusInWindow()
return false
}
}


// Validate numeric fields
Expand Down Expand Up @@ -747,6 +775,15 @@ class TaskConfigEditDialog(
generate_cover_image = (configFields["generate_cover_image"] as? JCheckBox)?.isSelected ?: true,
)
}
is PdfFormTask.PdfFormTypeConfig -> {
PdfFormTask.PdfFormTypeConfig(
template_file = (configFields["template_file"] as? JBTextField)?.text?.trim()
?.takeIf { it.isNotEmpty() },
task_type = baseConfig.task_type,
name = baseConfig.name
)
}


else -> baseConfig
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package cognotik.actions.task
import cognotik.actions.BaseAction
import cognotik.actions.agent.toFile
import cognotik.actions.plan.PlanConfigDialog
import cognotik.actions.plan.toApiChatModel
import com.intellij.openapi.actionSystem.ActionUpdateThread
import com.intellij.openapi.actionSystem.AnActionEvent
import com.intellij.openapi.progress.ProgressIndicator
Expand All @@ -21,6 +20,7 @@ import com.simiacryptus.cognotik.config.AppSettingsState
import com.simiacryptus.cognotik.config.instance
import com.simiacryptus.cognotik.plan.AbstractTask.TaskState
import com.simiacryptus.cognotik.plan.OrchestrationConfig
import com.simiacryptus.cognotik.plan.tools.toApiChatModel
import com.simiacryptus.cognotik.plan.tools.writing.BusinessProposalTask
import com.simiacryptus.cognotik.platform.ApplicationServices
import com.simiacryptus.cognotik.platform.Session
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package cognotik.actions.task
import cognotik.actions.BaseAction
import cognotik.actions.agent.toFile
import cognotik.actions.plan.PlanConfigDialog
import cognotik.actions.plan.toApiChatModel
import com.intellij.openapi.actionSystem.ActionUpdateThread
import com.intellij.openapi.actionSystem.AnActionEvent
import com.intellij.openapi.progress.ProgressIndicator
Expand All @@ -21,6 +20,7 @@ import com.simiacryptus.cognotik.config.instance
import com.simiacryptus.cognotik.plan.AbstractTask.TaskState
import com.simiacryptus.cognotik.plan.OrchestrationConfig
import com.simiacryptus.cognotik.plan.tools.data.DataIngestTask
import com.simiacryptus.cognotik.plan.tools.toApiChatModel
import com.simiacryptus.cognotik.platform.ApplicationServices
import com.simiacryptus.cognotik.platform.Session
import com.simiacryptus.cognotik.platform.file.DataStorage
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package cognotik.actions.task
import cognotik.actions.BaseAction
import cognotik.actions.agent.toFile
import cognotik.actions.plan.PlanConfigDialog
import cognotik.actions.plan.toApiChatModel
import com.intellij.openapi.actionSystem.ActionUpdateThread
import com.intellij.openapi.actionSystem.AnActionEvent
import com.intellij.openapi.progress.ProgressIndicator
Expand All @@ -24,6 +23,7 @@ import com.simiacryptus.cognotik.plan.OrchestrationConfig
import com.simiacryptus.cognotik.plan.TaskTypeConfig
import com.simiacryptus.cognotik.plan.tools.file.FileModificationTask
import com.simiacryptus.cognotik.plan.tools.file.FileModificationTask.Companion.FileModification
import com.simiacryptus.cognotik.plan.tools.toApiChatModel
import com.simiacryptus.cognotik.platform.ApplicationServices
import com.simiacryptus.cognotik.platform.Session
import com.simiacryptus.cognotik.platform.file.DataStorage
Expand Down
Loading
Loading