Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Run exec agent #11

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

fix: Run exec agent #11

wants to merge 19 commits into from

Conversation

zacharyblasczyk
Copy link
Member

@zacharyblasczyk zacharyblasczyk commented Mar 22, 2025

Summary by CodeRabbit

  • New Features

    • Improved job execution commands now require additional parameters for enhanced validation and OS-specific behavior.
    • Job management has been enhanced with detailed tracking, robust cancellation, and graceful shutdown support.
    • A new JobWithDetails structure allows for more comprehensive job information handling.
  • Documentation

    • Command usage examples have been updated for clearer guidance.
  • Chores

    • Added new targets in the Makefile for code formatting and linting.

Copy link

coderabbitai bot commented Mar 22, 2025

Walkthrough

This pull request introduces several enhancements and refactorings for job execution and management. The command for executing jobs now requires additional flags such as job agent name and workspace, with conditional logic based on the operating system. The runner implementations have been updated to use context, support graceful shutdown via signal handling, and manage job status updates using new helper functions. Furthermore, a new file defines a JobWithDetails structure along with its associated methods, while minor formatting changes have been applied to auxiliary files and the Makefile.

Changes

File(s) Change Summary
cmd/ctrlc/.../exec/exec.go Updated NewRunExecCmd to introduce a new name flag, enforce required flags, conditionally assign jobAgentType based on OS, and add a shutdown handler.
cmd/ctrlc/.../exec/runner.go, pkg/.../runner.go Revised runner implementations: updated Start method signature to use context and detailed job info, removed the Status method, and added a RunningJob struct, ExitAll method, and helper functions (killProcess, updateJobStatus) with signal-based shutdown handling.
Makefile Added a new fmt target running gofmt to format code and consolidated .PHONY targets (build, install, test, clean, fmt).
cmd/ctrlc/.../sync/clickhouse/clickhouse.go Adjusted struct field spacing for improved readability without altering functionality.
cmd/ctrlc/.../sync/sync.go, cmd/ctrlc/.../version/version.go Added a trailing newline; no functional changes.
internal/api/job.go Introduced the JobWithDetails struct with methods for fetching job details, templating scripts, updating job status, and setting an external ID.

Sequence Diagram(s)

sequenceDiagram
    participant U as User
    participant C as RunExecCmd
    participant R as ExecRunner
    participant JA as Job Agent
    participant OS as OS Signal Handler

    U->>C: Execute command with required flags (--name, --workspace)
    C->>R: Call Start(ctx, jobWithDetails)
    R->>JA: Initialize job (create temp script, start execution)
    JA-->>R: Return job execution result/status
    OS-->>R: Send termination signal
    R->>JA: Invoke ExitAll() to terminate job processes
    R->>C: Return updated job status
Loading

Possibly related PRs

Suggested reviewers

  • jsbroks

Poem

In a meadow of code I hop with glee,
Adding flags and runners so spry and free.
With signals graceful as a twilight breeze,
I nibble bugs with effortless ease.
Each change, a carrot in my codeyard fine,
Hop along with joy—this pull is divine!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (12)
pkg/jobagent/runner.go (1)

32-35: Add optional nil-check or usage note for runner field.
The new runner reference looks correct and meets the JobAgent needs. However, consider adding a precondition check or a usage note to ensure the caller doesn't pass a nil runner, avoiding null pointer scenarios during execution.

 func NewJobAgent(
   client *api.ClientWithResponses,
   config api.UpsertJobAgentJSONRequestBody,
   runner Runner,
 ) (*JobAgent, error) {
     if runner == nil {
+        return nil, fmt.Errorf("runner cannot be nil")
     }
     ...
 }
cmd/ctrlc/root/run/exec/exec.go (2)

14-19: Use strong typing checks for JobAgentType.
Defining a custom type and constants is a good approach. However, consider introducing a validation method on JobAgentType to avoid repeating validity checks.

 type JobAgentType string

 const (
     JobAgentTypeLinux   JobAgentType = "exec-linux"
     JobAgentTypeWindows JobAgentType = "exec-windows"
 )

+func (t JobAgentType) IsValid() bool {
+    return t == JobAgentTypeLinux || t == JobAgentTypeWindows
+}

22-23: Consider using JobAgentType instead of string.
Storing jobAgentType in a variable of type JobAgentType can prevent mismatched string assignments and unify usage with your constants.

-var jobAgentType string
+var jobAgentType JobAgentType
cmd/ctrlc/root/run/exec/runner.go (9)

29-38: Resource type duplication risk.
This Resource struct appears to overlap with internal/api/client.gen.go. Consider referencing or embedding the generated struct to avoid maintenance overhead.


40-45: Release struct duplication risk.
This follows the same logic as the Resource struct. Consider referencing the existing release type in your API client code if possible.


47-50: Environment struct duplication risk.
Same duplication concern. Consolidating environment data structures can simplify updates and reduce code drift.


52-58: Approval struct usage.
If you already have an approval-related struct in the API client, unify them to reduce parallel definitions.


59-64: Deployment struct duplication risk.
Parallel definitions can cause drift. Reuse the generated Deployment struct if possible.


66-70: Runbook struct duplication risk.
Reusing the runbook definition from the client code can help ensure consistency across the codebase.


72-82: JobData struct partially overlaps the existing Job definition.
You may consider referencing or embedding relevant fields from the existing generated model to avoid potential divergence.


131-137: Creating API client on each run.
Re-creating a new API client for every job might lead to performance overhead. Consider reusing a shared client instance to optimize resource usage.


158-158: Validate script content to prevent injection.
While templating user-provided data is flexible, consider sanitizing or validating inputs if the jobDetails could contain malicious data.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a3586da and 8223bd0.

📒 Files selected for processing (3)
  • cmd/ctrlc/root/run/exec/exec.go (2 hunks)
  • cmd/ctrlc/root/run/exec/runner.go (4 hunks)
  • pkg/jobagent/runner.go (1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
cmd/ctrlc/root/run/exec/runner.go (1)
internal/api/client.gen.go (23) (23)
  • Resource (244-255)
  • Release (232-241)
  • Environment (112-125)
  • Deployment (99-109)
  • Runbook (258-263)
  • r (3511-3516)
  • r (3519-3524)
  • r (3540-3545)
  • r (3548-3553)
  • r (3568-3573)
  • r (3576-3581)
  • r (3593-3598)
  • r (3601-3606)
  • r (3621-3626)
  • r (3629-3634)
  • r (3654-3659)
  • r (3662-3667)
  • r (3683-3688)
  • r (3691-3696)
  • r (3704-3709)
  • r (3712-3717)
  • Job (128-144)
  • JobStatus (147-147)
🔇 Additional comments (11)
cmd/ctrlc/root/run/exec/exec.go (7)

6-6: No concerns regarding heredoc import.
This import is commonly used for multi-line string literals and looks correct.


25-25: Cobra command declaration looks good.
No major issues: well-structured command and helpful usage.


35-35: Workspace ID retrieval.
Correctly fetching workspaceId from viper. No concerns raised here.


40-45: Mandatory flag checks are handled correctly.
Returning errors when name or workspaceId is empty aligns with best practices for user-friendly CLI.


54-54: Job agent creation call.
Instantiating the new job agent with proper arguments. Looks consistent with the rest of the logic.


57-59: Populating request body fields.
Good usage of Name, Type, and WorkspaceId. The code matches the new CLI flags.


76-79: Flags for --name and --type are properly declared.
Declaring a required --name is helpful. The default for --type is exec-linux, consistent with one of the constants.

cmd/ctrlc/root/run/exec/runner.go (4)

5-5: No issues with the context import.


17-17: No issues with the viper import.


85-87: Nil check for ExternalId.
This check addresses a common edge case for external run correlators. Looks good.


139-146: Handling empty response from job details API.
The error message and checks are correct. Good job preventing nil dereferences.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
cmd/ctrlc/root/run/exec/runner.go (2)

84-88: Handle non-200 status codes.
This code retrieves job details with GetJobWithResponse. If the server returns a status other than 200, it may still be worthwhile to handle specific cases (e.g., 404 job not found). Consider adding explicit handling if needed.


89-91: Check for partial or error responses.
Simply returning an error if resp.JSON200 is nil is correct, but also consider whether the API might return other structured error responses (e.g., JSON404) that could provide more informative messaging.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8223bd0 and 01bc449.

📒 Files selected for processing (1)
  • cmd/ctrlc/root/run/exec/runner.go (4 hunks)
🔇 Additional comments (5)
cmd/ctrlc/root/run/exec/runner.go (5)

5-5: Appropriate use of context.
This import is helpful for passing cancellation and timeout information to external calls, improving reliability.


17-17: Integration of viper for configuration.
Using viper to handle configuration values (url and api-key) is a convenient approach. Ensure defaults or error handling is in place for missing config keys where needed.


30-32: Validate external ID usage.
Returning JobStatusExternalRunNotFound when ExternalId is nil ensures a clear error status. Verify if this case also covers newly created jobs that haven't been assigned an external ID yet.

Would you like to confirm that newly created jobs have ExternalId assigned before a status check?


76-83: Confirm url and api-key availability.
Creating an API client via viper.GetString("url") and viper.GetString("api-key") is valid, but verify that these configurations are always set to avoid runtime errors if empty.

Please run a check in your environment to ensure url and api-key are properly configured in all relevant execution contexts.


93-96: Uncontrolled json.Unmarshal without error handling.
The error return from json.Unmarshal is swallowed, risking silent failures. Strongly recommend capturing and handling the error.

Below is a possible fix:

 var jobDetails map[string]interface{}
 detailsBytes, _ := json.Marshal(resp.JSON200)
-if err := json.Unmarshal(detailsBytes, &jobDetails); err != nil {
-    // error is discarded
+err = json.Unmarshal(detailsBytes, &jobDetails)
+if err != nil {
+    return "", fmt.Errorf("failed to parse job details: %w", err)
 }

templatedScript, err := template.New("script").Parse(config.Script)
if err != nil {
return "", fmt.Errorf("failed to parse script template: %w", err)
}

buf := new(bytes.Buffer)
if err := templatedScript.Execute(buf, job); err != nil {
if err := templatedScript.Execute(buf, jobDetails); err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid potential code injection in the script.
Passing jobDetails directly into the template and executing as a script can result in code injection if jobDetails includes malicious input. Consider sanitizing or validating fields in jobDetails and config.Script to reduce security risks.

@@ -25,6 +27,10 @@ type ExecConfig struct {
}

func (r *ExecRunner) Status(job api.Job) (api.JobStatus, string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the pid is missing is that considered sucessful? what if the pid is now a different process?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not clear how to keep track of these processes, maybe we assum they are not deamon so if the command closes so does the children. This means we won't need this status check and instead create a goroutine that updates the job when it exists

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
cmd/ctrlc/root/run/exec/runner.go (1)

80-80: ⚠️ Potential issue

Potential security risk: Untrusted data in templates.

Passing jobDetails directly into template execution could lead to template injection if jobDetails contains malicious input. Consider sanitizing or validating fields before using them in the template.

-if err := templatedScript.Execute(buf, jobDetails); err != nil {
+// Create a sanitized data context with only the required fields
+sanitizedDetails := map[string]interface{}{}
+// Copy over only the expected fields or validate inputs
+for k, v := range jobDetails {
+    // Sanitize or validate each field as needed
+    sanitizedDetails[k] = v
+}
+if err := templatedScript.Execute(buf, sanitizedDetails); err != nil {
🧹 Nitpick comments (2)
pkg/jobagent/job_details.go (1)

12-35: Well-structured job details fetching function with comprehensive error handling.

This new function cleanly retrieves job details with proper error handling at each step. The use of descriptive error messages will help with debugging.

There's one improvement you could make to the error handling in the API response check:

if resp.JSON200 == nil {
-    return nil, fmt.Errorf("received empty response from job details API")
+    return nil, fmt.Errorf("received empty response from job details API (status code: %d)", resp.StatusCode())
}
pkg/jobagent/runner.go (1)

137-144: Missing error handling for UpdateJobWithResponse.

The error return value from a.client.UpdateJobWithResponse is checked, but there's no recovery mechanism if the update fails beyond logging the error.

Consider implementing a retry mechanism or a way to handle these failed updates, as they could lead to job status inconsistencies:

-, err := a.client.UpdateJobWithResponse(
+updateResp, err := a.client.UpdateJobWithResponse(
    context.Background(),
    job.Id.String(),
    body,
)
if err != nil {
    log.Error("Failed to update job", "error", err, "jobId", job.Id.String())
+    // Consider implementing a retry mechanism or queuing failed updates
+    // for later processing to ensure job status consistency
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01bc449 and e1a6568.

📒 Files selected for processing (4)
  • cmd/ctrlc/root/run/exec/exec.go (3 hunks)
  • cmd/ctrlc/root/run/exec/runner.go (3 hunks)
  • pkg/jobagent/job_details.go (1 hunks)
  • pkg/jobagent/runner.go (3 hunks)
🧰 Additional context used
🪛 golangci-lint (1.64.8)
pkg/jobagent/runner.go

76-76: Error return value of a.client.UpdateJobWithResponse is not checked

(errcheck)

🔇 Additional comments (10)
cmd/ctrlc/root/run/exec/runner.go (2)

28-30: Great addition of nil check for job.ExternalId.

This defensive check prevents potential nil pointer dereferences when trying to convert the external ID to an integer, improving error handling and providing a clear error message.


52-52: Method signature updated to accept job details.

The Start method now accepts additional job details which will be used for template rendering. This change aligns with the interface updates in the jobagent package.

cmd/ctrlc/root/run/exec/exec.go (5)

16-20: Good improvement: Automatic job agent type detection based on OS.

Determining the job agent type based on the operating system simplifies usage and prevents errors. This addresses previous review comments about automatically determining the type.


25-27: Example usage updated to reflect new requirements.

The example now correctly shows the required --name parameter and no longer includes the deprecated --type parameter.


36-41: Proper parameter validation added.

Adding validation for both name and workspace parameters ensures the command won't execute with missing required information, which is a good defensive practice.


46-48: Job agent configuration updated to use dynamic type and user-provided name.

The configuration now correctly uses the automatically determined job agent type and the user-provided name, making the interface more consistent and user-friendly.


65-66: Required name flag properly implemented.

The name flag is correctly defined and marked as required, ensuring the user provides this critical information.

pkg/jobagent/runner.go (3)

13-13: Runner interface updated to accept job details.

The Start method's signature has been consistently updated to include job details across the codebase.


69-73: Job details fetching added with appropriate error handling.

The job details are now fetched before starting a job, with proper error handling that logs issues but allows the process to continue with other jobs.


76-76: Updated Runner.Start call with job details.

The function call has been updated to match the new interface, passing both the job and job details.

🧰 Tools
🪛 golangci-lint (1.64.8)

76-76: Error return value of a.client.UpdateJobWithResponse is not checked

(errcheck)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
cmd/ctrlc/root/run/exec/runner.go (1)

100-100: ⚠️ Potential issue

Security risk with unchecked template execution.

Passing jobDetails directly into the template execution without validation could lead to code injection if jobDetails contains malicious input.

Consider validating or sanitizing the jobDetails before using it in template execution:

-	if err := templatedScript.Execute(buf, jobDetails); err != nil {
+	// Create a sanitized/validated copy of jobDetails
+	safeJobDetails := sanitizeJobDetails(jobDetails)
+	if err := templatedScript.Execute(buf, safeJobDetails); err != nil {

You would need to implement the sanitizeJobDetails function to remove or escape potentially dangerous inputs.

🧹 Nitpick comments (2)
cmd/ctrlc/root/run/exec/runner.go (2)

138-156: Async process management with proper cleanup.

The goroutine for tracking process completion is a good approach, but consider the following improvements:

  1. The defensive nil check in lines 143-145 suggests ExecRunner might be used without proper initialization. Since you've now added NewExecRunner(), consider making ExecRunner's fields unexported and enforcing initialization through the constructor.

  2. Consider using a context for better goroutine management, especially for cancelation.

Here's a suggested improvement:

-go func(pid int, scriptPath string) {
+ctx, cancel := context.WithCancel(context.Background())
+go func(ctx context.Context, pid int, scriptPath string) {
+	defer cancel()
+	defer os.Remove(scriptPath) // Ensure cleanup happens in all cases
+	
 	err := cmd.Wait()
 	// Ensure the map is not nil; if there's any chance ExecRunner is used as a zero-value, initialize it.
 	r.mu.Lock()
-	if r.finished == nil {
-		r.finished = make(map[int]error)
-	}
 	r.finished[pid] = err
 	r.mu.Unlock()
 	
 	if err != nil {
 		log.Error("Process execution failed", "pid", pid, "error", err)
 	} else {
 		log.Info("Process execution succeeded", "pid", pid)
 	}
-
-	os.Remove(scriptPath)
-}(pid, tmpFile.Name())
+}(ctx, pid, tmpFile.Name())

This approach uses a deferred cleanup to ensure the file is always removed, and provides a context that could be used for cancellation if needed.


143-145: Unnecessary defensive programming.

Since you've added the NewExecRunner() constructor, these nil checks are redundant and suggest improper initialization. Consider removing them and ensuring all ExecRunner instances are created with the constructor.

-	if r.finished == nil {
-		r.finished = make(map[int]error)
-	}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1a6568 and 2d98dd4.

📒 Files selected for processing (2)
  • cmd/ctrlc/root/run/exec/exec.go (3 hunks)
  • cmd/ctrlc/root/run/exec/runner.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/ctrlc/root/run/exec/exec.go
🔇 Additional comments (6)
cmd/ctrlc/root/run/exec/runner.go (6)

22-25: Good addition of mutex for thread safety.

The addition of a mutex and a map to track completed processes is a good improvement for thread safety. This change properly addresses the need to track process completion status.


27-31: LGTM: Proper constructor initialization.

The constructor pattern ensures the finished map is properly initialized, preventing nil map panics later.


39-41: Defensive programming for null value handling.

Good defensive check for nil ExternalId. This addresses the previous comment about tracking processes and handles edge cases properly.


48-57: Thread-safe implementation for process status tracking.

The mutex-protected map lookup is correctly implemented. Using a map to track process completion status is more reliable than trying to check external processes directly.


59-70: Improved process status verification.

The fallback logic to check if a process is still running using signals is a good addition. The comments help explain the purpose of different code sections.


131-134: Proper cleanup on process start failure.

The code now properly removes the temporary script file if the process fails to start. This prevents resource leaks.

@zacharyblasczyk
Copy link
Member Author

@coderabbitai full review

Copy link

coderabbitai bot commented Mar 22, 2025

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
cmd/ctrlc/root/run/exec/exec.go (2)

16-20: Validate user overrides for job agent type (optional).
Currently, the code always auto-determines jobAgentType based on the OS without offering a user override. This is fine if you intend to enforce OS-based defaults. If user overrides are desired in the future, consider adding a --type or similar flag.


36-41: Validate “name” and “workspace” more thoroughly.
The simple checks for empty values are acceptable. If you want to disallow spaces or other disallowed characters, consider trimming or validating for disallowed values.

pkg/jobagent/job_details.go (1)

12-35: Minor optimization: Avoid double marshal/unmarshal.
Marshalling resp.JSON200 to bytes and then unmarshalling those bytes into details is correct, but you can directly unmarshal resp.Body (if accessible) into details, saving a step. Otherwise, the existing approach works fine.

cmd/ctrlc/root/run/exec/runner.go (1)

101-101: Risk of unsanitized templating.
Passing jobDetails directly into the script template can be risky if jobDetails originates from untrusted sources. Validate or sanitize the template variables if security is a concern.

Do you want to explore input-validation or sanitization options?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a3586da and 2d395fb.

📒 Files selected for processing (4)
  • cmd/ctrlc/root/run/exec/exec.go (3 hunks)
  • cmd/ctrlc/root/run/exec/runner.go (3 hunks)
  • pkg/jobagent/job_details.go (1 hunks)
  • pkg/jobagent/runner.go (3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
cmd/ctrlc/root/run/exec/runner.go (2)
pkg/jobagent/runner.go (1) (1)
  • Runner (12-15)
internal/api/client.gen.go (38) (38)
  • r (3511-3516)
  • r (3519-3524)
  • r (3540-3545)
  • r (3548-3553)
  • r (3568-3573)
  • r (3576-3581)
  • r (3593-3598)
  • r (3601-3606)
  • r (3621-3626)
  • r (3629-3634)
  • r (3654-3659)
  • r (3662-3667)
  • r (3683-3688)
  • r (3691-3696)
  • r (3704-3709)
  • r (3712-3717)
  • Job (128-144)
  • JobStatus (147-147)
  • JobStatusExternalRunNotFound (29-29)
  • err (686-686)
  • err (1738-1738)
  • err (1783-1783)
  • err (1812-1812)
  • err (1846-1846)
  • err (1891-1891)
  • err (1927-1927)
  • err (1979-1979)
  • err (2008-2008)
  • err (2042-2042)
  • err (2087-2087)
  • err (2116-2116)
  • err (2150-2150)
  • err (2184-2184)
  • err (2218-2218)
  • err (2263-2263)
  • JobStatusFailure (30-30)
  • JobStatusSuccessful (36-36)
  • JobStatusInProgress (31-31)
🔇 Additional comments (3)
cmd/ctrlc/root/run/exec/exec.go (1)

65-66: Appreciation: Marking the flag as required ensures clarity.
Using cmd.MarkFlagRequired("name") helps the user detect incorrect usage early.

pkg/jobagent/runner.go (1)

69-72: Skipping jobs that lack details.
Currently, if fetchJobDetails fails, you skip the job. This can be intentional. Just confirm that ignoring the job (and not marking its status) is the correct behavior for your workflow. If partial execution is allowed, consider an alternative fallback.

cmd/ctrlc/root/run/exec/runner.go (1)

132-157: Good use of asynchronous process handling and cleanup.
Storing process completion in r.finished is thread-safe thanks to the mutex, and removing the temp file in a deferred manner ensures we clean up on all paths.

@zacharyblasczyk
Copy link
Member Author

@coderabbitai summary

Copy link

coderabbitai bot commented Mar 22, 2025

✅ Actions performed

Summary regeneration triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
cmd/ctrlc/root/run/exec/runner.go (2)

38-51: Suggestion to unify process context with runner context.
Currently, each process has its own background context, which is independent of the runner’s r.ctx. If you want to terminate or clean up processes when Shutdown() is called, consider deriving the process context from r.ctx instead of context.Background(). This helps gracefully cancel long-running processes.


53-58: Graceful shutdown only waits for housekeeping.
The Shutdown() method cancels r.ctx and waits on the housekeeping goroutine, but any running commands spawned by cmd.Start() are not signaled to exit. Depending on your requirements, you might want to explicitly stop or interrupt processes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d395fb and f01a216.

📒 Files selected for processing (2)
  • cmd/ctrlc/root/run/exec/exec.go (3 hunks)
  • cmd/ctrlc/root/run/exec/runner.go (3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
cmd/ctrlc/root/run/exec/exec.go (2)
cmd/ctrlc/root/run/exec/runner.go (2) (2)
  • cmd (175-175)
  • NewExecRunner (38-51)
pkg/jobagent/runner.go (1) (1)
  • NewJobAgent (17-39)
🔇 Additional comments (13)
cmd/ctrlc/root/run/exec/exec.go (6)

16-20: Automatically selecting the job agent type for Windows vs. non-Windows OS is a neat approach.
No issues found with this logic; it simplifies usage by removing the need for an explicit “type” flag.


25-27: Example usage is properly aligned with the new flags.
Good job ensuring the example demonstrates the required --name and --workspace usage.


36-41: Excellent validation of required fields.
Rejecting empty name or workspace early ensures consistent usage and helps avoid potential runtime errors.


43-44: Graceful runner shutdown is a solid design choice.
Deferring runner.Shutdown() prevents resource leaks and ensures housekeeping routines are properly concluded.


49-51: Dynamic assignment of job agent config helps maintain DRY principles.
Using the previously defined variables (name, jobAgentType, workspaceId) for UpsertJobAgentJSONRequestBody is clear and maintainable.


68-69: Good practice to mark critical flags as required.
Marking the --name flag as mandatory underscores its importance for identifying the agent.

cmd/ctrlc/root/run/exec/runner.go (7)

23-27: Struct definition is straightforward and readable.
Using a dedicated ProcessInfo struct clarifies the fields and encapsulates process-related data.


28-36: Good concurrency-safe design.
Including a mutex and a map for tracking processes is a clean, thread-safe approach.


60-75: Effective periodic cleanup routine.
The hourly ticker and final ctx.Done() check are straightforward, ensuring old processes are eventually removed.


77-93: Retention-based process cleanup is clear and maintainable.
This logic is simple and prevents the processes map from growing indefinitely.


157-159: Security concern: Potential code injection via templated script.
Passing jobDetails directly to an html/template for shell execution can be risky if user input is not trusted or sanitized.


186-188: Proper error handling for command start.
Removing the script file on error and returning a wrapped error message is a good approach to maintain cleanliness and clarity.


191-220: Clear asynchronous execution flow with process tracking.
The combination of generating a unique ID, storing the process info, and launching a goroutine is a clean solution. Just note that processes will keep running even if the runner context is canceled, given separate contexts are used.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
cmd/ctrlc/root/run/exec/runner.go (1)

156-156: ⚠️ Potential issue

Security risk: Potential code injection in template execution.

Passing jobDetails directly into the template without sanitization could lead to code injection if jobDetails contains malicious content.

Consider sanitizing or validating fields in jobDetails before passing them to the template, or use a more restricted templating approach that prevents code execution.

🧹 Nitpick comments (3)
cmd/ctrlc/root/run/exec/runner.go (3)

59-74: Consider more frequent housekeeping for resource efficiency.

The housekeeping goroutine runs only once per hour, which might lead to unnecessary resource retention.

Consider decreasing the ticker interval to a more frequent time period (e.g., 5-15 minutes) to release resources more promptly:

- ticker := time.NewTicker(1 * time.Hour)
+ ticker := time.NewTicker(5 * time.Minute)

204-205: Consider utilizing parent context for child goroutines.

The code creates a new context that isn't connected to the runner's main context.

Consider deriving the goroutine's context from the runner's context to ensure proper cancellation during shutdown:

- ctx, cancel := context.WithCancel(context.Background())
+ ctx, cancel := context.WithCancel(r.ctx)

197-198: Consider more reliable unique identifiers.

Using the pointer address as a handle works but has limitations if used in distributed systems or across process restarts.

Consider using a more reliable unique identifier like UUID:

- handle := fmt.Sprintf("%p", procInfo)
+ handle := uuid.New().String()

This would require adding "github.com/google/uuid" to your imports.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f01a216 and eae19f1.

📒 Files selected for processing (1)
  • cmd/ctrlc/root/run/exec/runner.go (3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
cmd/ctrlc/root/run/exec/runner.go (1)
pkg/jobagent/runner.go (1) (1)
  • Runner (12-15)
🔇 Additional comments (10)
cmd/ctrlc/root/run/exec/runner.go (10)

22-26: Good implementation of process tracking with ProcessInfo struct.

The new ProcessInfo struct provides a clean way to encapsulate process state information, making it easier to track process lifecycle and status.


27-35: Well-designed concurrency management in ExecRunner.

The updated ExecRunner struct now includes proper synchronization primitives (mutex, WaitGroup) and context management for graceful shutdown. This is a significant improvement for handling concurrent process execution and cleanup.


37-50: Good initialization pattern with housekeeping goroutine.

The NewExecRunner function properly initializes the runner with context and starts a background housekeeping goroutine. This approach ensures proper resource cleanup even for long-running processes.


52-57: Graceful shutdown implementation is correct.

The Shutdown method properly cancels the context and waits for the housekeeping goroutine to complete, ensuring clean termination without resource leaks.


76-92: Good implementation of process cleanup.

The cleanupOldProcesses method correctly uses mutex locking to safely remove old completed processes. The 24-hour retention period seems reasonable for preserving process history.


99-105: Improved error handling for nil ExternalId.

The updated Status method now correctly checks for nil ExternalId before dereferencing, preventing potential null pointer exceptions.


106-125: Well-implemented process status checking.

The status checking logic has been improved to use the process map and properly determine job status based on process state.


127-129: Updated Start method signature correctly implements interface.

The Start method now accepts a jobDetails parameter as required by the updated Runner interface in jobagent package.


185-188: Improved process execution with asynchronous handling.

Using cmd.Start() instead of cmd.Run() allows for non-blocking execution, which is a better approach for long-running processes.


190-224: Well-implemented asynchronous process management.

The code now correctly:

  1. Creates and stores process information
  2. Uses a unique handle to track processes
  3. Executes the command asynchronously
  4. Updates process status on completion
  5. Cleans up temporary files
  6. Logs process outcomes

This significantly improves the robustness of the execution system.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
pkg/jobagent/runner.go (3)

13-13: Consider using a structured type instead of map[string]interface{}

Using map[string]interface{} for jobDetails provides flexibility but sacrifices type safety. This can lead to runtime errors if expected keys are missing or values have unexpected types. Consider defining a dedicated struct with proper field types to represent job details.

- Start(job api.Job, jobDetails map[string]interface{}) (string, error)
+ Start(job api.Job, jobDetails JobDetails) (string, error)

Where JobDetails would be a well-defined struct with appropriate fields.


1-203: Consider implementing a periodic cleanup mechanism

While the code now cleans up jobs that are no longer in progress, there's still a risk of orphaned entries in the handleMap if jobs complete but are not included in the API response. Consider implementing a periodic cleanup mechanism or a timeout-based approach to remove stale entries.

You could add a method like:

func (a *JobAgent) cleanupStaleJobs() {
    a.mu.Lock()
    defer a.mu.Unlock()
    
    // Get all current job IDs from API
    currentJobs, err := a.client.GetAllJobsWithResponse(context.Background(), a.id)
    if err != nil {
        log.Error("Failed to get all jobs for cleanup", "error", err)
        return
    }
    
    // Create a map of current job IDs for O(1) lookup
    currentJobMap := make(map[string]bool)
    for _, job := range *currentJobs.JSON200 {
        if job.Status == api.JobStatusInProgress {
            currentJobMap[job.Id.String()] = true
        }
    }
    
    // Remove entries from handleMap that don't exist in current jobs
    for jobId := range a.handleMap {
        if !currentJobMap[jobId] {
            delete(a.handleMap, jobId)
            log.Debug("Removed stale job from handleMap", "jobId", jobId)
        }
    }
}

And call this method periodically or as part of your job management cycle.

🧰 Tools
🪛 golangci-lint (1.64.8)

106-106: Error return value of a.client.UpdateJobWithResponse is not checked

(errcheck)


77-81:

❓ Verification inconclusive

Verify fetchJobDetails implementation and consider retries

The code skips a job if fetchJobDetails fails, which could lead to jobs never being processed if fetching details consistently fails. Consider implementing a retry mechanism or a fallback strategy.

jobDetails, err := fetchJobDetails(context.Background(), job.Id.String())
if err != nil {
-	log.Error("Failed to fetch job details", "error", err, "jobId", job.Id.String())
-	continue
+	log.Error("Failed to fetch job details, continuing with empty details", "error", err, "jobId", job.Id.String())
+	jobDetails = make(map[string]interface{})
}

🏁 Script executed:

#!/bin/bash
# Check implementation of fetchJobDetails function
grep -r "func fetchJobDetails" --include="*.go" .

Length of output: 171


Action: Revisit error-handling for job details retrieval

  • We verified that fetchJobDetails is implemented in pkg/jobagent/job_details.go and correctly returns a map[string]interface{}, error.
  • The updated diff now logs the error and assigns an empty map to jobDetails instead of skipping the job. Please ensure that processing with empty details is acceptable for downstream job handling.
  • If the job details are critical to execution, consider implementing a retry mechanism or other fallback strategy to better handle persistent errors.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eae19f1 and df80fe2.

📒 Files selected for processing (2)
  • cmd/ctrlc/root/run/exec/runner.go (3 hunks)
  • pkg/jobagent/runner.go (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cmd/ctrlc/root/run/exec/runner.go
🔇 Additional comments (9)
pkg/jobagent/runner.go (9)

31-37: Initialization looks good

The initialization of the JobAgent struct with the new fields is correctly implemented. The handleMap is properly initialized as an empty map.


48-50: Proper synchronization for concurrent access

The addition of the handleMap with a corresponding mutex for synchronization is a good pattern for thread-safe access to the map.


60-64: Good logging practices

The addition of debug logging with the handleMap size is helpful for monitoring and debugging. The mutex is properly locked and unlocked when accessing the map.


85-93: Good job management with mutex protection

The check for already running jobs with proper mutex locking is well implemented. This prevents duplicate processing of the same job.


117-124: Consistent status update

The code correctly sets the job status to InProgress when updating with the external ID.


131-135: Good debugging practice

Logging the handleMap size at the end of the method provides valuable debugging information.


145-149: Good debugging practice

Logging the handleMap size at the start of UpdateRunningJobs is helpful for monitoring and debugging.


183-191: Good cleanup strategy

Removing completed jobs from the handleMap is a good practice to prevent memory leaks. The mutex is correctly used for thread safety.


196-200: Good debugging practice

Logging the handleMap size at the end of UpdateRunningJobs provides useful metrics for monitoring.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
cmd/ctrlc/root/run/exec/runner.go (1)

26-107: Consider adding context for timeout handling.

The current implementation doesn't handle timeouts for long-running processes. This could potentially lead to orphaned processes if the parent application terminates.

Consider adding context support to allow for timeout management:

-func (r *ExecRunner) Start(job api.Job, jobDetails map[string]interface{}, statusUpdateFunc func(jobID string, status api.JobStatus, message string)) (string, api.JobStatus, error) {
+func (r *ExecRunner) Start(ctx context.Context, job api.Job, jobDetails map[string]interface{}, statusUpdateFunc func(jobID string, status api.JobStatus, message string)) (string, api.JobStatus, error) {
   // ... existing code ...

   go func(handle, scriptPath string) {
     defer os.Remove(scriptPath)

+    // Create a done channel for signaling completion
+    done := make(chan error, 1)
+    go func() {
+      done <- cmd.Wait()
+    }()
+
+    // Wait for either completion or context cancellation
+    select {
+    case err := <-done:
+      if err != nil {
+        log.Error("Process execution failed", "handle", handle, "error", err)
+        statusUpdateFunc(job.Id.String(), api.JobStatusFailure, err.Error())
+      } else {
+        log.Info("Process execution succeeded", "handle", handle)
+        statusUpdateFunc(job.Id.String(), api.JobStatusSuccessful, "")
+      }
+    case <-ctx.Done():
+      // Attempt to gracefully terminate the process
+      if cmd.Process != nil {
+        cmd.Process.Signal(os.Interrupt)
+        // Give it a chance to clean up
+        select {
+        case <-done:
+          // Process exited after signal
+        case <-time.After(5 * time.Second):
+          // Force kill if still running after timeout
+          cmd.Process.Kill()
+        }
+      }
+      statusUpdateFunc(job.Id.String(), api.JobStatusFailure, "Process terminated due to timeout or cancellation")
+    }
-    err := cmd.Wait()
-
-    if err != nil {
-      log.Error("Process execution failed", "handle", handle, "error", err)
-      statusUpdateFunc(job.Id.String(), api.JobStatusFailure, err.Error())
-    } else {
-      log.Info("Process execution succeeded", "handle", handle)
-      statusUpdateFunc(job.Id.String(), api.JobStatusSuccessful, "")
-    }
   }(handle, tmpFile.Name())
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between df80fe2 and ab92d45.

📒 Files selected for processing (3)
  • cmd/ctrlc/root/run/exec/exec.go (3 hunks)
  • cmd/ctrlc/root/run/exec/runner.go (3 hunks)
  • pkg/jobagent/runner.go (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/jobagent/runner.go
🧰 Additional context used
🧬 Code Definitions (1)
cmd/ctrlc/root/run/exec/runner.go (1)
internal/api/client.gen.go (21) (21)
  • r (3511-3516)
  • r (3519-3524)
  • r (3540-3545)
  • r (3548-3553)
  • r (3568-3573)
  • r (3576-3581)
  • r (3593-3598)
  • r (3601-3606)
  • r (3621-3626)
  • r (3629-3634)
  • r (3654-3659)
  • r (3662-3667)
  • r (3683-3688)
  • r (3691-3696)
  • r (3704-3709)
  • r (3712-3717)
  • Job (128-144)
  • JobStatus (147-147)
  • JobStatusFailure (30-30)
  • JobStatusSuccessful (36-36)
  • JobStatusInProgress (31-31)
🔇 Additional comments (10)
cmd/ctrlc/root/run/exec/exec.go (5)

16-20: Automatic OS-based type detection is a good enhancement.

This change improves usability by automatically determining the appropriate job agent type based on the OS rather than requiring the user to specify it. This reduces the chances of misconfiguration.


25-27: Usage example improves clarity.

Adding an example in the command documentation helps users understand how to use the command correctly. The example now clearly shows the required parameters.


36-41: Proper parameter validation.

The code now checks for required parameters before proceeding with execution, providing clear error messages when they're missing.


45-47: Properly configured job agent with dynamic values.

The code now uses the provided name and dynamically determined job agent type instead of hard-coded values, making the command more flexible and user-friendly.


61-62: Required flag properly marked.

Making the name flag required ensures users provide this essential parameter, improving command usability and preventing errors.

cmd/ctrlc/root/run/exec/runner.go (5)

26-27: Improved method signature with better flow control.

The updated signature now includes jobDetails and a callback function for status updates, allowing for asynchronous job execution with proper status reporting. The additional return value for job status provides more information to callers.


54-55: Potential security risk with template execution.

Passing jobDetails directly into the template and executing as a script can result in code injection if jobDetails includes malicious input.

Consider sanitizing or validating fields in jobDetails and config.Script to reduce security risks.


83-86: Improved process handling with cleanup.

The code now properly starts the process asynchronously and includes cleanup of the temporary file in case of failure.


92-105: Asynchronous process management with proper status updates.

The implementation of a goroutine to monitor the process execution is a good approach for asynchronous job handling. It properly updates the job status through the callback function and cleans up temporary files when the process completes.


89-90:

✅ Verification successful

Pointer address used as process handle.

Using the pointer address as a handle works but could lead to issues if the pointer is garbage collected. Consider using a more robust identifier.


🏁 Script executed:

#!/bin/bash
# Check if there's any place in the codebase where the handle is used other than logging
rg -A 2 -B 2 "handle :=" --type go

Length of output: 366


Clarification on Process Handle Usage

Based on the search results, the pointer address (formatted via fmt.Sprintf("%p", cmd)) is only used within cmd/ctrlc/root/run/exec/runner.go for logging purposes. While using a pointer address as a handle might be risky if relied upon for long-term or critical process identification—due to potential garbage collection—it’s currently serving solely as a log identifier. If the role of this handle expands beyond logging in the future, consider switching to a more robust identifier.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (7)
Makefile (1)

21-23: Consider adding 'fmt' to the .PHONY list.
Currently, the .PHONY line does not include fmt, which could lead to conflicts if a file or directory named “fmt” existed.

-.PHONY: build install test clean
+.PHONY: build install test clean fmt
internal/api/job.go (1)

19-34: Consider passing context from the caller.
Currently, NewJobWithDetails uses context.Background() to fetch job details, which may mask caller cancellations or timeouts.

-func NewJobWithDetails(client *ClientWithResponses, job Job) (*JobWithDetails, error) {
+func NewJobWithDetails(ctx context.Context, client *ClientWithResponses, job Job) (*JobWithDetails, error) {
    j := &JobWithDetails{
        Job:    job,
        Client: client,
    }
    var err error
-   j.Details, err = j.GetJobDetails(context.Background())
+   j.Details, err = j.GetJobDetails(ctx)
    if err != nil {
        return nil, fmt.Errorf("failed to get job details: %w", err)
    }
    return j, nil
}
pkg/jobagent/runner.go (2)

63-66: Consider implementing the commented job deduplication logic

The TODO comment suggests a useful feature to prevent starting the same job twice, but it's commented out.

If job deduplication is important, consider implementing this logic or creating a follow-up issue to track this enhancement.


74-94: Consider limiting the number of concurrent jobs

The code creates a goroutine for each job without limiting concurrency, which could exhaust system resources with many jobs.

Consider implementing a worker pool or semaphore to limit concurrent job execution:

+ // Create a semaphore to limit concurrent jobs
+ var sem = make(chan struct{}, 10) // Adjust the limit as needed

  for _, apiJob := range *jobs.JSON200.Jobs {
    // ...job setup...
    
    wg.Add(1)
+   // Acquire semaphore
+   sem <- struct{}{}
    go func(job *api.JobWithDetails) {
      defer wg.Done()
+     defer func() { <-sem }() // Release semaphore
      
      // job execution...
    }(job)
  }
cmd/ctrlc/root/run/exec/runner.go (3)

95-103: Possible resource leak in error handling

The error handling properly removes the temporary file in failure cases, but there's inconsistent defer usage which could lead to resource leaks.

  tmpFile, err := os.CreateTemp("", "script*"+ext)
  if err != nil {
    return api.JobStatusFailure, fmt.Errorf("failed to create temp script file: %w", err)
  }
+ defer os.Remove(tmpFile.Name()) // Always clean up the file, regardless of success/failure path

  // Write the script to the temporary file
  if _, err := tmpFile.WriteString(script); err != nil {
-   os.Remove(tmpFile.Name())
    return api.JobStatusFailure, fmt.Errorf("failed to write script file: %w", err)
  }
  if err := tmpFile.Close(); err != nil {
-   os.Remove(tmpFile.Name())
    return api.JobStatusFailure, fmt.Errorf("failed to close script file: %w", err)
  }

113-119: Potential leak when using CommandContext

When using exec.CommandContext, make sure the context passed has an appropriate timeout or cancellation mechanism.

Consider deriving a timeout context from the parent context:

+ // Create a timeout context for the command execution
+ cmdCtx, cancel := context.WithTimeout(ctx, 24*time.Hour) // Adjust timeout as needed
+ defer cancel()
- cmd := exec.CommandContext(ctx, "bash", "-c", tmpFile.Name())
+ cmd := exec.CommandContext(cmdCtx, "bash", "-c", tmpFile.Name())

216-241: Improve process termination robustness

The killProcess function could be more robust with additional error checking and a more graceful termination sequence.

Consider enhancing with these improvements:

  1. Check if the process is still running before attempting to kill it
  2. Use a more structured approach to graceful termination:
  func killProcess(cmd *exec.Cmd, jobID string) {
+   // Check if process exists before attempting to terminate
+   if cmd.Process == nil {
+     log.Debug("Process already exited", "jobId", jobID)
+     return
+   }
+   
    if runtime.GOOS == "windows" {
      // Windows termination...
    } else {
      // Unix termination...
      
      go func() {
-       time.Sleep(2 * time.Second)
+       // Use a more structured approach with context cancellation
+       ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+       defer cancel()
+       
+       select {
+       case <-ctx.Done():
          if err := cmd.Process.Kill(); err != nil {
            log.Error("Failed to kill process", "error", err, "jobId", jobID)
          }
+       case <-waitForProcessExit(cmd):
+         log.Debug("Process exited gracefully", "jobId", jobID)
+       }
      }()
    }
  }
+
+ // Helper function to monitor process exit
+ func waitForProcessExit(cmd *exec.Cmd) <-chan struct{} {
+   ch := make(chan struct{})
+   go func() {
+     cmd.Process.Wait()
+     close(ch)
+   }()
+   return ch
+ }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ab92d45 and 5160555.

📒 Files selected for processing (8)
  • Makefile (1 hunks)
  • cmd/ctrlc/root/run/exec/exec.go (2 hunks)
  • cmd/ctrlc/root/run/exec/runner.go (1 hunks)
  • cmd/ctrlc/root/sync/clickhouse/clickhouse.go (4 hunks)
  • cmd/ctrlc/root/sync/sync.go (1 hunks)
  • cmd/ctrlc/root/version/version.go (1 hunks)
  • internal/api/job.go (1 hunks)
  • pkg/jobagent/runner.go (3 hunks)
✅ Files skipped from review due to trivial changes (3)
  • cmd/ctrlc/root/sync/sync.go
  • cmd/ctrlc/root/version/version.go
  • cmd/ctrlc/root/sync/clickhouse/clickhouse.go
🧰 Additional context used
🧬 Code Definitions (1)
pkg/jobagent/runner.go (2)
internal/api/job.go (2)
  • JobWithDetails (12-17)
  • NewJobWithDetails (20-34)
internal/api/client.gen.go (4)
  • JobStatus (147-147)
  • ClientWithResponses (3319-3321)
  • JobStatusFailure (30-30)
  • JobStatusInProgress (31-31)
🔇 Additional comments (23)
Makefile (1)

17-18: Looks good to have a dedicated format target.
These lines add a fmt target to invoke gofmt -w -s -e . on the entire codebase. This aligns well with common Go formatting practices.

cmd/ctrlc/root/run/exec/exec.go (11)

5-8: Imports for shutdown handling and heredoc look proper.
The added imports (os, os/signal, runtime, syscall, and heredoc) are used consistently and appropriately for managing signals and formatting example strings.

Also applies to: 10-10


19-22: Variables naming is clear and initialization is concise.
Defining both name and jobAgentType together with sensible defaults is tidy.


24-26: Runtime check for Windows is handled correctly.
Overriding jobAgentType with "exec-powershell" on Windows is consistent with the new design.


31-33: Example usage is consistent with the new approach.
No type parameter is used, which aligns with automatic OS-based selection.


37-37: Retrieving workspace from config.
Using viper.GetString("workspace") is straightforward.


42-47: Validation steps for required flags are appropriate.
The checks ensure name and workspace are present instead of relying solely on flagged errors.


49-54: Constructing jobAgentConfig is clean and readable.
Passing the necessary data (name, type, workspaceId) is consistent and well-organized.


57-58: Creating JobAgent with the runner is straightforward.
Leverages jobagent.NewJobAgent effectively without extraneous parameters.


63-72: Signal-based graceful shutdown is well-implemented.
The channel uses buffered capacity, ensuring no missed signals, and runner.ExitAll(true) is invoked to clean up.


76-76: Clear error propagation for queued jobs.
Returning an error with context (fmt.Errorf) aids in debugging.


83-85: Flag usage is unambiguous and ensures required inputs.
Marking name and workspace as required matches the earlier validation checks.

internal/api/job.go (6)

1-9: Imports and package declaration look correct.
All required standard libraries and dependencies are present for JSON handling, templating, and context usage.


11-17: Structure JobWithDetails is clearly organized.
Storing both base Job data and extra fields (Details, Client, and ExternalID) provides a convenient extension for advanced features.


36-55: Fetching and unmarshaling job details is properly handled.
Error checks for failed responses or empty data add resilience.


80-100: Status update logic and error handling are robust.
Your checks for both network/response errors and missing response payload are thorough.


102-105: ExternalID setter is concise and straightforward.
This helps keep external references decoupled from the main job logic.


57-78:

✅ Verification successful

Script templating approach works, but be mindful of potential injection risks.
If the template content can come from untrusted sources, consider how placeholders might be sanitized.

Please confirm whether you anticipate arbitrary user input for the script field:


🏁 Script executed:

#!/bin/bash
# Searching for code references that store or retrieve the script from external inputs.
rg -A 5 'script'

Length of output: 18640


Script Templating Security: Confirmed No Arbitrary Input Source
The templating mechanism extracts the script solely from JobAgentConfig (as seen in internal/api/job.go), and our search did not reveal any evidence that this value is sourced from arbitrary user input. This minimizes the current injection risk. However, if the origin of the template ever shifts to include untrusted input, you should consider adding appropriate sanitization for any placeholders.

  • Verified that the only occurrences of script in the repository are within internal configurations and usage.
  • No indications that script can be manipulated by arbitrary or external user inputs were detected.
pkg/jobagent/runner.go (2)

12-15: Interface redesign improves job execution flow

The revised Runner interface with the new Start method signature enhances job management by accepting a context and returning a status directly. This better supports cancellation and structured error handling.


47-49: Documentation accurately reflects the implementation changes

The updated comments clearly explain the new responsibility distribution where the Start method handles job status updates.

cmd/ctrlc/root/run/exec/runner.go (3)

22-27: Good encapsulation of running job details

The RunningJob struct effectively encapsulates all necessary details for tracking and managing job execution state.


36-40: Helper function improves code organization

Extracting the status update logic into a helper function reduces duplication and centralizes error handling.


79-82:

❓ Verification inconclusive

Verify template safety against potential injection

The TemplateJobDetails method could be vulnerable to template injection if job details contain user-submitted content.


🏁 Script executed:

#!/bin/bash
# Look for template usage in the TemplateJobDetails method
grep -rn "TemplateJobDetails" --include="*.go" .
rg -A 20 "func \(\w+ \*JobWithDetails\) TemplateJobDetails\(\)"

Length of output: 1445


Potential Template Injection Risk

  • The TemplateJobDetails method in internal/api/job.go directly parses and executes the script template using the job agent config’s "script" and j.Details.
  • If either the script or the details can include user-submitted content, this could expose a template injection vulnerability.
  • Please verify that both the source of the script template and the contents of j.Details are either from trusted sources or properly validated/sanitized before being used for template execution.

Comment on lines 63 to 66
// TODO: If we can start jobs in pending, then this is valid and will prevent us from starting the same job twice.
// if apiJob.Status == api.JobStatusInProgress {
// continue
// }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually pretty critical to not restarting a job that takes longer than the interval, and for not starting jobs already handled by other agents.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We start jobs in a in progress state which doesn't make sense to me. It should only be in progress after it is picked up?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
cmd/ctrlc/root/run/exec/exec.go (2)

21-25: Variable overshadowing concern.

Defining maxJobTimeoutMin here and then re-declaring it in line 52 with := can create confusion, because it shadows this variable and doesn’t update its value. Consider updating line 52 to an assignment (e.g., maxJobTimeoutMin = viper.GetInt("max-job-timeout")) or removing this initial declaration if it’s unnecessary.

 var (
     name             string
     jobAgentType     = "exec-bash"
-    maxJobTimeoutMin int
+    maxJobTimeoutMin int // Keep declared, but ensure we don't overshadow it later
 )

52-55: Ensure single variable assignment.

As noted, either remove the earlier declaration of maxJobTimeoutMin or avoid redeclaring it here to reduce confusion. Otherwise, this logic for defaulting to 30 minutes is sound.

cmd/ctrlc/root/run/exec/runner.go (1)

99-201: Process starting and asynchronous cleanup.

  1. Pointer address as job handle: Using fmt.Sprintf("%p", cmd) might collide if the same memory address is reused. While unlikely in typical usage, consider a unique ID generator (e.g., UUID) for clarity.
  2. Script creation: Generating a temp file, making it executable, and removing it after the run is a safe approach.
  3. Concurrency: The defer r.cleanupJob(handle) approach is appropriate for ensuring consistent cleanup.
-handle := fmt.Sprintf("%p", cmd)
+handle := uuid.NewString()
 // Remainder of code can stay the same
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5160555 and 43f68ca.

📒 Files selected for processing (4)
  • cmd/ctrlc/root/run/exec/exec.go (2 hunks)
  • cmd/ctrlc/root/run/exec/runner.go (1 hunks)
  • internal/api/job.go (1 hunks)
  • pkg/jobagent/runner.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/jobagent/runner.go
🧰 Additional context used
🧬 Code Definitions (1)
cmd/ctrlc/root/run/exec/exec.go (3)
internal/api/client.go (1)
  • NewAPIKeyClientWithResponses (10-19)
cmd/ctrlc/root/run/exec/runner.go (1)
  • NewExecRunner (42-87)
pkg/jobagent/runner.go (1)
  • NewJobAgent (18-39)
🔇 Additional comments (25)
cmd/ctrlc/root/run/exec/exec.go (10)

4-12: Imports look good.

The additional imports (context, os/signal, runtime, syscall, time, heredoc) correctly support the new timeout and signal handling features.


34-36: Example usage is clear.

The updated example nicely illustrates how to specify the agent’s name and workspace. Keep it in sync whenever new flags or options are introduced.


40-40: No issues with retrieving the API key from config.


45-47: Validation for agent name is correct.

The user experience will be more robust with this check in place.


48-50: Validation for workspace is correct.

Similarly, requiring a workspace keeps the command usage clear.


57-60: Timeout-based context creation looks good.

This properly enforces a maximum processing time.


75-85: Signal handling for graceful shutdown is well-implemented.

Exiting all jobs upon interrupt or termination ensures a clean shutdown and consistent state.


87-89: Queued job execution invocation is straightforward.

No issues found with handling errors from RunQueuedJobs.


96-99: Required flags solidify correct usage.

Enforcing --name and --workspace ensures the command is never run with missing crucial parameters.


27-29:

✅ Verification successful

Confirm type naming consistency.

You’ve set jobAgentType = "exec-powershell" on Windows and "exec-bash" otherwise. Ensure any related validation or error messages elsewhere in the codebase also use these values, rather than older strings like "exec-windows" or "exec-linux".


🏁 Script executed:

#!/bin/bash
# Description: Search for references to "exec-windows" or "exec-linux" in the repository 
# to confirm they aren't mistakenly used in validations, help text, or error messages.

rg "exec-windows|exec-linux" -A 3

Length of output: 35


I'll run another verification script to search for both deprecated and current agent type strings to ensure consistency across the codebase. This will help us confirm that no references to "exec-windows" or "exec-linux" are lingering and that the new values ("exec-powershell" and "exec-bash") are used consistently:

Once this script completes, please share the outputs so we can finalize the review comment.


🏁 Script executed:

#!/bin/bash
echo "Searching for deprecated types: 'exec-windows' or 'exec-linux'"
rg "exec-windows|exec-linux" -n -A 3

echo "Searching for updated types: 'exec-powershell' or 'exec-bash'"
rg "exec-powershell|exec-bash" -n -A 3

Length of output: 777


Agent type naming consistent across the codebase

The verification confirms that only the updated types "exec-powershell" (for Windows) and "exec-bash" (for non-Windows) are used, and there are no lingering references to the deprecated "exec-windows" or "exec-linux".

cmd/ctrlc/root/run/exec/runner.go (8)

4-15: Imports and dependencies are appropriate.

They adequately support command execution, signal handling, and logging.


22-27: RunningJob struct is concise and clear.

This captures all essential details (cmd, job ID, job reference, cancellation status).


29-32: ExecRunner struct design.

Using a map plus a mutex for tracking concurrent jobs is a standard and effective approach.


35-40: Job status updates.

Centralizing status updates in a helper function improves code clarity and fosters a single source of truth for job statuses.


42-84: Graceful shutdown logic.

The multi-signal handling with a short timeout is a robust approach to ensure the application can do cleanup while also allowing forced termination.


89-97: Job cleanup function.

Acquiring a lock, deleting the job from the map, and logging debug info is well-structured.


203-234: ExitAll function is well-structured.

You lock the map, iterate through running jobs, and optionally kill them. The approach for updating their statuses before killing them is consistent with the rest of the code.


236-261: killProcess function.

Terminating the process tree for Windows with taskkill /F /T /PID and using pgid + SIGTERM on Unix-based systems, plus a final kill fallback, ensures cross-platform coverage.

internal/api/job.go (7)

1-18: Struct extension approach.

JobWithDetails & context fields allow for more detailed and context-driven job states.


20-23: Factory function usage.

NewJobWithDetails calling NewJobWithDetailsContext is a clear pattern for default vs. custom contexts.


25-41: Context-aware constructor logic.

Retrieving Details immediately in the constructor ensures the job data is ready for subsequent steps.


43-62: GetJobDetails handles API fetch robustly.

Error checks for the response object and JSON marshalling appear thorough.


64-71: WithContext method.

Cloning with a shallow copy is a solid approach to embedding a fresh context without losing job fields.


95-115: UpdateStatus method is straightforward.

Handling optional parameters for message and externalID is well done, and errors are properly surfaced.


117-120: Setting ExternalID.

This simple setter effectively externalizes the job reference ID for further updates.

Comment on lines +72 to +78
// TemplateJobDetails applies the job details template to the script
func (j *JobWithDetails) TemplateJobDetails() (string, error) {
// Extract script from JobAgentConfig
script, ok := j.JobAgentConfig["script"].(string)
if !ok {
return "", fmt.Errorf("script not found in job agent config")
}

// Parse the script template
templatedScript, err := template.New("script").Parse(script)
if err != nil {
return "", fmt.Errorf("failed to parse script template: %w", err)
}

// Execute the template with job details
buf := new(bytes.Buffer)
if err := templatedScript.Execute(buf, j.Details); err != nil {
return "", fmt.Errorf("failed to execute script template: %w", err)
}

return buf.String(), nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Script templating can be risky without sanitization.

User-provided details are injected into a script template, which can lead to code injection if the content is untrusted. Consider validating or sanitizing fields in j.Details to mitigate security risks.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 43f68ca and c6fdca4.

📒 Files selected for processing (2)
  • cmd/ctrlc/root/run/exec/exec.go (2 hunks)
  • cmd/ctrlc/root/run/run.go (2 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
cmd/ctrlc/root/run/run.go (2)
internal/cliutil/interval.go (1)
  • AddIntervalSupport (10-51)
cmd/ctrlc/root/run/exec/exec.go (1)
  • NewRunExecCmd (18-91)
🪛 golangci-lint (1.64.8)
cmd/ctrlc/root/run/exec/exec.go

50-50: ineffectual assignment to interval

(ineffassign)

🔇 Additional comments (10)
cmd/ctrlc/root/run/run.go (3)

7-7: LGTM: Added viper import for config support.

Adding the viper import is appropriate for reading configuration settings.


19-23: Good implementation of configurable interval.

The code now reads the interval from configuration with a sensible default of "10s". This makes the command behavior more flexible without requiring code changes.


24-24: LGTM: Updated command addition with interval support.

The exec command is now properly wrapped with the interval support function, passing the configured interval value.

cmd/ctrlc/root/run/exec/exec.go (7)

5-8: LGTM: Added necessary imports for signal handling and OS detection.

These imports support the new graceful shutdown logic and OS-based agent type selection.


19-26: LGTM: OS-aware job agent type selection.

Automatically selecting the appropriate job agent type based on the OS is a good improvement. This removes the need for users to manually specify the correct type.


31-33: LGTM: Updated example with correct usage pattern.

The example now correctly shows the required flags without the unnecessary --type parameter since it's automatically determined.


42-47: LGTM: Added validation for required parameters.

Good defensive programming by checking for required parameters early in the execution.


53-58: LGTM: Improved job agent configuration.

The job agent is now configured with dynamic values from command flags and OS detection.


68-76: LGTM: Added graceful shutdown handling.

The signal handling for graceful shutdown is a great addition that improves the user experience when terminating the process.


87-89: LGTM: Added required flag markers.

Marking the flags as required ensures users provide the necessary parameters.

Copy link
Collaborator

@j7m4 j7m4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code should be thoroughly tested with numerous combinations of long and short running processes as part of a test suite. At this stage of the project that may not be needed, but bugs will creep in and we might end up with deadlocks.

client, err := api.NewAPIKeyClientWithResponses(apiURL, apiKey)
if err != nil {
return fmt.Errorf("failed to create API client: %w", err)
}
if name == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could move input validation above the client creation?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
cmd/ctrlc/root/run/exec/exec.go (2)

83-84: Handle errors from MarkFlagRequired calls

The MarkFlagRequired method returns an error that should be checked. While these errors are rare (typically only occurring if the flag doesn't exist), it's a good practice to handle them.

-	cmd.MarkFlagRequired("name")
-	cmd.MarkFlagRequired("workspace")
+	if err := cmd.MarkFlagRequired("name"); err != nil {
+		log.Fatal("Failed to mark name flag as required", "error", err)
+	}
+	if err := cmd.MarkFlagRequired("workspace"); err != nil {
+		log.Fatal("Failed to mark workspace flag as required", "error", err)
+	}
🧰 Tools
🪛 golangci-lint (1.64.8)

83-83: Error return value of cmd.MarkFlagRequired is not checked

(errcheck)


18-18: Consider adding GoDoc comment for exported function

The NewRunExecCmd function is exported but lacks a GoDoc comment. Adding documentation would improve code readability and help users of this package understand its purpose and usage.

+// NewRunExecCmd creates a new command for executing jobs directly when received
 func NewRunExecCmd() *cobra.Command {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6fdca4 and 33ea1ad.

📒 Files selected for processing (3)
  • Makefile (1 hunks)
  • cmd/ctrlc/root/run/exec/exec.go (2 hunks)
  • pkg/jobagent/runner.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/jobagent/runner.go
🧰 Additional context used
🧬 Code Definitions (1)
cmd/ctrlc/root/run/exec/exec.go (3)
internal/api/client.go (1)
  • NewAPIKeyClientWithResponses (10-19)
cmd/ctrlc/root/run/exec/runner.go (1)
  • NewExecRunner (42-74)
pkg/jobagent/runner.go (1)
  • NewJobAgent (18-39)
🪛 golangci-lint (1.64.8)
cmd/ctrlc/root/run/exec/exec.go

83-83: Error return value of cmd.MarkFlagRequired is not checked

(errcheck)

🔇 Additional comments (7)
Makefile (2)

17-22: Great addition of fmt and lint targets!

These new targets enhance code quality processes by providing standardized formatting and static code analysis capabilities. The fmt target will help maintain consistent code style, while the lint target will help catch potential issues early.


26-26: Good consolidation of .PHONY declarations

Consolidating all .PHONY declarations into a single line is a clean approach and follows common Makefile conventions. This change properly includes both the existing and new targets.

cmd/ctrlc/root/run/exec/exec.go (5)

19-26: Good OS-specific adaptation for job agent type

The code now intelligently selects the appropriate executor based on the operating system, which improves cross-platform compatibility. Using PowerShell for Windows and Bash for other platforms is a sensible default.


31-33: Example command is more useful now

The updated example clearly demonstrates how to use the command with the required flags, which improves usability.


38-43: Good validation of required parameters

Adding explicit validation for both the name and workspace parameters ensures that the command won't proceed without the necessary information, preventing potential errors later.


48-53: Appropriate job agent configuration

The job agent configuration now uses the provided name and OS-specific job agent type, which aligns with the overall design of the system.


63-71:

❓ Verification inconclusive

Check for potential signal handler conflict

There appears to be two signal handlers in the codebase - one here and one in the NewExecRunner function (based on the relevant code snippets). This could potentially lead to race conditions or unexpected behavior when handling shutdown signals.


🏁 Script executed:

#!/bin/bash
# Check for other signal handlers in the codebase that might conflict
grep -r "signal.Notify" --include="*.go" .

Length of output: 212


Action Required: Review and Confirm Signal Handler Behavior

After verification, we found that the codebase contains two invocations of signal.Notify:

  • File: cmd/ctrlc/root/run/exec/exec.go
    (Lines 63-71) starts a goroutine to listen for os.Interrupt and syscall.SIGTERM and call runner.ExitAll(true).

  • File: cmd/ctrlc/root/run/exec/runner.go
    (Signal handling is similarly set up, as identified by the grep output.)

There is a potential risk of conflicting or redundant signal handling, which might lead to race conditions or unexpected shutdown behavior. Please review both signal handlers to verify if they are intended to coexist or if merger/refactoring is recommended to ensure a single, coherent shutdown process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants