-
Notifications
You must be signed in to change notification settings - Fork 1
Resolve "Alpamon Agent Refactoring" #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add a Pool struct that implements the Goroutine Worker Pool pattern to solve the problem of potentially having an unlimited number of goroutines.
There was a potential for goroutines to not shut down gracefully upon alpamon termination because their contexts were not propagated. Furthermore, the unrestrained use of contexts resulted in poor context management. To address these problems, add a ContextManager to globally manage contexts within alpamon.
As the Pool was added, add test cases for the functionalities it provides.
As the ContextManager was added, add test cases for the functionalities it provides.
As the management of goroutines and contexts was changed to be handled globally by the Pool and ContextManager, respectively, refactor the following components: - Collector(pkg/collector/collector.go) - WebsocketClient(pkg/runner/client.go) - CommandRunner(pkg/runner/command.go) - CommitAsync()(pkg/runner/commit.go) - Reporter(pkg/scheduler/reporter.go) - LogServer(pkg/logger/server.go)
If multiple PtyClients attempt to use the terminals variable simultaneously, a race condition could occur. To solve this, use sync.RWMutex to prevent the race condition.
Add a pool section to alpamon.conf to allow users to dynamically configure parameters such as the maximum number of workers and the queue size for the Pool.
Refactore the newly added pool section in alpamon.conf to be stored and used in Settings, similar to the other existing sections in alpamon.conf.
Delete unused NewContextManagerWithRoot().
Adjust golint to pool_test.go.
Add a ReporterManager to manage the goroutine lifecycle of the Reporters.
Fix a context leak by modifying the code to use the ReporterManager for managing the goroutine and context lifecycle of the Reporters.
As mentioned in the code review, fix the code to use a cancel function to prevent context leaks and memory leaks in the restart, quit, reboot, and shutdown operation cases.
As mentioned in the code review, fix the code to use a cancel function to prevent context leaks and memory leaks in LogServer and WebsocketClient.
Replace unconventional context timeout as sleep mechanism with standard time.After + select pattern for better readability and Go idioms.
Use pointer type for DefaultTimeout to distinguish between "not configured" and "explicitly set to 0", enabling users to disable timeout via config.
…ase-1-goroutine-pool-and-context-management-implementation Resolve "[Alpamon Agent Refactoring] Phase 1: Goroutine Pool and Context Management Implementation"
Add executor components to resolve the God Object Anti-pattern in pkg/runner/command.go Add Handler, CommandExecutor, and WSClient interfaces considering polymorphism and extensibility. Executor is the component that executes the actual command, responsible for receiving the command data from Handler, managing privilege demotion, and executing the command.
Add Registry, which is responsible for registering and unregistering handlers, and managing the registered handlers.
Add CommandDispatcher component, which is responsible for routing the Command received from the CommandRunner to the correct handler.
Add BaseHandler to provide functionality commonly used by all handlers.
The previous use of maps for passing CommandData between the Dispatcher, Handler, and Executor resulted in reduced type safety. This was resolved by adding a separate CommandArgs struct to manage the data used by these components.
Update the CommandDispatcher following the addition of CommandArgs.
Update Interfaces following the addition of CommandArgs at pkg/executor/handlers/common/interfaces.go
Integrate NewBaseHandlerWithTypes() into NewBaseHandler().
Refactor pkg/runner/command.go and pkg/runner/command\_types.go following the addition of executor-related components to resolve the God Object Anti-pattern in pkg/runner/command.go. Refactor the CommandRunner so that its sole responsibility is to parse the incoming CommandData into CommandArgs, route the command to the Dispatcher for execution by the Executor, and send the execution result back to the WebsocketClient. Firewall logic was kept for compatibility and is scheduled for refactoring in Phase 3.
Add dispatcher field to WebsocketClient to handle command with CommandRunner and CommandDispatcher. Separate the logic for the command cases into a method named handleCommand()
Minor fix.
Add a SystemHandler to process system-related commands. This handler takes over the following cases previously handled by CommandRunner: upgrade, restart, quit, reboot, shutdown, update, byebye
Update pty.go to import and use protocol.CommandData directly. Update terminal.go to use protocol.CommandData instead of runner.CommandData. Update client.go, command.go to use types in internal/protocol instead of types in pkg/runner.
…-phase4-transport-layer-separation Resolve "[Alpamon Agent Refactoring] Phase4: Transport Layer Separation"
Add InfoHandler tests (ping, help, commit, sync) Add ShellHandler tests (operators, user/group, env, timeout) Add SystemHandler tests (upgrade, restart, quit, reboot, shutdown)
Add pool leak tests (normal ops, panic, context cancel, queue full). Add context manager leak tests (create/cancel, child cleanup). Verify zero goroutine leaks after operations.
Add pool benchmarks (worker scaling, queue throughput, concurrent submit). Add context manager benchmarks (create, timeout, concurrent). Add dispatcher/registry benchmarks (get, list, concurrent). Fix benchmark deadlock by increasing queue size.
Add integration tests for registry and handler execution. Add regression tests for all handler/command types. Add resource usage tests (memory, startup time, overhead).
Apply golint.
Apply golint.
Apply golint.
…ase-5-testing-and-validation Resolve "[Alpamon Agent Refactoring] Phase 5: Testing and Validation"
…actoring-integrate-main-branch-changes-into-refactored-architecture
Add TunnelHandler to handle the following commands: opentunnel, closetunnel Define types and structs related to TunnelHandler.
Add fields related to tunneling functionality in conjunction with TunnelHandler.
…ation Update RegisterAll() to include TunnelHandler registration.
…tegrate-main-branch-changes-into-refactored-architecture Resolve "[Alpamon Agent Refactoring] Integrate Main Branch Changes into Refactored Architecture"
|
@junho226 , @jisung-02 First of all, I apologize for requesting a review on such a large amount of code.
During the refactoring process, I have also incorporated the latest changes merged into the main branch.
|
Inline suppressions added for intentional behaviors (9 locations): - executor.go: CWE-78 command injection (core agent functionality) - file.go: CWE-22 path traversal, CWE-918 SSRF (admin file operations) - user.go: CWE-22 path traversal (home directory management) - fs.go: CWE-22 path traversal (file existence check) - http_client.go: CWE-918 SSRF (admin-specified URL requests) Alpamon is a server management agent equivalent to SSH server. Commands are received only from authenticated Alpacon administrators.
Add CodeQL security scanning for Go codebase with custom configuration. New files: - .github/workflows/codeql.yml: Weekly CodeQL analysis on main branch - .github/codeql/codeql-config.yml: Security model documentation
Remove custom CodeQL configuration to resolve conflict with GitHub's Default Setup which is already enabled on this repository. Deleted files: - .github/workflows/codeql.yml - .github/codeql/codeql-config.yml Inline suppression comments in source files (executor.go, file.go, user.go, fs.go, http_client.go) remain intact and will work with GitHub's Default Setup.
Add // lgtm[...] inline comments to suppress CodeQL security alerts that are false positives for this server management agent. Alpamon is designed to execute admin commands from authenticated Alpacon console, similar to an SSH server. These operations are intentional features, not vulnerabilities: - go/command-injection: executor.go (admin command execution) - go/path-injection: file.go, user.go, fs.go (file operations) - go/request-forgery: file.go, http_client.go (URL fetching) Existing // codeql[...] comments retained for documentation.
Summary
This PR merges the complete agent refactoring work from
131-alpamon-agent-refactoringbranch intomain. The refactoring addresses critical architectural issues, including the God Object anti-pattern, uncontrolled goroutine spawning, and circular dependencies.Key Achievements
Architecture Changes
1. Handler Pattern Implementation
Decomposed the monolithic
command.gointo specialized handlers:2. Goroutine Pool & Context Management
Worker Pool (
internal/pool/pool.go):Context Manager (
pkg/agent/context.go):3. Command Dispatcher System
4. Main Branch Features Integrated
Design Patterns Applied
pkg/executor/handlers/pkg/executor/factory.gopkg/executor/registry.gohandlers/firewall/backend.gointernal/pool/pool.gopkg/agent/context.goBreaking Changes
API Changes
NewCommandRunner()now requiresdispatcherparameterNewWebsocketClient()now requiresctxManagerandworkerPoolparametersConfiguration
New
[pool]section inalpamon.conf:Dependencies
github.com/xtaci/smux(tunnel multiplexing)gopkg.in/natefinch/lumberjack.v2(log rotation)Test Results
Test Coverage
Files Changed (Key Files)
Core Refactoring
pkg/runner/command.go- Reduced from 2,363 to 121 linespkg/runner/client.go- Added pool integration and ping/pongcmd/alpamon/command/root.go- Updated initialization sequenceNew Components
internal/pool/pool.go- Worker pool implementationpkg/agent/context.go- Context managerpkg/executor/dispatcher.go- Command dispatcherpkg/executor/factory.go- Handler factorypkg/executor/handlers/*- 9 specialized handlersIntegrated from Main
pkg/runner/tunnel_client.go- smux tunnel clientpkg/runner/auth_manager.go- Sudo approval managerpkg/runner/control_client.go- Control WebSocket clientVerification Checklist
Deployment Notes
[pool]section to existing configsRelated Issues/PRs