Skip to content

poc/kagenti-integration#2354

Draft
jezekra1 wants to merge 1 commit intomainfrom
poc/kagenti-integration
Draft

poc/kagenti-integration#2354
jezekra1 wants to merge 1 commit intomainfrom
poc/kagenti-integration

Conversation

@jezekra1
Copy link
Collaborator

@jezekra1 jezekra1 commented Mar 9, 2026

Signed-off-by: Radek Ježek radek.jezek@ibm.com

Summary

Refs #2304

Linked Issues

Documentation

  • No Docs Needed:

"skopeo",
"copy",
*(["--src-username", "x-access-token", "--src-password", github_token] if github_token else []),
*(["--src-username", "x-access-token", "--src-password", github_token] if github_token and image.startswith("ghcr.io/") else []),

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

The string
ghcr.io/
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI about 21 hours ago

In general, to fix incomplete URL substring sanitization, you should parse the value into structured components (scheme, host/registry, path) and then compare the host/registry field to an allowlist or exact values, rather than relying on substring or prefix matching on the entire string.

Here, the goal is to detect when a container image is hosted on GitHub Container Registry (ghcr.io) so that the code can attach --src-username/--src-password and the GITHUB_TOKEN environment variable. Instead of image.startswith("ghcr.io/"), we should derive the registry portion of the image reference and compare that registry to "ghcr.io" (or to a small allowlist containing it). This keeps behavior the same for normal values like ghcr.io/org/repo:tag, but avoids misclassifying strings with extra prefixes such as docker://ghcr.io/org/repo or oci://ghcr.io/....

A minimal, self-contained way to do this without changing external behavior is:

  1. Add a small helper function in this module (near other utilities) that safely extracts the registry from an image reference:
    • Strip any docker:// prefix if present (since the code itself prepends docker:// elsewhere).
    • Split the remaining string on / and treat the first component as the registry if it contains a dot (.) or colon (:), or is exactly localhost (standard Docker rule).
  2. Replace the two uses of image.startswith("ghcr.io/") in this function with a call to that helper (e.g., is_ghcr_image(image) which internally compares the parsed registry to "ghcr.io").
  3. Keep the rest of the logic unchanged, including the way --src-username/--src-password and env are constructed.

This change stays entirely within platform.py, doesn’t add new dependencies, and only affects the detection of GHCR images.


Suggested changeset 1
apps/agentstack-cli/src/agentstack_cli/commands/platform.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/apps/agentstack-cli/src/agentstack_cli/commands/platform.py b/apps/agentstack-cli/src/agentstack_cli/commands/platform.py
--- a/apps/agentstack-cli/src/agentstack_cli/commands/platform.py
+++ b/apps/agentstack-cli/src/agentstack_cli/commands/platform.py
@@ -44,6 +44,31 @@
 configuration = Configuration()
 
 
+def _get_image_registry(image: str) -> str | None:
+    """
+    Extract the registry (host) component from a container image reference.
+
+    This follows Docker's heuristic: if the first path component contains a dot
+    or a colon, or is exactly 'localhost', it is treated as the registry.
+    """
+    if not image:
+        return None
+    # Strip a leading transport prefix if present (for example 'docker://').
+    if "://" in image:
+        # Only split once to avoid discarding the remainder.
+        _, image = image.split("://", 1)
+    first_component = image.split("/", 1)[0]
+    if "." in first_component or ":" in first_component or first_component == "localhost":
+        return first_component
+    return None
+
+
+def _is_ghcr_image(image: str) -> bool:
+    """Return True if the image is hosted on GitHub Container Registry."""
+    registry = _get_image_registry(image)
+    return registry == "ghcr.io"
+
+
 @functools.cache
 def detect_driver() -> typing.Literal["lima", "wsl"]:
     has_lima = (importlib.resources.files("agentstack_cli") / "data" / "bin" / "limactl").is_file() or shutil.which("limactl")
@@ -811,12 +836,16 @@
                     else [
                         "skopeo",
                         "copy",
-                        *(["--src-username", "x-access-token", "--src-password", github_token] if github_token and image.startswith("ghcr.io/") else []),
+                        *(
+                            ["--src-username", "x-access-token", "--src-password", github_token]
+                            if github_token and _is_ghcr_image(image)
+                            else []
+                        ),
                         f"docker://{image}",
                         f"containers-storage:{image}",
                     ],
                     f"Pulling image {image}",
-                    env={"GITHUB_TOKEN": github_token} if github_token and image.startswith("ghcr.io/") else None,
+                    env={"GITHUB_TOKEN": github_token} if github_token and _is_ghcr_image(image) else None,
                 )
 
         # --- Kagenti platform installation ---
EOF
@@ -44,6 +44,31 @@
configuration = Configuration()


def _get_image_registry(image: str) -> str | None:
"""
Extract the registry (host) component from a container image reference.

This follows Docker's heuristic: if the first path component contains a dot
or a colon, or is exactly 'localhost', it is treated as the registry.
"""
if not image:
return None
# Strip a leading transport prefix if present (for example 'docker://').
if "://" in image:
# Only split once to avoid discarding the remainder.
_, image = image.split("://", 1)
first_component = image.split("/", 1)[0]
if "." in first_component or ":" in first_component or first_component == "localhost":
return first_component
return None


def _is_ghcr_image(image: str) -> bool:
"""Return True if the image is hosted on GitHub Container Registry."""
registry = _get_image_registry(image)
return registry == "ghcr.io"


@functools.cache
def detect_driver() -> typing.Literal["lima", "wsl"]:
has_lima = (importlib.resources.files("agentstack_cli") / "data" / "bin" / "limactl").is_file() or shutil.which("limactl")
@@ -811,12 +836,16 @@
else [
"skopeo",
"copy",
*(["--src-username", "x-access-token", "--src-password", github_token] if github_token and image.startswith("ghcr.io/") else []),
*(
["--src-username", "x-access-token", "--src-password", github_token]
if github_token and _is_ghcr_image(image)
else []
),
f"docker://{image}",
f"containers-storage:{image}",
],
f"Pulling image {image}",
env={"GITHUB_TOKEN": github_token} if github_token and image.startswith("ghcr.io/") else None,
env={"GITHUB_TOKEN": github_token} if github_token and _is_ghcr_image(image) else None,
)

# --- Kagenti platform installation ---
Copilot is powered by AI and may make mistakes. Always verify output.
],
f"Pulling image {image}",
env={"GITHUB_TOKEN": github_token} if github_token else None,
env={"GITHUB_TOKEN": github_token} if github_token and image.startswith("ghcr.io/") else None,

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

The string
ghcr.io/
may be at an arbitrary position in the sanitized URL.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant architectural shift by integrating Kagenti into the Agent Stack CLI. This integration simplifies agent management, enhances scalability, and provides a more streamlined local development experience. The changes involve removing custom Kubernetes management components and adopting Kagenti's deployment and discovery mechanisms.

Highlights

  • Kagenti Integration: This PR integrates Kagenti for agent scaling, deployment, and discovery, replacing the custom Kubernetes provider management.
  • Architecture Refactoring: The architecture is refactored to delegate agent lifecycle management to Kagenti, streamlining the local developer experience and enabling optional enterprise features.
  • Dependency Updates: Removes dependencies related to the previous Kubernetes provider management and updates configurations to align with Kagenti's architecture.
  • UI and API Changes: Updates the UI and API endpoints to reflect the new architecture, including changes to agent URLs and authentication methods.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • apps/agentstack-cli/src/agentstack_cli/init.py
    • Removed the 'build' command and related UI elements.
    • Updated UI URL to agentstack.localtest.me:8080.
  • apps/agentstack-cli/src/agentstack_cli/api.py
    • Modified the OpenAI client to remove the Authorization header from default headers.
  • apps/agentstack-cli/src/agentstack_cli/auth_manager.py
    • Added 'login_with_password' method for authentication using resource owner password grant.
  • apps/agentstack-cli/src/agentstack_cli/commands/agent.py
    • Removed GitHub repository related logic and Dockerfile options from the 'add' and 'update' commands.
    • Modified 'add' and 'update' commands to accept network URLs instead of Docker images or GitHub URLs.
    • Removed discovery timeout constants.
    • Removed environment variable management commands.
    • Modified agent listing to remove state sorting and missing environment variable display.
  • apps/agentstack-cli/src/agentstack_cli/commands/build.py
    • Removed the entire file, as build functionality is now handled by Kagenti.
  • apps/agentstack-cli/src/agentstack_cli/commands/platform.py
    • Modified the 'start' command to include Kagenti dependencies and configuration.
    • Added logic to parse chart-scoped values from YAML files and command-line flags.
    • Modified image pulling and Helm installation processes to accommodate Kagenti.
  • apps/agentstack-cli/src/agentstack_cli/commands/self.py
    • Updated the UI URL to agentstack.localtest.me:8080.
  • apps/agentstack-cli/src/agentstack_cli/commands/server.py
    • Added a shortcut for local development login using resource owner password grant.
  • apps/agentstack-cli/src/agentstack_cli/configuration.py
    • Removed 'agent_registry' and added auto-recovery for local dev authentication.
    • Removed HttpUrl import.
  • apps/agentstack-cli/src/agentstack_cli/data/vm/common/etc/systemd/system/kubectl-port-forward@.service
    • Modified the kubectl port-forward service to support namespace specification.
  • apps/agentstack-cli/src/agentstack_cli/utils.py
    • Removed 'print_log' function and GitHub URL related logic.
  • apps/agentstack-cli/uv.lock
    • Updated the required Python version to 3.14.
  • apps/agentstack-sdk-py/src/agentstack_sdk/a2a/extensions/services/platform.py
    • Updated the default platform URL to agentstack-api.localtest.me:8080.
  • apps/agentstack-sdk-py/src/agentstack_sdk/platform/init.py
    • Removed imports related to provider builds and discovery.
  • apps/agentstack-sdk-py/src/agentstack_sdk/platform/client.py
    • Updated the default platform URL to agentstack-api.localtest.me:8080.
  • apps/agentstack-sdk-py/src/agentstack_sdk/platform/common.py
    • Removed GithubVersionType, ResolvedGithubUrl, and ResolvedDockerImageID classes.
  • apps/agentstack-sdk-py/src/agentstack_sdk/platform/provider.py
    • Removed auto_stop_timeout, version_info, registry, and related logic.
    • Added source_type and simplified the Provider model.
    • Removed environment variable management methods.
  • apps/agentstack-sdk-py/src/agentstack_sdk/server/server.py
    • Removed AgentExtension import and environment variable reloading logic.
    • Modified the serve method to handle self-registration without a client factory.
  • apps/agentstack-sdk-ts/src/experimental/server/core/config/schemas.ts
    • Updated the default platform URL to agentstack-api.localtest.me:8080.
  • apps/agentstack-server/src/agentstack_server/api/auth/auth.py
    • Removed provider_builds and provider_variables permissions.
  • apps/agentstack-server/src/agentstack_server/api/dependencies.py
    • Removed ProviderBuildServiceDependency and ProviderDiscoveryServiceDependency.
  • apps/agentstack-server/src/agentstack_server/api/routes/provider_builds.py
    • Removed the entire file, as build functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/api/routes/provider_discovery.py
    • Removed the entire file, as discovery functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/api/routes/providers.py
    • Removed auto_stop_timeout, version_info, registry, and related logic.
    • Modified the create_provider and patch_provider methods to align with the new architecture.
  • apps/agentstack-server/src/agentstack_server/application.py
    • Removed imports and routes related to provider builds and discovery.
  • apps/agentstack-server/src/agentstack_server/bootstrap.py
    • Removed KubernetesProviderBuildManager and KubernetesProviderDeploymentManager injection.
  • apps/agentstack-server/src/agentstack_server/configuration.py
    • Removed AgentRegistryConfiguration and added KagentiConfiguration.
    • Updated the default issuer URLs for Keycloak.
  • apps/agentstack-server/src/agentstack_server/domain/constants.py
    • Removed DOCKER_MANIFEST_LABEL_NAME and DEFAULT_AUTO_STOP_TIMEOUT.
  • apps/agentstack-server/src/agentstack_server/domain/models/permissions.py
    • Removed provider_variables and provider_builds permissions.
  • apps/agentstack-server/src/agentstack_server/domain/models/provider.py
    • Removed auto_stop_timeout, version_info, registry, and related logic.
    • Added source_type and simplified the Provider model.
  • apps/agentstack-server/src/agentstack_server/domain/repositories/provider.py
    • Removed type and unmanaged_state parameters from the list method.
    • Removed update_unmanaged_state method and replaced it with update_state.
  • apps/agentstack-server/src/agentstack_server/exceptions.py
    • Removed BuildAlreadyFinishedError and MissingConfigurationError.
  • apps/agentstack-server/src/agentstack_server/infrastructure/kagenti/init.py
    • Added an empty init file.
  • apps/agentstack-server/src/agentstack_server/infrastructure/kagenti/client.py
    • Added a KagentiClient class for interacting with the Kagenti API.
  • apps/agentstack-server/src/agentstack_server/infrastructure/persistence/migrations/alembic/versions/c0095389475b_.py
    • Removed managed provider features and added Kagenti sync support.
  • apps/agentstack-server/src/agentstack_server/infrastructure/persistence/repositories/provider.py
    • Removed type and unmanaged_state parameters from the list method.
    • Removed update_unmanaged_state method and replaced it with update_state.
  • apps/agentstack-server/src/agentstack_server/service_layer/build_manager.py
    • Removed the entire file, as build functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/service_layer/deployment_manager.py
    • Removed the entire file, as deployment functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/service_layer/services/a2a.py
    • Removed deployment manager and related logic.
    • Simplified the ensure_agent method.
  • apps/agentstack-server/src/agentstack_server/service_layer/services/provider_build.py
    • Removed the entire file, as build functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/service_layer/services/provider_discovery.py
    • Removed the entire file, as discovery functionality is now handled by Kagenti.
  • apps/agentstack-server/tasks.toml
    • Removed provider_build and provider_discovery queues.
  • apps/agentstack-ui/src/utils/constants.ts
    • Updated the default API URL to agentstack-api.localtest.me:8080.
  • apps/agentstack-ui/tasks.toml
    • Updated the schema generation script to use the new API URL.
  • apps/agentstack-ui/template.env
    • Updated the default API URL to agentstack-api.localtest.me:8080 and NEXTAUTH_URL to agentstack.localtest.me:8080.
Activity
  • Removed KubernetesProviderDeploymentManager and KubernetesProviderBuildManager.
  • Integrated Kagenti for agent scaling, deployment, and discovery.
  • Updated configurations to align with Kagenti's architecture.
  • Modified UI and API endpoints to reflect the new architecture.
  • Removed environment variable management commands.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring to integrate kagenti for agent lifecycle management, replacing the custom Kubernetes provider deployment and build system. The changes are extensive, touching many parts of the codebase from the CLI to the server-side services and database models. Key changes include removing the build command, simplifying the add and update agent commands, and introducing a new kagenti client for agent discovery. The provider model has been greatly simplified, and a new database migration reflects these changes. Overall, the changes are consistent with the goal of delegating agent management to kagenti. I've identified a few areas where maintainability could be improved by reducing hardcoded values in the platform setup scripts.

Note: Security Review did not run due to the size of the PR.

Comment on lines +771 to +782
await run_in_vm(
vm_name,
[
"bash",
"-c",
textwrap.dedent("""\
kubectl --kubeconfig=/kubeconfig apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/standard-install.yaml
kubectl --kubeconfig=/kubeconfig apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.2/cert-manager.yaml
kubectl --kubeconfig=/kubeconfig wait --for=condition=Available deployment -n cert-manager cert-manager-webhook --timeout=120s
"""),
],
"Installing kagenti prerequisites (Gateway API CRDs, cert-manager)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The URLs for installing Gateway API CRDs and cert-manager are hardcoded with specific versions (v1.4.0 and v1.17.2 respectively). While this ensures reproducibility, it makes updates require code changes. To improve maintainability, consider defining these versions as constants at the top of the file.

Comment on lines +785 to +802
await run_in_vm(
vm_name,
[
"bash",
"-c",
textwrap.dedent("""\
ISTIO_VERSION=1.28.0
ISTIO_REPO=https://istio-release.storage.googleapis.com/charts/
helm repo add istio "$ISTIO_REPO" 2>/dev/null || true
helm repo update istio
kubectl --kubeconfig=/kubeconfig create namespace istio-system --dry-run=client -o yaml | kubectl --kubeconfig=/kubeconfig apply -f -
helm upgrade --install istio-base istio/base --version=$ISTIO_VERSION --namespace=istio-system --kubeconfig=/kubeconfig --wait --force-conflicts
helm upgrade --install istiod istio/istiod --version=$ISTIO_VERSION --namespace=istio-system --kubeconfig=/kubeconfig --wait --force-conflicts \
--set pilot.resources.requests.cpu=50m \
--set pilot.resources.requests.memory=256Mi
"""),
],
"Installing Istio (Gateway API controller)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Istio version (1.28.0) is hardcoded within this shell script block. This could lead to maintenance issues if kagenti's Istio dependency changes in the future. It would be more maintainable to define this version as a constant at the top of the file, making it easier to update.

agent_namespace = agent.get("namespace", namespace)

# Construct service URL from k8s naming convention (service port 8080)
url = f"http://{name}.{agent_namespace}.svc.cluster.local:8080"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The agent service URL is constructed with a hardcoded port 8080. If kagenti allows agents to run on different ports, this could cause connection issues. If this is a fixed convention from kagenti, adding a comment to clarify this would be helpful. For more robustness, consider if the port can be discovered from the service definition rather than being hardcoded.

- Replace Docker/registry-based providers with network-only providers
- Add kagenti agent sync cron and provider health check refresh
- Expose otel-collector via HTTPRoute for local agent telemetry
- Upgrade Phoenix image to 12.31.2 for GraphQL API compatibility
- Fix server traces reaching otel-collector (port 4318→8335)
- Set default OTEL endpoint in Python SDK for local deployments
- Simplify provider model: ProviderState (online/offline) replaces
  ProviderType/ProviderStatus/ProviderUnmanagedStatus
- Remove Docker image labels, GitHub version resolving, provider builds
- Fix checkbox selection UX in agent remove command

Signed-off-by: Radek Ježek <radek.jezek@ibm.com>
@jezekra1 jezekra1 force-pushed the poc/kagenti-integration branch from a95917a to 898c888 Compare March 12, 2026 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant