Skip to content

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Sep 26, 2025

This PR implements automatic bridge creation and deletion for bridge nodes that only connect containerlab-managed interfaces, addressing issue #XXX.

Problem

Previously, users had to manually create and manage Linux bridges when using the bridge node kind in topologies. This created friction for the common use case where bridges only connect containerlab nodes (internal-only bridges) and don't interface with external systems.

# Before: Required manual bridge creation
topology:
  nodes:
    srl1:
      kind: nokia_srlinux
    srl2: 
      kind: nokia_srlinux
    # note: bridge br-clab must be created manually
    br-clab:
      kind: bridge
  links:
    - endpoints: ["srl1:e1-1", "br-clab:eth1"]
    - endpoints: ["srl2:e1-1", "br-clab:eth2"]

Solution

The implementation adds automatic bridge lifecycle management through:

1. Automatic Bridge Creation

  • PreDeploy Hook: Bridge nodes now check if the bridge exists during the pre-deploy phase
  • Auto-Creation: If the bridge doesn't exist in the host namespace, it's automatically created and brought up
  • Tracking: Bridges created by containerlab are marked for proper cleanup

2. Automatic Bridge Deletion

  • Smart Cleanup: During topology destruction, bridges are automatically deleted if:
    • They have no remaining slave interfaces after container cleanup
    • They appear to be containerlab-managed (based on creation tracking or naming patterns)
    • They are in the host namespace (container namespace bridges are handled differently)

3. Safety Mechanisms

  • Conservative Logic: Only deletes bridges that are clearly containerlab-managed
  • Timing Handling: Uses polling to handle the timing issue where bridge deletion occurs before container interface cleanup
  • Preservation: Manually created bridges or those managed by other tools are preserved

Key Changes

  • nodes/bridge/bridge.go:

    • Added PreDeploy() method for automatic bridge creation
    • Enhanced Delete() method with intelligent cleanup logic
    • Added shouldAutoDelete() heuristics for safe bridge identification
    • Modified CheckDeploymentConditions() to not require pre-existing bridges
  • links/endpoint_bridge.go:

    • Updated CheckBridgeExists() to allow non-existent bridges for auto-creation scenarios
    • Added appropriate logging for bridge auto-creation
  • lab-examples/br01/br01.clab.yml:

    • Updated documentation to reflect that manual bridge creation is no longer required

Behavior

# After: No manual bridge creation needed
topology:
  nodes:
    srl1:
      kind: nokia_srlinux
    srl2:
      kind: nokia_srlinux  
    # Bridge will be auto-created and auto-deleted
    br-clab:
      kind: bridge
  links:
    - endpoints: ["srl1:e1-1", "br-clab:eth1"] 
    - endpoints: ["srl2:e1-1", "br-clab:eth2"]

Deploy: Containerlab creates the bridge if it doesn't exist
Destroy: Containerlab removes the bridge if no interfaces remain and it was auto-created

Testing

Verified the implementation handles:

  • ✅ Auto-creation of missing bridges during deployment
  • ✅ Auto-deletion of containerlab-managed bridges during destroy
  • ✅ Preservation of manually created or external bridges
  • ✅ Proper handling of bridges with remaining interfaces
  • ✅ Backward compatibility with existing topologies

This change significantly improves the user experience for internal-only bridge scenarios while maintaining safety and backward compatibility.

Original prompt

This section details on the original issue you should resolve

<issue_title>Manage the lifecycle of an (internal-only) bridge </issue_title>
<issue_description>I recently started using the bridge node kind to connect multiple nodes to the same access domain. In other words, I use the bridge to connect multiple containerlab nodes defined in the same topology, and not connect the bridge to any external or other interfaces on the host. The documentation states you're supposed to bring up the bridge yourself. This makes perfect sense when the bridge is connected to interfaces outside of control of containerlab - the user will need to set up those anyway. But for the use case where the bridge only connects to containerlab nodes (like in the br01 example), I feel it would make for a better user experience if containerlab attempts to take ownership of the entire lifecycle for the bridge.

For instance, if a bridge node exists the topology, for the deploy and destroy commands:

  • [deploy] containerlab creates the bridge if it does not exist
  • [deploy] containerlab connects node endpoints as listed to the bridge
  • [destroy] containerlab removes the node endpoints it created on the bridge
  • [destroy] containerlab removes the bridge if no interfaces remain connected to the bridge on topology destroy, after having removed the nodes

WDYT? I am not super familiar with containerlab internals, but I am guessing these operations would need to be serialized before and after concurrently processing nodes?</issue_description>

Comments on the Issue (you are @copilot in this section)

@hellt Hi This sounds logical to me. The reason we never did that was
  • did not have this use case as a priority
  • more work

:)

But I'd be down to review a PR for it, especially considering the safe guards around not trying to remove the bridge if it existed before the deploy command </comment_new>
<comment_new>@steiler
Checking and setup of the bridge should be simple via implementing the PreDeploy() func on the bridge node.
The destroy action is a little more tricky. It is not a phased process or so ... we simply call delete on the nodes and thats it.
you could spawn a go routine that remains running still trying to delete ... but that seems ugly to me. So the delete call on the bridge should maybe block but when are all other nodes removed...
maybe create a second delete phase and also pull node.DeleteNetnsSymlink() into that phase ... so remove DeleteNetnsSymlink() from the node interface and implicitly call it from deleteStageTwo() or so ... in which then also the bridge interfaces are being checked...
Or try to check via the list of links, if links would remain after all the links the bridge knows of via the topology file are removed.</comment_new>
<comment_new>@hellt
What if we had a clean up stage that is called after the nodes destroy?
Something that potentially can be exposed to a user by introducing two more top level knobs

Setup
Teardown

With a list of commands to run using teardown.commands and setup.commands</comment_new>

Fixes #2267

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@Copilot Copilot AI changed the title [WIP] Manage the lifecycle of an (internal-only) bridge Implement automatic bridge lifecycle management for internal-only bridges Sep 26, 2025
@Copilot Copilot AI requested a review from hellt September 26, 2025 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Manage the lifecycle of an (internal-only) bridge
2 participants