Skip to content

Conversation

@catherinetcai
Copy link

Adds support for peer configuration that's in YAML format according to: #1393

I consolidated the peer configurations from both the consolidated annotation and single annotations into populating a single config struct.

Tested this by spinning up a cluster using https://github.com/aauren/kube-router-automation and adding the following annotations onto the aws-controller and aws-worker nodes and then running kubectl exec into the pods to validate that the BGP configurations were picked up.

apiVersion: v1
kind: Node
metadata:
  name: aws-controller
  annotations:
    kube-router.io/peer.ips: "10.95.0.254"
    kube-router.io/peer.asns: "4200000001"
    kube-router.io/peer.localips: "192.168.1.0"
apiVersion: v1
kind: Node
metadata:
  name: aws-worker
  annotations:
    kube-router.io/peers: |
      - remoteip: 10.0.0.1
        remoteasn: 64640
        password: cGFzc3dvcmQ=
        localip: 192.168.0.1

I'm going to keep this MR in the draft state until I'm able to do a more thorough testing cycle with kubetest2.

@aauren

Copy link
Collaborator

@aauren aauren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@catherinetcai Thanks for starting work on this!

Most of my comments are nits, there are no major problems that I can see with this work. I like the fact that you pulled out all of the BGP parsing stuff into functions on a struct. I think that's already going a long ways towards making the code more readable.

The other main thing I was looking for was keeping backwards compatibility, but it looks like you've done that pretty well.

The configuration structure itself looks pretty good, although I'm still mulling that over a bit to see if there would be some way to make some of the items less duplicated. For instance, people are likely to have similar Password, RemoteASN, and RemoteIP for many of the peers. Maybe it would be possible to define a global default that people could specify once, and then allow individual entries to override that default when necessary?

Or maybe we could introduce a concept of peer groups? https://github.com/osrg/gobgp/blob/master/docs/sources/configuration.md?plain=1#L175-L187 which would follow more of the gobgp standard?

FRR has this standard as well: https://github.com/aauren/kube-router-automation/blob/main/ansible/playbooks/roles/bgp_router/templates/frr.conf.j2#L36-L42

@catherinetcai
Copy link
Author

@catherinetcai Thanks for starting work on this!

Most of my comments are nits, there are no major problems that I can see with this work. I like the fact that you pulled out all of the BGP parsing stuff into functions on a struct. I think that's already going a long ways towards making the code more readable.

The other main thing I was looking for was keeping backwards compatibility, but it looks like you've done that pretty well.

The configuration structure itself looks pretty good, although I'm still mulling that over a bit to see if there would be some way to make some of the items less duplicated. For instance, people are likely to have similar Password, RemoteASN, and RemoteIP for many of the peers. Maybe it would be possible to define a global default that people could specify once, and then allow individual entries to override that default when necessary?

Or maybe we could introduce a concept of peer groups? https://github.com/osrg/gobgp/blob/master/docs/sources/configuration.md?plain=1#L175-L187 which would follow more of the gobgp standard?

FRR has this standard as well: https://github.com/aauren/kube-router-automation/blob/main/ansible/playbooks/roles/bgp_router/templates/frr.conf.j2#L36-L42

I really like the idea of following more of the GoBGP standard. Do you have any thoughts for how those should be passed in?

@aauren
Copy link
Collaborator

aauren commented Nov 2, 2025

@catherinetcai Thanks for starting work on this!
Most of my comments are nits, there are no major problems that I can see with this work. I like the fact that you pulled out all of the BGP parsing stuff into functions on a struct. I think that's already going a long ways towards making the code more readable.
The other main thing I was looking for was keeping backwards compatibility, but it looks like you've done that pretty well.
The configuration structure itself looks pretty good, although I'm still mulling that over a bit to see if there would be some way to make some of the items less duplicated. For instance, people are likely to have similar Password, RemoteASN, and RemoteIP for many of the peers. Maybe it would be possible to define a global default that people could specify once, and then allow individual entries to override that default when necessary?
Or maybe we could introduce a concept of peer groups? https://github.com/osrg/gobgp/blob/master/docs/sources/configuration.md?plain=1#L175-L187 which would follow more of the gobgp standard?
FRR has this standard as well: https://github.com/aauren/kube-router-automation/blob/main/ansible/playbooks/roles/bgp_router/templates/frr.conf.j2#L36-L42

I really like the idea of following more of the GoBGP standard. Do you have any thoughts for how those should be passed in?

Heh... I think that it felt more apparent to me when I hadn't thought it through all the way. I was thinking that you could just add a peerGroup to the annotation yaml and then reference it. But this is per-node not cluster wide. I suppose that we could do something like add a config file or allow users to add a group via a parameter to kube-router, but those all feel a bit like a hack.

I guess for now we leave it as it is. But maybe in the future, we add a CRD or something? This is similar to what Cilium does: https://docs.cilium.io/en/stable/network/bgp-control-plane/bgp-control-plane-v2 Although theirs is a bit different because they have nodeSelectors in their BGP config so that users can control node applications that way. Which I suppose is more k8s idiomatic.

if err := yaml.Unmarshal([]byte(nodeBgpPeersAnnotation), &peerConfigs); err != nil {
return nil, fmt.Errorf("failed to parse %s annotation: %w", peersAnnotation, err)
}
klog.Infof("Peer config from %s annotation: %+v", peersAnnotation, peerConfigs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that this has the potential to leak BGP secrets into logs. My thought here, is that this should be fairly easily avoidable, if we don't log the peersAnnotation (here or the one above although it makes the line less helpful, maybe we could just direct the user to look at the node's annotations?) and we add a .String() function to bgp.PeerConfigs that avoided printing the password attribute.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! Updated the PeerConfig struct with the String() function as you suggested where it omits printing the password.


for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
bgpPeerCfgs, err := bgpPeerConfigsFromAnnotations(tc.nodeAnnotations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are currently failing because it's missing the local address parameter here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦🏼‍♀️ Super embarrassing, should have caught that. Sorry. This was sloppy.

if len(remoteIPs) != len(ports) && len(ports) != 0 {
return fmt.Errorf("invalid peer router config. The number of ports should either be zero, or "+
"one per peer router. If blank items are used, it will default to standard BGP port, %s. "+
"Example: \"port,,port\" OR [\"port\",\"\",\"port\"]", strconv.Itoa(options.DefaultBgpPort))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we return an error here, then we wouldn't actually use the default option right? I believe that returning an error here would mean that the BGP server would fail to start in startBGPServer() right?

Its possible that I misunderstood the code path here, but if I'm right, then we may want to change the helpers above to have an else statement so that we always have a default. Then remove the default blurb from this message as at this point we'll already have the default?

Alternatively we could turn these into a warn or an info log rather than an error.

docs/bgp.md Outdated
remoteasn: 65000
password: U2VjdXJlUGFzc3dvcmQK,
- remoteip: 192.168.1.100
remoteasn: 65000'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a trailing apostrophe here, and trailing commas on the base64 passwords I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fail, thank you.

…ace for PeerConfig structs. Break out ValToPtr functions into a testutils package.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants