Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added autodetect of interfaces on private networks #893

Closed
wants to merge 169 commits into from

Conversation

aldernero
Copy link
Contributor

What this PR does:
This PR sets the interface lists in cfg.InfNames to contain all network interfaces that have an address on a private network, conformant to RFC1918. This replaces the static []string{"en0", "eth0"} that is currently used.

If the new function, PrivateNetworkInterfaces doesn't find any interfaces matching the RFC1918 criteria, it defaults to [en0, eth0].

Which issue(s) this PR fixes:

Fixes #881

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but the change in dskit needs to happen in github.com/grafana/dskit repository first.

@@ -76,7 +77,7 @@ func (cfg *LifecyclerConfig) RegisterFlagsWithPrefix(prefix string, f *flag.Flag
panic(fmt.Errorf("failed to get hostname %s", err))
}

cfg.InfNames = []string{"eth0", "en0"}
cfg.InfNames = util.PrivateNetworkInterfaces()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files under vendor/ directory are library code, updated using go mod vendor. This one lives at github.com/grafana/dskit, and needs to be updated there. Note that dskit cannot use code from Mimir, the dependency is from Mimir to dskit instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This implies that util.PrivateNetworkInterfaces() may need to live in dskit).

pstibrany and others added 2 commits January 26, 2022 10:41
* Remove sharding strategy from store-gateway.

Signed-off-by: Peter Štibraný <[email protected]>
Copy link
Contributor

@simonswine simonswine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing the draft PR. I think it is going into the right direction. I agree with all that @pstibrany said.

I also do think we need to look at the order of the interfaces that are returned as that list is gonna be used later to determine an IP address with https://github.com/grafana/dskit/blob/0bff37e60ff46c6f7e634abd42366aeebe0784bf/ring/util.go#L176

For my local machine the list is:

[wlp0s20f3 (my wifi) br-240fb45644c4 (another docker brdige) docker0 enp0s20f0u4u4 (my ethernet that has the default route)]

So it would always choose the IP address of the wifi adapter rather than the ethernet.

I think compared to what we have currently (hardcoded list of interfaces) this is only a minor problem. But maybe there is a way listing the route table and take the interface names from there. (Might be not that os independent, I don't think it exists in net)

* Remove -ruler.sharding-strategy from ruler.

Signed-off-by: Peter Štibraný <[email protected]>
@simonswine
Copy link
Contributor

As a small addition, this is my routing table:

$ ip route
default via 172.20.74.254 dev enp0s20f0u4u4 proto dhcp metric 100
default via 172.20.74.254 dev wlp0s20f3 proto dhcp metric 600
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.18.0.0/16 dev br-240fb45644c4 proto kernel scope link src 172.18.0.1 linkdown
172.20.74.0/24 dev enp0s20f0u4u4 proto kernel scope link src 172.20.74.176 metric 100
172.20.74.0/24 dev wlp0s20f3 proto kernel scope link src 172.20.74.200 metric 600

With those rules, we should be able to determine the interfaces:

  • Select interface routes with destination network in the private space
  • Failing that use the default route
  • The interface with the smallest metric for destination is preferred

Not too sure if that is a over engineered, but I think that approach could lead to quite good results

* Remove distributor.sharding-strategy option.

Signed-off-by: Peter Štibraný <[email protected]>
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I've some comments here, even if final code will move to dskit.

I see @simonswine feedback as a nice addition. I would not block this PR on that, and I would try to work on it as a follow up PR.

level.Warn(util_log.Logger).Log("msg", "error getting interfaces", "err", err)
}
for _, i := range ints {
fmt.Println(i.Index, i.Name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed.

// Parses network interfaces and returns those having an address conformant to RFC1918
func PrivateNetworkInterfaces() []string {
var privInts []string
ints, err := net.Interfaces()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the logic to a private function that takes the list of interfaces in input, so that you can test it. My comment is about adding unit tests to this function :)

}
}
if len(privInts) == 0 {
level.Warn(util_log.Logger).Log("msg", "couldn't find any interfaces on private networks, defaulting to [eth0 en0]")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit dubious about this log because:

  1. It gets logged each time we call PrivateNetworkInterfaces()
  2. It gets logged even if the user overrides the config, so the result of PrivateNetworkInterfaces() doesn't really matter for them (this comment applies to other logs too)

@simonswine
Copy link
Contributor

I see @simonswine feedback as a nice addition. I would not block this PR on that, and I would try to work on it as a follow up PR.

Totally agree with this. What this PR contains right now, is already a significant improvement, let's get in first.

pracucci and others added 19 commits January 26, 2022 12:16
Signed-off-by: Marco Pracucci <[email protected]>
* set initial list of mimir maintainers

Signed-off-by: Mauro Stettler <[email protected]>

* formatting

Signed-off-by: Mauro Stettler <[email protected]>

* link to maintainer accounts

* formatting

Signed-off-by: Mauro Stettler <[email protected]>

* sort

Signed-off-by: Mauro Stettler <[email protected]>

* remove whitespace

Signed-off-by: Mauro Stettler <[email protected]>

* add company names beside maintainers

Signed-off-by: Mauro Stettler <[email protected]>

* newline at end of file

Signed-off-by: Mauro Stettler <[email protected]>

* formatting

Signed-off-by: Mauro Stettler <[email protected]>

* remove people not actively working on Mimir

Signed-off-by: Mauro Stettler <[email protected]>

* update team members and add maintainers per sub-pkgs

Signed-off-by: Mauro Stettler <[email protected]>

* add newline at end of file

Signed-off-by: Mauro Stettler <[email protected]>

* remove krajo

Signed-off-by: Mauro Stettler <[email protected]>

* removing steve

Signed-off-by: Mauro Stettler <[email protected]>

* add cyril as maintainer for frontend and fix formatting

Signed-off-by: Mauro Stettler <[email protected]>

* remove christian simon

Signed-off-by: Mauro Stettler <[email protected]>

* add newline at the end

Signed-off-by: Mauro Stettler <[email protected]>

* include sub-packages in pkg paths

Signed-off-by: Mauro Stettler <[email protected]>

* add comment for Cyril

Signed-off-by: Mauro Stettler <[email protected]>

* add former team members sections

Signed-off-by: Mauro Stettler <[email protected]>

* remove company from previous team members

Signed-off-by: Mauro Stettler <[email protected]>

* use em-dash

Signed-off-by: Mauro Stettler <[email protected]>

* Add krajorama & stevesg to team members

Signed-off-by: Richard Hartmann <[email protected]>

* Update MAINTAINERS.md

Signed-off-by: Richard Hartmann <[email protected]>

Co-authored-by: Richard Hartmann <[email protected]>
* Change default -server.http-listen-port=8080

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Update default httpPort used in e2e tests

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Update config-file-reference.md

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Update backward compatibility tests

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Update CHANGELOG.md

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Replace usages of port 80 with port 8080

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Update docs

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Add container_http_port as config to jsonnet

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Rename container_http_port to server_http_port

Signed-off-by: Dimitar Dimitrov <[email protected]>

* Use server_http_port to set `-server.http-listen-port` flag

Signed-off-by: Dimitar Dimitrov <[email protected]>
* Upstream jsonnet config used at Grafana Labs

Signed-off-by: Marco Pracucci <[email protected]>

* Rebuilt tests

Signed-off-by: Marco Pracucci <[email protected]>
* Change the default of ignore_deletion_mark_delay to 1h

* Remove reference to actual default value in docs
* cmd/mimir: Detect flags' categories

Signed-off-by: Arve Knudsen <[email protected]>

* compactor: Mark BlockRanges as advanced

Signed-off-by: Arve Knudsen <[email protected]>

* Implement help flag explicitly

Signed-off-by: Arve Knudsen <[email protected]>

* Add advanced and experimental help flags

Signed-off-by: Arve Knudsen <[email protected]>

* Fix printing of non-configuration param flags

Signed-off-by: Arve Knudsen <[email protected]>

* Fix quoting of default string values

Signed-off-by: Arve Knudsen <[email protected]>

* Add to changelog

Signed-off-by: Arve Knudsen <[email protected]>

* Introduce -help-all flag instead of -help-advanced etc

Signed-off-by: Arve Knudsen <[email protected]>

* Print [experimental] before experimental flags

Signed-off-by: Arve Knudsen <[email protected]>

* Mark querier.at_modifier_enabled experimental

Signed-off-by: Arve Knudsen <[email protected]>

* Move experimental prefix before usage string instead

Signed-off-by: Arve Knudsen <[email protected]>

* cmd/mimir: Detect flags' categories

Signed-off-by: Arve Knudsen <[email protected]>

* Implement help flag explicitly

Signed-off-by: Arve Knudsen <[email protected]>

* Add advanced and experimental help flags

Signed-off-by: Arve Knudsen <[email protected]>

* Fix printing of non-configuration param flags

Signed-off-by: Arve Knudsen <[email protected]>

* Fix quoting of default string values

Signed-off-by: Arve Knudsen <[email protected]>

* Add to changelog

Signed-off-by: Arve Knudsen <[email protected]>

* Introduce -help-all flag instead of -help-advanced etc

Signed-off-by: Arve Knudsen <[email protected]>

* Print [experimental] before experimental flags

Signed-off-by: Arve Knudsen <[email protected]>

* Update cmd/mimir/usage.go

Co-authored-by: Mauro Stettler <[email protected]>

* Move experimental prefix before usage string instead

Signed-off-by: Arve Knudsen <[email protected]>

* Remove special formatting of one-character flags

Signed-off-by: Arve Knudsen <[email protected]>

* Document that the basic flag category is the default

Signed-off-by: Arve Knudsen <[email protected]>

* Add tests

Signed-off-by: Arve Knudsen <[email protected]>

* Add Make target reference-help

Signed-off-by: Arve Knudsen <[email protected]>

* Add hint to -help output

Signed-off-by: Arve Knudsen <[email protected]>

Co-authored-by: Mauro Stettler <[email protected]>
* Fix jsonnet after PRs 892 and 907

Signed-off-by: Marco Pracucci <[email protected]>

* Fix reference help too

Signed-off-by: Marco Pracucci <[email protected]>
This complements the change by #871 in backward_compatibility.go

Signed-off-by: Dimitar Dimitrov <[email protected]>
Signed-off-by: Richard Hartmann <[email protected]>
Tweaking -help-all flag description based on tech writer feedback.

Signed-off-by: Arve Knudsen <[email protected]>
Tweak help output when an unrecognized flag is provided, based on tech
writer feedback.

Signed-off-by: Arve Knudsen <[email protected]>
* Added 'mimir' to default job names in the mixin

Signed-off-by: Marco Pracucci <[email protected]>

* Fixed jobMatcher() and jobSelector()

Signed-off-by: Marco Pracucci <[email protected]>

* Fix job names selector in alerts too

Signed-off-by: Marco Pracucci <[email protected]>
andyasp and others added 27 commits February 9, 2022 09:25
* Remove mentions of obsolete sharding-enabled flags.

Signed-off-by: Peter Štibraný <[email protected]>

* Don't make it sound like ruler sharding is optional.

Signed-off-by: Peter Štibraný <[email protected]>
* Fill in version information

BuildUser and BuildDate are now filled. Swapped the Makefile to also
set the values directly rather than through main, although this was
not a necessary change.

* Format makefiles

* Add defaults and base date off commit

* Transition to using internal version

* Default to unknown directly

Co-authored-by: Peter Štibraný <[email protected]>
* Rename -auth.enabled to -auth.multitenancy-enabled, updated description.

Signed-off-by: Peter Štibraný <[email protected]>

* Review feedback.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
* Removed old website

Signed-off-by: Marco Pracucci <[email protected]>

* Rebuilt build image

Signed-off-by: Marco Pracucci <[email protected]>

* Re-introduce prettier to build image

Signed-off-by: Marco Pracucci <[email protected]>

* Updated build image

Signed-off-by: Marco Pracucci <[email protected]>
* Removed test-exporter

Signed-off-by: Marco Pracucci <[email protected]>

* Reintroduce gitignore

Signed-off-by: Marco Pracucci <[email protected]>
* Categorize tenant federation flags

* Categorize server flags

* Use flag category overrides in docgenerator

* Mark tenant federation as basic

* Revise comment to say field instead of flag
* Add activity tracking to queries in ingester.

Signed-off-by: Peter Štibraný <[email protected]>

* Update CHANGELOG.md

Signed-off-by: Peter Štibraný <[email protected]>

* Rename type to make linter happy.

Signed-off-by: Peter Štibraný <[email protected]>

* Add license header.

Signed-off-by: Peter Štibraný <[email protected]>

* Update CHANGELOG.md

Co-authored-by: Mauro Stettler <[email protected]>

* Enable tracking of flush/shutdown and add examples.

Signed-off-by: Peter Štibraný <[email protected]>

* Add license header.

Signed-off-by: Peter Štibraný <[email protected]>

Co-authored-by: Mauro Stettler <[email protected]>
* Remove cortexmetrics references

Signed-off-by: Marco Pracucci <[email protected]>

* Replace 'Cortex' with 'Grafana Mimir' in touched files

Signed-off-by: Marco Pracucci <[email protected]>

* More cleanup

Signed-off-by: Marco Pracucci <[email protected]>
Additionally replace Cortex with Grafana Mimir.

Relates to #1109.

Signed-off-by: Jack Baldry <[email protected]>
* Re-order "Mirror requests to a second cluster"

Additionally replace references to Cortex with Grafana Mimir.

Relates to #1095.

Signed-off-by: Jack Baldry <[email protected]>

* Fix grammar for distributors belonging to clusters plural

Signed-off-by: Jack Baldry <[email protected]>
Co-authored-by: replay <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
* Removed deprecated config

Signed-off-by: Marco Pracucci <[email protected]>

* Remove all references to max_samples_per_query

Signed-off-by: Marco Pracucci <[email protected]>

* Rebuilt help

Signed-off-by: Marco Pracucci <[email protected]>
* Remove sharded ruler documentation

Useful content has been moved to the "(Optional) Ruler" documentation.

Signed-off-by: Jack Baldry <[email protected]>

* Correct that a ruler requires object storage and not a database

Signed-off-by: Jack Baldry <[email protected]>
Co-authored-by: Marco Pracucci <[email protected]>

* Remove comparison to ingesters that is no longer true

Signed-off-by: Jack Baldry <[email protected]>
Co-authored-by: Marco Pracucci <[email protected]>

* Correct count of storages backends and note that filesystem is for development only

Signed-off-by: Jack Baldry <[email protected]>
Co-authored-by: Marco Pracucci <[email protected]>
…e) (#1159)

* Update mimir-prometheus to bring back old chunks mapper (without queue) and use old chunks mapper by default.

Signed-off-by: Peter Štibraný <[email protected]>

* Mark -blocks-storage.tsdb.head-chunks-write-queue-size as experimental.

Signed-off-by: Peter Štibraný <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
* Alertmanager: Remove upstream clustering (gossip) operation.

This commit removes the ability for alertmanager to be run with the upstream
clustering (gossip) functionality. Alertmanager must now be run with sharding
enabled. In essence, `-alertmanager.sharding-enabled=true` is now the default.

This removes the following configuration options which are no longer needed:
- `-alertmanager.sharding-enabled`
- `-alertmanager.cluster.advertise-address`
- `-alertmanager.cluster.gossip-interval`
- `-alertmanager.cluster.listen-address`
- `-alertmanager.cluster.peers`
- `-alertmanager.cluster.push-pull-interval`

Some points of note:
- The `state` field of the `Alertmanager` struct is now always of the type
  `alertmanager.state`, allowing simplification of certain code paths.
- The "status" endpoint for Alertmanager which would display the list of
  peers, now simply shows the service status (Starting, Running, etc).
  The equivalent information for sharding is found in the ring page.
- All remaining unit tests and integration tests which did not run with
  sharding enabled, have now been ported to work with sharding.

* Alertmanager: Rename -alertmanager.cluster.peer-timeout to -alertmanager.peer-timeout
@aldernero
Copy link
Contributor Author

I'm going to close this PR and open a new one that covers integrating the recent dskit change (that replaced much of this).

@aldernero aldernero closed this Feb 10, 2022
@aldernero aldernero deleted the autodetect-rfc1918-interfaces branch February 10, 2022 23:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change default ingester.lifecycler.interface_names[0] to auto detect the interface with default route