Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote snapshotters fails to pull an image #797

Closed
andre-j3sus opened this issue Nov 26, 2024 · 1 comment
Closed

Remote snapshotters fails to pull an image #797

andre-j3sus opened this issue Nov 26, 2024 · 1 comment

Comments

@andre-j3sus
Copy link

Hello everyone,

I followed the Getting Started with Remote Snapshotters guide, but encountered an issue when trying to use the example command provided in the documentation, failing to pull the image.

I manage to start the 3 host-daemons, but when I run the following command, described in the Remote Snapshotter Example, I get the following error:

$ sudo ./remote-snapshotter ghcr.io/firecracker-microvm/firecracker-containerd/amazonlinux:latest-esgz
Docker username: andre-j3sus
Docker password: ...
Creating VM
Setting docker credential metadata
Pulling the image
failed to extract layer sha256:44bffd90bc4b6651d3f09985df4ca649ce6f055b7bc840859c39052c0a02e9e2: failed to mount /var/lib/firecracker-containerd/containerd/tmpmounts/containerd-mount4028773215: no such file or directory: unknown

My setup details and the logs are below:

Setup

  • OS: Ubuntu 20.04 (Focal Fossa)

      $ uname -a
      Linux node-000.ajesus-232873.ntu-cloud-pg0.utah.cloudlab.us 5.4.0-196-generic #216-Ubuntu SMP Thu Aug 29 13:26:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
      $ cat /etc/os-release
      NAME="Ubuntu"
      VERSION="20.04 LTS (Focal Fossa)"
  • Go Version: go version go1.23.3 linux/amd64

  • Firecracker Version: firecracker --version Firecracker v1.1.0 (could it be related to this version? I tried to upgrade it to 1.7.0, but I did not tried to downgrade it).

  • SDK version: ~/go/pkg/mod/github.com/firecracker-microvm/[email protected]/

  • firecracker-containerd/config.toml:

    version = 2
    disabled_plugins = ["io.containerd.grpc.v1.cri"]
    root = "/var/lib/firecracker-containerd/containerd"
    state = "/run/firecracker-containerd"
    [grpc]
      address = "/run/firecracker-containerd/containerd.sock"
    [plugins]
      [plugins."io.containerd.snapshotter.v1.devmapper"]
        pool_name = "fc-dev-thinpool"
        base_image_size = "10GB"
        root_path = "/var/lib/firecracker-containerd/snapshotter/devmapper"
    [proxy_plugins]
      [proxy_plugins.proxy]
        type = "snapshot"
        address = "/var/lib/demux-snapshotter/snapshotter.sock"
    [debug]
      level = "debug"
  • /etc/containerd/firecracker-runtime.json:
    {
      "firecracker_binary_path": "/usr/local/bin/firecracker",
      "kernel_image_path": "/var/lib/firecracker-containerd/runtime/default-vmlinux.bin",
      "kernel_args": "console=ttyS0 pnp.debug=1 noapic reboot=k panic=1 pci=off nomodules ro systemd.unified_cgroup_hierarchy=0 systemd.journald.forward_to_console systemd.unit=firecracker.target init=sbin/overlay-init",
      "root_drive": "/var/lib/firecracker-containerd/runtime/rootfs-stargz.img",
      "log_fifo": "fc-logs.fifo",
      "log_levels": ["debug"],
      "metrics_fifo": "fc-metrics.fifo",
      "default_network_interfaces": [
      {
        "AllowMMDS": true,
        "CNIConfig": {
          "NetworkName": "fcnet",
          "InterfaceName": "veth0"
        }
      }
    ]
    }

Logs

  1. Logs from demux-snapshotter:
 $ sudo snapshotter/demux-snapshotter
 ERRO[0018] Function called without namespaced context    error="namespace is required: failed precondition" function=Walk
 DEBU[0018] no namespace found, proxying walk function to all cached snapshotters  function=Walk
 ERRO[0018] Function called without namespaced context    error="namespace is required: failed precondition" function=Remove
 ERRO[0018] Function called without namespaced context    error="namespace is required: failed precondition" function=Cleanup
 INFO[0018] stopping server                              
 INFO[0018] done        
  1. Logs from http-address-resolver:
 $ sudo snapshotter/http-address-resolver
 INFO[0000] http resolver serving at port 10001 
  1. Firecracker-container logs are available here: Gist.

Additional Notes

  • Using the devmapper getting started guide, everything works as expected.
  • Pulling the same image using docker or containerd succeeds.
  • I also tried other images, including those from the pre-converted stargz images list, but the issue persists.
  • I noted that the stargz-snapshotter submodule was pointing to a 2022 commit, and I tried to update it to the current version but the issue persists.

Please let me know if you need any additional information or testing. I appreciate any help, as I've been struggling to get this tutorial working for a couple of weeks.

@andre-j3sus
Copy link
Author

andre-j3sus commented Nov 30, 2024

The issue was identified in the firecracker-containerd logs:

DEBU[2024-11-22T16:07:34.311745334-07:00] [    2.956402] containerd-stargz-grpc[783]: {"error":"Get \"https://ghcr.io/v2/firecracker-microvm/firecracker-containerd/amazonlinux/blobs/sha256:4212523da282d68d5e938cc87081bf8e582cd48a4dcbcf9f4b08cf6acc91cae2\": dial tcp: lookup ghcr.io on [::1]:53: read udp [::1]:40854-\u003e[::1]:53: read: connection refused","key":"vm1/4/extract-155055181-C2nq sha256:44bffd90bc4b6651d3f09985df4ca649ce6f055b7bc840859c39052c0a02e9e2","level":"debug","mountpoint":"/var/lib/containerd-stargz-grpc/snapshotter/snapshots/1/fs","msg":"Retrying request","parent":"","src":"ghcr.io/firecracker-microvm/firecracker-containerd/amazonlinux:latest-esgz/sha256:4212523da282d68d5e938cc87081bf8e582cd48a4dcbcf9f4b08cf6acc91cae2","time":"2024-11-22T23:07:34.231190624Z"}  jailer=noop runtime=aws.firecracker vmID=vm1 vmm_stream=stdout

The issue appeared to be related to DNS resolution, and after reviewing this section from the getting started guide, I discovered that Firecracker VMs inherit DNS settings from the host. This caused problems when the host was using systemd-resolved or a similar local resolver that directed DNS queries to localhost.

Note that, by default, the nameserver configuration within your host's /etc/resolv.conf will be parsed and provided to VMs as their nameserver configuration. This can cause problems if your host is using a systemd resolver or other resolver that operates on localhost (which results in the VM using its own localhost as the nameserver instead of your host's). This situation may require manual tweaking of the default CNI configuration, such as specifying static DNS configuration as part of the ptp plugin.

To resolve the issue, I updated the CNI configuration (./tools/demo/fcnet.conflist) to manually specify DNS settings for the VMs. The updated configuration is as follows:

{
  "cniVersion": "1.0.0",
  "name": "fcnet",
  "plugins": [
    {
      "type": "bridge",
      "bridge": "fc-br0",
      "isDefaultGateway": true,
      "forceAddress": false,
      "ipMasq": true,
      "hairpinMode": true,
      "mtu": 1500,
      "ipam": {
        "type": "host-local",
        "subnet": "192.168.1.0/24",
        "resolvConf": "/etc/resolv.conf"
      },
      "dns": {
        "nameservers": ["128.110.156.4", "1.1.1.1", "8.8.8.8"]
      }
    },
    {
      "type": "firewall"
    },
    {
      "type": "tc-redirect-tap"
    },
    {
      "type": "loopback"
    }
  ]
}

This manual DNS configuration resolved the issue, and the system is functioning as expected now.
Therefore, I am closing this issue.

Additionally, I will now look at #761 and try to reproduce it. It seems related to this issue, but also with the docker credentials setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant