Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dragonfly cache folder is empty after successful preheat #3699

Open
dusagu opened this issue Dec 11, 2024 · 3 comments
Open

dragonfly cache folder is empty after successful preheat #3699

dusagu opened this issue Dec 11, 2024 · 3 comments
Labels

Comments

@dusagu
Copy link

dusagu commented Dec 11, 2024

problem

The folder is empty after preheat. only available when pull the image.

Environment

OS: debian 12
Version: latest
docker compose deploy

89b9e6667f48   dragonflyoss/dfdaemon:latest   "/opt/dragonfly/bin/…"   36 minutes ago   Up 36 minutes (healthy)   0.0.0.0:65000-65002->65000-65002/tcp, 0.0.0.0:65006->65006/tcp, 0.0.0.0:65008->65008/tcp   peer

image

config

/etc/dragonfly/seed-peer.yml

# daemon alive time, when sets 0s, daemon will not auto exit
# it is useful for longtime running
aliveTime: 0s

# daemon gc task running interval
gcInterval: 1h

# WorkHome is working directory.
# In linux, default value is /usr/local/dragonfly.
# In macos(just for testing), default value is /Users/$USER/.dragonfly.
workHome: ''

# logDir is the log directory.
# In linux, default value is /var/log/dragonfly.
# In macos(just for testing), default value is /Users/$USER/.dragonfly/logs.
logDir: ''

# cacheDir is dynconfig cache directory.
# In linux, default value is /var/cache/dragonfly.
# In macos(just for testing), default value is /Users/$USER/.dragonfly/cache.
cacheDir: ''

# pluginDir is the plugin directory.
# In linux, default value is /usr/local/dragonfly/plugins.
# In macos(just for testing), default value is /Users/$USER/.dragonfly/plugins.
pluginDir: ''

# dataDir is the download data directory.
# In linux, default value is /var/lib/dragonfly.
# In macos(just for testing), default value is /Users/$USER/.dragonfly/data.
dataDir: ''

# when daemon exit, keep peer task data or not
# it is usefully when upgrade daemon service, all local cache will be saved
# default is false
keepStorage: true

# console shows log on console
console: false

# whether to enable debug level logger and enable pprof
verbose: false

# listen port for pprof, only valid when the verbose option is true
# default is -1. If it is 0, pprof will use a random port.
pprof-port: -1

# jaeger endpoint url, like: http://jaeger.dragonfly.svc:14268/api/traces
jaeger: ""

# all addresses of all schedulers
# the schedulers of all daemons should be same in one region or zone.
# daemon will send tasks to a fixed scheduler by hashing the task url and meta data
# caution: only tcp is supported
scheduler:
  manager:
    # get scheduler list dynamically from manager
    enable: true
    # manager service addresses
    netAddrs:
      - type: tcp
        addr: 100.**.**.**:65003
        # scheduler list refresh interval
    refreshInterval: 10m
    seedPeer:
      # Dfdaemon enabled seed peer mode.
      enable: true
      # Seed peer type includes super, strong and weak.
      type: weak
      # Seed peer cluster id.
      clusterID: 1
      keepAlive:
        # Keep alive internal.
        internal: 30s
  # schedule timeout
  scheduleTimeout: 30s
  # when true, only scheduler says back source, daemon can back source
  disableAutoBackSource: false

# Current host info used for scheduler.
host:
  # # Access ip for other peers,
  # # when local ip is different with access ip, advertiseIP should be set.
  advertiseIP: 82.**.**.**
  # Geographical location, separated by "|" characters.
  location: ''
  # IDC deployed by daemon.
  idc: ''
  # Daemon hostname.
  # hostname: ""

# Download service option.
download:
  # Calculate digest when transfer files, set false to save memory.
  calculateDigest: true
  # Total download limit per second.
  totalRateLimit: 2048Mi
  # Per peer task download limit per second.
  perPeerRateLimit: 1024Mi
  # Download piece timeout.
  pieceDownloadTimeout: 30s
  # When request data with range header, prefetch data not in range.
  prefetch: true
  # Golang transport option.
  transportOption:
    # Ddial timeout.
    dialTimeout: 30s
    # Keep alive.
    keepAlive: 30s
    # Same with http.Transport.MaxIdleConns.
    maxIdleConns: 100
    # Same with http.Transport.IdleConnTimeout.
    idleConnTimeout: 90s
    # Same with http.Transport.ResponseHeaderTimeout.
    responseHeaderTimeout: 30s
    # Same with http.Transport.TLSHandshakeTimeout.
    tlsHandshakeTimeout: 30s
    # Same with http.Transport.ExpectContinueTimeout.
    expectContinueTimeout: 30s
  # Concurrent option for back source, default: empty
  # if you want to enable concurrent option, thresholdSize and goroutineCount is enough, keep other options empty is okay.
  concurrent:
    # thresholdSize indicates the threshold to download pieces concurrently.
    thresholdSize: 10M
    # thresholdSpeed indicates the threshold download speed to download pieces concurrently.
    thresholdSpeed: 2M
    # goroutineCount indicates the concurrent goroutine count for every task.
    goroutineCount: 4
    # initBackoff second for every piece failed, default: 0.5.
    initBackoff: 0.5
    # maxBackoff second for every piece failed, default: 3.
    maxBackoff: 3
    # maxAttempts for every piece failed,default: 3.
    maxAttempts: 3
  # Download grpc option.
  downloadGRPC:
    # Security option.
    security:
      insecure: true
      cacert: ''
      cert: ''
      key: ''
      tlsVerify: true
      tlsConfig: null
    # Download service listen address
    # current, only support unix domain socket.
    unixListen:
      # In linux, default value is /var/run/dfdaemon.sock.
      # In macos(just for testing), default value is /tmp/dfdaemon.sock.
      socket: ''
  # Peer grpc option.
  # Peer grpc service send pieces info to other peers.
  peerGRPC:
    security:
      insecure: true
      cacert: ''
      cert: ''
      key: ''
      tlsVerify: true
    tcpListen:
      # # Listen address.
      # listen: 0.0.0.0
      # Listen port, daemon will try to listen,
      # when this port is not available, daemon will try next port.
      port: 65006
      # If want to limit upper port, please use blow format.
#     port:
#       start: 65000
#       end: 65009

# Upload service option.
upload:
  # Upload limit per second.
  rateLimit: 2048Mi
  security:
    insecure: true
    cacert: ''
    cert: ''
    key: ''
    tlsVerify: false
  tcpListen:
    # # Listen address.
    # listen: 0.0.0.0
    # Listen port, daemon will try to listen,
    # when this port is not available, daemon will try next port.
    port: 65008
    # If want to limit upper port, please use blow format.
#   port:
#     start: 65020
#     end: 65029

# Object storage service.
objectStorage:
  # Enable object storage service.
  enable: false
  # Filter is used to generate a unique Task ID by
  # filtering unnecessary query params in the URL,
  # it is separated by & character.
  # When filter: "Expires&Signature&ns", for example:
  #  http://localhost/xyz?Expires=111&Signature=222&ns=docker.io and http://localhost/xyz?Expires=333&Signature=999&ns=docker.io
  # is same task.
  filter: 'Expires&Signature&ns'
  # maxReplicas is the maximum number of replicas of an object cache in seed peers.
  maxReplicas: 3
  # Object storage service security option.
  security:
    insecure: true
    tlsVerify: true
  tcpListen:
    # # Listen address.
    # listen: 0.0.0.0
    # Listen port.
    port: 65004

# Peer task storage option.
storage:
  # Task data expire time,
  # when there is no access to a task data, this task will be gc.
  taskExpireTime: 2160h
  # Storage strategy when process task data.
  # io.d7y.storage.v2.simple : download file to data directory first, then copy to output path, this is default action
  #                           the download file in date directory will be the peer data for uploading to other peers.
  # io.d7y.storage.v2.advance: download file directly to output path with postfix, hard link to final output,
  #                            avoid copy to output path, fast than simple strategy, but:
  #                            the output file with postfix will be the peer data for uploading to other peers
  #                            when user delete or change this file, this peer data will be corrupted.
  # default is io.d7y.storage.v2.simple.
  strategy: io.d7y.storage.v2.simple
  # Disk quota gc threshold, when the quota of all tasks exceeds the gc threshold, the oldest tasks will be reclaimed.
  diskGCThreshold: 100Gi
  # Disk used percent gc threshold, when the disk used percent exceeds, the oldest tasks will be reclaimed.
  # eg, diskGCThresholdPercent=80, when the disk usage is above 80%, start to gc the oldest tasks.
  diskGCThresholdPercent: 80
  # Set to ture for reusing underlying storage for same task id.
  multiplex: true

# Health service option.
health:
  security:
    insecure: true
    cacert: ''
    cert: ''
    key: ''
    tlsVerify: false
  tcpListen:
    # # Listen address.
    # listen: 0.0.0.0
    # Listen port, daemon will try to listen,
    # when this port is not available, daemon will try next port.
    port: 40902
    # If want to limit upper port, please use blow format.
#   port:
#     start: 40901
#     end: 40901

# Proxy service detail option.
proxy:
  # Filter for hash url.
  # when defaultFilter: "Expires&Signature&ns", for example:
  #  http://localhost/xyz?Expires=111&Signature=222&ns=docker.io and http://localhost/xyz?Expires=333&Signature=999&ns=docker.io
  # is same task, it is also possible to override the default filter by adding
  # the X-Dragonfly-Filter header through the proxy.
  defaultFilter: 'Expires&Signature&ns'
  # Tag the task.
  # when the value of the default tag is different,
  # the same download url can be divided into different tasks according to the tag,
  # it is also possible to override the default tag by adding
  # the X-Dragonfly-Tag header through the proxy.
  defaultTag: ''
  security:
    insecure: true
    cacert: ''
    cert: ''
    key: ''
    tlsVerify: false
  tcpListen:
    # namespace stands the linux net namespace, like /proc/1/ns/net.
    # It's useful for running daemon in pod with ip allocated and listening the special port in host net namespace.
    # Linux only.
    namespace: ''
    # # Listen address.
    # listen: 0.0.0.0
    # Listen port, daemon will try to listen,
    # when this port is not available, daemon will try next port.
    port: 65001
    # If want to limit upper port, please use blow format.
  #   port:
  #     start: 65020
  #     end: 65029
  registryMirror:
    # When enable, using header "X-Dragonfly-Registry" for remote instead of url.
    dynamic: true
    # URL for the registry mirror.
    url: https://registry.dcim.co
    # Whether to ignore https certificate errors.
    insecure: true
    # Optional certificates if the remote server uses self-signed certificates.
    certs: []
    # Whether to request the remote registry directly.
    direct: false
    # Whether to use proxies to decide if dragonfly should be used.
    useProxies: false

  proxies:
    # Proxy all http image layer download requests with dfget.
    - regx: blobs/sha256.*
    # Proxy all http image layer download requests with dfget.
    - regx: file-server.*
    # Change http requests to some-registry to https and proxy them with dfget.
    - regx: some-registry/
      useHTTPS: true
    # Proxy requests directly, without dfget.
    - regx: no-proxy-reg
      direct: true
    # Proxy requests with redirect.
    - regx: some-registry
      redirect: another-registry
    # The same with url rewrite like apache ProxyPass directive.
    - regx: ^http://some-registry/(.*)
      redirect: http://another-registry/$1

  hijackHTTPS:
    # key pair used to hijack https requests
    cert: ""
    key: ""
    hosts:
      - regx: mirror.aliyuncs.com:443 # regexp to match request hosts
        # whether to ignore https certificate errors
        insecure: true
        # optional certificates if the host uses self-signed certificates
        certs: []
  # max tasks to download same time, 0 is no limit
  maxConcurrency: 0
  whiteList:
    # the host of the whitelist
    - host: ""
      # match whitelist hosts
      regx: ".*"
      # port that need to be added to the whitelist
      ports:



security:
  # autoIssueCert indicates to issue client certificates for all grpc call.
  # If AutoIssueCert is false, any other option in Security will be ignored.
  autoIssueCert: false
  # caCert is the root CA certificate for all grpc tls handshake, it can be path or PEM format string.
  caCert: ''
  # tlsVerify indicates to verify certificates.
  tlsVerify: false
  # tlsPolicy controls the grpc shandshake behaviors:
  #   force: both ClientHandshake and ServerHandshake are only support tls
  #   prefer: ServerHandshake supports tls and insecure (non-tls), ClientHandshake will only support tls
  #   default: ServerHandshake supports tls and insecure (non-tls), ClientHandshake will only support insecure (non-tls)
  # Notice: If the drgaonfly service has been deployed, a two-step upgrade is required.
  # The first step is to set tlsPolicy to default, and then upgrade the dragonfly services.
  # The second step is to set tlsPolicy to prefer, and then completely upgrade the dragonfly services.
  tlsPolicy: 'prefer'
  certSpec:
    # validityPeriod is the validity period  of certificate.
    validityPeriod: 4320h
# Prometheus metrics address.
# metrics: ':8000'

network:
  # Enable ipv6.
  enableIPv6: false

scheduler.yml

# server scheduler instance configuration
server:
  # # Advertise ip.
  advertiseIP: 100.**.**.**
  # # Listen ip.
  # listenIP: 0.0.0.0
  # Port is the ip and port scheduler server listens on.
  port: 8002
  # # Server host.
  # host: localhost
  # WorkHome is working directory.
  # In linux, default value is /usr/local/dragonfly.
  # In macos(just for testing), default value is /Users/$USER/.dragonfly.
  workHome: ''
  # logDir is the log directory.
  # In linux, default value is /var/log/dragonfly.
  # In macos(just for testing), default value is /Users/$USER/.dragonfly/logs.
  logDir: ''
  # cacheDir is dynconfig cache directory.
  # In linux, default value is /var/cache/dragonfly.
  # In macos(just for testing), default value is /Users/$USER/.dragonfly/cache.
  cacheDir: ''
  # pluginDir is the plugin directory.
  # In linux, default value is /usr/local/dragonfly/plugins.
  # In macos(just for testing), default value is /Users/$USER/.dragonfly/plugins.
  pluginDir: ''
  # dataDir is the directory.
  # In linux, default value is /var/lib/dragonfly.
  # In macos(just for testing), default value is /Users/$USER/.dragonfly/data.
  dataDir: ''

# scheduler policy configuration
scheduler:
  # Algorithm configuration to use different scheduling algorithms,
  # default configuration supports "default" and "ml"
  # "default" is the rule-based scheduling algorithm,
  # "ml" is the machine learning scheduling algorithm
  # It also supports user plugin extension, the algorithm value is "plugin",
  # and the compiled `d7y-scheduler-plugin-evaluator.so` file is added to
  # the dragonfly working directory plugins.
  algorithm: default
  # backSourceCount is the number of backsource clients
  # when the seed peer is unavailable.
  backSourceCount: 3
  # Retry scheduling back-to-source limit times.
  retryBackSourceLimit: 5
  # Retry scheduling limit times.
  retryLimit: 10
  # Retry scheduling interval.
  retryInterval: 50ms
  # GC metadata configuration.
  gc:
    # pieceDownloadTimeout is the timeout of downloading piece.
    pieceDownloadTimeout: 30m
    # peerGCInterval is the interval of peer gc.
    peerGCInterval: 10s
    # peerTTL is the ttl of peer. If the peer has been downloaded by other peers,
    # then PeerTTL will be reset
    peerTTL: 24h
    # taskGCInterval is the interval of task gc. If all the peers have been reclaimed in the task,
    # then the task will also be reclaimed.
    taskGCInterval: 30m
    # hostGCInterval is the interval of host gc.
    hostGCInterval: 6h
    # hostTTL is time to live of host. If host announces message to scheduler,
    # then HostTTl will be reset.
    hostTTL: 1h

# Database info used for server.
database:
  # Redis configuration.
  redis:
    # Redis addresses.
    addrs:
      - "redis:6379"
    # Redis username.
    username: ''
    # Redis password.
    password: dragonfly
    # Redis brokerDB name.
    brokerDB: 1
    # Redis backendDB name.
    backendDB: 2

# Dynamic data configuration.
dynConfig:
  # Dynamic config refresh interval.
  refreshInterval: 1m

# Scheduler host configuration.
host:
  # idc is the idc of scheduler instance.
  idc: ''
  # location is the location of scheduler instance.
  location: ''

# Manager configuration.
manager:
  # addr is manager access address.
  addr: "100.**.**.**.**:65003"
  # schedulerClusterID cluster id to which scheduler instance belongs.
  schedulerClusterID: "1"
  # keepAlive keep alive configuration.
  keepAlive:
    # KeepAlive interval.
    interval: 5s

# Seed peer configuration.
seedPeer:
  # Scheduler enable seed peer as P2P peer,
  # if the value is false, P2P network will not be back-to-source through
  # seed peer but by peer and preheat feature does not work.
  enable: true

# Machinery async job configuration,
# see https://github.com/RichardKnop/machinery.
job:
  # Scheduler enable job service.
  enable: true
  # Number of workers in global queue.
  globalWorkerNum: 1
  # Number of workers in scheduler queue.
  schedulerWorkerNum: 1
  # Number of workers in local queue.
  localWorkerNum: 5

# Store task download information.
storage:
  # maxSize sets the maximum size in megabytes of storage file.
  maxSize: 100
  # maxBackups sets the maximum number of storage files to retain.
  maxBackups: 10
  # bufferSize sets the size of buffer container,
  # if the buffer is full, write all the records in the buffer to the file.
  bufferSize: 100

# Enable prometheus metrics.
metrics:
  # Scheduler enable metrics service.
  enable: false
  # Metrics service address.
  addr: ':8000'
  # Enable host metrics.
  enableHost: false

security:
  # autoIssueCert indicates to issue client certificates for all grpc call.
  # If AutoIssueCert is false, any other option in Security will be ignored.
  autoIssueCert: false
  # caCert is the root CA certificate for all grpc tls handshake, it can be path or PEM format string.
  caCert: ''
  # tlsVerify indicates to verify certificates.
  tlsVerify: false
  # tlsPolicy controls the grpc shandshake behaviors:
  #   force: both ClientHandshake and ServerHandshake are only support tls
  #   prefer: ServerHandshake supports tls and insecure (non-tls), ClientHandshake will only support tls
  #   default: ServerHandshake supports tls and insecure (non-tls), ClientHandshake will only support insecure (non-tls)
  # Notice: If the drgaonfly service has been deployed, a two-step upgrade is required.
  # The first step is to set tlsPolicy to default, and then upgrade the dragonfly services.
  # The second step is to set tlsPolicy to prefer, and then completely upgrade the dragonfly services.
  tlsPolicy: 'prefer'
  certSpec:
    # validityPeriod is the validity period  of certificate.
    validityPeriod: 4320h

network:
  # Enable ipv6.
  enableIPv6: false

# console shows log on console
console: false

# whether to enable debug level logger and enable pprof
verbose: true

# listen port for pprof, only valid when the verbose option is true
# default is -1. If it is 0, pprof will use a random port.
pprof-port: -1

# jaeger endpoint url, like: http://jaeger.dragonfly.svc:14268/api/traces
jaeger: ""

docker-compose.yml

services:
  dfdaemon:
    image: dragonflyoss/dfdaemon:latest
    container_name: peer
    restart: always
    healthcheck:
      test: ["CMD-SHELL", "/bin/grpc_health_probe -addr=:65006 || exit 1"]
      interval: 1s
      timeout: 2s
      retries: 30
    volumes:
      - /var/log/peer:/var/log/dragonfly/daemon
      - /mnt/cache/dragonfly:/var/lib/dragonfly
      - /etc/dragonfly/seed-peer.yaml:/etc/dragonfly/dfget.yaml:ro
    ports:
      - 65000:65000
      - 65001:65001
      - 65002:65002
      - 65006:65006
      - 65008:65008
    extra_hosts:
      - "harbor.**.**:100.**.**.**"

@dusagu dusagu added the bug label Dec 11, 2024
@gaius-qi
Copy link
Member

gaius-qi commented Dec 11, 2024

@dusagu Please pull and merge the latest version of the main branch. Docker compose has been updated maybe two months ago.

@dusagu
Copy link
Author

dusagu commented Dec 11, 2024

@gaius-qi image
But here it is 5hours ago

@gaius-qi
Copy link
Member

@gaius-qi image

But here it is 5hours ago

Update the config of the docker compose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants