Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: continue ingest after errors #103

Merged
merged 9 commits into from
Mar 14, 2024
Merged

feat: continue ingest after errors #103

merged 9 commits into from
Mar 14, 2024

Conversation

Frando
Copy link
Contributor

@Frando Frando commented Mar 12, 2024

This refactors the error handling in the ingest process. We now store failures during ingest in the database, and continue if the fatal is non-fatal.

  • If the call to fetchUpdates fails, we apply a delay of 30s until we try again, as this is likely a network or external issues.
  • If the mapping of a record fails, we log the error into the database and continue
  • If a fatal error occurs that we do not handle specifically, abort with error message

This means that for all errors that can happen due to external circumstances (datasource offline, datasource emits bad data), we do not abort but log the errors and try to continue. We only abort right away for other errors, which should not happen (to be treated as bugs in repco).

The PR also includes a new CLI command, ds errors, to show the errors:

$ yarn repco ds errors -r default --count 1
repo:         did:key:z6MkizBdcAEfBz5LGvC3V5bgnMScLuGUt7b76ysdA8kA6v7Z
datasource:   repco:default:datasource:cba:https://cba.fro.at/wp-json/wp/v2
timestamp:    2024-03-12T16:58:07.825Z
kind:         fetch_updates
error:        Failed to fetch updates.
  caused by: Failed to fetch: Fetch failed (URL: https://cba.fro.at/wp-json/wp/v2/posts?page=1&per_page=30&_embed=&orderby=modified&order=asc&modified_after=2008-03-31T20%3A37%3A55&api_key=k8WHfNbal0rjIs2f)
  caused by: TypeError: fetch failed
  caused by: Error [ERR_TLS_CERT_ALTNAME_INVALID]: Hostname/IP does not match certificate's altnames: Host: cba.fro.at. is not in the cert's altnames: DNS:cloudflare-dns.com, DNS:*.cloudflare-dns.com, DNS:one.one.one.one, IP Address:1.0.0.1, IP Address:1.1.1.1, IP Address:162.159.36.1, IP Address:162.159.46.1, IP Address:2606:4700:4700:0:0:0:0:1001, IP Address:2606:4700:4700:0:0:0:0:1111, IP Address:2606:4700:4700:0:0:0:0:64, IP Address:2606:4700:4700:0:0:0:0:6400
cursor:       {"posts":"2008-03-31T20:37:55"}
sourcerecord: null
---
Done in 1.52s.



$ yarn repco ds errors --help
Show ingest error log

USAGE: repco ds errors [opts]

OPTIONS:
  --repo, -r    Repo name or DID
  --ds, -d      Datasource UID (optional)
  --offset, -o  Offset
  --count, -o   Offset
  --stack, -s   Show stack trace
  --json, -j    Print as JSON

@Frando Frando force-pushed the feat/ingesterrors branch from 4b2fb80 to 3284fc4 Compare March 12, 2024 17:46
@Frando Frando force-pushed the feat/ingesterrors branch from 3284fc4 to 27060b6 Compare March 12, 2024 17:47
@Frando Frando merged commit 35b4892 into main Mar 14, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant