Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
jonavellecuerdo committed Jan 9, 2024
1 parent 755d07c commit af58897
Show file tree
Hide file tree
Showing 2 changed files with 75 additions and 2 deletions.
73 changes: 73 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,4 +73,77 @@ STATUS_UPDATE_INTERVAL = 1000
SENTRY_DSN = <sentry-dsn-for-oai-pmh-harvester>
```

## CLI commands

All CLI commands can be run with pipenv run <COMMAND>.

### `oai`

```text
Usage: -c [OPTIONS] COMMAND [ARGS]...
Options:
-h, --host TEXT Hostname of OAI-PMH server to harvest from, e.g.
https://dspace.mit.edu/oai/request. [required]
-o, --output-file TEXT Filepath to write output to. Can be a local filepath
or an S3 URI, e.g. S3://bucketname/filename.xml.
[required]
-v, --verbose Optional: enable debug output.
--help Show this message and exit.
Commands:
harvest Harvest command to retrieve records from an OAI-PMH compliant source.
setlist Create a JSON file describing the set structure of an OAI-PMH compliant source.
```

### `oai harvest`

```text
Usage: -c harvest [OPTIONS]
Harvest command to retrieve records from an OAI-PMH compliant source.
Options:
--method [get|list] Record retrieval method to use. Default 'list'
method is faster and should be used in most
cases; 'get' method should be used for
ArchivesSpace due to errors retrieving a full
record set with the 'list' method. [default:
list]
-m, --metadata-format TEXT Optional: specify alternate metadata format for
harvested records (e.g. mods, mets, oai_dc, qdc,
ore). [default: oai_dc]
-f, --from-date TEXT Optional: starting date to harvest records from,
in format YYYY-MM-DD. Limits harvest to records
added/updated on or after the provided date.
-u, --until-date TEXT Optional: ending date to harvest records from,
in format YYYY-MM-DD. Limits harvest to records
added/updated on or before the provided date.
-s, --set-spec TEXT Optional: SetSpec of set to be harvested. Limits
harvest to records in the provided set.
-sr, --skip-record TEXT Optional: OAI-PMH identifier of record to skip
during harvest. Only works if the harvest method
used is 'get'. Can be repeated to skip multiple
records, e.g. '-sr oai:12345 -sr oai:67890'. Can
also be set via ENV variable, see README for
details.
--exclude-deleted Optional: exclude deleted records from harvest.
--help Show this message and exit.
```

### `oai setlist`
```
Usage: -c setlist [OPTIONS]
Create a JSON file describing the set structure of an OAI-PMH compliant
source.
Uses the OAI-PMH ListSets verbs to retrieve all sets from a repository, and
writes the set names and specs to a JSON output file.
Options:
--help Show this message and exit.
```



4 changes: 2 additions & 2 deletions harvester/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ def harvest(
skip_record: tuple[str] | None,
exclude_deleted: bool,
) -> None:
"""Harvest records from an OAI-PMH compliant source and write to an output file."""
"""Harvest command to retrieve records from an OAI-PMH compliant source."""
logger.info(
"OAI-PMH harvesting from source %s with parameters: method=%s, "
"metadata_format=%s, from_date=%s, until_date=%s, set=%s, skip_record=%s, "
Expand Down Expand Up @@ -162,7 +162,7 @@ def harvest(
@main.command()
@click.pass_context
def setlist(ctx: click.Context) -> None:
"""Get set info from an OAI-PMH compliant source and write to an output file.
"""Create a JSON file describing the set structure of an OAI-PMH compliant source.
Uses the OAI-PMH ListSets verbs to retrieve all sets from a repository, and writes
the set names and specs to a JSON output file.
Expand Down

0 comments on commit af58897

Please sign in to comment.