You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation of the DOI ingestion function processes the input file line by line, regardless of file format or structure. This approach allows any file type to be processed. Current code:
with open(args.list_of_dois, "r") as csv_file:
for line in csv_file:
list_of_dois.append(line.strip())
Problem: This code does not validate the file type, so it will try to process any input file (e.g., .txt, .csv, .json, yaml). While it works for line-based formats, this lack of restriction could lead to issues if the input is a file with a different format or structure.
Also, if one passes the invalid .csv file the pipeline does not have a failure feedback mechanism as it gives a Success message.
The text was updated successfully, but these errors were encountered:
A headerless, one-column csv/text file is what is expected at the moment. DOIs can be in any format. Legal DOIs are extracted from each row using a regular expression.
The current implementation of the DOI ingestion function processes the input file line by line, regardless of file format or structure. This approach allows any file type to be processed. Current code:
Problem: This code does not validate the file type, so it will try to process any input file (e.g., .txt, .csv, .json, yaml). While it works for line-based formats, this lack of restriction could lead to issues if the input is a file with a different format or structure.
Also, if one passes the invalid .csv file the pipeline does not have a failure feedback mechanism as it gives a
Success
message.The text was updated successfully, but these errors were encountered: