Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special number characters are not recognized in python. #183

Open
Yueqiao12Zhang opened this issue Sep 13, 2024 · 7 comments · May be fixed by #159
Open

Special number characters are not recognized in python. #183

Yueqiao12Zhang opened this issue Sep 13, 2024 · 7 comments · May be fixed by #159
Assignees

Comments

@Yueqiao12Zhang
Copy link
Contributor

In my csv2rdf script, I use isdigit() to check for number characters automatically. However, it recognizes exponential characters like 0³, and circled characters like ⑦ as integers, while the RDFLib is not able to convert those to integer type in an RDF graph. Is there any way to recognize only pure integers in python?

@Yueqiao12Zhang Yueqiao12Zhang self-assigned this Sep 13, 2024
@Yueqiao12Zhang Yueqiao12Zhang linked a pull request Sep 13, 2024 that will close this issue
@fujinaga
Copy link
Member

Don't try to automatically determine whether a column is a number or a sting. Do this manually for each column. Maybe OpenRefine can help?

@Yueqiao12Zhang
Copy link
Contributor Author

OpenRefine does not help since we need to specify each column. I will add a feature to specify certain special columns.

@fujinaga
Copy link
Member

Have you started using OpenRefine API yet? Once you figured out how to do things manually, you may direst certain tasks via the API.

@Yueqiao12Zhang
Copy link
Contributor Author

Using OpenRefine API is not realistic since 99% of the reconciliation work should involve human inspection.

@Yueqiao12Zhang
Copy link
Contributor Author

Other graph editing work that are done in OpenRefine can be replaced using Pandas.

@fujinaga
Copy link
Member

I'm talking about OpenRefine API after the initial conversion.
The next time, we import from a database, there should be very little interaction, except for changed or new entries since the last import.

@fujinaga
Copy link
Member

We can use Pandas or any other custom editing of CVS files, but I want to keep it as simple as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants