Skip to content

PwC integration causes integration tests to fail #5473

@mbollmann

Description

@mbollmann

A recent PR from the Papers with Code integration causes the integration tests of the Python library to fail.

This is because the PwC integration recently introduced a string with double whitespaces:

<pwcdataset url="https://paperswithcode.com/dataset/how-to-fix-quickbooks-error-30159-causes">How to fix QuickBooks Error 30159 – Causes &amp; Fixes</pwcdataset>

Our XML indentation function automatically "cleans" such double whitespaces to single ones, but is never called by ingest_pwc.py. When the integration tests load & re-save the files, the whitespaces are cleaned, causing an error.

IMHO this an example of the drawbacks of directly manipulating XML without encapsulation, and my suggested fix is to refactor ingest_pwc.py to use the Python library instead.

Metadata

Metadata

Assignees

Labels

bugpython-libraryConcerning the acl-anthology-py library

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions