Skip to content

Accept custom dialects #7

@RomainTT

Description

@RomainTT

First of all, thanks for this package it’s great!

My problem is that the validator only accepts “classic” CSV with a header, a comma as separator, a newline with \n, etc.

It would be really nice to manage other formats, and even nicer to make these formats part of the schema.

To manage formats, here are some ideas:

  • Use the Dialects of the native csv library.
  • In this line which reads the CSV file, give a custom dialect to DictReader.
  • The custom dialect could either be given as an argument of ValidateCsv.__init__, or could be generated from the schema itself. In that last case, a new method generate_dialect could parse the schema and build the dialect.

Example of schema with maximum complexity for the dialect:

{
  "name": null,
  "description": null,
  "filename": {
    "regex": null
  },
  "dialect": {
    "delimiter": "\t",
    "skipinitialspace": true,
    "lineterminator": "\n",
    "quoting": "QUOTE_NONE",
    "quotechar": "'",
    "escapechar": "\\",
    "doublequote": false
  }
  "columns": []
}

I’ll try to do a PR if I can spare some time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions