Introduction

Data Standardizer provides implementations of various internationally recognised standards in data processing, covering topics ranging from languages to currencies and geographical entities. With strongly-typed enumerations for each standard (where applicable) or other targeted data types, you can represent these elements in your code such that errors with invalid values are minimised.

Supported target platforms include (modern) .Net and .Net Standard. Data Standardizer can be used in modern application software, but is also available as an option for older codebases that are being upgraded more gradually or may remain on older frameworks indefinitely.

Supporting the project

If you derive a commercial benefit from use of Data Standardizer or feel it otherwise adds value to your project, you are asked to please consider supporting the project. You can do this by becoming a GitHub sponsor to make a financial contribution. Data Standardizer is maintained and enhanced by @matthew25187 in his personal time and made available for free for all to use.

Getting Started

Installation

Data Standardizer is available as a series of packages from NuGet.org that can be linked to your existing projects. Available packages include:

Package	Description
DataStandardizer.Chronology	Supports the following standards: TZ Database Unix time DOS date & time
DataStandardizer.Core	Common types used to implement standards in the other packages. You should not need to link to this package directly.
DataStandardizer.File.CSV	Supports the following standards: RFC 4180, Common Format and MIME Type for Comma-Separated Values (CSV) Files
DataStandardizer.Geography	Supports the following standards: ISO 3166-1, Codes for the representation of names of countries and their subdivisions – Part 1: Country code ISO 3166-2, Codes for the representation of names of countries and their subdivisions – Part 2: Country subdivision code UN M49, Standard Country or Area Codes for Statistical Use (Series M, No. 49)
DataStandardizer.Language	Supports the following standards: ISO 639, Code for the representation of names of languages Part 1: Alpha-2 code Part 2: Alpha-3 code Part 3: Alpha-3 code for comprehensive coverage of languages Part 5: Alpha-3 code for language families and groups ISO 15924, Codes for the representation of names of scripts
DataStandardizer.LanguageTag	Supports the following standards: Best Current Practice (BCP) 47 for IETF language tags
DataStandardizer.Money	Supports the following standards: ISO 4217, Codes for the representation of currencies and funds Table A.1 – Current currency & funds code list Table A.2 – Current funds codes Table A.3 – List of codes for historic denominations of currencies & funds Money type, as described in Patterns of Enterprise Application Architecture by Martin Fowler

To use a particular standard in your application, find the corresponding package from the above list and add it as a dependency to your project. Instructions for doing so will depend on what development tooling you are using.

Visual Studio: see Install and manage packages in Visual Studio using the NuGet Package Manager
.Net CLI: see Install and manage NuGet packages with the dotnet CLI
Visual Studio Code: see NuGet in Visual Studio Code

Software dependencies

Depending on which .Net platform you are targeting, the above packages will also depend on various other system- and third-party packages. They will be included as static dependencies where required and should be automatically resolved, but if you are using a proxy for your package server you may need to make sure these other packages are also available.

The repository includes a number of PowerShell scripts with names starting with Generate. These scripts are used to re-generate the enums that comprise the implementations of each corresponding standard and require the use of a PowerShell shell prompt to execute as well as access to the official flat-file data sources provided by the relevant standards body or designated maintainer. Some scripts may also require a minimum version of PowerShell to run.

Other scripts and YAML files are included to support the infrastructure (IaC) used by the Data Standardizer project for functions such as pipelines, package hosting, etc. These files are not intended to be used by the end-user.

Latest releases

Package	Release version	Release status
DataStandardizer.Chronology
DataStandardizer.Core
DataStandardizer.File.CSV
DataStandardizer.Geography
DataStandardizer.Language
DataStandardizer.LanguageTag
DataStandardizer.Money

The most recently produced release version (shown above) does not necessarily correspond with the latest package version published to NuGet or any other publically available source.

Build and Test

Branching strategy

The Data Standardizer repository makes use of two "main" branches. They are:

Name	Description
`master`	Top-level branch from which all package release builds are produced. The `develop` branch will be merged into this branch when a new release is done.
`develop`	Default branch and the branch from which preview package builds are produced. Changes are marshalled on this branch before being included in a release build.

Other branches that may be created from time-to-time are not relevant to non-contributors.

Build source code

To compile the source code, first you will need to clone the repository to your local machine. You can find instructions for doing so here.

With the source code, you can then open a command prompt, change the current directory to the repository root folder, and use the following command to compile the entire solution:

dotnet build DataStandardizer.sln

You can also work with the source code in IDEs such as Visual Studio or Visual Studio Code. In these cases, open the DataStandardizer.sln solution file to access the source code.

There are also solution filter files (*.slnf) for each of the projects (packages) in the repository root folder alongside the main solution file. These files narrow the scope of projects included to only those needed to build and test a single package. You can also build these solution filters if so desired, and even open them in your IDE if you only want to work with the code for one package. They are included mainly because they are used by the CI pipelines to enable the building and testing of each package individually.

Running tests

The included tests are based on the XUnit test framework. To run the tests, you will need a test runner able to work with XUnit. The test projects do include a default test runner dependency, which enables you to run the tests from the command line. With a command prompt open (as described above), you can run all tests in the solution:

dotnet test DataStandardizer.sln

Visual Studio includes the Test Explorer that enables you to discover available tests and execute those tests by various categorizations. Find out more about Test Explorer here. Testing is also supported in Visual Studio Code with use of the C# Dev Kit (learn more here).

Usage

Though each package contains many types, typically there will be only a few that you will end up using directly in your application. Please refer to the project documentation for more information. Includes articles on how to use the packages for specific tasks.

📄 CSV File Support

The DataStandardizer.File.CSV package includes built-in support for working with CSV (Comma-Separated Values) files — a common format for structured data exchange. This functionality is designed to be lightweight, flexible, and compatible with legacy .NET applications via support for .NET Standard 1.x and 2.0.

✅ What It Offers

Read and write CSV files with customizable delimiters
Normalize inconsistent CSV structures for downstream processing
Handle headers, quoted fields, and edge cases gracefully
Designed for extensibility and integration into broader data workflows

💡 Example: Reading and Normalizing a CSV File

    var inputPath = "data.csv";
    var outputPath = "normalized.csv";

    var csvInputOptions = new CsvFileOptions
    {
        TerminatorLineBreak = "\n"  // source file has non-standard line breaks
    };
    var csvOutputOptions = csvInputOptions with
    {
        TerminatorLineBreak = "\r\n", // write lines to the output file with standard line breaks
        QuoteHandling = CsvFieldQuoteHandling.Required  // quote field values only when needed
    };

    using (var input = File.OpenRead(inputPath))
    using (var csvReader = new CsvFileReader<CsvFileRecordLine>(input, csvInputOptions))
    using (var output = File.Create(outputPath))
    using (var csvWriter = new CsvFileWriter<CsvFileRecordLine>(output, csvOutputOptions))
    {
        var line = csvReader.ReadLine();
        while (line is not null)
        {
            csvWriter.WriteLine(line);

            line = csvReader.ReadLine();
        }
    }

📦 Where to Find It

CSV support is provided by the DataStandardizer.File.CSV NuGet package. You can install it via:

dotnet add package DataStandardizer.File.CSV

For more advanced usage and configuration options, see the project documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github		.github
docs		docs
images		images
pipelines		pipelines
scripts		scripts
src		src
templates		templates
tests		tests
.gitignore		.gitignore
DataStandardizer.BCP47.slnf		DataStandardizer.BCP47.slnf
DataStandardizer.Chronology.slnf		DataStandardizer.Chronology.slnf
DataStandardizer.Core.slnf		DataStandardizer.Core.slnf
DataStandardizer.File.CSV.slnf		DataStandardizer.File.CSV.slnf
DataStandardizer.File.slnf		DataStandardizer.File.slnf
DataStandardizer.Geography.slnf		DataStandardizer.Geography.slnf
DataStandardizer.ISO15924.slnf		DataStandardizer.ISO15924.slnf
DataStandardizer.ISO3166.slnf		DataStandardizer.ISO3166.slnf
DataStandardizer.ISO4217.slnf		DataStandardizer.ISO4217.slnf
DataStandardizer.ISO639.slnf		DataStandardizer.ISO639.slnf
DataStandardizer.Language.slnf		DataStandardizer.Language.slnf
DataStandardizer.LanguageTag.slnf		DataStandardizer.LanguageTag.slnf
DataStandardizer.Money.slnf		DataStandardizer.Money.slnf
DataStandardizer.UNM49.slnf		DataStandardizer.UNM49.slnf
DataStandardizer.sln		DataStandardizer.sln
LICENSE.md		LICENSE.md
README.md		README.md
global.json		global.json
nuget.config		nuget.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Introduction

Supporting the project

Getting Started

Installation

Software dependencies

Latest releases

Build and Test

Branching strategy

Build source code

Running tests

Usage

📄 CSV File Support

✅ What It Offers

💡 Example: Reading and Normalizing a CSV File

📦 Where to Find It

About

Uh oh!

Sponsor this project

Uh oh!

Languages

Uh oh!

License

matthew25187/DataStandardizer

Folders and files

Latest commit

History

Repository files navigation

Introduction

Supporting the project

Getting Started

Installation

Software dependencies

Latest releases

Build and Test

Branching strategy

Build source code

Running tests

Usage

📄 CSV File Support

✅ What It Offers

💡 Example: Reading and Normalizing a CSV File

📦 Where to Find It

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Sponsor this project

Uh oh!

Languages