Skip to content

felixgwilliams/python-copier-template-ds

Repository files navigation

Python Copier Template for Data Science

License: MIT Ruff Copier pre-commit

This is a template built with Copier to generate a data science focused python project.

Get started with the following command:

copier copy gh:felixgwilliams/python-copier-template-ds path/to/destination

Features

Project structure

It is assumed that most of the work will be done in Jupyter Notebooks. However, the template also includes a python project, in which you can put functions and classes shared across notebooks. The repository is set up to use Pytest for unit testing this module code.

The template also includes a data directory whose contents will be ignored by git. You can use this folder to store data that you do not commit. You may also put a readme file in which you can document the source datasets you use and how to acquire them.

just is a command runner that allows you to easily to run project-specific commands. In fact, you can use just to run all the setup commands listed below:

just setup

The repository is set up to use uv for package or project management. You may set up your python environment with

uv sync --all-groups --all-extras

The repository is configured to use Ruff for linting and formatting. By default, all lints are enabled except

  • COM Enforces trailing commas
  • ERA Disallows commented-out code
  • ISC001 (conflicts with the formatter).

In addition, the following rules are only enforced for module code as they are inappropriate or too strict for unit tests and notebooks:

  • D Requires docstrings on functions, classes and modules
  • ANN Requires type annotations on functions and methods
  • S101 Disallows use of assert
  • PLR2004 Disallows "magic" values in comparisons
  • T20 Disallows print statements

The target line length is 120 and the docstring convention is google.

pre-commit is a tool that runs checks on your files before you commit them with git, thereby helping ensure code quality. Enable it with the following command:

pre-commit install --install-hooks

The configuration is stored in .pre-commit-config.yaml.

nbwipers is a tool written in rust to ensure Jupyter notebooks are clean. Committing notebooks that are not clean makes diffs more confusing, can degrade performance and increases the risk of leaking sensitive information. You can set it up as a git filter with the following command.

nbwipers install local

The repository comes configured to use pytest for unit testing the module code. Feel free to ignore it if you do not write module code.

Github Actions

You may optionally add a github workflow file which checks the following:

  • uses ruff to check files are formatted and linted
  • Runs unit tests and checks coverage
  • Checks any markdown files are formatted with markdownlint-cli2
  • Checks that all jupyter notebooks are clean

Typos checks for common typos in code, aiming for a low false positive rate. The repository is configured not to use it for Jupyter notebook files, as it tends to find errors in cell outputs.

Test with Copier and copier-template-tester.

About

A template for a data-science type project using uv

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published