Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: CI

on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:

permissions:
contents: read

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
lint:
name: Lint and format checks - Python ${{ matrix.python-version }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.12", "3.13", "3.14"]
steps:
- name: Checkout py-snappy
uses: actions/checkout@v4

- name: Install uv and Python ${{ matrix.python-version }}
uses: astral-sh/setup-uv@v4
with:
enable-cache: true
cache-dependency-glob: "pyproject.toml"
python-version: ${{ matrix.python-version }}

- name: Run all quality checks via tox
run: uvx tox -e all-checks

test:
name: Tests - Python ${{ matrix.python-version }} on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-latest]
python-version: ["3.12", "3.13", "3.14"]
steps:
- name: Checkout py-snappy
uses: actions/checkout@v4

- name: Install uv and Python ${{ matrix.python-version }}
uses: astral-sh/setup-uv@v4
with:
enable-cache: true
cache-dependency-glob: "pyproject.toml"
python-version: ${{ matrix.python-version }}

- name: Run tests via tox
run: uvx tox -e pytest
152 changes: 152 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
Pipfile.lock

# poetry
poetry.lock

# pdm
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# IDEs
.idea/
.vscode/
*.swp
*.swo
*~
.DS_Store

# UV
.uv/

# Ruff
.ruff_cache/
81 changes: 81 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# py-snappy

Pure Python implementation of Google's Snappy compression algorithm.

## Features

- **Pure Python**: No external dependencies or C extensions required
- **Full compatibility**: Produces output compatible with Google's Snappy format
- **Well documented**: Extensive inline documentation explaining the algorithm
- **Thoroughly tested**: Comprehensive test suite using C++ Snappy test data

## Installation

```bash
uv sync
```

## Usage

```python
from src import compress, decompress

# Compress data
data = b"Hello, World!" * 100
compressed = compress(data)

# Decompress data
original = decompress(compressed)
assert original == data
```

## API

### Core Functions

- `compress(data: bytes) -> bytes`: Compress data using Snappy
- `decompress(data: bytes) -> bytes`: Decompress Snappy-compressed data

### Utilities

- `max_compressed_length(size: int) -> int`: Maximum possible compressed size
- `get_uncompressed_length(data: bytes) -> int`: Read uncompressed length from header
- `is_valid_compressed_data(data: bytes) -> bool`: Quick validation check

### Exceptions

- `SnappyDecompressionError`: Raised when decompression fails

## Development

```bash
# Install dependencies
uv sync

# Run tests
uv run pytest

# Run linter
uv run ruff check src/ tests/

# Format code
uv run ruff format src/ tests/
```

## Algorithm

Snappy is an LZ77-variant compression algorithm that prioritizes speed over compression ratio. Key characteristics:

- **Block-based**: Data is processed in 64KB blocks
- **Hash table**: O(1) match lookup using a simple hash function
- **Greedy matching**: No optimal parsing or lazy evaluation
- **Wire format**: Varint length prefix followed by literals and copy references

## References

- [Google Snappy](https://github.com/google/snappy)
- [Format Description](https://github.com/google/snappy/blob/main/format_description.txt)

## License

MIT License
Loading
Loading