The philosophy of this project is focused on creating a high-performance solution for distributed data storage and database interactions. — Tsunami43
A distributed, replicated log storage system built on the Erlang VM (BEAM) using Elixir.
Shanghai provides:
- Durable Write-Ahead Log (WAL) with batched writes for high throughput
- Multi-master replication with credit-based flow control
- Cluster membership with heartbeat-based failure detection
- Built-in observability via telemetry and structured logging
- Production-ready operations with comprehensive tooling
Performance: 250,000+ writes/sec, <2ms P99 latency, eventual consistency.
The project prioritizes simplicity, observability, and operational excellence over complexity.
# Install dependencies
mix deps.get
mix compile
# Run tests
mix test
# Start Shanghai
iex -S mix
# Write to WAL
iex> {:ok, lsn} = Storage.WAL.Writer.append("Hello, Shanghai!")
{:ok, 1}
# Check cluster status
iex> Cluster.Membership.all_nodes()
[%Cluster.Entities.Node{...}]Shanghai consists of four main subsystems:
Durable, sequential write-ahead log with batching support.
- Segment-based file layout (64 MB per segment)
- Batched writes for 60x throughput improvement
- CRC32 checksums for corruption detection
- Automatic compaction of old segments
Throughput: 250,000+ writes/sec (batched) Latency: P99 < 2ms
Distributed membership management with failure detection.
- Heartbeat protocol (5-second intervals)
- Failure detection (suspect at 10s, down at 15s)
- Gossip dissemination for state propagation
- Event notifications for membership changes
Detection time: ~10-15 seconds Scales to: 100+ nodes
Asynchronous multi-master replication with backpressure.
- Credit-based flow control prevents memory exhaustion
- Batch transmission for efficiency
- Automatic recovery from failures
- Lag monitoring via telemetry
Throughput: 50,000+ entries/sec (LAN) Lag: <100ms under normal load
Built-in metrics, logging, and monitoring.
- Telemetry events for all operations
- Structured logging with correlation IDs
- Prometheus metrics export
- Admin HTTP API for status
See: Observability Guide
- Getting Started Guide - Installation, first app, cluster setup
- Examples - Event sourcing, counters, queues, and more
- Integration Guide - Embed Shanghai in your application
- Architecture Overview - System design and components
- WAL Protocol - File format specification
- Replication Protocol - Replication mechanics
- Cluster Protocol - Membership and failure detection
- Operations Guide - Production deployment and maintenance
- Performance Guide - Benchmarks and optimization
- Tuning Guide - Configuration recommendations
- Observability Guide - Monitoring and debugging
- API Reference - Complete API documentation
- Deprecations - Deprecated features and migration
- ADRs - Architecture decision records
- Simplicity over complexity - Choose simple, understandable designs
- Fail-fast philosophy - Crash and restart rather than inconsistent state
- Observable by default - Everything emits telemetry
- Location transparency - Distributed operations look like local ones
- Elixir 1.19 or later
- Erlang/OTP 27 or later
# Clone the repository
git clone <repository-url>
cd shanghai
# Get dependencies
mix deps.get
# Compile all apps
mix compile
# Run tests
mix test
# Format code
mix format
# Run quality checks
mix quality# Start an interactive shell
iex -S mix
# Basic operations
iex> Query.write("user:1", %{name: "Alice", email: "alice@example.com"})
{:ok, :written}
iex> Query.read("user:1")
{:ok, %{name: "Alice", email: "alice@example.com"}}
iex> Query.delete("user:1")
{:ok, :deleted}✅ Storage Layer
- Write-Ahead Log with segment management
- Batched writes (60x throughput improvement)
- Crash recovery with torn write detection
- Segment compaction
✅ Cluster Management
- Heartbeat-based failure detection
- Membership state management
- Event notification system
- Erlang distribution integration
✅ Replication
- Leader-follower replication
- Credit-based flow control
- Automatic backpressure
- Lag monitoring
✅ Observability
- Telemetry integration throughout
- Prometheus metrics export
- Structured logging
- Admin HTTP API
- CLI tools (shanghaictl)
✅ Production Ready
- Comprehensive documentation
- Performance benchmarks
- Operations guides
- Monitoring setup
This is currently a learning/research project. Contributions welcome as the project matures.