This project creates Go bindings for the Polars data manipulation library!
Polars is an open-source library for data manipulation, known for being one of the fastest data processing solutions on a single machine. It features a well-structured, typed API that is both expressive and easy to use.
https://github.com/pola-rs/polars
Note
Build Process & Security Considerations
The GitHub Actions runners cannot compile the Polars Rust bindings due to resource constraints, so binaries are currently compiled on a local development machine. While this isn't ideal from a security perspective, we've implemented several measures to ensure transparency and verifiability:
- π Reproducible builds: All build scripts are available in
./scripts
for review - π Checksum verification: Each binary release includes SHA256 and MD5 checksums
- π Build transparency: Release notes include build environment details and dependency versions
- ποΈ Self-compilation option: You can always build from source using
./build.sh
To verify a binary download:
# Download the checksum file and verify
sha256sum -c libpolars_go-linux-amd64-v0.1.0.so.sha256
For the easiest setup experience, use our setup script that downloads both the package and precompiled binary:
curl -sSL https://raw.githubusercontent.com/jordandelbar/go-polars/main/scripts/setup.sh | sh
This script will:
- Download and set up the polars package in your project
- Download the precompiled binary for your platform
- Configure your Go module with the necessary replace directives
- Create an example file to test your installation
package main
import (
"fmt"
"github.com/jordandelbar/go-polars/polars"
)
func main() {
df, err := polars.ReadCSV("data.csv")
if err != nil {
panic(err)
}
fmt.Println(df.String())
}
β Available for:
- Linux x86_64
π§ Coming soon:
- macOS x86_64 and ARM64
- Windows x86_64
If pre-compiled binaries aren't available for your platform:
Prerequisites:
- Rust: Install from rustup.rs
- Build tools:
build-essential
(Ubuntu) or equivalent
git clone https://github.com/jordandelbar/go-polars
cd go-polars
./build.sh
go-polars supports a comprehensive set of expression operations for data manipulation:
Gt(value)
- Greater thanLt(value)
- Less thanEq(value)
- Equal toNe(value)
- Not equal toGe(value)
- Greater than or equal toLe(value)
- Less than or equal to
Add(expr)
/AddValue(value)
- AdditionSub(expr)
/SubValue(value)
- SubtractionMul(expr)
/MulValue(value)
- MultiplicationDiv(expr)
/DivValue(value)
- Division
And(expr)
- Logical ANDOr(expr)
- Logical ORNot()
- Logical NOT
go-polars provides powerful GroupBy functionality for data aggregation:
GroupBy(columns...)
- Group data by one or more columnsCount()
- Count rows per groupSum(column)
- Sum values per groupMean(column)
- Calculate mean per groupMin(column)
- Find minimum per groupMax(column)
- Find maximum per groupStd(column)
- Calculate standard deviation per groupAgg(expressions...)
- Custom aggregations with multiple expressions
Col("column").Sum()
- Sum aggregation expressionCol("column").Mean()
- Mean aggregation expressionCol("column").Min()
- Minimum aggregation expressionCol("column").Max()
- Maximum aggregation expressionCol("column").Std()
- Standard deviation aggregation expressionCount()
- Count aggregation expression
import "github.com/jordandelbar/go-polars/polars"
// Load data
df, err := polars.ReadCSV("data.csv")
// Comparison operations
filtered := df.Filter(polars.Col("age").Gt(25))
equals := df.Filter(polars.Col("score").Eq(100))
// Mathematical operations
df = df.WithColumns(
polars.Col("price").MulValue(1.1).Alias("price_with_tax"),
polars.Col("length").Add(polars.Col("width")).Alias("perimeter"),
)
// Logical operations
complex := df.Filter(
polars.Col("age").Gt(18).And(polars.Col("score").Ge(80)),
)
// Chaining operations
result := df.
Filter(polars.Col("age").Gt(18).And(polars.Col("score").Ge(80))).
WithColumns(polars.Col("salary").MulValue(1.05).Alias("new_salary")).
Select(polars.Col("name"), polars.Col("new_salary"))
// GroupBy operations
groupedData := df.GroupBy("department")
countResult := groupedData.Count()
avgSalary := groupedData.Mean("salary")
// Complex aggregations
stats := df.GroupBy("department").Agg(
polars.Col("salary").Mean().Alias("avg_salary"),
polars.Col("salary").Max().Alias("max_salary"),
polars.Col("salary").Min().Alias("min_salary"),
polars.Count().Alias("employee_count"),
)
Get started with simple DataFrame operations:
make run-basic-example
Run the full-featured example with complex operations:
make run-expressions-example
Run the GroupBy and aggregation operations demo:
make run-groupby-example
make local-build
- Build the library from source (smart build)make force-build
- Force rebuild even if up to datemake quick-build
- Smart build (only rebuilds if needed)make run-basic-example
- Run basic DataFrame demomake run-expressions-example
- Run expression operations demomake run-groupby-example
- Run GroupBy and aggregation demomake run-all-examples
- Run all examples
# Run all tests
make test
# Quick test run
make test-short
# Test specific functionality
make test-groupby
# Performance benchmarks
make test-bench
# Generate coverage report
make test-coverage
# View coverage in browser
make view-coverage
# Development cycle (quick build + short tests)
make dev
- Join operations
- Data type conversions:
Cast()
- Schema inspection
- Null handling:
IsNull()
,IsNotNull()
,FillNull()
- Advanced Aggregations:
Median()
,... - Window functions
- Pivot & Reshape options
- Additional I/O Formats:
ReadJSON()
,WriteJSON()
,... - When/Otherwise logic
- Data Quality & Validation:
IsEmpty()
,...
- Fork the repository
- Build locally:
./build.sh
- Test your changes:
make test
- Submit a pull request
This project is licensed under the MIT License. See the LICENSE file for details.