Skip to content

mchav/dataframe

User guide | Discord

DataFrame

A fast, safe, and intuitive DataFrame library.

Why use this DataFrame library?

  • Encourages concise, declarative, and composable data pipelines.
  • Static typing makes code easier to reason about and catches many bugs at compile time—before your code ever runs.
  • Delivers high performance thanks to Haskell’s optimizing compiler and efficient memory model.
  • Designed for interactivity: expressive syntax, helpful error messages, and sensible defaults.
  • Works seamlessly in both command-line and notebook environments—great for exploration and scripting alike.

Features

  • Type-safe column operations with compile-time guarantees
  • Familiar, approachable API designed to feel easy coming from other languages.
  • Interactive REPL for data exploration and plotting.

Quick start

Browse through some examples in binder or in our playground.

Install

Cabal

To use the CLI tool:

$ cabal update
$ cabal install dataframe
$ dataframe

As a prodject dependency add dataframe to your .cabal file.

Stack (in stack.yaml add to extra-deps if needed)

Add to your package.yaml dependencies:

dependencies:
  - dataframe

Or manually to stack.yaml extra-deps if needed.

Example

dataframe> df = D.fromNamedColumns [("product_id", D.fromList [1,1,2,2,3,3]), ("sales", D.fromList [100,120,50,20,40,30])]
dataframe> df
------------------
product_id | sales
-----------|------
   Int     |  Int 
-----------|------
1          | 100  
1          | 120  
2          | 50   
2          | 20   
3          | 40   
3          | 30   

dataframe> :exposeColumns df
"product_id :: Expr Int"
"sales :: Expr Int"
dataframe> df |> D.groupBy [F.name product_id] |> D.aggregate [F.sum sales `as` "total_sales"]
------------------------
product_id | total_sales
-----------|------------
   Int     |     Int    
-----------|------------
1          | 220        
2          | 70         
3          | 70         

Documentation