Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 5 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

## Why `cosinor-lite?`

- **Unified workflow for live cell data**: Perform detrending and fit cosinor models with either a fixed 24 h, free-period or damped oscillation with consistent APIs across data modalities (i.e. bioluminescence, cytokines etc.).
- **Unified workflow for live cell data**: Perform detrending and fit cosinor models with either a fixed 24 h, free-period or damped oscillation with consistent APIs across data modalities (i.e. bioluminescence, cytokines, qPCR etc.).
- **Differential rhythmicity across any omics type**: Compare two conditions for differential rhythmicity for any type of omics data using BIC model-selection strategies with widespread use in the circadian literature.
- **Create publication-ready analysis with a simple UI**: Launch a browser-based analysis console for visual quality control, plots, and parameter export.

Expand Down Expand Up @@ -42,6 +42,8 @@ Here is an adaptation of their figure explaining the methdology:
<img src="./images/model_selection.png" alt="Model selection" style="width:60%;"/>
</div>

The tool is very similar to the dryR package in R, but implemented in Python for ease of use with other Python-based data analysis pipelines. Please see the dryR tool for more details, and if you prefer implementing in R (https://github.com/naef-lab/dryR/tree/master).

For condition 1 (i.e. alpha cells) and condition 2 (i.e. beta cells), we fit 5 different models:

- Model 1) Arrhythmic in alpha and beta cells
Expand All @@ -68,8 +70,6 @@ uv sync
source .venv/bin/activate
```

> **Tip:** The project targets Python 3.11+. Using `uv` keeps package resolution reproducible via `uv.lock`.

## Quick Start

### Launch the interactive app
Expand Down Expand Up @@ -102,32 +102,14 @@ fit = analysis.fit(group="treated", model="damped")
print(fit.parameters)
```

## Repository Layout

```
src/cosinor_lite/ # Core library modules and models
app.py # Gradio application entry point
tests/ # Pytest suites (unit, integration, fixtures)
data/ # Example datasets for the app
images/ # Illustration assets used in the UI
```

## Development Workflow

```bash
# run style, type, and security hooks
pre-commit run --all-files

# execute unit tests
uv run pytest
```

## Sample Datasets

The `data/` directory includes curated examples to help you get started:

- `bioluminescence_example.csv`: Bioluminescence time series for live-cell analysis.
- `cytokine_example.csv`: An alternative CSV with the same schema for cosinor fitting using cytokine data.
- `cytokine_example.csv`: Cytokine time series for live-cell analysis.
- `qpcr_example.csv`: qPCR time series for live-cell analysis.
- `GSE95156_Alpha_Beta.txt`: RNA-seq data used in the omics differential rhythmicity workflow.

Each file is formatted to drag & drop into `app.py` as well as the library APIs.
Expand Down
376 changes: 275 additions & 101 deletions app.py

Large diffs are not rendered by default.

14,028 changes: 14,028 additions & 0 deletions data/GSE95156_Alpha_Beta.csv

Large diffs are not rendered by default.

10 changes: 10 additions & 0 deletions data/qpcr_example.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Participant ID,1,1,2,2,3,3,4,5,5,6,6,7,7,8,8,9,10,11
Replicate,1,2,1,2,1,2,1,1,2,1,2,1,2,1,2,1,1,1
Group,Group 1,Group 1,Group 1,Group 1,Group 1,Group 1,Group 1,Group 1,Group 1,Group 2,Group 2,Group 2,Group 2,Group 2,Group 2,Group 2,Group 1,Group 2
12,0.1521,0.1157,0.2137,0.2373,,0.1287,0.2434,0.3844,0.5509,0.1456,0.1192,0.2279,0.2544,0.1478,0.165,0.2433,0.0757,0.169
16,0.2682,0.235,0.522,0.5362,0.2034,0.1663,0.5834,0.7298,0.5026,0.1357,0.1508,0.357,0.3393,0.2616,0.2432,0.2816,0.1228,0.2285
20,0.2328,0.2364,0.7227,0.6,0.2074,0.1887,0.6815,0.6507,0.7113,0.1493,0.1641,0.4117,,0.3647,0.2158,0.2677,0.1582,0.2619
24,0.1691,0.1394,0.4334,0.4078,0.1752,0.2141,0.4122,0.5008,0.5088,0.1237,0.1432,0.3894,,0.2779,0.2997,0.2772,0.1534,0.3343
28,0.078,0.065,0.1708,0.1662,0.1165,0.145,0.1788,0.2429,0.2548,0.0888,0.0764,0.1327,0.1392,0.1387,0.103,0.1846,0.0791,
32,0.0546,0.062,0.1118,0.111,0.1157,0.088,0.1201,0.1236,0.115,0.0656,0.0867,0.1163,0.1133,0.0948,0.1096,0.1236,0.0582,0.1021
36 ,0.0784,0.0719,0.1551,0.1334,0.1091,0.1146,0.1724,0.1765,0.2084,0.1214,,0.1048,0.1442,0.2728,0.2212,0.2097,0.072,0.1603
10 changes: 8 additions & 2 deletions notebooks/live_cell_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,8 @@
"- poly2 (linear + quadratic terms)\n",
"- moving_average (need to provide number of window steps)\n",
"\n",
"As measurements are every 10 mins, we'll use a moving_average window of 240 (24 hour moving average window)."
"As measurements are every 10 mins, we'll use a moving_average window of 240 (24 hour moving average window).\n",
"The helper returns a matplotlib figure along with temporary PDF and CSV paths so you can download both the plot and the detrended traces."
]
},
{
Expand All @@ -123,7 +124,12 @@
"metadata": {},
"outputs": [],
"source": [
"fig, tmp_path = dataset.plot_group_data(\"group1\", method=\"moving_average\", window=240, plot_style=\"line\")"
"fig, tmp_path, csv_path = dataset.plot_group_data(\n",
" \"group1\",\n",
" method=\"moving_average\",\n",
" window=240,\n",
" plot_style=\"line\",\n",
")"
]
},
{
Expand Down
30 changes: 18 additions & 12 deletions notebooks/omics_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,9 @@
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import pandas as pd\n",
"from cosinor_lite.datasets import OmicsDataset\n",
"from cosinor_lite.differential_rhytmicity_omics import DifferentialRhythmicity, OmicsHeatmap, TimeSeriesExample\n",
"\n",
"from cosinor_lite.omics_dataset import OmicsDataset\n",
"from cosinor_lite.omics_differential_rhytmicity import DifferentialRhythmicity, OmicsHeatmap, TimeSeriesExample\n",
"\n",
"plt.rcParams.update(\n",
" {\n",
Expand Down Expand Up @@ -210,8 +211,9 @@
" t_cond1=t_cond1,\n",
" t_cond2=t_cond2,\n",
" deduplicate_on_init=True,\n",
" log2_transform=False,\n",
")\n",
"rna_data.expression_histogram()"
"fig = rna_data.expression_histogram()"
]
},
{
Expand All @@ -231,15 +233,15 @@
"metadata": {},
"outputs": [],
"source": [
"rna_data.replicate_scatterplot(sample1=\"ZT_0_a_1\", sample2=\"ZT_0_a_2\")"
"fig = rna_data.replicate_scatterplot(sample1=\"ZT_0_a_1\", sample2=\"ZT_0_a_2\")"
]
},
{
"cell_type": "markdown",
"id": "13",
"metadata": {},
"source": [
"In some cases this may reveal that measurements below a certain threshold look visually less correlated between samples. Here we will defines genes as non-expressed using the threshold of <0, as used in the original 2017 article."
"In some cases this may reveal that measurements below a certain threshold look visually less correlated between samples. Here we will defines genes as non-expressed using a mean expression cut-off of 0 and requiring at least two detected measurements per condition, matching the original 2017 article."
]
},
{
Expand All @@ -249,7 +251,7 @@
"metadata": {},
"outputs": [],
"source": [
"rna_data.add_is_expressed(mean_min=0)"
"rna_data.add_is_expressed(mean_min=0, num_detected_min=2)"
]
},
{
Expand Down Expand Up @@ -302,7 +304,7 @@
" cond2_label=\"Beta cells\",\n",
" show_unexpressed=False,\n",
")\n",
"heatmap.plot_heatmap()"
"fig = heatmap.plot_heatmap()"
]
},
{
Expand Down Expand Up @@ -332,12 +334,16 @@
" cond2_label=\"Beta cells\",\n",
")\n",
"\n",
"# example.plot_time_series(\"Scg2\", xticks=np.arange(0, 25, 4))\n",
"# example.plot_time_series(\"Upk3a\", xticks=np.arange(0, 25, 4))\n",
"# example.plot_time_series(\"Kcnk3\", xticks=np.arange(0, 25, 4))\n",
"\n",
"example.plot_time_series(\"Ntrk1\", xticks=np.arange(0, 25, 4))"
"example.plot_time_series(\"Arntl\", xticks=np.arange(0, 25, 4))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "21",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
4 changes: 2 additions & 2 deletions run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ function test:quick {

# execute tests against the installed package; assumes the wheel is already installed
function test:ci {
INSTALLED_PKG_DIR="$(python -c 'import cosinor_lite; print(cosinor_lite.__path__[0])')"
INSTALLED_PKG_DIR="$(uv run python -c 'import cosinor_lite; print(cosinor_lite.__path__[0])')"
# in CI, we must calculate the coverage for the installed package, not the src/ folder
COVERAGE_DIR="$INSTALLED_PKG_DIR" run-tests
}
Expand All @@ -47,7 +47,7 @@ function run-tests {
}

function serve-coverage-report {
python -m http.server 8000 \
uv run python -m http.server 8000 \
--bind 127.0.0.1 \
--directory "$THIS_DIR/test-reports/htmlcov/"
}
Expand Down
58 changes: 46 additions & 12 deletions src/cosinor_lite/livecell_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ class LiveCellDataset:
poly2_trend(x, y): Fits a second-order polynomial trend to the data.
moving_average_trend(x, y, window): Computes a moving average trend.
get_trend(x, y, method, window): Applies the specified detrending method.
plot_group_data(group, method, window, m, plot_style): Plots time series data for the specified group.
plot_group_data(group, method, window, m, plot_style): Plots time series data for the specified group and exports detrended values.

"""

Expand Down Expand Up @@ -247,7 +247,7 @@ def moving_average_trend(
y_series = pd.Series(y)
ma_fit = y_series.rolling(window=window, center=True).mean().to_numpy()
good = np.isfinite(x) & np.isfinite(ma_fit)
y_detrended = y[good] - ma_fit[good] + np.mean(y[good])
y_detrended = y[good] - ma_fit[good] # + np.mean(y[good])
return x[good], ma_fit[good], y_detrended

def get_trend(
Expand Down Expand Up @@ -301,14 +301,14 @@ def get_trend(
msg = f"Unknown detrending method: {method}"
raise ValueError(msg)

def plot_group_data(
def plot_group_data( # noqa: PLR0915
self,
group: str,
method: str = "linear",
window: int = 5,
m: int = 5,
plot_style: str = "scatter",
) -> tuple[plt.Figure, str]:
) -> tuple[plt.Figure, str, str]:
"""
Plots data for a specified group, with options for trend fitting and plot style.

Expand All @@ -332,6 +332,8 @@ def plot_group_data(
The matplotlib Figure object containing the plots.
tmp_path : str
The path to the saved temporary PDF file of the figure.
csv_path : str
The path to a temporary CSV containing detrended data in input-like layout.

Raises
------
Expand All @@ -340,32 +342,39 @@ def plot_group_data(

"""
if group == "group1":
ids, _replicates, data = self.get_group1_ids_replicates_data()
ids, replicates, data = self.get_group1_ids_replicates_data()
color = self.color_group1
group_label = self.group1_label
elif group == "group2":
ids, _replicates, data = self.get_group2_ids_replicates_data()
ids, replicates, data = self.get_group2_ids_replicates_data()
color = self.color_group2
group_label = self.group2_label
else:
msg = "group must be 'group1' or 'group2'"
raise ValueError(msg)

n_group = len(np.unique(ids))
ids_array = np.array(ids)
replicates_array = np.array(replicates)
n_group = len(np.unique(ids_array))
n_cols = m
n_rows = int(np.ceil(n_group / n_cols))

study_list = np.unique(ids).tolist()
study_list = np.unique(ids_array).tolist()
fig = plt.figure(figsize=(5 * n_cols / 2.54, 5 * n_rows / 2.54))

detrended_matrix = np.full_like(data, np.nan, dtype=float)

for i, id_curr in enumerate(study_list):
mask = np.array(ids) == id_curr
mask = ids_array == id_curr
n_reps = int(np.sum(mask))
ax = fig.add_subplot(n_rows, n_cols, i + 1)

col_indices = np.where(mask)[0]

for j in range(n_reps):
x = self.time
y = data[:, mask][:, j]
col_idx = col_indices[j]
y = data[:, col_idx]

if plot_style == "scatter":
ax.scatter(x, y, s=4, alpha=0.8, color=color)
Expand All @@ -376,12 +385,18 @@ def plot_group_data(
x_fit = x[valid_mask]
y_fit = y[valid_mask]

x_processed, trend, _y_detrended = self.get_trend(
x_processed, trend, y_detrended = self.get_trend(
x_fit,
y_fit,
method=method,
window=window,
)
detrended_full = np.full_like(x, np.nan, dtype=float)
for x_val, y_val in zip(x_processed, y_detrended, strict=False):
idx_candidates = np.where(np.isclose(x, x_val))[0]
if idx_candidates.size:
detrended_full[idx_candidates[0]] = y_val
detrended_matrix[:, col_idx] = detrended_full
if method != "none":
ax.plot(
x_processed,
Expand All @@ -403,4 +418,23 @@ def plot_group_data(

fig.savefig(tmp_path)

return fig, tmp_path
csv_fd, csv_path = tempfile.mkstemp(suffix=".csv")
os.close(csv_fd)

column_labels = [f"col_{i + 1}" for i in range(detrended_matrix.shape[1])]
table_rows = [
np.array(ids_array, dtype=object),
np.array(replicates_array, dtype=object),
np.array([group_label] * detrended_matrix.shape[1], dtype=object),
]
table_rows.extend(detrended_matrix.astype(object))
index_labels = [
"participant_id",
"replicate",
"group",
*[float(t) for t in self.time],
]
export_df = pd.DataFrame(table_rows, index=index_labels, columns=column_labels)
export_df.to_csv(csv_path, index=True, header=False, na_rep="")

return fig, tmp_path, csv_path
18 changes: 15 additions & 3 deletions src/cosinor_lite/omics_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@ class OmicsDataset:
Display label for condition 2, by default ``"cond2"``.
deduplicate_on_init : bool, optional
Whether to deduplicate genes immediately after initialization, by default ``False``.
log2_transform : bool, optional
Whether to apply a log2(x + 1) transform to expression columns on init, by default ``False``.

"""

Expand All @@ -60,6 +62,7 @@ class OmicsDataset:
cond2_label: str = "cond2"

deduplicate_on_init: bool = False
log2_transform: bool = False

@field_validator("t_cond1", "t_cond2", mode="before")
@classmethod
Expand Down Expand Up @@ -115,12 +118,21 @@ def _check_columns(self) -> Self:

def __post_init__(self) -> None:
"""Populate derived columns and optionally deduplicate entries."""
if self.log2_transform:
self._apply_log2_transform()
self.add_detected_timepoint_counts()
self.add_mean_expression()
self.add_number_detected()
if self.deduplicate_on_init:
self.deduplicate_genes()

def _apply_log2_transform(self) -> None:
"""Apply a log2(x + 1) transform to measurement columns."""
measurement_cols = self.columns_cond1 + self.columns_cond2
numeric = self.df[measurement_cols].apply(pd.to_numeric, errors="coerce")
transformed = np.log2(numeric + 1.0)
self.df.loc[:, measurement_cols] = transformed

def detected_timepoint_counts(self, cond: str) -> list[int]:
"""
Count detected time points per gene for the requested condition.
Expand Down Expand Up @@ -165,8 +177,8 @@ def add_detected_timepoint_counts(self) -> None:

def add_mean_expression(self) -> None:
"""Compute mean expression for both conditions and store in the DataFrame."""
self.df["mean_cond1"] = self.df[self.columns_cond1].mean(axis=1)
self.df["mean_cond2"] = self.df[self.columns_cond2].mean(axis=1)
self.df["mean_cond1"] = self.df[self.columns_cond1].mean(axis=1, skipna=True)
self.df["mean_cond2"] = self.df[self.columns_cond2].mean(axis=1, skipna=True)

def add_number_detected(self) -> None:
"""Store the count of non-null measurements for each condition."""
Expand Down Expand Up @@ -247,7 +259,7 @@ def expression_histogram(self, bins: int = 20) -> plt.Figure:

"""
print(plt.rcParams["font.size"])
fig = plt.figure(figsize=(6 / 2.54, 12 / 2.54))
fig = plt.figure(figsize=(8 / 2.54, 12 / 2.54))
plt.subplot(2, 1, 1)
plt.hist(self.df["mean_cond1"].to_numpy().flatten(), bins=bins)
plt.xlabel("Mean Expression")
Expand Down
Loading