Bug fix - to handle "u: list of array-like, shape (n_samples, n_co… #605

YaadR · 2025-02-23T15:57:31Z

Bug fix - to handle "u: list of array-like, shape (n_samples, n_control_features)" input - I'll elaborate more in the PR comment section

Generally there is an option to give the model.fit() ' list of array-like, shape' as documented. when this is done for X_train data requires all the associated data to be in a sequence (list) form as well, e.g. [t_train], [x_dot] and [u_train] - the problem that arises with u_train is that there is a reshape part in the code that does it poorly as well as a section that called for X_train.shape - an numpy.array() feature that doesn't exist in python 'list' type. both fixes allows the code to run correct and smoothly, and does not effect other library features.

In the code x_train_1 and x_trian_2 are of different lengths, to demonstrate the use of python 'list' specifically and not numpy.array() which constrained to 'symmetric' matrix shape only

The code that reproduces the problem

#!/usr/bin/env python3
import numpy as np # numpy==1.26.4
import pysindy as ps # pysindy==1.7.5

def main():
    # 1. Create sample data to mimic your shape conditions.
    #    Two trajectories: (2, 101) and (2, 100).
    t_train_1 = np.linspace(0, 1, 101)
    t_train_2 = np.linspace(0, 1, 100)

    x_train_1 = np.vstack([
        np.sin(2*np.pi*t_train_1),
        np.cos(2*np.pi*t_train_1)
    ])  # shape (2, 101)

    x_train_2 = np.vstack([
        np.sin(2*np.pi*t_train_2),
        np.cos(2*np.pi*t_train_2)
    ])  # shape (2, 100)

    # Create x_dot data for x_train_1
    x_dot_1 = np.zeros_like(x_train_1)
    for i in range(x_train_1.shape[0]):
        sfd = ps.SmoothedFiniteDifference(smoother_kws={'window_length': 25})
        x_dot_1[i, :] = sfd._differentiate(x_train_1[i, :], t_train_1[1] - t_train_1[0])

    x_dot_2 = np.zeros_like(x_train_2)
    for i in range(x_train_2.shape[0]):
        x_dot_2[i, :] = sfd._differentiate(x_train_2[i, :], t_train_2[1] - t_train_2[0])

    x_dot = [x_dot_1.T, x_dot_2.T]


    # Optional: Control input (u_train) for each trajectory
    #           shape matches time dimension
    u_train_1 = np.zeros_like(t_train_1)
    u_train_2 = np.zeros_like(t_train_2)

    # Combine into lists to represent multiple trajectories
    x_train = [x_train_1, x_train_2]
    u_train = [u_train_1, u_train_2]

    # Simple time step (dt) taken from the first trajectory
    dt = t_train_1[1] - t_train_1[0]

    # Example feature library (you can choose any)
    feature_library = ps.PolynomialLibrary(degree=2)

    # For demonstration, define a single optimizer:
    from pysindy.optimizers import STLSQ
    selected_optimizers = {
        "STLSQ_example": {
            "class": STLSQ,
            "params": {
                "alpha": 0.1,
                "threshold": 0.1,
                "fit_intercept": True
            }
        }
    }

    # Check if x_train is a list => multiple trajectories
    xu_list = isinstance(x_train, list)

    def run_selected_optimizers(selected_opts):
        if not selected_opts:
            print("Please select at least one optimizer.")
            return

        models_scores = {}
        models_errors = {}

        # Example function to compute "prediction error" (stub)
        # pred_state, state_data shapes must match in time dimension.
        def compute_prediction_error(pred_state, state_data):
            # Just a demo for RMS error
            state_data = state_data[:, :pred_state.shape[1]]
            return [
                np.sqrt(np.mean((pred - true) ** 2))
                for (pred, true) in zip(pred_state, state_data)
            ]

        # 3. Loop over each optimizer
        for name, opt_data in selected_opts.items():
            optimizer_class = opt_data["class"]
            optimizer_params = opt_data["params"]

            # 4. Initialize and fit the SINDy model
            model = ps.SINDy(
                optimizer=optimizer_class(**optimizer_params),
                feature_library=feature_library
            )
            model.fit(
                x=x_train,
                t=dt,
                x_dot=x_dot,                # Not providing pre-computed derivatives
                u=u_train,                 # Control inputs
                multiple_trajectories=xu_list,
            )

            # 5. Print model to console
            print(f"\n===== Trained Model: {name} =====")
            model.print()

    # 6. Finally, run the optimizers
    run_selected_optimizers(selected_optimizers)


if __name__ == "__main__":
    main()

The problem:

…ol_features)" input - I'll elaborate more in the PR comment section

giopapanas · 2025-03-19T14:32:06Z

Thank you @YaadR , for your fix here. I raised this issue: #611, do you think it relates to your bug fix? In brief, when I do a toy experiment and run model.fit() with X of 1D, then the model.fit runs fine. However, as I explain in the discussion in the link above, the model gives me an error when I load a multi-dimensional X.

Btw, do you know if I need to input [x_dot] and [u_train] data myself? I think PySINDy is by default loading [x_dot] and [u_train], if you specify the differentiation method and the library to use? Thank you in advance.

YaadR · 2025-03-25T13:27:02Z

Hi @giopapanas , to my understanding #611 is not sourced from the same bug.

Jacob-Stevens-Haas · 2025-04-04T18:25:42Z

Hey, thanks for your PR @YaadR - sorry for the delay. I've verified that this still exists on master branch.

Let's talk about your code. I've shrunk it down to a alternative MWE (minimal, working example):

import numpy as np 
import pysindy as ps 

x1 = np.arange(10).reshape((-1, 1))
x2 = np.arange(11).reshape((-1, 1))
x = [x1, x2]
u1 = np.arange(10)
u2 = np.arange(11)
u = [np.arange(10), np.arange(11)]


model = ps.SINDy()
# No error
model.fit(x=x[0], t=1.0, u=u[0])
# No error
model.fit(x=[x1, x1], t=1.0, u=[u1, u1])
# Error
model.fit(x=x, t=1.0, u=u)

This is the form we prefer to receive examples in, as the process of reducing the example is likely to show you the problem. Here it is obvious: u arrays are allowed to be flat when passed as a single trajectory, or when every trajectory is the same length, but not when using multiple trajectories of different lengths.

But you've noticed the docstring: u: list of array-like, shape (n_samples, n_control_features). So I see the problem differently: by accepting the first two calls when they don't obey the API, it promotes the expectation that the third form would work.

If you're still interested in the PR, and for that I'd be grateful, you're welcome to find a way to fix what I described as the bug. I'd recommend starting by writing a test.

(BTW: Code formatting in github allows you to specify the language in order to get syntax highlighting, by typing "```python". I've added that to your comment.)

Bug fix - to handle "u: list of array-like, shape (n_samples, n_contr…

9b24004

…ol_features)" input - I'll elaborate more in the PR comment section

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug fix - to handle "u: list of array-like, shape (n_samples, n_co… #605

Bug fix - to handle "u: list of array-like, shape (n_samples, n_co… #605

YaadR commented Feb 23, 2025 •

edited by Jacob-Stevens-Haas

Loading

giopapanas commented Mar 19, 2025

YaadR commented Mar 25, 2025 •

edited

Loading

Jacob-Stevens-Haas commented Apr 4, 2025 •

edited

Loading

Bug fix - to handle "u: list of array-like, shape (n_samples, n_co… #605

Are you sure you want to change the base?

Bug fix - to handle "u: list of array-like, shape (n_samples, n_co… #605

Conversation

YaadR commented Feb 23, 2025 • edited by Jacob-Stevens-Haas Loading

giopapanas commented Mar 19, 2025

YaadR commented Mar 25, 2025 • edited Loading

Jacob-Stevens-Haas commented Apr 4, 2025 • edited Loading

YaadR commented Feb 23, 2025 •

edited by Jacob-Stevens-Haas

Loading

YaadR commented Mar 25, 2025 •

edited

Loading

Jacob-Stevens-Haas commented Apr 4, 2025 •

edited

Loading