How to calculate the bootstrap error and its confidence interval of a time series data #780

geophysics91 · 2020-09-05T07:37:16Z

geophysics91
Sep 5, 2020

Dear experts, i need to calculate the bootstrap error of the 5 time series data appended in a file. In side the time_series files five time series data are separated with > > symbols. https://i.fluffy.cc/12NLsqHhTTcvR67btNjRzXZCkbpkfw9c.html can anybody suggest better way to do it. I tried http://rasbt.github.io/mlxtend/user_guide/evaluate/bootstrap/#example-1-bootstrapping-the-mean but its for only single timseries data

pkaf · 2020-09-15T08:19:20Z

pkaf
Sep 15, 2020

Looking at the implementation of

bootstrap(x, func, num_rounds=1000, ci=0.95, ddof=1, seed=None)

it says x can be (n_samples, [n_columns]), perhaps you need to reshape your data to have this dimension?

0 replies

rasbt · 2020-09-15T18:18:20Z

rasbt
Sep 15, 2020
Maintainer

it says x can be (n_samples, [n_columns]), perhaps you need to reshape your data to have this dimension?

Yes, @pkaf is correct it can be both an 1D or 2D array. Reshaping may not be necessary though. It depends on what your argument for fun is. E.g., the numpy mean function can compute the mean for both 1D and 2D arrays so both

import numpy as np
from mlxtend.evaluate import bootstrap


rng = np.random.RandomState(123)
x = rng.normal(loc=5., size=100)
original, std_err, ci_bounds = bootstrap(x, num_rounds=1000, func=np.mean, ci=0.95, seed=123)
print('Mean: %.2f, SE: +/- %.2f, CI95: [%.2f, %.2f]' % (original, 
                                                        std_err, 
                                                        ci_bounds[0],
                                                        ci_bounds[1]))

and

rng = np.random.RandomState(123)
x = rng.normal(loc=5., size=(100, 2))
original, std_err, ci_bounds = bootstrap(x, num_rounds=1000, func=np.mean, ci=0.95, seed=123)
print('Mean: %.2f, SE: +/- %.2f, CI95: [%.2f, %.2f]' % (original, 
                                                        std_err, 
                                                        ci_bounds[0],
                                                        ci_bounds[1]))

would work.

You could also handle the reshaping yourself if it is necessary for your func. E.g., like in the example below:

from mlxtend.data import autompg_data

from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

X, y = autompg_data()


lr = LinearRegression()

def r2_fit(X, model=lr):
    x, y = X[:, 0].reshape(-1, 1), X[:, 1]
    pred = lr.fit(x, y).predict(x)
    return r2_score(y, pred)


original, std_err, ci_bounds = bootstrap(X, num_rounds=1000,
                                         func=r2_fit,
                                         ci=0.95,
                                         seed=123)
print('Mean: %.2f, SE: +/- %.2f, CI95: [%.2f, %.2f]' % (original, 
                                                             std_err, 
                                                             ci_bounds[0],
                                                             ci_bounds[1]))

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to calculate the bootstrap error and its confidence interval of a time series data #780

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

How to calculate the bootstrap error and its confidence interval of a time series data #780

geophysics91 Sep 5, 2020

Replies: 2 comments

pkaf Sep 15, 2020

rasbt Sep 15, 2020 Maintainer

geophysics91
Sep 5, 2020

pkaf
Sep 15, 2020

rasbt
Sep 15, 2020
Maintainer