Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
189 commits
Select commit Hold shift + click to select a range
87d6510
Decoupled cv (#243)
kegl Sep 7, 2020
7fbae01
Update README.rst
kegl Sep 7, 2020
b7cdedd
Update .travis.yml
kegl Sep 8, 2020
864906d
adding --data-label (#245)
kegl Sep 8, 2020
f9b1939
Fixing ramp-leaderboard (#247)
kegl Oct 16, 2020
3eadb4d
Merge branch 'master' into advanced
albertcthomas Oct 26, 2020
96f6efb
add condition in circle push_doc (#253)
albertcthomas Oct 26, 2020
45cce46
add install instruction for advanved
albertcthomas Nov 3, 2020
1676d9d
Merge branch 'master' into advanced
albertcthomas Nov 27, 2020
ec8835e
hyperopt fix (#257)
gabriel-hurtado Nov 27, 2020
85a533f
Modefied ramp for generative wf
gabriel-hurtado Jul 12, 2019
23a68c8
Added tests for new prediction type, generative regression
gabriel-hurtado Jul 16, 2019
19e0122
Cleaned a bit
gabriel-hurtado Jul 17, 2019
7c20ac5
Renamed log likelihood for generation
gabriel-hurtado Jul 18, 2019
174ba01
Probabilities are normalized and bins are checked, test is included
gabriel-hurtado Jul 18, 2019
64c6bf6
convert all the nb to n
gabriel-hurtado Jul 18, 2019
53865c1
Started ensembling code
gabriel-hurtado Jul 29, 2019
534af70
Loss could be negative
gabriel-hurtado Aug 1, 2019
b45630b
Now different generative regressor wf can have their ow different num…
gabriel-hurtado Aug 5, 2019
ad5c5d2
Added an optio for multi target generative regressors
gabriel-hurtado Aug 5, 2019
f32248e
Solved problem for time series
gabriel-hurtado Aug 7, 2019
c6d6347
Updated to work with RL loop
gabriel-hurtado Aug 19, 2019
8644271
Cleaned generative regression, now works with numpy and pandas
gabriel-hurtado Aug 20, 2019
6ad07d8
Bugfix
gabriel-hurtado Aug 21, 2019
c11756b
Now accept arbitrary parameters
gabriel-hurtado Aug 21, 2019
c9f19b9
Added likelihood ratio
gabriel-hurtado Aug 22, 2019
e067a5d
Bugfix on ramp
gabriel-hurtado Aug 23, 2019
0f56708
Deleted an usefull line
gabriel-hurtado Aug 23, 2019
955ebbc
Added seed
gabriel-hurtado Aug 28, 2019
bffb3c7
Added the ability to indicate a broken time continuity in ts_feature_…
gabriel-hurtado Sep 12, 2019
e969a5f
Now able to use restart info on the generative regressor
gabriel-hurtado Sep 16, 2019
b94d32b
Solved a bug for classical generative regressors
gabriel-hurtado Sep 16, 2019
cdd654b
Removed useless copy
gabriel-hurtado Sep 24, 2019
1604e78
Minor change, log LK is bounded to avoid nan loss
gabriel-hurtado Sep 26, 2019
573cc0e
bugfix, forgot one removed variable
gabriel-hurtado Sep 27, 2019
e11866f
Added gaussian generative regressor
gabriel-hurtado Oct 3, 2019
ae0ea66
Changed representation of distributions
gabriel-hurtado Oct 16, 2019
9895b7f
bugfix
gabriel-hurtado Oct 17, 2019
6abf6b6
Adding other distributions, todo: implement uniform, and test it
gabriel-hurtado Oct 21, 2019
385872c
Added uniform
gabriel-hurtado Oct 22, 2019
48e44d8
Cleaned a bit
gabriel-hurtado Oct 22, 2019
ff730a8
pep8
gabriel-hurtado Oct 22, 2019
f3e66aa
Added minimal docstring
gabriel-hurtado Oct 22, 2019
b8fe552
Solve masking bug
gabriel-hurtado Oct 22, 2019
b9fbe7b
Sampling working on dists
gabriel-hurtado Oct 28, 2019
09b42fa
passing current index to model
gabriel-hurtado Oct 29, 2019
230224f
Remove discrepencie
gabriel-hurtado Oct 30, 2019
b48afe9
Added combine
gabriel-hurtado Nov 5, 2019
81aaac1
new score file
kegl Nov 4, 2019
fa067c4
rename
kegl Nov 4, 2019
baf9ba5
rename
kegl Nov 4, 2019
038c0e7
fixed merging bug
gabriel-hurtado Nov 5, 2019
c7fad6b
Reformating and added beta
gabriel-hurtado Nov 5, 2019
5ff8fac
Reformating, code more flexible now
gabriel-hurtado Nov 7, 2019
5bcc5ed
Added cheat checker
gabriel-hurtado Nov 7, 2019
b782b45
Bugfix on beta distrib
gabriel-hurtado Nov 7, 2019
d5ab0d1
Updated ramp
gabriel-hurtado Nov 7, 2019
bd41fdb
Exposed distribution dict for clearer interface, added cv based on sy…
gabriel-hurtado Nov 8, 2019
85ad46b
Updated test for gen reg
gabriel-hurtado Nov 8, 2019
6bff1ad
Code refactoring
gabriel-hurtado Nov 8, 2019
0356318
Removed unused line
gabriel-hurtado Nov 8, 2019
c28fe15
updated testing kit
gabriel-hurtado Nov 8, 2019
8461c0c
acrobot workflow
kegl Nov 8, 2019
6b5c44a
dropping restart warning
kegl Nov 11, 2019
74548b5
dropping restart warning
kegl Nov 11, 2019
9c9bd09
dropping future deprecation warning
kegl Nov 11, 2019
10bf041
cosmetics
kegl Nov 11, 2019
2befe1f
no extra actions
kegl Nov 12, 2019
cc01476
more meaningful print message
kegl Nov 12, 2019
794ad00
Added vonmisses, folded gaussian and truncated gaussian distributions
gabriel-hurtado Nov 18, 2019
87979e6
more info in error trace
kegl Nov 18, 2019
f16c67a
Return zero when x is outside support of pdf
gabriel-hurtado Nov 19, 2019
a1876a6
Bugfix
gabriel-hurtado Nov 19, 2019
76f0e37
Added pert distribution
gabriel-hurtado Nov 20, 2019
3601732
bugfix
gabriel-hurtado Nov 21, 2019
95b5f3b
local changes, ready to merge
gabriel-hurtado Dec 11, 2019
5599b95
fixing combine
kegl Nov 21, 2019
a241daa
fixing blending
kegl Nov 21, 2019
2de8ce4
fixing blending
kegl Nov 22, 2019
bc41939
make blending more readable
kegl Dec 2, 2019
545d60f
creating training_output before using it
kegl Dec 2, 2019
f252e5b
fixing one test
kegl Dec 2, 2019
fa4c9b9
fixing blending
kegl Nov 21, 2019
ef57935
fixing blending
kegl Nov 21, 2019
afa66e3
make blending more readable
kegl Dec 2, 2019
e40b5bd
creating training_output before using it
kegl Dec 2, 2019
4296df8
patching testing
kegl Dec 3, 2019
8892a87
Bugfix
gabriel-hurtado Apr 7, 2020
03d9b96
Bugfix
gabriel-hurtado Apr 9, 2020
dba11d4
Minor improvements
gabriel-hurtado Apr 9, 2020
7c1dd99
making time series cv more robust to restart name
kegl Feb 18, 2020
634747c
docstring for step and train + pep 8
albertcthomas Dec 26, 2019
7c9d477
allow passing RandomState instance in step
albertcthomas Jan 13, 2020
ff3c832
Bugfix
gabriel-hurtado Apr 7, 2020
d206a84
force pd.DataFrame
albertcthomas Apr 7, 2020
911fe8e
Now submissions define their own order
gabriel-hurtado Apr 10, 2020
88dc0ed
Added the ability to do multidim mdn
gabriel-hurtado Apr 21, 2020
de89f12
make generative_regressor use new import function
albertcthomas Apr 22, 2020
5432b62
use sanitize=False because we can have open in the submissions
albertcthomas Apr 22, 2020
bec143a
pep8, mu -> mean
kegl Apr 26, 2020
b0f6ef4
pep8, verbose=False
kegl Apr 26, 2020
443d7e4
beautification, naming convention, comments
kegl Apr 27, 2020
8425795
beautification, naming convention, comments
kegl Apr 27, 2020
272a2a5
beautification, naming convention, comments
kegl Apr 27, 2020
61a5c64
restructuring, normalizing weights
kegl Apr 28, 2020
ead3cad
adding full pipeline, renaming auto to autoregression
gabriel-hurtado Apr 28, 2020
c525fff
making time series future check optional
kegl Apr 28, 2020
bc27154
clean unwrapping of mixture components
kegl Apr 28, 2020
c6f74bd
dimension-wise LR score
kegl Apr 28, 2020
4d35f29
dimension-wise scores
kegl Apr 28, 2020
73b5406
rmse for generative regression
kegl Apr 28, 2020
afe2493
R2 score for mixture models
kegl Apr 28, 2020
e604c69
Kolmogorov Smirnov stats to measure calibratedness
kegl Apr 29, 2020
d241bd5
Kolmogorov Smirnov lower the better
kegl Apr 29, 2020
186c90b
adding lookahead check back
kegl Apr 30, 2020
1d91bd6
nb -> n
kegl May 5, 2020
8f26584
n_params in step
albertcthomas Apr 30, 2020
e55fdff
fixing bug in switching check off
kegl May 5, 2020
0a22977
fixing bug in switching check off
kegl May 5, 2020
9b7f84f
missed points error
kegl May 5, 2020
c256897
test refactoring
albertcthomas Apr 16, 2020
f9203cd
deleting old bin-based likelihoods
kegl May 5, 2020
2ce715c
fix in step because of safety removal
albertcthomas May 5, 2020
cc89416
renaming
kegl May 5, 2020
1c8f7a5
adding order to model
kegl May 11, 2020
798e1a0
Fixed step
gabriel-hurtado May 11, 2020
72bde10
adding order to model
kegl May 12, 2020
07f0fb1
fixing some tests
kegl May 15, 2020
f925d35
Added MDN compatibility
gabriel-hurtado May 15, 2020
1e69cc8
ddebug
kegl May 16, 2020
9dccd9a
sample from multivariate gaussian in step for full gen reg
albertcthomas May 19, 2020
deb07fb
fix step generative regressor full
albertcthomas May 21, 2020
fb35cd2
make step work for non autoregressive setup
albertcthomas May 22, 2020
846afad
start adding docstrings for my future self
albertcthomas May 22, 2020
eb9b493
verbose scores
kegl Jul 1, 2020
62f03b4
Unify workflows, decomposition submission attributes and cleaning
gabriel-hurtado Jul 3, 2020
611562a
possibility to pass restart=None
albertcthomas Jul 3, 2020
9b09abd
restart_name no longer a list
albertcthomas Jul 8, 2020
10a5bee
Removed double __init__
gabriel-hurtado Jul 8, 2020
761ede3
docstring
gabriel-hurtado Jul 9, 2020
e70dba1
Discrtibutions as scipy
gabriel-hurtado Jul 15, 2020
e9dd1d3
Fixed message
gabriel-hurtado Jul 16, 2020
e22c180
Added evey scipy dist
gabriel-hurtado Jul 21, 2020
01b5132
typofix
gabriel-hurtado Jul 21, 2020
0848ffc
Added a 'to y_pred' functionnality
gabriel-hurtado Jul 2, 2020
42824e3
Cosmetics
gabriel-hurtado Jul 3, 2020
f748c28
Commented get_component
gabriel-hurtado Jul 7, 2020
c7c938b
Updated docstring
gabriel-hurtado Jul 7, 2020
bc1b496
WIP: fix tests
gabriel-hurtado Jul 21, 2020
d7d1bc9
Updated tests
gabriel-hurtado Jul 21, 2020
55e9576
Cosmetics
gabriel-hurtado Jul 21, 2020
515ded3
Comments and fixed step
gabriel-hurtado Jul 28, 2020
98cd531
comment
kegl Jul 27, 2020
f44ed06
hotfix, the number of parameters was wrong for some distribs
gabriel-hurtado Aug 7, 2020
27dabed
verbose
kegl Sep 3, 2020
cf1aef0
allow passing dataframe + n_burn_in + n_lookahead
albertcthomas Jun 20, 2020
eac5bab
extend train_is with burn_in and restart + shuffle CV
albertcthomas Jun 20, 2020
a3912a8
update prediction type of generative regressors after decoupled cv
albertcthomas Sep 9, 2020
49b0385
fix in step when decomposition is None to pass numpy array: this is t…
albertcthomas Sep 14, 2020
b382041
fix merge with advanced
albertcthomas Sep 17, 2020
e9cf8bd
temporary fix for Series.nonzeros
albertcthomas Sep 22, 2020
35cefff
some cleaning in time_series cv
albertcthomas Nov 11, 2020
282028d
rm diff with master in nll score_types
albertcthomas Nov 11, 2020
74176f2
sc in test
albertcthomas Nov 11, 2020
b9fc307
Code cleaning and comments update
gabriel-hurtado Nov 19, 2020
71e428f
Updated docstrings
gabriel-hurtado Nov 24, 2020
49e9f2d
Revreted naming order
gabriel-hurtado Nov 25, 2020
b8e1405
Removed plotting, moved to separate branch
gabriel-hurtado Nov 25, 2020
7cac295
sc in time_series cvs
albertcthomas Nov 25, 2020
fba4ecf
missing MAX_MDN_PARAMS
albertcthomas Nov 25, 2020
9f4c644
n_dists -> n_components
albertcthomas Nov 25, 2020
cbb8a2f
fix warning residuals == 0
albertcthomas Nov 25, 2020
23ec2df
rm remaining plots in scores
albertcthomas Dec 3, 2020
e086202
fix test after rebase
albertcthomas Dec 4, 2020
ae98322
Stochastic regression
gabriel-hurtado Feb 11, 2021
4425043
Merge branch 'stochastic_regression' into 'generative_regression'
albertcthomas Feb 11, 2021
74686f7
Merge branch 'master' into advanced
albertcthomas Mar 5, 2021
b757bda
Merge branch 'advanced' into generative_regression_clean
albertcthomas Mar 5, 2021
f7e047c
flake8
albertcthomas Mar 5, 2021
c00e178
flake8
albertcthomas Mar 5, 2021
ea0853c
add xarray in requirements
albertcthomas Mar 5, 2021
2580c2b
refactoring time series cv
kegl Apr 6, 2021
8acfdee
vectorization
albertcthomas Apr 9, 2021
3d853f1
Merge branch 'master' into generative_regression
albertcthomas Jul 5, 2021
b0c2091
fix duplicate entry points after merge
albertcthomas Jul 5, 2021
fbe9243
Merge branch 'vectorized_env' into generative_regression
albertcthomas Jul 9, 2021
f71c380
temp fix for sampling when n_burn_in > 1
albertcthomas Jul 19, 2021
f6fc8fa
revert "temp fix for sampling when n_burn_in > 1"
albertcthomas Aug 3, 2021
2b9e240
add error message when using rw sample method with more than 1 sample
albertcthomas Aug 26, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ cache: pip
branches:
only:
- master
- advanced
env:
- PYTHON_VERSION=3.6
- PYTHON_VERSION=3.7
Expand Down
2 changes: 2 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
RAMP workflow
=============

The advanced branch contains advanced features and may be less stable than the master. It works with the advanced branch of ramp-board_.

RAMP workflow allows to define and run machine learning pipeline, documentations available here_.

.. _here: https://paris-saclay-cds.github.io/ramp-docs/ramp-workflow/stable/
9 changes: 9 additions & 0 deletions ci_tools/circle/push_doc.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
# We have three possibily workflows:
# If the git branch is 'master' then we want to commit and merge the dev/
# docs on gh-pages
# If the git branch is 'advanced' then we want to commit and merge the
# advanced docs on gh-pages
# If the git branch is [0-9].[0.9].X (i.e. 0.9.X, 1.0.X, 1.2.X, 41.21.X) then
# we want to commit and merge the major.minor/ docs on gh-pages
# If the git branch is anything else then we just want to test that committing
Expand Down Expand Up @@ -45,6 +47,13 @@ then
doc_clone_commit
git push origin $DOC_BRANCH
echo "Push complete"
elif [ "$CIRCLE_BRANCH" = "advanced" ]
then
# Changes are made to advanced/ directory
DIR="ramp-workflow/advanced"
doc_clone_commit
git push origin $DOC_BRANCH
echo "Push complete"
elif [[ "$CIRCLE_BRANCH" =~ ^[0-9]+\.[0-9]+\.X$ ]]
then
# Strip off .X from branch name, so changes will go to 0.1/, 91.235/, etc
Expand Down
4 changes: 2 additions & 2 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ and `prediction types <https://github.com/paris-saclay-cds/ramp-workflow/tree/ma
Installation
************

Ramp-workflow is available on PyPI and can be installed via `pip`::
The advanced branch of ramp-workflow can be installed via `pip`::

$ pip install ramp-workflow
$ pip install git+https://github.com/paris-saclay-cds/ramp-workflow.git@advanced

.. toctree::
:maxdepth: 2
Expand Down
8 changes: 7 additions & 1 deletion rampwf/cvs/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
from .clustering import Clustering
from .time_series import TimeSeries
from .time_series import (
InsideEpisode, KFoldPerEpisode, RollingPerEpisode, ShufflePerEpisode,
TimeSeries)

__all__ = [
'Clustering',
'InsideEpisode',
'KFoldPerEpisode',
'RollingPerEpisode',
'ShufflePerEpisode',
'TimeSeries',
]
68 changes: 68 additions & 0 deletions rampwf/cvs/tests/test_time_series.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
import numpy as np
from numpy.testing import assert_allclose
import pandas as pd

from rampwf.cvs.time_series import KFoldPerEpisode
from rampwf.cvs.time_series import RollingPerEpisode
from rampwf.cvs.time_series import ShufflePerEpisode


def test_get_episode_starts():
restart_name = 'restart'
X_data = np.random.randn(10, 2)
data = np.concatenate(
(X_data,
np.array([[1], [0], [0], [0], [1], [0], [0], [1], [0], [0]])), axis=1)
X_df = pd.DataFrame(
columns=['X_1', 'X_2', 'restart'], data=data)

cv = KFoldPerEpisode(restart_name, 0)
episode_starts = cv._get_episode_starts(X_df)
assert_allclose(episode_starts, np.array([0, 4, 7]))

cv = KFoldPerEpisode(restart_name, 2)
episode_starts = cv._get_episode_starts(X_df)
assert_allclose(episode_starts, np.array([0, 2, 3]))


def test_per_restart():
restart_name = 'restart'
X_data = np.random.randn(10, 2)
y = np.array([[1], [0], [0], [0], [1], [0], [0], [1], [0], [0]])
data = np.concatenate(
(X_data,
y), axis=1)
X_df = pd.DataFrame(
columns=['X_1', 'X_2', 'restart'], data=data)

cv = KFoldPerEpisode(restart_name, 0)
# adding the virtual next episode start index
gen = cv.get_cv(X_df, y)
train_is, test_is = next(gen)
assert train_is == list(range(4, 10))
assert test_is == list(range(4))

cv = KFoldPerEpisode(restart_name, 2)
gen = cv.get_cv(X_df, np.arange(3))
train_is, test_is = next(gen)
assert train_is == [2]
assert test_is == list(range(2))

cv = ShufflePerEpisode(restart_name, random_state=2)
gen = cv.get_cv(X_df, y)
train_is, test_is = next(gen)
assert train_is == [0, 1, 2, 3, 4, 5, 6]
assert test_is == [7, 8, 9]
train_is, test_is = next(gen)
train_is, test_is = next(gen)
assert train_is == [4, 5, 6, 7, 8, 9]
assert test_is == [0, 1, 2, 3]

cv = RollingPerEpisode(restart_name)
gen = cv.get_cv(X_df, y)
train_is, test_is = next(gen)
assert train_is == [0, 1, 2, 3]
assert test_is == [4, 5, 6]
train_is, test_is = next(gen)
assert train_is == [0, 1, 2, 3, 4, 5, 6]
assert test_is == [7, 8, 9]
Loading