You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently, the rewards distribution of each arm is fixed, so the library can't simulate non-stationary (restless) bandit problems.
Describe the solution you'd like
When implementing a custom Arm, there should be an update() function that can be overridden to update the reward distribution's parameters. Then each step, the ArmSet should call update() for each arm to advance its reward distribution.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Currently, the rewards distribution of each arm is fixed, so the library can't simulate non-stationary (restless) bandit problems.
Describe the solution you'd like
When implementing a custom
Arm
, there should be anupdate()
function that can be overridden to update the reward distribution's parameters. Then each step, theArmSet
should callupdate()
for each arm to advance its reward distribution.The text was updated successfully, but these errors were encountered: