Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with SARTSATraces #72

Merged
merged 10 commits into from
May 10, 2024

Conversation

dharux
Copy link

@dharux dharux commented May 7, 2024

This fixes #70 and fixes #71.

Issue #70 : SARTSA Traces cannot be sampled correctly.

The sampleable_inds field of the episodes buffer containing a SARTSATrace (CircularArraySARTSATraces or ElasticArraySARTSATraces) was not correctly keeping track of which indices are sampleable. In a SARTSATraces, information from 3 steps are required to complete a single trace. The initial step of the episode in which only the state is pushed. The next step in which the action and next_state are pushed. The following step in which the next_action is pushed. Now, there is one sampleable index, but two unsampleable ones. Thus, the last two indices in the trace are typically unsampleable during the episode. This change was made to the function Base.push!(eb::EpisodesBuffer, xs::NamedTuple) in episodes.jl.

Issue #71 : Action and state go out of sync in CircularArraySARTSATraces

When more traces are pushed into the Traces than its capacity, the state does not match the appropriate action. To fix this, the capacity of the state trace should be one more than that of the action trace. The capacity of all traces are incremented by 1 so that the Traces can hold capacity amount of full traces.

The tests were also modified to check that it works correctly with the following usage:

  1. Push first state at the start of the episode (PreEpisodeStage in RL.jl)
  2. Push state, action, reward, terminal during the episode (PostActStage)
  3. Push action after the episode ends in a PartialNamedTuple (PostEpisodeStage)

This is the behaviour of the agent from RLCore.jl and all trajectories should work well with that. The CircularArraySARTSATraces are a bit restrictive with regards to usage and cannot be used very differently than the typical usage above. I have not been able to make it work while represented next_action and working with all general usage.

The CircularArrarySARTSATraces do not currently work with CircularPrioritizedTraces as these add additional keys like keys and priorities which are updated outside of the Traces. Some changes need to be made with CircularPrioritizedTraces to make it work with CircularArrarySARTSATraces also. Thus, there are 2 errors during the testing involving CircularPrioritizedTraces.

Dharanish and others added 3 commits May 7, 2024 12:26
Bug 1 - When pushing more traces into CircularArraySARTSATraces than its capacity, the state and action traces are not in line anymore
Bug 2 - sampleable_inds were not correct for CircularArraySARTSATraces
Bug 3 - CircularArraySARTSATraces were not sampleable by a EpisodesSampler
Bug 1 - When pushing more traces into CircularArraySARTSATraces than its capacity, the state and action traces are not in line anymore
Bug 2 - sampleable_inds were not correct for CircularArraySARTSATraces
Bug 3 - CircularArraySARTSATraces were not sampleable by a EpisodesSampler
@jeremiahpslewis
Copy link
Member

@dharux Thank you for the PR. I've updated the main branch to eliminate unrelated test failures. Would you able to resolve the remaining test errors (in CircularPrioritizedTraces)?

@dharux
Copy link
Author

dharux commented May 8, 2024

I could take a shot at it. It might require completely rewriting how tuples are pushed into a CircularPrioritizedTraces and how it is sampled for the case when it contains a CircularArraySARTSATraces.

@jeremiahpslewis
Copy link
Member

@HenriDeh any tips or pointers for @dharux

Copy link

codecov bot commented May 9, 2024

Codecov Report

Attention: Patch coverage is 79.48718% with 8 lines in your changes are missing coverage. Please review.

Project coverage is 74.23%. Comparing base (b09e2ed) to head (29a6a3e).
Report is 21 commits behind head on main.

Files Patch % Lines
src/episodes.jl 70.00% 6 Missing ⚠️
src/common/CircularPrioritizedTraces.jl 92.85% 1 Missing ⚠️
src/samplers.jl 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #72      +/-   ##
==========================================
- Coverage   74.65%   74.23%   -0.43%     
==========================================
  Files          15       18       +3     
  Lines         801      850      +49     
==========================================
+ Hits          598      631      +33     
- Misses        203      219      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jeremiahpslewis jeremiahpslewis requested a review from HenriDeh May 9, 2024 09:34
@jeremiahpslewis
Copy link
Member

@dharux Thanks, looks good to me, I’ll let @HenriDeh do a final check.

@dharux dharux marked this pull request as draft May 9, 2024 09:48
@dharux dharux marked this pull request as ready for review May 9, 2024 09:54
@dharux
Copy link
Author

dharux commented May 9, 2024

Needed some final fixes. Everything should be working now!

Copy link
Member

@HenriDeh HenriDeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time. Sampleable indices are not very intuitive, I hope this didn't give you headaches.

@jeremiahpslewis jeremiahpslewis merged commit de01fb3 into JuliaReinforcementLearning:main May 10, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Action and state go out of sync in CircularArraySARTSATraces SARTSA Traces cannot be sampled correctly
3 participants