You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Technically, you only need to use Softmax once to get the portfolio weights.
However, during training, we found that the PnL fluctuations are too big, and the agent finds it very hard to converge. This is due to the high stochasticity in the market. Applying Softmax twice will somewhat make the weights more even, and therefore, the PnL will not fluctuate too much, making it easier for RL agents to converge.
In short, it is a compromise due to the previous methods' inability to handle a high stochastic environment. You can remove this if your algorithms can handle the fluctuations.
Softmax is applied on action,
TradeMaster/trademaster/trainers/portfolio_management/trainer.py
Line 149 in bc5a30a
then in,
TradeMaster/trademaster/environments/portfolio_management/environment.py
Line 125 in bc5a30a
softmax is applied again to transfer action into portfolio weights. Is there a specific reason why this is done? Thanks for your time
The text was updated successfully, but these errors were encountered: