Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why is softmax applied twice when actions are transferred to portfolio weights? #196

Closed
tengyaolong2000 opened this issue Jan 24, 2024 · 1 comment

Comments

@tengyaolong2000
Copy link

tengyaolong2000 commented Jan 24, 2024

Softmax is applied on action,

action = np.exp(action)/np.sum(np.exp(action))

then in,

softmax is applied again to transfer action into portfolio weights. Is there a specific reason why this is done? Thanks for your time

@qinmoelei
Copy link
Contributor

Technically, you only need to use Softmax once to get the portfolio weights.

However, during training, we found that the PnL fluctuations are too big, and the agent finds it very hard to converge. This is due to the high stochasticity in the market. Applying Softmax twice will somewhat make the weights more even, and therefore, the PnL will not fluctuate too much, making it easier for RL agents to converge.

In short, it is a compromise due to the previous methods' inability to handle a high stochastic environment. You can remove this if your algorithms can handle the fluctuations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants