-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verify the baseline figures output and code #166
Comments
Just from looking at the new figures (no code review yet, will do next), things look mostly fine, but a few aesthetic comments:
I will get to other questions after looking also at the code. |
Running |
Don't run the code, it won't work without updated runs. I would just do a code review for now |
Well, that time I checked the fix and ran it and it does indeed appear to be an error. It looks like you reorganized things and renamed them when splitting from Anyway, I committed the fix. Other ones I will try to just check the code, since I guess you may have MCMC output that differs from what I downloaded from the AWS bucket so I can't rely on results being identical. |
Sounds good. But my suggestion is just not to run the code until we have the final version of the chains. There are things that changed you wouldn't have on your computer. When you have modified all the constants and done the code review I will rerun absolutely everything (with Julia 1.9) and then do the full execution of the scripts from the commandline shell script and fix any bugs |
Strictly, for the RBC models, while we list the pseudotrue as 0.2, we actually simulate based on beta = 0.998, which gives beta_{draw}=0.2004008. In the Dynare in contrast, we use the beta_draw parameterization and so it is exactly 0.2. Since for these models, unlike SGU, we use the exact same generated data sets, this difference never affects any calculations. However, it does mean that the value reported in the tables is rounded. I think this is something we can just ignore, since the rounding error is small relative to sampling/estimation error and goes in the wrong direction to explain why our mean estimates have a slight bias visible in the robustness plots, which is why I was investigating it, though it does explain as much as a third of the (small) positive bias in the T=200 frequentist stats. In that case it's potentially meaningful, so I propose replacing
pseudotrues = Dict(:α => 0.3, :β_draw => 0.2004008, :ρ => 0.9) and I guess we can do the same in the robustness figures, though it will make them look a bit worse instead of better by making the exact same replacement in:
Update: made the changes: commit 3446ba3 |
Thanks @donskerclass , all looks good. I think it is totally fine for you to change all of those files as you see fit. Better if the pseudo is correct to more digits. If you want to make file changes to even regenerate the data as well, then that is fine with me, you just need to tell me what to rerun. Otherwise I suggest you just make whatver changes you wish as you go, and then I will run things at the end |
No need to regenerate the data so far: the change was to make evaluation consistent with the existing data rather than the data consistent with the evaluation. But I will continue to make changes and if anything requires regeneration I'll let you know. |
The code for Everything else in the figures looks fine: I deleted or moved around a few excess legends to improve visibility, but that's an aesthetic choice we can modify later if needed. |
@donskerclass lets hold off on changing the figures too much if possible until you can run them yourself. You don't have the modified runs, so I don't think it is possible yet. After #168 you should be able to change whatever you want |
No problem, I wasn't planning on further figure changes. Also, the python files to generate the tables look fine to me but don't run on my machine because of whatever Python/Pandas installation I have, which doesn't particularly bother me. In terms of results, I checked the times and the 10-40x speedup (for ESS/sec) we claim for SGU2 for NUTS over Particle MH has decreased to like a 3-5x speedup. Weirdly, this is a mix of our latest particle run being faster and our latest HMC run being less efficient (even though faster in total time, a smaller number of effective samples). Of course, the first order reason to use HMC is quality, since those particle runs aren't producing trustworthy results, but this is a pretty big speed change. My guess is a mix of different machines, plus, what was in my experience the major source of random variability across runs, which is choice of step size in the adaptation period. This could be just the luck of the seed, or it could be due to regenerating the data, and having, as it happens, a data realization that calls for a slightly smaller step size. Either way, I don't think this is too bad, but we will maybe want to shift to emphasizing quality a little bit more than we emphasize speed. For the RBC second order, we get similar speed ratios of NUTS vs particle MH as in the past version, but this comes from both being faster, maybe due to faster machines? (or luck of latency with cloud compute, etc, etc). I think ESS% got slightly worse for NUTS, so it isn't from better efficiency in this run. Quality for everything else looks comparable, and the above performance issues seem within normal computational variation and not due to implementation or code issues to be addressed. I think you can go ahead and run the experiments now if you need to. Anything we want to do with writeups and aesthetics can be modified after the draws. I am still not entirely sure on the RBC_SV settings, but that's the only one that could possibly need to change: if it can be separated out we can keep updating after the other runs are fully finalized. It's quality is good enough that a run at current settings isn't a problem, we just may want to show different things about the model. |
Code review to look for errors, mismatched filenames, labels in wrong order, using the wrong experiments, etc.
convert_frequentist_output.jl
: Converts the output from the frequentist experiments into a format that can be used by the plotting scripts.convert_dynare_output.jl
: Converts the output from the dynare experiments into a format that can be used by the plotting scripts consistent with the Julia chains.baseline_figures.jl
: Generates all figures except for the RBC robustness examplesrbc_robustness_figures.jl
: Generates the RBC robustness figuresbaseline_tables.py
: Generates all tables except for the RBC frequentist tablesrbc_frequentist_tables.py
: Generates the RBC frequentist tablesGo through all figures to make sure we don't have major regressions in quality
The original code used
rbc2_joint_200_long
in a few places. For example https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_figures.jl#L205-L207 and https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_figures.jl#L217Check SGU pseudotrues between julia and dynare: https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_figures.jl#L255
James said that he thought something might have an error reordering at some point in the dynare vs. the julia code. Not sure if this is true or not, but created Verify SGU sampling of the last 4 parameters #167 to review the ideas. The density plots look bad, but it might be because the sampling si bad (or that it starts at a particular initial condition away from the pseudotrue, or that thepriors are wrong, etc.).
In https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/rbc_robustness_figures.jl review the pseudos and the
yrange
andxrange
for display. Mess around with labels/figures/captions to your hearts content.In hhttps://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_tables.py#L83-L84 make sure the number of particles is up to date. Otherwise everything should come from metadata
Verify psuedos/etc. in https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_tables.py#L71-L74
Change the footnotes as you see fit in https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/baseline_tables.py#L77-L84
https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/rbc_frequentist_tables.py just verify thecode, look for errors, etc.
You will see that in the paper itself I added in the inferred shcoks of the SGU and the RBC SV. Remove those if you don't like them.
The new stuff with
T=500
is in theScaling with Sample Length
appendix. Check if correct then move around as you see fit. If you want other results, we can add them, but I think this proves the point on how performance scales with N.Similarly, there is only a subset of new material in the RBC with Stochastic Volatility section, so add things in as you see fit.
Figure 11 and 12 were moved to the appendix, as discussed. Feel free to change anything on those figures (e.g. https://github.com/HighDimensionalEconLab/HMCExamples.jl/blob/main/scripts/generate_paper_results/rbc_robustness_figures.jl can zoom in, change tittles, resize, etc.
The text was updated successfully, but these errors were encountered: