Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the OSACA macros #4152

Merged
merged 5 commits into from
Jan 1, 2025
Merged

Conversation

eggrobin
Copy link
Member

@eggrobin eggrobin commented Jan 1, 2025

  • Don’t inline the functions.
  • fix the garbled argument reduction in UNDER_OSACA_HYPOTHESES;
  • carry the loop through registers by default now that ordering the graph is dealt with by Graph improvements OSACA#21;
  • use an outer loop to pin the markers where the loop-carried dependencies are still valid;
  • make the OSACA_loop_terminator volatile. Reading it early did not seem to allow any unwanted optimizations, but it made the code less readable, with a load outside the loop and a mysterious test at the end. The extra movzx is irrelevant to latency analysis.

Now works as expected for other quadrants, e.g., θ = 3 (CSX; Mind the missing roundsd latency; something else seems to be missing, compare other *CAs on the same code):
osaca_dg-cropped (1)

Open Source Architecture Code Analyzer (OSACA) - 0.6.1
Analyzed file:      ..\Principia\numerics\Release\x64\sin_cos.asm
Architecture:       CSX
Timestamp:          2025-01-01 04:28:52


 P - Throughput of LOAD operation can be hidden behind a past or future STORE instruction
 * - Instruction micro-ops not bound to a port
 X - No throughput/latency information for this instruction in data file


Combined Analysis Report
------------------------
                                      Port pressure in cycles

     |  0   - 0DV  |  1   |  2   -  2D  |  3   -  3D  |  4   |  5   |  6   |  7   ||  CP  | LCD  |
--------------------------------------------------------------------------------------------------
2718 |             |      |             |             |      |      |      |      ||      |      | X npad 7
2719 |             |      |             |             |      |      |      |      ||      |      |   $OSACA_loop$154:
2720 |             |      |             |             |      |      |      |      ||      |      |   ; Line 333
2721 | 1.00        |      |             |             |      |      |      |      ||      |      |   comisd xmm15, xmm4
2722 |             |      |             |             |      |      |      |      ||      |      | * jbe SHORT $LN16@Cos
2723 | 1.00        |      |             |             |      |      |      |      ||      |      |   comisd xmm4, xmm13
2724 |             |      |             |             |      |      |      |      ||      |      | * jbe SHORT $LN16@Cos
2725 | 0.50        | 0.00 |             |             |      | 0.00 | 0.50 |      ||      |      |   mov al, 1
2726 | 0.50        |      | 0.50   0.50 | 0.50   0.50 |      |      | 0.50 |      ||      |      |   jmp SHORT $LN17@Cos
2727 |             |      |             |             |      |      |      |      ||      |      |   $LN16@Cos:
2728 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   xor eax, eax
2729 |             |      |             |             |      |      |      |      ||      |      |   $LN17@Cos:
2730 |             |      |             |             |      |      |      |      ||      |      |   ; Line 338
2731 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movsd xmm0, QWORD PTR __real@411921fb54442d18
2732 | 1.00        |      |             |             |      |      |      |      ||      |      |   comisd xmm0, xmm4
2733 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$13[rsp], al
2734 |             |      |             |             |      |      |      |      ||      |      | * jb SHORT $LN18@Cos
2735 | 1.00        |      |             |             |      |      |      |      ||      |      |   comisd xmm4, xmm14
2736 |             |      |             |             |      |      |      |      ||      |      | * jb SHORT $LN18@Cos
2737 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   mov al, 1
2738 | 0.00        |      | 0.50   0.50 | 0.50   0.50 |      |      | 1.00 |      ||      |      |   jmp SHORT $LN19@Cos
2739 |             |      |             |             |      |      |      |      ||      |      |   $LN18@Cos:
2740 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   xor eax, eax
2741 |             |      |             |             |      |      |      |      ||      |      |   $LN19@Cos:
2742 |             |      |             |             |      |      |      |      ||      |      |   ; Line 343
2743 |             |      |             |             |      |      |      |      ||  0.0 |  0.0 | * movaps xmm0, xmm4
2744 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$12[rsp], al
2745 | 0.33        | 0.33 | 0.50   0.50 | 0.50   0.50 |      | 0.33 |      |      ||  4.0 |  4.0 |   mulsd xmm0, QWORD PTR __real@3fe45f306dc9c883
2746 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||      |      |   xorps xmm1, xmm1
2747 |             |      |             |             |      | 1.00 |      |      ||  1.0 |  1.0 |   movsd xmm1, xmm0
2748 |             |      |             |             |      |      |      |      ||  0.0 |  0.0 | X roundsd xmm2, xmm1, 4
2749 |             |      |             |             |      |      |      |      ||      |      |   ; Line 347
2750 |             |      |             |             |      |      |      |      ||  0.0 |  0.0 | * movaps xmm0, xmm2
2751 | 0.33        | 0.33 | 0.50   0.50 | 0.50   0.50 |      | 0.33 |      |      ||  4.0 |  4.0 |   mulsd xmm0, QWORD PTR __real@3ff921fb54440000
2752 | 1.33        | 0.33 |             |             |      | 0.33 |      |      ||      |      |   cvtsd2si rdx, xmm2
2753 |             |      |             |             |      |      |      |      ||      |      |   ; Line 348
2754 | 0.33        | 0.33 | 0.50   0.50 | 0.50   0.50 |      | 0.33 |      |      ||      |      |   mulsd xmm2, QWORD PTR __real@3d868c234c4c6629
2755 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||  4.0 |  4.0 |   subsd xmm4, xmm0
2756 |             |      |             |             |      |      |      |      ||      |      |   ; Line 351
2757 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movsd xmm0, QWORD PTR __real@be10000000000000
2758 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\double_precision_body.hpp
2759 |             |      |             |             |      |      |      |      ||      |      |   ; Line 370
2760 |             |      |             |             |      |      |      |      ||  0.0 |  0.0 | * movaps xmm11, xmm4
2761 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||  4.0 |  4.0 |   subsd xmm11, xmm2
2762 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2763 |             |      |             |             |      |      |      |      ||      |      |   ; Line 351
2764 | 1.00        |      |             |             |      |      |      |      ||      |      |   comisd xmm0, xmm11
2765 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\double_precision_body.hpp
2766 |             |      |             |             |      |      |      |      ||      |      |   ; Line 371
2767 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||      |      |   subsd xmm4, xmm11
2768 | 0.67        | 0.16 |             |             |      | 0.16 |      |      ||      |      |   subsd xmm4, xmm2
2769 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2770 |             |      |             |             |      |      |      |      ||      |      |   ; Line 351
2771 |             |      |             |             |      |      |      |      ||      |      | * ja SHORT $LN20@Cos
2772 | 1.00        |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   comisd xmm11, QWORD PTR __real@3e10000000000000
2773 |             |      |             |             |      |      |      |      ||      |      | * ja SHORT $LN20@Cos
2774 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   xor eax, eax
2775 | 0.00        |      | 0.50   0.50 | 0.50   0.50 |      |      | 1.00 |      ||      |      |   jmp SHORT $LN21@Cos
2776 |             |      |             |             |      |      |      |      ||      |      |   $LN20@Cos:
2777 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   mov al, 1
2778 |             |      |             |             |      |      |      |      ||      |      |   $LN21@Cos:
2779 |             |      |             |             |      |      |      |      ||      |      |   ; Line 431
2780 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movaps xmm12, XMMWORD PTR ?sign_bit@masks@internal@_sin_cos@numerics@principia@@3U__m128d@@B
2781 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||      |      |   xorps xmm3, xmm3
2782 |             |      |             |             |      |      |      |      ||      |      |   ; Line 351
2783 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$11[rsp], al
2784 |             |      |             |             |      |      |      |      ||      |      |   ; Line 431
2785 |             |      |             |             |      | 1.00 |      |      ||      |      |   movsd xmm3, xmm11
2786 |             |      |             |             |      |      |      |      ||      |      |   ; Line 499
2787 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movzx eax, BYTE PTR ?UseHardwareFMA@internal@_fma@numerics@principia@@3_NB ; principia::numerics::_fma::internal::UseHardwareFMA
2788 |             |      |             |             |      |      |      |      ||      |      |   ; Line 431
2789 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||      |      |   andps xmm3, xmm12
2790 |             |      |             |             |      |      |      |      ||      |      |   ; Line 499
2791 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$18[rsp], al
2792 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||      |      |   xorps xmm10, xmm10
2793 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include\cstdlib
2794 |             |      |             |             |      |      |      |      ||      |      |   ; Line 23
2795 |             |      |             |             |      |      |      |      ||  0.0 |  0.0 | * movaps xmm0, xmm11
2796 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2797 |             |      |             |             |      |      |      |      ||      |      |   ; Line 352
2798 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   mov rcx, rdx
2799 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include\cstdlib
2800 |             |      |             |             |      |      |      |      ||      |      |   ; Line 23
2801 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||  1.0 |  1.0 |   andps xmm0, xmm6
2802 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2803 |             |      |             |             |      |      |      |      ||      |      |   ; Line 500
2804 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   and dl, 1
2805 |             |      |             |             |      |      |      |      ||      |      |   ; Line 275
2806 | 0.33        | 0.33 | 0.50   0.50 | 0.50   0.50 |      | 0.33 |      |      ||  4.0 |  4.0 |   addsd xmm0, QWORD PTR __real@42a0000000000000
2807 |             |      |             |             |      |      |      |      ||      |      |   ; Line 448
2808 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm6, xmm4
2809 |             |      |             |             |      |      |      |      ||      |      |   ; Line 500
2810 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$17[rsp], dl
2811 |             |      |             |             |      |      |      |      ||      |      |   ; Line 448
2812 | 0.50        | 0.50 |             |             |      |      |      |      ||      |      |   addsd xmm6, xmm4
2813 |             |      |             |             |      |      |      |      ||      |      |   ; Line 352
2814 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   and ecx, 3
2815 |             |      |             |             |      |      |      |      ||      |      |   ; Line 275
2816 | 0.67        | 0.16 | 0.50   0.50 | 0.50   0.50 |      | 0.16 |      |      ||  1.0 |  1.0 |   andps xmm0, XMMWORD PTR ?mantissa_index_bits@masks@internal@_sin_cos@numerics@principia@@3U__m128d@@B
2817 | 1.00        |      |             |             |      |      |      |      ||  1.0 |  1.0 |   movq rax, xmm0
2818 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include\array
2819 |             |      |             |             |      |      |      |      ||      |      |   ; Line 545
2820 | 0.00        |      |             |             |      |      | 1.00 |      ||  1.0 |  1.0 |   shl rax, 5
2821 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\fma_body.hpp
2822 |             |      |             |             |      |      |      |      ||      |      |   ; Line 35
2823 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movsd xmm8, QWORD PTR [rax+r8+16]
2824 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2825 |             |      |             |             |      |      |      |      ||      |      |   ; Line 436
2826 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movsd xmm0, QWORD PTR [rax+r8+8]
2827 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||  4.0 |  4.0 |   movsd xmm1, QWORD PTR [rax+r8]
2828 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm2, xmm0
2829 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||  1.0 |  1.0 |   xorps xmm1, xmm3
2830 |             |      |             |             |      |      |      |      ||      |      |   ; Line 283
2831 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$10[rsp], 1
2832 |             |      |             |             |      |      |      |      ||      |      |   ; Line 436
2833 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||      |      |   xorps xmm2, xmm3
2834 |             |      |             |             |      |      |      |      ||      |      |   ; Line 294
2835 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$9[rsp], 1
2836 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\fma_body.hpp
2837 |             |      |             |             |      |      |      |      ||      |      |   ; Line 35
2838 |             |      |             |             |      | 1.00 |      |      ||      |      |   movsd xmm10, xmm2
2839 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2840 |             |      |             |             |      |      |      |      ||      |      |   ; Line 444
2841 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||  4.0 |  4.0 |   subsd xmm11, xmm1
2842 | 0.33        | 0.33 |             |             |      | 0.33 |      |      ||      |      |   xorps xmm1, xmm1
2843 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\fma_body.hpp
2844 |             |      |             |             |      |      |      |      ||      |      |   ; Line 35
2845 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm0, xmm8
2846 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm9, xmm10
2847 |             |      |             |             |      |      |      |      ||      |      |   ; Line 46
2848 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm7, xmm10
2849 | 0.11        | 0.13 |             |             |      | 0.75 |      |      ||      |      |   xorps xmm3, xmm3
2850 |             |      |             |             |      |      |      |      ||      |      |   ; Line 35
2851 |             |      |             |             |      | 1.00 |      |      ||      |      |   movsd xmm3, xmm11
2852 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2853 |             |      |             |             |      |      |      |      ||      |      |   ; Line 448
2854 | 0.50        | 0.50 |             |             |      |      |      |      ||  4.0 |  4.0 |   addsd xmm6, xmm11
2855 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\fma_body.hpp
2856 |             |      |             |             |      |      |      |      ||      |      |   ; Line 35
2857 | 0.50        | 0.50 |             |             |      |      |      |      ||      |      |   vfnmadd213sd xmm9, xmm3, xmm0
2858 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\double_precision_body.hpp
2859 |             |      |             |             |      |      |      |      ||      |      |   ; Line 310
2860 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm0, xmm9
2861 | 0.00        | 0.00 |             |             |      | 1.00 |      |      ||      |      |   subsd xmm0, xmm8
2862 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2863 |             |      |             |             |      |      |      |      ||      |      |   ; Line 448
2864 | 0.50        | 0.50 |             |             |      |      |      |      ||  4.0 |  4.0 |   mulsd xmm6, xmm11
2865 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\fma_body.hpp
2866 |             |      |             |             |      |      |      |      ||      |      |   ; Line 46
2867 | 0.50        | 0.50 |             |             |      |      |      |      ||      |      |   vfnmsub213sd xmm7, xmm3, xmm0
2868 |             |      |             |             |      |      |      |      ||      |      |   ; Line 14
2869 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movaps xmm0, XMMWORD PTR tv946[rsp]
2870 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm2, xmm0
2871 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2872 |             |      |             |             |      |      |      |      ||      |      |   ; Line 450
2873 | 0.38        | 0.62 |             |             |      |      |      |      ||      |      |   mulsd xmm8, xmm6
2874 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\fma_body.hpp
2875 |             |      |             |             |      |      |      |      ||      |      |   ; Line 14
2876 |             |      |             |             |      | 1.00 |      |      ||  1.0 |  1.0 |   movsd xmm1, xmm6
2877 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm0, xmm4
2878 | 0.00        | 1.00 | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||  4.0 |  4.0 |   vfmadd213sd xmm2, xmm1, XMMWORD PTR tv947[rsp]
2879 | 0.00        | 1.00 | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   vfmadd213sd xmm5, xmm1, XMMWORD PTR tv949[rsp]
2880 |             |      |             |             |      |      |      |      ||  0.0 |  0.0 | * movaps xmm1, xmm2
2881 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2882 |             |      |             |             |      |      |      |      ||      |      |   ; Line 450
2883 | 0.00        | 1.00 |             |             |      |      |      |      ||      |      |   mulsd xmm8, xmm5
2884 | 0.00        | 0.25 |             |             |      | 0.75 |      |      ||      |      |   xorps xmm2, xmm2
2885 | 0.00        | 0.00 |             |             |      | 1.00 |      |      ||      |      |   xorps xmm4, xmm4
2886 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\fma_body.hpp
2887 |             |      |             |             |      |      |      |      ||      |      |   ; Line 14
2888 |             |      |             |             |      | 1.00 |      |      ||      |      |   movsd xmm4, xmm0
2889 |             |      |             |             |      | 1.00 |      |      ||  1.0 |  1.0 |   movsd xmm2, xmm1
2890 | 0.00        | 0.00 |             |             |      | 1.00 |      |      ||      |      |   xorps xmm1, xmm1
2891 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2892 |             |      |             |             |      |      |      |      ||      |      |   ; Line 449
2893 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm3, xmm6
2894 | 0.00        | 1.00 |             |             |      |      |      |      ||      |      |   mulsd xmm3, xmm11
2895 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\fma_body.hpp
2896 |             |      |             |             |      |      |      |      ||      |      |   ; Line 14
2897 |             |      |             |             |      | 1.00 |      |      ||      |      |   movsd xmm1, xmm3
2898 | 0.00        | 1.00 |             |             |      |      |      |      ||  4.0 |  4.0 |   vfmadd213sd xmm1, xmm2, xmm4
2899 |             |      |             |             |      |      |      |      ||      |      |   ; Line 35
2900 |             |      |             |             |      |      |      |      ||  0.0 |  0.0 | * movaps xmm0, xmm1
2901 | 0.00        | 1.00 |             |             |      |      |      |      ||  4.0 |  4.0 |   vfnmadd213sd xmm10, xmm0, xmm8
2902 | 0.00        | 1.00 |             |             |      |      |      |      ||  4.0 |  4.0 |   addsd xmm10, xmm7
2903 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\double_precision_body.hpp
2904 |             |      |             |             |      |      |      |      ||      |      |   ; Line 352
2905 |             |      |             |             |      |      |      |      ||  0.0 |  0.0 | * movaps xmm4, xmm10
2906 | 0.00        | 1.00 |             |             |      |      |      |      ||  4.0 |  4.0 |   addsd xmm4, xmm9
2907 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2908 |             |      |             |             |      |      |      |      ||      |      |   ; Line 310
2909 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm0, xmm4
2910 |             |      |             |             |      |      |      |      ||      |      | * movaps xmm2, xmm0
2911 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\double_precision_body.hpp
2912 |             |      |             |             |      |      |      |      ||      |      |   ; Line 353
2913 |             |      |             |             |      |      |      |      ||  0.0 |      | * movaps xmm0, xmm4
2914 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2915 |             |      |             |             |      |      |      |      ||      |      |   ; Line 310
2916 | 0.00        | 0.50 | 0.50   0.50 | 0.50   0.50 |      | 0.50 |      |      ||      |      |   andps xmm2, XMMWORD PTR ?exponent_bits@masks@internal@_sin_cos@numerics@principia@@3U__m128d@@B
2917 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\double_precision_body.hpp
2918 |             |      |             |             |      |      |      |      ||      |      |   ; Line 353
2919 | 0.00        | 0.50 |             |             |      | 0.50 |      |      ||  4.0 |      |   subsd xmm0, xmm9
2920 | 0.00        | 0.50 |             |             |      | 0.50 |      |      ||  4.0 |      |   subsd xmm10, xmm0
2921 |             |      |             |             |      |      |      |      ||      |      |   ; File C:\Users\robin\Projects\mockingbirdnest\Principia\numerics\sin_cos.cpp
2922 |             |      |             |             |      |      |      |      ||      |      |   ; Line 318
2923 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movsd xmm0, QWORD PTR __real@fcaffff000000000
2924 | 0.00        | 0.49 |             |             |      | 0.51 |      |      ||  1.0 |      |   andnps xmm12, xmm10
2925 | 0.00        | 0.51 |             |             |      | 0.49 |      |      ||  1.0 |      |   psubq xmm12, xmm2
2926 | 1.00        |      |             |             |      |      |      |      ||  3.0 |      |   comisd xmm0, xmm12
2927 |             |      |             |             |      |      |      |      ||      |      | * jbe SHORT $LN128@Cos
2928 | 1.00        |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   comisd xmm12, QWORD PTR __real@fcb0000800000000
2929 |             |      |             |             |      |      |      |      ||      |      | * jbe SHORT $LN128@Cos
2930 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   mov al, 1
2931 | 0.00        |      | 0.50   0.50 | 0.50   0.50 |      |      | 1.00 |      ||      |      |   jmp SHORT $LN129@Cos
2932 |             |      |             |             |      |      |      |      ||      |      |   $LN128@Cos:
2933 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   xor eax, eax
2934 |             |      |             |             |      |      |      |      ||      |      |   $LN129@Cos:
2935 |             |      |             |             |      |      |      |      ||      |      |   ; Line 512
2936 | 1.00        |      |             |             |      |      |      |      ||      |      |   ucomisd xmm4, xmm4
2937 |             |      |             |             |      |      |      |      ||      |      |   ; Line 318
2938 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$8[rsp], al
2939 |             |      |             |             |      |      |      |      ||      |      |   ; Line 512
2940 |             |      |             |             |      |      |      |      ||      |      | * jnp SHORT $LN8@Cos
2941 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   mov al, 1
2942 | 0.00        |      | 0.50   0.50 | 0.50   0.50 |      |      | 1.00 |      ||      |      |   jmp SHORT $LN9@Cos
2943 |             |      |             |             |      |      |      |      ||      |      |   $LN8@Cos:
2944 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   xor eax, eax
2945 |             |      |             |             |      |      |      |      ||      |      |   $LN9@Cos:
2946 |             |      |             |             |      |      |      |      ||      |      |   ; Line 515
2947 | 0.00        | 0.50 | 0.50   0.50 | 0.50   0.50 |      | 0.50 |      |      ||      |  1.0 |   xorps xmm4, QWORD PTR __xmm@80000000000000008000000000000000
2948 | 0.00        | 0.50 |             |             |      | 0.50 |      |      ||      |      |   xorps xmm2, xmm2
2949 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movaps xmm5, XMMWORD PTR tv948[rsp]
2950 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movsd xmm6, QWORD PTR __xmm@7fffffffffffffff7fffffffffffffff
2951 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   mov BYTE PTR OSACA_computed_condition$16[rsp], al
2952 |             | 1.00 |             |             |      | 0.00 |      |      ||      |      |   lea rax, QWORD PTR [rcx-1]
2953 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   cmp rax, 1
2954 | 0.00        |      | 0.00        | 0.00        | 1.00 |      | 2.00 | 1.00 ||      |      |   setbe BYTE PTR OSACA_computed_condition$15[rsp]
2955 |             |      | 0.50   0.50 | 0.50   0.50 |      |      |      |      ||      |      |   movzx eax, BYTE PTR ?OSACA_loop_terminator@@3_NC
2956 | 0.00        | 0.00 |             |             |      | 0.00 | 1.00 |      ||      |      |   test al, al
2957 |             |      |             |             |      |      |      |      ||      |      | * je $OSACA_loop$154
2958 |             |      | 0.00        | 0.00        | 1.00 |      |      | 1.00 ||      |      |   movsd QWORD PTR OSACA_result$14[rsp], xmm4

       22.0          22.0   14.0   14.0   14.0   14.0   11.0   22.0   22.0   11.0    77.0   65.0

@@ -7,20 +7,7 @@ namespace numerics {
namespace _sin_cos {
namespace internal {

#define PRINCIPIA_INLINE_SIN_COS 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The references to this symbol in sin_cos.cpp must be removed too.

@pleroy pleroy added the LGTM label Jan 1, 2025
@eggrobin eggrobin merged commit 68853a6 into mockingbirdnest:master Jan 1, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants