Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bool writer like reader #118

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Bool writer like reader #118

wants to merge 2 commits into from

Conversation

Melirius
Copy link
Collaborator

Making use of known bit lengths we can reduce number of low_value flush tests and provide performance improvement.

This PR:

2024-11-21T17:57:59.491Z INFO  [lepton_jpeg::structs::lepton_file_writer] compressing to Lepton format
2024-11-21T17:57:59.957Z INFO  [lepton_jpeg::structs::lepton_file_writer] Number of threads: 8
2024-11-21T17:58:01.172Z INFO  [lepton_jpeg::structs::lepton_file_writer] worker threads 8267ms of CPU time in 1213ms of wall time
2024-11-21T17:58:01.172Z INFO  [lepton_jpeg::structs::lepton_file_writer] decompressing to verify contents
2024-11-21T17:58:02.767Z INFO  [lepton_jpeg_util] compressed input 22171278, output 17324074 bytes (compression = 28.0%)
2024-11-21T17:58:02.767Z INFO  [lepton_jpeg_util] Main thread CPU: 3275ms, Worker thread CPU: 19316 ms, walltime: 3275 ms

 Performance counter stats for 'taskset -c 10 nice -n -20 target/release/lepton_jpeg_util images/img_52MP_7k.jpg images/img_52MP_7k2.lep':

       846 331 521      cache-references                                                        (41,79%)
        79 315 801      cache-misses                     #    9,37% of all cache refs           (41,85%)
    14 764 047 286      cycles                                                                  (41,94%)
       932 581 537      ic_fetch_stall.ic_stall_back_pressure                                        (41,93%)
     1 011 815 567      stalled-cycles-frontend          #    6,85% frontend cycles idle        (42,02%)
    36 784 577 922      instructions                     #    2,49  insn per cycle            
                                                  #    0,03  stalled cycles per insn     (42,22%)
     4 312 566 577      branch-instructions                                                     (42,19%)
       158 545 838      branch-misses                    #    3,68% of all branches             (42,14%)
     5 065 300 912      ic_fetch_stall.ic_stall_any                                             (41,99%)
        40 037 609      ic_fetch_stall.ic_stall_dq_empty                                        (41,67%)
        66 558 200      l2_cache_misses_from_ic_miss                                            (41,57%)
     2 040 840 392      l2_latency.l2_cycles_waiting_on_fills                                        (41,56%)
           182 998      faults                                                                
                 1      migrations                                                            

       3,309236918 seconds time elapsed

       3,009823000 seconds user
       0,294591000 seconds sys

main

2024-11-21T17:56:45.765Z INFO  [lepton_jpeg::structs::lepton_file_writer] compressing to Lepton format
2024-11-21T17:56:46.236Z INFO  [lepton_jpeg::structs::lepton_file_writer] Number of threads: 8
2024-11-21T17:56:47.456Z INFO  [lepton_jpeg::structs::lepton_file_writer] worker threads 8295ms of CPU time in 1219ms of wall time
2024-11-21T17:56:47.456Z INFO  [lepton_jpeg::structs::lepton_file_writer] decompressing to verify contents
2024-11-21T17:56:49.053Z INFO  [lepton_jpeg_util] compressed input 22171278, output 17324074 bytes (compression = 28.0%)
2024-11-21T17:56:49.053Z INFO  [lepton_jpeg_util] Main thread CPU: 3288ms, Worker thread CPU: 19384 ms, walltime: 3288 ms

 Performance counter stats for 'taskset -c 10 nice -n -20 target/release/lepton_jpeg_util images/img_52MP_7k.jpg images/img_52MP_7k2.lep':

       862 881 575      cache-references                                                        (41,81%)
        75 491 793      cache-misses                     #    8,75% of all cache refs           (41,82%)
    14 923 123 147      cycles                                                                  (41,78%)
       768 707 879      ic_fetch_stall.ic_stall_back_pressure                                        (41,89%)
       979 668 407      stalled-cycles-frontend          #    6,56% frontend cycles idle        (41,89%)
    37 907 138 544      instructions                     #    2,54  insn per cycle            
                                                  #    0,03  stalled cycles per insn     (41,86%)
     4 237 299 015      branch-instructions                                                     (41,93%)
       150 601 797      branch-misses                    #    3,55% of all branches             (41,89%)
     5 258 215 965      ic_fetch_stall.ic_stall_any                                             (41,84%)
        35 478 851      ic_fetch_stall.ic_stall_dq_empty                                        (41,97%)
        61 239 992      l2_cache_misses_from_ic_miss                                            (41,97%)
     1 945 145 590      l2_latency.l2_cycles_waiting_on_fills                                        (41,83%)
           182 871      faults                                                                
                 1      migrations                                                            

       3,331041940 seconds time elapsed

       2,981193000 seconds user
       0,346673000 seconds sys

Like reader it reduces the number of value tests
@Melirius
Copy link
Collaborator Author

Rebased

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant