Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate large .expect files #2233

Open
rachitnigam opened this issue Jul 29, 2024 · 11 comments
Open

Eliminate large .expect files #2233

rachitnigam opened this issue Jul 29, 2024 · 11 comments

Comments

@rachitnigam
Copy link
Contributor

rachitnigam commented Jul 29, 2024

The general philosophy around checking in golden files requires them to be inspected when changes occur. It is not clear to me that files larger than a reasonably small bound are ever realistically checked. This leads to a false sense of security because we technically have "tests" even though we're not really making sure they are correct.

I ran the following command to find all the .expect files that are longer than 300 files:

fd .expect | xargs -I _ wc -l _ | awk '{ if ( $1 > 300 ) print $2,$1 }' | sort -nk2

A couple of recommendations:

  1. If a particular test does not need to be large, let's bring it under the 300 lines threshold
  2. If a particular test does need to be longer, let's figure out why
    1. If it the case that we're just tracking golden outputs (like the output of a computation) and we would never want it to change, let's add a prefix like .freeze.expect to indicate that this output will never change after being committed to the repo.
    2. If the output format is particularly verbose, let's build some tooling to compress and track only the relevant part of the output. For example, if the Queues testing infrastructure can use a tool to generate a shorter representation of the output, let's build that tool.
  3. Let's write a GH bot to enforce this limit on any new .expect files

Backend

tests/backend/verilog/memory-with-external-attribute.expect 628

Frontends

tests/frontend/ntt-pipeline/ntt-4.expect 306
tests/frontend/ntt-pipeline/ntt-4-reduced-2.expect 312
tests/frontend/relay/batch_matmul.expect 321
tests/frontend/relay/max_pool2d.expect 341
tests/frontend/relay/conv2d.expect 423
tests/frontend/relay/softmax.expect 696

YXI and Xilinx

(@nathanielnrn)

tests/xilinx/language-tutorial-iterate.expect 687
tests/xilinx/gen-verilog/language-tutorial-iterate.expect 686
tests/xilinx/gen-verilog/dot-product.expect 1426
tests/xilinx/gen-verilog/vectorized-add.expect 1426
yxi/tests/axi/read-compute-write/seq-vec-add.expect 395
yxi/tests/axi/read-compute-write/seq-mem-vec-add.expect 475
yxi/tests/axi/dynamic/dyn-vec-add.expect 772
yxi/tests/axi/dynamic/dyn-mem-vec-add.expect 848
yxi/tests/axi/read-compute-write/seq-mem-vec-add-axi-wrapped.expect 9119
yxi/tests/axi/dynamic/dyn-mem-vec-add-axi-wrapped.expect 11012

Interpreter

(@EclecticGriffin)

interp/tests/benchmarks/polybench/linear-algebra-symm.expect 318
interp/tests/benchmarks/polybench/linear-algebra-gramschmidt.expect 400
interp/tests/benchmarks/polybench/linear-algebra-2mm.expect 418
interp/tests/benchmarks/polybench/linear-algebra-syr2k.expect 422
interp/tests/benchmarks/polybench/linear-algebra-3mm.expect 576
interp/tests/benchmarks/polybench/linear-algebra-doitgen.expect 752

Queues

(@anshumanmohan)

calyx-py/test/correctness/queues/binheap/fifo.expect 60008
calyx-py/test/correctness/queues/binheap/pifo.expect 60008
calyx-py/test/correctness/queues/fifo.expect 60008
calyx-py/test/correctness/queues/pifo.expect 60008
calyx-py/test/correctness/queues/pifo_tree.expect 60008
calyx-py/test/correctness/queues/sdn.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/rr_queues/rr_queue_2flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/rr_queues/rr_queue_3flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/rr_queues/rr_queue_4flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/rr_queues/rr_queue_5flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/rr_queues/rr_queue_6flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/rr_queues/rr_queue_7flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/strict_queues/strict_2flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/strict_queues/strict_3flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/strict_queues/strict_4flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/strict_queues/strict_5flows.expect 60008
calyx-py/test/correctness/queues/strict_and_rr_queues/strict_queues/strict_6flows.expect 60008
calyx-py/test/correctness/queues/binheap/stable_binheap.expect 80010
calyx-py/test/correctness/queues/nwc_simple.expect 100012
calyx-py/test/correctness/queues/pcq.expect 100012
calyx-py/test/correctness/queues/pieo.expect 100012
@anshumanmohan
Copy link
Contributor

anshumanmohan commented Jul 29, 2024

Thanks for flagging this! The queue team is aware of our embarrassingly large files, and we hope to clean this up as part of #2191! Hopefully we'll massage this away in a few weeks :)

@ethanuppal
Copy link
Member

@EclecticGriffin any update on the usability of cider2 as a component interpreter programmatically?

@EclecticGriffin
Copy link
Collaborator

@EclecticGriffin any update on the usability of cider2 as a component interpreter programmatically?

I don't see what that has to do with this issue? Unless I'm missing something?

@EclecticGriffin
Copy link
Collaborator

wrt to the .expect files all of these are from the polybench tests and so I don't see those changing, meaning we can hit them with the .freeze

@ethanuppal
Copy link
Member

@EclecticGriffin any update on the usability of cider2 as a component interpreter programmatically?

I don't see what that has to do with this issue? Unless I'm missing something?

The testbench I'm writing depends on cycle-level interpretation of calyx components, and I think it eliminates the need for expect files when testing, for example, a data structure (and it's more important to test behavioral aspects of the code then external memory state after completion).

@rachitnigam
Copy link
Contributor Author

wrt to the .expect files all of these are from the polybench tests and so I don't see those changing, meaning we can hit them with the .freeze

Perfect! Let's do that @EclecticGriffin!

The testbench I'm writing depends on cycle-level interpretation of calyx components, and I think it eliminates the need for expect files when testing, for example, a data structure (and it's more important to test behavioral aspects of the code then external memory state after completion).

@ethanuppal this sounds cool and potentially useful for the queues team. However, I think it is super important to get buy-in from folks and making sure it actually serves the purpose that they want it to do.

@rachitnigam
Copy link
Contributor Author

@nathanielnrn do you have any thoughts on how we can remove the large AXI expect tests?

@ethanuppal
Copy link
Member

Update:

  • Queues team said rust test interface would be useful
  • Still having trouble using cider2 as a library

@nathanielnrn
Copy link
Contributor

I think the best solution for yxi/AXI tests relies on implementing rachitnigam/runt#20. After that I think it's enough to test only for .yxi output and functional correctness with cocotb. The problem is that currently fud2 doesn't quite handle the path finding correctly to go from calyx -> axi-wrapped-calyx -> ... cocotb. Without the correct path finding/runt functionality nothing is enforcing the axi-wrapped-calyx to stay in sync with the generator without the tests that produce big .expect files.

@anshumanmohan
Copy link
Contributor

I'd like to say a little more about the queues team wrt the rust test interface. Basically I think the rust test interface sounds neat, but I don't want to devote lots of resources to porting our testing over right now. Porting over to a rust test interface is a medium-term goal for us.

At present I have asked my team to generate the enormous .expect and .data files that we presently do, test against those, and then discard them. So it all goes down as usual, it just doesn't get checked in.

@ethanuppal
Copy link
Member

@anshumanmohan sounds like a good plan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants