Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI to compare 2 kastore files #135

Open
hyanwong opened this issue Jul 28, 2020 · 2 comments
Open

CLI to compare 2 kastore files #135

hyanwong opened this issue Jul 28, 2020 · 2 comments

Comments

@hyanwong
Copy link
Member

A few times now I have wanted to work out why two almost identical tree sequences have differing file sizes. The most comprehensive way to do this is python3 -m kastore ls -l file.kastore on both files, followed by comparing the output. It would be lovely to have a command-line option that took 2 files and showed where the differences lay, perhaps showing a 3 column layout with file1, file2, difference, where arrays with the same name in file1 and file2 have the difference shown, and arrays only present in one of the other file are listed as "only in file1" or "only in file 2". Colouring the largest value in any row in red would be the cherry on the cake. None of this is essential, though.

@jeromekelleher
Copy link
Member

Writing a general diff is not a trivial thing to do @hyanwong - do you have some tools from the standard library in mind for implementing this?

@grahamgower
Copy link
Member

Looking at what unittest does in this case might be useful.
https://github.com/python/cpython/blob/3c4fc864ce931db90214c54d742062a80dbef7c4/Lib/unittest/case.py#L922
It essentially amounts to truncating this output:

            difflib.ndiff(pprint.pformat(seq1).splitlines(),
                          pprint.pformat(seq2).splitlines()))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants