Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #191 #197

Merged
merged 40 commits into from
Jun 28, 2023
Merged

Conversation

nck-mlcnv
Copy link
Contributor

This pull request also does the following:

  • Fix the same issue with the countLines method, although it's a quite simple fix for that method.
    • For this I've added the getLineEnding method, which returns the used line separator of a file.
  • Add better tests.

@bigerl
Copy link
Member

bigerl commented Apr 11, 2023

please resolve conflicts

Copy link
Member

@bigerl bigerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I think the code that was used here before is way too over complicated and mostly beyond fixing. It needs more or less a complete rewrite of the respective methods. I suggested some directions in my comments.

@nck-mlcnv nck-mlcnv requested a review from bigerl April 14, 2023 10:01
@bigerl
Copy link
Member

bigerl commented Apr 19, 2023

Please request an review from @Clueliss when ready.

@nck-mlcnv nck-mlcnv requested a review from Clueliss April 20, 2023 17:31
Copy link

@Clueliss Clueliss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, so I'm not sure why I'm supposed to review this, because while I do know how to write Java I wouldn't know what constitutes good Java or what is available in the stdlib.

Regarding the exceptions being rethrown or not: I don't know what the goal is here so I can't comment on what's best there.

So in that spirit, this looks fine to me and the tests seem to be passing.

Copy link
Member

@bigerl bigerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One little thing that came to my mind. Thx @Clueliss . I should have further specified, but I think you just did what I amtes you to: have a look for logical errors in the code.

Copy link
Member

@bigerl bigerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turns out to be a much bigger change than anticipated. The general structure looks good, now. I've pointed out some bugs, smaller naming improvements and an suggestion for further testing below.

One general comment that is also reflected in some single comments: Please use character-wise reading instead of line-wise reading wherever we read and seek files manually.

@nck-mlcnv nck-mlcnv linked an issue May 26, 2023 that may be closed by this pull request
@nck-mlcnv nck-mlcnv requested a review from bigerl May 26, 2023 14:20
Copy link
Member

@bigerl bigerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's getting into shape. I think we missed one case when parsing the separators. See my comments below.

@nck-mlcnv nck-mlcnv requested a review from bigerl June 20, 2023 12:50
* the separator does not necessarily need to be surrounded by new line chars

* Enforce UTF-8 when figuring out the line ending

* Implement Knuth-Morris-Pratt on byte-level to index file
@nck-mlcnv
Copy link
Contributor Author

nck-mlcnv commented Jun 22, 2023

Conflicts have to be resolved and the documentation now needs checked and maybe updated now, because the behaviour of the FileSeparatorQuerySource has been changed.

  • update documentation
  • resolve conflicts

Copy link
Member

@bigerl bigerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a short look at some of the failing tests. At least some of them seem to be caused by leading/trailing whitespaces/new line characters in tests. The expected strings do not yet have them because they were removed before by the QuerySource.

@nck-mlcnv
Copy link
Contributor Author

So apperently, when the files are uploaded to GitHub they may have different line endings than the original ones. That's why I had to add the behaviour, that the expected values for tests are constructed with the proper line endings.

@nck-mlcnv nck-mlcnv requested a review from bigerl June 27, 2023 14:29
Updated the FileUtilsTest class to create and use temporary files, instead of hard-coded file paths. This change ensures more reliable and portable tests as it no longer relies on specific files being present in the filesystem. Also, removed the '@ignore' annotation to enable the test.
Added a command to ensure that temporary files created during testing are deleted once the process has completed. This is to prevent accumulation of unnecessary test files that could potentially consume excessive memory space.
Changed the "readTest" method in FileUtilsTest.java to use a temporary file, instead of a static file, to improve test isolation and avoid possible interference between tests.
In the `FileUtilsTest` class, the logic for the parameterized test data was refactored. We moved from using hardcoded file paths to dynamically creating temporary files for testing purposes. The test content is now randomly generated UUID strings, allowing for a more diverse dataset for testing. The modified test now checks the hash of the generated string content instead of testing if the hash is simply over 0, ensuring a more accurate analysis.
Copy link
Member

@bigerl bigerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the textfiles for the tests at some point. See #214 for that. For now, this looks fine and we can move on.

@bigerl bigerl merged commit 922b103 into develop Jun 28, 2023
2 checks passed
@bigerl bigerl deleted the fix/issue-191/readline-with-different-line-separators branch June 28, 2023 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix github actions FileUtils method readLineAt doesn't properly read windows-style line endings
3 participants