-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYSTEMDS-3729] Add roll reorg operations in CP, python script test #2103
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good, just missing a few things.
src/main/java/org/apache/sysds/runtime/functionobjects/RollIndex.java
Outdated
Show resolved
Hide resolved
src/main/java/org/apache/sysds/runtime/instructions/cp/ReorgCPInstruction.java
Show resolved
Hide resolved
src/main/java/org/apache/sysds/runtime/instructions/cp/ReorgCPInstruction.java
Show resolved
Hide resolved
src/main/java/org/apache/sysds/runtime/matrix/data/LibMatrixReorg.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/sysds/test/functions/reorg/FullRollTest.java
Outdated
Show resolved
Hide resolved
src/test/java/org/apache/sysds/test/functions/reorg/FullRollTest.java
Outdated
Show resolved
Hide resolved
…se matrix), refine comments, initialize ReorgCPInst object variables to null.
Hello Sebastian, Additionally, the rollTest Python script works fine locally, but there are environment issues on GitHub Actions. I'll resolve this soon by updating the javaTests.yml. |
# Conflicts: # .github/workflows/javaTests.yml
The Java tests pass locally but continue to fail in GitHub Actions because the I have updated the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit unfortunate that the python is not installed in our test image (see specific comment for more details). Therefore, it is hard for you to call python and test the way you want to.
Alternatively, I can suggest and would actually recommend:
- Write the test up in java, where you know what the result should be without comparing to Numpy.
- And then in our Python API, write the test again, to compare to Numpy. (we have many examples of doing this in recent commits)
src/main/java/org/apache/sysds/runtime/instructions/cp/ReorgCPInstruction.java
Show resolved
Hide resolved
src/main/java/org/apache/sysds/runtime/matrix/data/LibMatrixReorg.java
Outdated
Show resolved
Hide resolved
…a test, Python install script in the Dockerfile, and MTX file reader/writer.
I have created a Python API and added Python test code for roll function. I have rolled back the code related to the MTX file reader/writer, which was separated into another PR, as well as the previously modified Java test code and Docker file script. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2103 +/- ##
============================================
+ Coverage 70.50% 70.58% +0.08%
- Complexity 42071 42214 +143
============================================
Files 1441 1442 +1
Lines 162297 162526 +229
Branches 31626 31672 +46
============================================
+ Hits 114425 114719 +294
+ Misses 38832 38817 -15
+ Partials 9040 8990 -50 ☔ View full report in Codecov by Sentry. |
Previously, the roll function incorrectly handled sparse matrix indices, leading to a test failure, but I have resolved the issue. |
LGTM the implementation looks solid. I would suggest to add a component test that just covers the call to LibMatrixReorg, |
Thank you for continuously providing valuable feedback. As I'm encountering code coverage for the first time, I have some questions. I've summarized the current situation and requests below as I understand them. Could you please confirm if this is correct?
|
Yes this sounds correct. Therefore:
Thanks. |
I have added the JavaTest as mentioned earlier. I would appreciate it if you could review it. Thank you! |
I fixed an issue where the roll function was not properly invoked due to an incorrect output file setting and the lack of calling recomputeNonZeros in the Java test. |
LGTM - @min-guk could you please rebase this PR (there are quite a number of merge conflicts over these many commits) and then I would merge it in. |
Thank you @mboehm7. I have proceeded with the rebase as requested. Once the PR is merged, I will immediately create a PR for the Spark version as well. Additionally, I have been studying how to implement it using FED, but I’m finding it a bit difficult to grasp. Could you perhaps provide a brief explanation or suggest some reference materials I could look into? |
OK, this was not a Here are some papers for the federated backend: |
@mboehm7 I apologize for the inconvenience. It seems I made a mistake due to my inexperience with Git and GitHub. I will make sure to study more to prevent this from happening again in the future. Also, thank you for recommending the papers. I will take the time to read and study them. |
This PR implements a new roll reorg function in CP and provides test scripts using Python.
rev
, dense column vectors are rolled column-wise, while the rest are rolled row-wise.roll
function, baseline test scripts were written using Python's numpyroll
function.mtx
format between R and Python, I made the following modifications to themtx
file input/output code:TestUtils::writeTestMatrix()
: The last number in the mtx header should represent the number of non-zero elements in the matrix, but the previous input test matrix was showing the total number of elements, which caused an error in Python. (https://networkrepository.com/mtx-matrix-market-format.html)TestUtils::readRMatrixFromFS()
: While R writes the mtx header in 2 lines, Python's scipy writes it in 3 lines, including an extra line with just a %. I modified the code to ignore this blank line.Please feel free to share any feedback or suggestions regarding the implementation direction.