-
-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: for issue-144 page hanging on large files #172
base: master
Are you sure you want to change the base?
Conversation
pls merge this, most wanted for me |
This code should not be used. It is faster, but it is incorrect. Because the new approach is "chunking" the diff processing into sets of 100 it gives incorrect results on inputs over 100 lines long. For example, you can use the following inputs to simulate a pair of input files 302 lines long where the only difference is that a single line has been removed from the new value.
When I try this in the original version of the code it works fine, and reports only the line containing "50" as removed and no other changes. When I try it with the suggested change above it correctly reports the line containing "50" as removed on the left, but then reports the lines containing "99", "149", "199", "249", and "299" as removed on the left and added on the right. This problem will become worse the larger the difference in line counts between the files. Once you get past 100 lines difference on a given side, every line will be reported as a difference even when all the lines exist with no changes elsewhere in each file. |
let oa = oldArr.splice(0, num); | ||
let na = newArr.splice(0, num); | ||
|
||
while (oa.length > 0 || na.length > 0) { | ||
const o = oa.join(""); | ||
const n = na.join(""); | ||
diffArray = diffArray.concat( | ||
diff.diffLines(o, n, { | ||
newlineIsToken: true, | ||
ignoreWhitespace: false, | ||
ignoreCase: false, | ||
}) | ||
); | ||
oa = oldArr.splice(0, num); | ||
na = newArr.splice(0, num); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach causes the result to be incorrect. Because you're chunking the call to diffLines
into units of 100 lines additions or deletions will cause equal lines to get pushed into different chunks, where the underlying diff algorithm can no longer match them up.
fix for #144