fix: for issue-144 page hanging on large files #172

TristanSpeakEasy · 2022-09-30T10:39:06Z

fix for #144

kamaldlk · 2022-10-21T11:15:29Z

pls merge this, most wanted for me

erothmayer · 2022-10-23T14:55:19Z

This code should not be used. It is faster, but it is incorrect.

Because the new approach is "chunking" the diff processing into sets of 100 it gives incorrect results on inputs over 100 lines long. For example, you can use the following inputs to simulate a pair of input files 302 lines long where the only difference is that a single line has been removed from the new value.

const range = (start, end) => [...Array(end).keys()].slice(start, end);
const oldValue = `{\n${range(0, 300).join('\n')}\n}`;
const newValue = `{\n${[...range(0, 50), ...range(51, 300)].join('\n')}\n}`;

When I try this in the original version of the code it works fine, and reports only the line containing "50" as removed and no other changes. When I try it with the suggested change above it correctly reports the line containing "50" as removed on the left, but then reports the lines containing "99", "149", "199", "249", and "299" as removed on the left and added on the right.

This problem will become worse the larger the difference in line counts between the files. Once you get past 100 lines difference on a given side, every line will be reported as a difference even when all the lines exist with no changes elsewhere in each file.

erothmayer · 2022-10-23T14:59:48Z

src/compute-lines.ts

+		let oa = oldArr.splice(0, num);
+		let na = newArr.splice(0, num);
+
+		while (oa.length > 0 || na.length > 0) {
+			const o = oa.join("");
+			const n = na.join("");
+			diffArray = diffArray.concat(
+				diff.diffLines(o, n, {
+					newlineIsToken: true,
+					ignoreWhitespace: false,
+					ignoreCase: false,
+				})
+			);
+			oa = oldArr.splice(0, num);
+			na = newArr.splice(0, num);
+		}


This approach causes the result to be incorrect. Because you're chunking the call to diffLines into units of 100 lines additions or deletions will cause equal lines to get pushed into different chunks, where the underlying diff algorithm can no longer match them up.

fix: for issue-144 page hanging on large files

adbce86

TristanSpeakEasy mentioned this pull request Sep 30, 2022

big data cause Page jammed #144

Open

kamaldlk approved these changes Oct 21, 2022

View reviewed changes

erothmayer reviewed Oct 23, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: for issue-144 page hanging on large files #172

fix: for issue-144 page hanging on large files #172

TristanSpeakEasy commented Sep 30, 2022

kamaldlk commented Oct 21, 2022

erothmayer commented Oct 23, 2022

erothmayer Oct 23, 2022

fix: for issue-144 page hanging on large files #172

Are you sure you want to change the base?

fix: for issue-144 page hanging on large files #172

Conversation

TristanSpeakEasy commented Sep 30, 2022

kamaldlk commented Oct 21, 2022

erothmayer commented Oct 23, 2022

erothmayer Oct 23, 2022

Choose a reason for hiding this comment