Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix textLinesMutator when dealing with insertions/deletions at the end of pad #5253

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 18 additions & 7 deletions src/static/js/Changeset.js
Original file line number Diff line number Diff line change
Expand Up @@ -908,20 +908,24 @@ class TextLinesMutator {
let removed = '';
if (this._isCurLineInSplice()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this._curCol is greater than or equal to the current line's length? In that case, shouldn't we set this._curCol to 0 and increment this._curLine before this check? Or is that case not possible due to invariants?

If invariants make that case impossible, I think it would be worthwhile to add an assert statement.

Copy link
Member Author

@webzwo0i webzwo0i Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip:

  • if L present, curCol is set to 0 in all branches (in skipLines)
  • if L not present, curCol will be +=N (number of skipped chars) [0]

insert:

  • if L present, curCol is set to 0 after applying L, because of the "multiline end with newline" constraint
  • if L is present, but curLine was not in splice, curCol isn't touched
  • if L is not present and we are processing the last line of lines array, curCol is increased by N chars [1]
  • if L is not present and we are not processing the last line, curCol is increased by N chars [2]

remove:

  • no changes to curCol

In cases [1] and [2] curCol calculation is safe, as a text string is given to the respective function (the term that's added to curCol is derived from from text).

Also after calling close or skipping over multiple L that don't have attributes, the splice is not active anymore and it's possible that curCol is > 0. So there are four possible "start" states:

  • splice not active, curCol is 0 [X1]
  • splice not active, curCol > 0 [X2]
  • splice active, curCol is 0 [X3]
  • splice active, curCol > 0 [X4]

So for all 4 "start" states, I'll add coverage for [0] exceeding the number of chars in the line (ie a text of "test\n" and skipping over 6 chars so we end in the middle of the next line), followed by a call to removeLines.

If we want to assert I think it's better to do it where curCol is changed.

if (this._curCol === 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about line markers? If the first character is a line marker and this._curCol === 1, shouldn't it be treated the same as if this._curCol === 0? Or is that not possible due to invariants? Unfortunately, I don't think it's possible to add an assert here because TextLinesMutator doesn't have access to the attributes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I don't understand, but the ops that are applied to lines shouldn't care about lineMarker's special meaning in the editor. ie lineMarker is special in the editor, because it's not rendered and it changes the "meaning" of the whole line, but when it comes to Changeset.js whatever attributes apply to the lineMarker are only applied to this single, "real" character. So we should not treat this._curCol === 1 as this._curCol === 0

// First line to be removed is in splice.
removed = this._curSplice[this._curSplice.length - 1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line makes the following assumptions:

  • this._curSplice.length > 2
  • this._curLine === this._curSplice[0] + (this._curSplice.length - 3) (in other words, the current line is the last entry in this._curSplice)

It's not immediately obvious to me that either assumption is guaranteed to be true. Examples:

  • If this._curLine < this._curSplice[0] then this._isCurLineInSplice() will return true even if this._curSplice.length === 2.
  • If this._curLine === this._curSplice[0] and this._curSplice.length === 4, then the first removed line should be this._curSplice[this._curSplice.length - 2].

Maybe those assumptions are guaranteed to be true due to (undocumented?) invariants. If so, it would be good to assert them here.

this._curSplice.length--;
// Next lines to be removed are not in splice.
removed += nextKLinesText(L - 1);
this._curSplice[1] += L - 1;
} else {
removed = nextKLinesText(L - 1);
this._curSplice[1] += L - 1;
const sline = this._curSplice.length - 1;
removed = this._curSplice[sline].substring(this._curCol) + removed;
this._curSplice[sline] = this._curSplice[sline].substring(0, this._curCol) +
this._linesGet(this._curSplice[0] + this._curSplice[1]);
this._curSplice[1] += 1;
// Is there a line left?
const remaining = this._linesGet(this._curSplice[0] + this._curSplice[1]) || '';
this._curSplice[sline] = this._curSplice[sline].substring(0, this._curCol) + remaining;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If remaining is the empty string, then this._curSplice[sline] will not end with a newline. Don't all entries in this._lines need to end with a newline character?

diff --git a/src/static/js/Changeset.js b/src/static/js/Changeset.js
index 5e0bc34fc..defc08852 100644
--- a/src/static/js/Changeset.js
+++ b/src/static/js/Changeset.js
@@ -918,11 +918,23 @@ class TextLinesMutator {
         removed = nextKLinesText(L - 1);
         this._curSplice[1] += L - 1;
         const sline = this._curSplice.length - 1;
-        removed = this._curSplice[sline].substring(this._curCol) + removed;
-        // Is there a line left?
-        const remaining = this._linesGet(this._curSplice[0] + this._curSplice[1]) || '';
-        this._curSplice[sline] = this._curSplice[sline].substring(0, this._curCol) + remaining;
-        this._curSplice[1] += remaining ? 1 : 0;
+        removed = this._curSplice[sline].slice(this._curCol) + removed;
+        this._curSplice[sline] = this._curSplice[sline].slice(0, this._curCol);
+        // The current line's newline was deleted, which means that the line after the splice (if
+        // there is one) should be joined with what remains of the current line.
+        const n = this._curSplice[0] + this._curSplice[1];
+        if (n < this._linesLength()) {
+          // There is a line that can be joined with the current line.
+          // TODO: What if the joined line starts with a line marker?
+          // TODO: What if the joined line has line attributes that differ from the current line's
+          // line attributes?
+          this._curSplice[sline] += this._linesGet(n);
+          this._curSplice[1] += 1;
+        } else {
+          // There is no line that can be joined with the current line. The current line's newline
+          // was removed, but all lines must end with a newline so add it back.
+          this._curSplice[sline] += '\n';
+        }
       }
     } else {
       // Nothing that is removed is in splice. Implies curCol === 0.

this._curSplice[1] += remaining ? 1 : 0;
}
} else {
// Nothing that is removed is in splice. Implies curCol === 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to assert that this._curCol === 0.

removed = nextKLinesText(L);
this._curSplice[1] += L;
}
Expand Down Expand Up @@ -973,17 +977,24 @@ class TextLinesMutator {
this._curLine += newLines.length;
// insert the remaining chars from the "old" line (e.g. the line we were in
// when we started to insert new lines)
this._curSplice.push(theLine.substring(lineCol));
this._curCol = 0; // TODO(doc) why is this not set to the length of last line?
const remaining = theLine.substring(lineCol);
if (remaining !== '') this._curSplice.push(remaining);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If remaining === '', then that means that this._curCol points just past the line's newline, correct? If so, I think the better fix would be to set this._curCol to 0 and increment this._curLine before doing anything else.

Also, this._curCol pointing just past the newline feels like a violation of an invariant. Maybe instead of checking if remaining !== '', it should assert that and we should fix whatever logic is causing this._curCol to point past the newline.

Copy link
Member Author

@webzwo0i webzwo0i Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not necessarly a newline at every line. Etherpad uses it with lines that always end with newlines (e.g. rep.lines), except for mutateAttributionLines, but this has other logic.
Also take a look at the test case, this is valid imo.
Update: I do think now, that it should end with a newline now. So when it comes to "valid" ignore the newline for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do understand what you wrote in the first paragraph, though. If there is a newline, we would never go "beyond" that. Let me test

this._curCol = 0;
} else {
this._curSplice.push(...newLines);
this._curLine += newLines.length;
}
} else {
} else if (!this.hasMore()) {
// There are no additional lines. Although the line is put into splice, curLine is not
// increased because there may be more chars in the line (newline is not reached).
// increased because there may be more chars in the line (newline is not reached). We are
// inserting at the end of lines. curCol is 0 as curLine is not in splice.
this._curSplice.push(text);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right. If my understanding is correct, each entry in this._lines represents a line of the document and must end with a newline character. By pushing to this._curSplice, this code creates a new line at the end of the document, one that line doesn't end with a newline. This code should instead insert the text after column this._curCol (which should be before the line's newline) in line this._curLine.

I think the actual bug is inside this._putCurLineInSplice(): That method should not attempt to read past the end of this._lines or delete lines that don't exist in this._lines. Something like this (untested):

diff --git a/src/static/js/Changeset.js b/src/static/js/Changeset.js
index 5e0bc34fc..930a5e187 100644
--- a/src/static/js/Changeset.js
+++ b/src/static/js/Changeset.js
@@ -823,11 +823,18 @@ class TextLinesMutator {
    * @returns {number} the index of the added line in curSplice
    */
   _putCurLineInSplice() {
-    if (!this._isCurLineInSplice()) {
-      this._curSplice.push(this._linesGet(this._curSplice[0] + this._curSplice[1]));
-      this._curSplice[1]++;
+    assert(this._inSplice, 'not in splice');
+    assert(this._curLine >= this._curSplice[0], 'curLine is before splice start');
+    while (this._curSplice[0] + this._curSplice.length - 2 < this._curLine + 1) {
+      const n = this._curSplice[0] + this._curSplice[1];
+      if (n < this._linesLength()) {
+        this._curSplice.push(this._linesGet(n));
+        this._curSplice[1]++;
+      } else {
+        // No more lines available to add to the splice. Create a new line from scratch.
+        this._curSplice.push('\n');
+      }
     }
-    // TODO should be the same as this._curSplice.length - 1
     return 2 + this._curLine - this._curSplice[0];
   }
 

Copy link
Member Author

@webzwo0i webzwo0i Dec 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing to _putCurLineInSplice. Will take a look.

Let's make sure newlines are present at all times

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should happen if an op removes the last line's newline but that op is immediately followed by another op that restores the newline? For example:

const text = 'x\n';
const lines = [text];
const cs = '=1|1-1|1+1$\n'; // Keep a char, then remove a newline, then add a newline.
Changeset.mutateTextLines(cs, lines);
assert.equal(lines.join(''), 'x\n');
assert.equal(Changeset.applyToText(cs, text), 'x\n');

Similarly, what if an op adds text to the end of the document without a trailing newline, then the next op adds the newline:

const text = 'x\n';
const lines = [text];
const cs = '|1=2+1|1+1$y\n'; // Keep existing text, add a char, then add a newline.
Changeset.mutateTextLines(cs, lines);
assert.equal(lines.join(''), 'x\ny\n');
assert.equal(Changeset.applyToText(cs, text), 'x\ny\n');

If we force newlines to always be present in each entry in this._lines then we're going to end up inserting extra newlines when we don't want them. I think the way to fix this is to strip the newlines from this._lines and add them back before returning.

this._curCol += text.length;
} else {
// insert text after curCol
const sline = this._putCurLineInSplice();
if (!this._curSplice[sline]) {
// TODO should never happen now
const err = new Error(
'curSplice[sline] not populated, actual curSplice contents is ' +
`${JSON.stringify(this._curSplice)}. Possibly related to ` +
Expand Down
44 changes: 44 additions & 0 deletions src/tests/frontend/specs/easysync-assembler.js
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,50 @@ describe('easysync-assembler', function () {
expect(assem.toString()).to.equal('-k');
});

it('smartOpAssembler append - op with text', async function () {
const assem = Changeset.smartOpAssembler();
const pool = poolOrArray([
'attr1,1',
'attr2,2',
'attr3,3',
'attr4,4',
'attr5,5',
]);

padutils.warnDeprecated.disabledForTestingOnly = true;
try {
assem.appendOpWithText('-', 'test', '*3*4*5', pool);
assem.appendOpWithText('-', 'test', '*3*4*5', pool);
assem.appendOpWithText('-', 'test', '*1*4*5', pool);
} finally {
delete padutils.warnDeprecated.disabledForTestingOnly;
}
assem.endDocument();
expect(assem.toString()).to.equal('*3*4*5-8*1*4*5-4');
webzwo0i marked this conversation as resolved.
Show resolved Hide resolved
});

it('smartOpAssembler append - op with multiline text', async function () {
const assem = Changeset.smartOpAssembler();
const pool = poolOrArray([
'attr1,1',
'attr2,2',
'attr3,3',
'attr4,4',
'attr5,5',
]);

padutils.warnDeprecated.disabledForTestingOnly = true;
try {
assem.appendOpWithText('-', 'test\ntest', '*3*4*5', pool);
assem.appendOpWithText('-', '\ntest\n', '*3*4*5', pool);
assem.appendOpWithText('-', '\ntest', '*1*4*5', pool);
} finally {
delete padutils.warnDeprecated.disabledForTestingOnly;
}
assem.endDocument();
expect(assem.toString()).to.equal('*3*4*5|3-f*1*4*5|1-1*1*4*5-4');
webzwo0i marked this conversation as resolved.
Show resolved Hide resolved
});

it('smartOpAssembler append + op with text', async function () {
const assem = Changeset.smartOpAssembler();
const pool = poolOrArray([
Expand Down
48 changes: 48 additions & 0 deletions src/tests/frontend/specs/easysync-mutations.js
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,54 @@ describe('easysync-mutations', function () {
['skip', 1, 1, true],
], ['banana\n', 'cabbage\n', 'duffle\n']);

runMutationTest(8, ['\n', 'fun\n', '\n'], [
['remove', 1, 1, '\n'],
['skip', 3, 0, false],
['remove', 2, 2, '\n\n'],
['insert', 'c'],
], ['func']);

runMutationTest(9, ['\n', 'fun\n', '\n'], [
['remove', 1, 1, '\n'],
['skip', 3, 0, false],
['remove', 2, 2, '\n\n'],
['insert', 'c'],
['insert', 'a\n', 1],
['insert', 'c'],
], ['funca\n', 'c']);

runMutationTest(10, ['\n', 'fun\n', '\n'], [
['remove', 1, 1, '\n'],
['skip', 2, 0, false],
['remove', 3, 2, 'n\n\n'],
['insert', 'z'],
], ['fuz']);

// #2836, #5214, #3560 regressions
runMutationTest(11, ['\n'], [
['remove', 1, 1, '\n'],
['insert', 'c', 0],
], ['c']);

runMutationTest(12, ['\n'], [
['remove', 1, 1, '\n'],
['insert', 'a\n', 1],
['insert', 'c'],
], ['a\n', 'c']);

runMutationTest(13, ['\n', 'fun\n', '\n'], [
['remove', 1, 1, '\n'],
['skip', 4, 1, false],
['remove', 1, 1, '\n'],
['insert', 'c'],
], ['fun\n', 'c']);

runMutationTest(14, ['\n'], [
['remove', 1, 1, '\n'],
['insert', 'a'],
['insert', 'c\n', 1],
], ['ac\n']);

it('mutatorHasMore', async function () {
const lines = ['1\n', '2\n', '3\n', '4\n'];
let mu;
Expand Down
Loading