-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix textLinesMutator when dealing with insertions/deletions at the end of pad #5253
base: develop
Are you sure you want to change the base?
Conversation
As a side note: The commits that fixed #2802 should be re-evaluated. |
3220691
to
bb99568
Compare
This should be ready now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The "chore Changeset.js: use template strings" commit missed one: Line 1383 (in
outputMutOp()
). - The "easysync tests: move to separate files" commit is difficult to review, mostly because stuff was reordered. Please split this commit into two: One that splits
easysync.js
into multiple files but preserves relative order, and one that reorders stuff. - The "easysync tests: move to separate files" commit introduces a
//TODO: why isn't this |2+2*x|1+4 ?
comment tosrc/tests/frontend/specs/easysync-assembler.js
. The comment doesn't belong in that commit. - Please keep the first paragraph of each commit message short—ideally less than 50 chars but definitely less than 70. If you need to say more, put it in a separate paragraph (blank line between the paragraphs).
- Please add the regression tests in the commit that fixes the bug.
Thanks for the input @rhansen! Will clean up. Regarding putting the test into the same commit as the fix, I thought we also put the test in a separate commit (before all other commits) as stated somewhere in CONTRIBUTING.md. But in this case, as it's fixing multiple bugs in one PR, I'll definitely include them into the correct commit. |
bb99568
to
c579c86
Compare
Rebased onto latest |
It's OK to add an
|
e635a91
to
f4dd9cd
Compare
@webzwo0i I rebased onto latest |
76197a1
to
148d0a2
Compare
148d0a2
to
fda022d
Compare
fda022d
to
21a8912
Compare
Thanks! I added the tests to the appropriate commits, Rebased onto latest develop |
96a398b
to
6d96eed
Compare
@@ -903,20 +907,24 @@ class TextLinesMutator { | |||
let removed = ''; | |||
if (this._isCurLineInSplice()) { | |||
if (this._curCol === 0) { | |||
// First line to be removed is in splice. | |||
removed = this._curSplice[this._curSplice.length - 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line makes the following assumptions:
this._curSplice.length > 2
this._curLine === this._curSplice[0] + (this._curSplice.length - 3)
(in other words, the current line is the last entry inthis._curSplice
)
It's not immediately obvious to me that either assumption is guaranteed to be true. Examples:
- If
this._curLine < this._curSplice[0]
thenthis._isCurLineInSplice()
will return true even ifthis._curSplice.length === 2
. - If
this._curLine === this._curSplice[0]
andthis._curSplice.length === 4
, then the first removed line should bethis._curSplice[this._curSplice.length - 2]
.
Maybe those assumptions are guaranteed to be true due to (undocumented?) invariants. If so, it would be good to assert them here.
} | ||
} else { | ||
// Nothing that is removed is in splice. Implies curCol === 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to assert that this._curCol === 0
.
@@ -903,20 +907,24 @@ class TextLinesMutator { | |||
let removed = ''; | |||
if (this._isCurLineInSplice()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if this._curCol
is greater than or equal to the current line's length? In that case, shouldn't we set this._curCol
to 0 and increment this._curLine
before this check? Or is that case not possible due to invariants?
If invariants make that case impossible, I think it would be worthwhile to add an assert statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skip:
- if L present, curCol is set to 0 in all branches (in skipLines)
- if L not present, curCol will be +=N (number of skipped chars) [0]
insert:
- if L present, curCol is set to 0 after applying L, because of the "multiline end with newline" constraint
- if L is present, but curLine was not in splice, curCol isn't touched
- if L is not present and we are processing the last line of lines array, curCol is increased by N chars [1]
- if L is not present and we are not processing the last line, curCol is increased by N chars [2]
remove:
- no changes to curCol
In cases [1] and [2] curCol calculation is safe, as a text string is given to the respective function (the term that's added to curCol is derived from from text).
Also after calling close
or skipping over multiple L that don't have attributes, the splice is not active anymore and it's possible that curCol is > 0. So there are four possible "start" states:
- splice not active, curCol is 0 [X1]
- splice not active, curCol > 0 [X2]
- splice active, curCol is 0 [X3]
- splice active, curCol > 0 [X4]
So for all 4 "start" states, I'll add coverage for [0] exceeding the number of chars in the line (ie a text of "test\n" and skipping over 6 chars so we end in the middle of the next line), followed by a call to removeLines.
If we want to assert I think it's better to do it where curCol is changed.
@@ -903,20 +907,24 @@ class TextLinesMutator { | |||
let removed = ''; | |||
if (this._isCurLineInSplice()) { | |||
if (this._curCol === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about line markers? If the first character is a line marker and this._curCol === 1
, shouldn't it be treated the same as if this._curCol === 0
? Or is that not possible due to invariants? Unfortunately, I don't think it's possible to add an assert here because TextLinesMutator doesn't have access to the attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I don't understand, but the ops that are applied to lines shouldn't care about lineMarker's special meaning in the editor. ie lineMarker is special in the editor, because it's not rendered and it changes the "meaning" of the whole line, but when it comes to Changeset.js whatever attributes apply to the lineMarker are only applied to this single, "real" character. So we should not treat this._curCol === 1
as this._curCol === 0
this._curSplice[1] += 1; | ||
// Is there a line left? | ||
const remaining = this._linesGet(this._curSplice[0] + this._curSplice[1]) || ''; | ||
this._curSplice[sline] = this._curSplice[sline].substring(0, this._curCol) + remaining; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If remaining
is the empty string, then this._curSplice[sline]
will not end with a newline. Don't all entries in this._lines
need to end with a newline character?
diff --git a/src/static/js/Changeset.js b/src/static/js/Changeset.js
index 5e0bc34fc..defc08852 100644
--- a/src/static/js/Changeset.js
+++ b/src/static/js/Changeset.js
@@ -918,11 +918,23 @@ class TextLinesMutator {
removed = nextKLinesText(L - 1);
this._curSplice[1] += L - 1;
const sline = this._curSplice.length - 1;
- removed = this._curSplice[sline].substring(this._curCol) + removed;
- // Is there a line left?
- const remaining = this._linesGet(this._curSplice[0] + this._curSplice[1]) || '';
- this._curSplice[sline] = this._curSplice[sline].substring(0, this._curCol) + remaining;
- this._curSplice[1] += remaining ? 1 : 0;
+ removed = this._curSplice[sline].slice(this._curCol) + removed;
+ this._curSplice[sline] = this._curSplice[sline].slice(0, this._curCol);
+ // The current line's newline was deleted, which means that the line after the splice (if
+ // there is one) should be joined with what remains of the current line.
+ const n = this._curSplice[0] + this._curSplice[1];
+ if (n < this._linesLength()) {
+ // There is a line that can be joined with the current line.
+ // TODO: What if the joined line starts with a line marker?
+ // TODO: What if the joined line has line attributes that differ from the current line's
+ // line attributes?
+ this._curSplice[sline] += this._linesGet(n);
+ this._curSplice[1] += 1;
+ } else {
+ // There is no line that can be joined with the current line. The current line's newline
+ // was removed, but all lines must end with a newline so add it back.
+ this._curSplice[sline] += '\n';
+ }
}
} else {
// Nothing that is removed is in splice. Implies curCol === 0.
// There are no additional lines. Although the line is put into splice, curLine is not | ||
// increased because there may be more chars in the line (newline is not reached). We are | ||
// inserting at the end of lines. curCol is 0 as curLine is not in splice. | ||
this._curSplice.push(text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is right. If my understanding is correct, each entry in this._lines
represents a line of the document and must end with a newline character. By pushing to this._curSplice
, this code creates a new line at the end of the document, one that line doesn't end with a newline. This code should instead insert the text after column this._curCol
(which should be before the line's newline) in line this._curLine
.
I think the actual bug is inside this._putCurLineInSplice()
: That method should not attempt to read past the end of this._lines
or delete lines that don't exist in this._lines
. Something like this (untested):
diff --git a/src/static/js/Changeset.js b/src/static/js/Changeset.js
index 5e0bc34fc..930a5e187 100644
--- a/src/static/js/Changeset.js
+++ b/src/static/js/Changeset.js
@@ -823,11 +823,18 @@ class TextLinesMutator {
* @returns {number} the index of the added line in curSplice
*/
_putCurLineInSplice() {
- if (!this._isCurLineInSplice()) {
- this._curSplice.push(this._linesGet(this._curSplice[0] + this._curSplice[1]));
- this._curSplice[1]++;
+ assert(this._inSplice, 'not in splice');
+ assert(this._curLine >= this._curSplice[0], 'curLine is before splice start');
+ while (this._curSplice[0] + this._curSplice.length - 2 < this._curLine + 1) {
+ const n = this._curSplice[0] + this._curSplice[1];
+ if (n < this._linesLength()) {
+ this._curSplice.push(this._linesGet(n));
+ this._curSplice[1]++;
+ } else {
+ // No more lines available to add to the splice. Create a new line from scratch.
+ this._curSplice.push('\n');
+ }
}
- // TODO should be the same as this._curSplice.length - 1
return 2 + this._curLine - this._curSplice[0];
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing to _putCurLineInSplice
. Will take a look.
Let's make sure newlines are present at all times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should happen if an op removes the last line's newline but that op is immediately followed by another op that restores the newline? For example:
const text = 'x\n';
const lines = [text];
const cs = '=1|1-1|1+1$\n'; // Keep a char, then remove a newline, then add a newline.
Changeset.mutateTextLines(cs, lines);
assert.equal(lines.join(''), 'x\n');
assert.equal(Changeset.applyToText(cs, text), 'x\n');
Similarly, what if an op adds text to the end of the document without a trailing newline, then the next op adds the newline:
const text = 'x\n';
const lines = [text];
const cs = '|1=2+1|1+1$y\n'; // Keep existing text, add a char, then add a newline.
Changeset.mutateTextLines(cs, lines);
assert.equal(lines.join(''), 'x\ny\n');
assert.equal(Changeset.applyToText(cs, text), 'x\ny\n');
If we force newlines to always be present in each entry in this._lines
then we're going to end up inserting extra newlines when we don't want them. I think the way to fix this is to strip the newlines from this._lines
and add them back before returning.
this._curSplice.push(theLine.substring(lineCol)); | ||
this._curCol = 0; // TODO(doc) why is this not set to the length of last line? | ||
const remaining = theLine.substring(lineCol); | ||
if (remaining !== '') this._curSplice.push(remaining); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If remaining === ''
, then that means that this._curCol
points just past the line's newline, correct? If so, I think the better fix would be to set this._curCol
to 0 and increment this._curLine
before doing anything else.
Also, this._curCol
pointing just past the newline feels like a violation of an invariant. Maybe instead of checking if remaining !== ''
, it should assert that and we should fix whatever logic is causing this._curCol
to point past the newline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not necessarly a newline at every line. Etherpad uses it with lines that always end with newlines (e.g. rep.lines), except for mutateAttributionLines
, but this has other logic.
Also take a look at the test case, this is valid imo.
Update: I do think now, that it should end with a newline now. So when it comes to "valid" ignore the newline for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do understand what you wrote in the first paragraph, though. If there is a newline, we would never go "beyond" that. Let me test
OK, I think I ensure that newlines are everywhere first. It's possible that I didn't saw missing newlines because it's dealing with edits at the end of the pad. |
Any update @webzwo0i? |
Will have time in the next few days. I'll probably tackle curCol/curLine behavior first, including asserting the invariants. For newlines behavior I need to take a look at |
6d96eed
to
e36ffb3
Compare
I rebased onto latest |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Any update on thoses fixup commits ? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Bump @webzwo0i -- any progress? |
@dependabot rebase |
...in case we are in the middle of a line (curCol !== 0) and we remove everything up to the end of the line. Before this commit a string 'undefined' could make it into the splice.
The new insertions are directly pushed to curSplice now instead of trying to incorporate an undefined line into the splice, which would result in an error: TypeError: Cannot read property 'substring' of undefined
In such cases the remaining part of the old line is directly pushed to the splice but we need to ensure it is not an empty string. Before this commit an empty string was pushed to the splice.
e36ffb3
to
f8b41ba
Compare
@webzwo0i Do you have some time to fix this pr? |
Will clean this up tomorrow (fixing eslint etc). This should fix #5214 #5213 and replace #5215 and #2911
Also adds some comments and additional coverage.