Helix's motions - mental models #14468
Replies: 1 comment
-
Why I think Helix motions should be selection firstNote: Forgive the time it took to post this. I coincidentally fell ill the very next day after writing the parent post. I believe Helix should lean into the selection-first model of motions. First, while Helix's vision bullet point on the topic starts with "selection -> action," it does use the term "selection first". Admittedly, the only things promised by these terms is that selecting the text is required before performing actions on it. However, we don't usually set out to design inefficiencies, and certainly not these devs, so I interpret "selection first" a little more broadly. I read it to mean that selecting is to be a first class workflow. As the linked threads have shown, Helix's current mixed model doesn't reach that standard for some workflows; for instance, I deal primarily with prose, where we deal less often with tree-sitter objects. Of course, Helix isn't meant to be everything for everyone, but we can improve the experience for a not insignificant current and potential userbase without sacrificing too much current functionality. Second, examining them closely, most motions already follow the selection-first mental model in some way, whether through being category II or category III in my analysis. In fact, if as one footnote suggests, we consider What it would look likeHow would Helix become more selection-first? Here I sketch some ideas. Some are easier than others, some are harder, and some are more or less disruptive for people with movement-first models. These aren't demands or anything of the sort, just ideas that might be worth considering if we choose this direction. There are already open PRs for some of these, so naturally many of these ideas aren't mine. The particular implementation of these ideas is of course up to review and scrutiny to keep Helix's code nice and maintainable. I'm also aware talk is cheap, so I am diligently studying the codebase to be able to contribute to this project I love.
Some questions or criticisms
PS: I had to get this out sooner rather than later. I may update it with other details as I get better :P. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This post DOES NOT argue for adopting vim-like keybindings, please don't remove it under that understanding.
TL;DR: Helix's motions have 3 distinct behaviours which mix-and-match between 2 editing mental models. Choosing and leaning on one would likely reduce confusion and complaints.
Introduction
Since as far back as 2021 (#165), there has been discussion and disagreement on Helix's keymaps and how they should work. It is also not uncommon for beginners to be confused about how to do things "the Helix way", whether they have previous experience with modal editors or they don't.
Helix aims to be a selection-first modal text editor, which in the strictest sense means that the editing workflow works by first selecting the text we want to operate on, and then performing the operation on it.
Selections/ranges
Properly understanding these confusions requires being clear about how Helix's model works in the code. I will simplify the overview to only include single-cursor selections (and below, motions that stay on the same buffer).
Selections are the primary editing construct. In a Helix buffer, there is always something selected: a single cursor is a selection that spans one character (that character can be a single whitespace character). A selection has an anchor, a head, and a direction: the anchor and head roughly correspond to the "start" and "end" of the selection when taking into account direction (forwards and backwards) —in fact, the anchor and head's positions relative to each other are what determine the selection direction. The main cursor, when kept as the default box, is positioned where the head is.
Motions
A lot of the confusion centres around commands called "motions" or "movements", namely, among others:
and how these commands do/should interact with Helix's "selection first" model 1.
Some users have expressed these commands are inconsistent between each other (#14074 , #1630 , #5127, #6604, #3114), sometimes leading to confusion and frustration. In particular, they feel that some of these commands "select" something, (that is, result in nonzero-width visible ranges) while others don't. There is also another related confusion, about whether some or any of these commands "extend" the selection; this particular confusion also ties into how these commands do/should interact with the "count" system (numbers you can type before executing an operation as a shorthand to performing the operation multiple times).
To be on the same page, based on how Helix treats selections, all the aforementioned commands result in selections, even if the resulting selections are a single character wide. The differences between the commands lies in what selection they generate.
In the current system:
h, j, k, l, gh, gl, gs, gg, ge
all result in a selection containing the single character at the target position (let's call them Category I)2.gw
and[/] f,t,a,c,e,T
(respectively) all result in a selection containing the entirety of the target object (comment, function, etc) (let's call them Category II).w, e, b, W, E, B, f, t, F, T
and[/] p
all result in a selection containing all the text between the current cursor position and the target: they set the anchor to where the cursor/head was and place the head on the destination (let's call them Category III).For all of these categories, it's important to note they don't care what you had selected before (except as a way of finding your position); they replace whatever you had selected with whatever the new selection is, as described above.
The confusion and disagreement over their functionality seems to stem from different understandings, or mental models, of what these commands are for in Normal mode.
Selection vs movement
This article was prompted by a pull request addressing a long discussion about the motions
w, e, b, W, E, B
and their interaction with counts in particular, so I'll use those to illustrate, but the illustrations apply to the other motions as well.The thesis is this: there are two (2) dominant mental models about what these commands are for in Normal mode:3
For the movement-first camp, though I can't read minds, we could say the primary Normal mode frame of mind is moving: these commands are the means to navigate the text. The fact some of these select (like
w
) is, effectively, a convenience: Helix requires us to select text before we act on it, so these small selections make it easier to change or delete one word. In other words, in Normal mode, selecting is secondary to moving (I'll touch on Select mode later).For the selection-first camp, the primary Normal mode frame of mind is selecting; these commands are the means to select the text we want to operate on. The fact that they result in moving the cursor is, effectively, a side effect of Helix's selection-first architecture: Helix selections always have a head which is represented by the cursor. In other words, moving is secondary to selecting.
For small and simple motions in Normal mode (e.g. a single
w
press), both concepts result in identical outcomes. Outcomes diverge, and unexpected behaviour occurs for one or the other camp, when we attempt anything more complex, such as using counts4 or Select mode.Select mode
Select mode is one of Helix's three major modes. It behaves similarly to Normal mode, with pretty much all the same motions available. The difference is that Select fixes the anchor of our current selection and allows us to use motions to position the head without destroying what has already been selected.
Select mode's differences from Normal mode merit a little further specification, since many are confused about what they are (#1570 (comment) , #12533 , #8761, #14344), especially when it comes to small motions like
w
. Say we have the following text (fromhelix-core/src/selection.rs
), and we represent our selection as[]
, where[
is the anchor and]
is the head (ex. 1):If, in either Normal mode or Select mode, we press
b
, this is the result:What happens from here varies depending on whether we are in Normal mode or Select mode. If in Normal mode, we proceed to press
w
, we get this:In Select mode, however (whether we did the previous
b
in Normal mode or activated Select mode afterb
), this happens:On the other hand, say we're back to ex. 1, and instead of
b
, we want to usew
. In Normal mode, this happens:In Select mode, something subtly different happens. Since we already had the space before "are" selected (i.e. contained in our cursor), the resulting selection also includes the space before "are":
The differences become starker if counts are taken into account. From the position in ex. 1, if in Normal mode, we input
b3w
, this is the result:While inputting either
vb3w
orbv3w
results inFor the sake of argument, if someone was in position ex. 1 and wanted to select "Selections are the " using simple motions, one correct chain is
b;v3w
, collapsing the selection created byb
before entering Select mode.This makes clear the difference between Normal mode and Select mode in terms of how Category II and Category III motions behave: in Normal mode, they create a brand-new selection that discards whatever we had selected, but in Select mode, they add the selection that the Normal mode motion would have resulted in to our existing selection, after accounting for direction. This adding is what's called "extending" the selection 5.
Why this matters in the context of the two mental models will be dealt with in a later section.
The mental model in Helix's code
The function names in Helix's code—which incorporate words such as "move, goto, jump"—heavily suggest the mental model used is the "movement-first" model, i.e., their intended function is navigation, and selection is secondary.
However, we saw above that despite this uniformity of name, there are at least three (3) ways motions result in selections in Normal mode. The most compatible with the code's model is category I, where the cursor lands at the target with a 1-character selection. Category II almost entirely follows the selection-first model, if only for convenience, and category III seems to conflate both mental models, resulting in the expressed inconsistency.
It's notable, however, that not all motions follow this pattern. In particular, some functions around navigating tree-sitter syntax trees are named—and behave—like Category II, according to the selection-first model (with quirks: #1999):
Consequences of the mental models
Having explored how Helix's motions behave, how the motions are named in the code, and how people might interpret the motions differently, the question remains: why does this all matter?
I think this analysis matters because it's likely the frustrations users have with (to them) unexpected behaviours in Helix arise from a mismatch between their mental model and Helix's current behaviour. Furthermore, it's likely this mismatch is caused, facilitated, or exacerbated by either community miscommunication, or Helix's motions' above-mentioned divergent behaviours, or both.
Certain users—perhaps cued by Helix's self-description as "selection first" and by the mnemonics of some motions, which as we have seen create selections containing the expected text, in simple cases or, for some motions, always—adopt the "selection-first" mental model, and then are surprised, and perhaps frustrated, with some motions' "movement-first" behaviours. Other users—I dare to suggest some of which may have experience with other text editors, or who have an engineer's approach to systematicity—adopt the "movement-first" mental model. Most probably enjoy the conveniences of many motions also selecting, but they have an expectation that selecting is its own frame of mind; not that there haven't been reports of unexpected behaviour from this camp too.
Whatever camp you fall in, over four years of discussion about Helix's basic keymap suggests the current hybrid behaviours are not serving many people and will likely remain a point of confusion indefinitely. For community sustainability, choosing a single mental model—either selection-first or movement-first—leaning on it consistently and everywhere, and communicating it clearly to the community is probably for the best.
Thank you for reading this far. I will soon post my own opinion as to which direction I think is more productive for Helix to take, with arguments. If I've made a factual mistake, I'd be thankful for a correction. I would also, of course, love to hear which of "movement-first" or "selection-first" you think would be better, and why.
The PR discussion that prompted this: #1570
PS: Why I think this deserves to be an open discussion.
#2477 was—appropriately—closed because it had degenerated into an (often rude) catch-all discussion about Helix's keymaps with no actionable steps that could plausibly close it as an issue. I believe this discussion is different because it's circumscribed to one concept: my proposed categorization of editing mental models and resulting expectations. Also, unlike that discussion, this post sheds light on specific directions to solve the perceived problems, even if the proposed directions aren't followed; namely, making an unambiguous decision on the editing mental model we want for Helix, taking steps to lean into it to maximize its benefits, and communicating it to users.
Edit: Added references to relevant issues, PRs, discussions, and comments.
Edit 2: Fixed style issues remaining from draft stage. Also added section justifying its open status.
Footnotes
Note
hjkl
don't have ahelix-core
command to move by characters. Instead,helix-term
uses functions by those names which call themove_impl()
function with the appropriate parameters. Furthermore, all theg
-prefixed commands, as well asf, t, F, T
, are implemented directly inhelix-term
. ↩Since
hjkl
all have for a target a specific character, it could even be said they are in category II: they generate a selection that fully contains the target. ↩The distinction was elegantly (if through a mistake) demonstrated in https://github.com/helix-editor/helix/issues/536#issuecomment-892083147 ↩
For other issues with counts, see also https://github.com/helix-editor/helix/issues/2477 ↩
In reality, these motions all generate a new range each time, with the Normal mode motions ignoring the previous selection and the Select mode motions taking it into account in their different ways. ↩
Beta Was this translation helpful? Give feedback.
All reactions