Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify initial XPath expression context of bind expressions (and similar) #137

Open
eyelidlessness opened this issue Jun 20, 2024 · 3 comments

Comments

@eyelidlessness
Copy link
Member

This issue is intended to capture the outcome of some discussion with @lognaturel about this test. Specifically, given a form structure...

<h:html xmlns="http://www.w3.org/2002/xforms" xmlns:h="http://www.w3.org/1999/xhtml">
  <h:head>
    <h:title>Some form</h:title>
    <model>
      <instance>
        <data id="some-form">
          <node><value>1</value></node>
          <node><value>2</value></node>
          <node><value>3</value></node>
          <node><value>4</value></node>
          <node><value>5</value></node>
          <node-values/>
        </data>
      </instance>
      <bind nodeset="/data/node" relevant="position() > 2"/>
      <bind nodeset="/data/node/value" type="int"/>
      <bind nodeset="/data/node-values" calculate="concat(/data/node/value)"/>
    </model>
  </h:head>
  <h:body>
    <group ref="/data/node">
      <repeat nodeset="/data/node">
        <input ref="/data/node/value"/>
      </repeat>
    </group>
  </h:body>
</h:html>
  • In JavaRosa, the value of /data/node-values will be "345"
  • Currently in Web Forms, it will be ""

This is because:

  1. Web Forms evaluates all XPath expressions with a single node as its expression context.
  2. For each /data/node, the expression position() > 2 will always return false because position() in a single node context will always return 1.
  3. As a result, none of the node repeat instances will ever be relevant.
  4. That non-relevance is inherited by each value, causing their values to be blank.

It's clear that the expectation is that the relevant expression is evaluated against a multiple node context. Though it's not obvious in this form fixture, there's still some room for ambiguity about what the expected context should be:

  • The complete set of all nodes matching the nodeset
  • The contiguous subset of all nodes matching the nodset

The first option seems obvious. But consider a variation of the above form fixture:

<h:html xmlns="http://www.w3.org/2002/xforms" xmlns:h="http://www.w3.org/1999/xhtml">
  <h:head>
    <h:title>Position, context, nested repeats</h:title>
    <model>
      <instance>
        <data id="position-context-nested-repeats">
          <outer>
            <inner>
              <val>1</val>
              <val>2</val>
              <val>3</val>
              <val>4</val>
              <val>5</val>
            </inner>
            <inner-vals />
          </outer>
          <outer>
            <inner>
              <val>1</val>
              <val>2</val>
              <val>3</val>
              <val>4</val>
              <val>5</val>
            </inner>
            <inner-vals />
          </outer>
        </data>
      </instance>
      <bind nodeset="/data/outer/inner" relevant="position() > 2"/>
      <bind nodeset="/data/outer/inner/val" type="int"/>
      <bind nodeset="/data/outer/inner-vals" calculate="concat(/data/outer/inner/val)"/>
    </model>
  </h:head>
  <h:body>
    <group ref="/data/outer">
      <repeat nodeset="/data/outer">
        <group ref="/data/outer/inner">
          <repeat nodeset="/data/outer/inner">
            <input ref="/data/outer/inner/val"/>
          </repeat>
        </group>
      </repeat>
    </group>
  </h:body>
</h:html>
  • With the complete set, the respective values of /data/outer/inner-vals would be:

    • "345"
    • "12345"
  • With the contiguous subset, they'd be:

    • "345"
    • "345"

The latter is what I would intuitively expect. Describing a structure like this in discussion with @lognaturel, we agreed on that intuition. Notably, for the affected test linked above, the implicit outcome would be that position() in nodeset context will behave as if it had been expressed as position(.) (i.e. it would be consistent with the ODK XForms specification's 1-arity position extension).

Having a clear set of expectations, we also agreed on these next steps:

  1. Time box a spike to prove out the hypothesis that both form structures will behave as expected, with a general change to use the contiguous subset context for:
  • relevant expressions
  • other bind expressions with the same nodset context
  • broadly, any other form expression which would be expected to have the same nodeset context
  1. Validate that this generalization does not break other things. (Editorial: it may very well fix other things.)

  2. If all of the above holds, open an xforms-spec issue to clarify initial expression context accordingly.

@lognaturel
Copy link
Member

lognaturel commented Jun 20, 2024

The primary use for this I'm aware of involves first capturing a roster in one repeat and then asking follow-up questions about some or all of the items in the roster in a second repeat. For example, you could capture the names and ages of all household members and then ask questions only about the children under 5.

Here's an XLSForm that illustrates this: https://docs.google.com/spreadsheets/d/1Ca8fbhpGpdaxLAnSSebrGRnbey5U0FUbJM_pVxjReKI/edit#gid=1068911091

If all of the above holds, open an xforms-spec issue to clarify initial expression context accordingly.

An alternative would be to consider implementing it for compatibility with existing forms without making it an explicit part of the spec and maybe communicating that it's deprecated in some way (warning?). I don't believe we document any examples like this and the second structure in my sample form seems less ambiguous.

I believe this intersects in interesting ways with edits (#36): if certain repeat instances are entirely non-relevant, because the spec doesn't have a stable ordinal concept, their position will change on edit. That's why Enketo has an additional ordinal concept: https://enketo.github.io/enketo-express/tutorial-12-ordinals.html This is also at least tangentially related to enketo/enketo#49 I think

@lognaturel
Copy link
Member

The more this is kicking around my brain, the more I feel like we might want to defer even the spike until we have a vat of edge cases that we can consider together and triage! I think the concept is likely only useful with jr:count and I think we have a nice alternative form definition with the relevance on an inner group.

This comment from @eyelidlessness also resonates and may be an alternative to addressing this specific issue:

it might be worth revisiting the group > repeat > group aspect of spec and recommendations. it feels like those structures are doing a lot of "if you hold it right" work that could benefit from something more dedicated to the use cases that we already know

@eyelidlessness
Copy link
Member Author

I'm not in a huge rush to dive further into this, but I do want to make it clear that this issue isn't specifically about position() or use cases specific to the test.

It really is about the broader question: when an XPath expression's context is a bind nodeset, what exactly does that mean?

That has broader implications than the particular test that raised the question this time, and the kinds of use cases we can extrapolate from it. There have been other cases like this which have raised the same question in my mind. I would need to spend some time going through our now-expanded test suite and probably through Enketo's fixtures to say for certain what the other cases look like. This one just happened to jump out at me as a fairly well distilled example where the implications of the question are relatively simple to grok.

My gut instinct is that the time we'd invest in this proposed spike would be relatively minimal (even without time boxing it). Almost certainly less than the time we'd spend doubting it, and collecting cases to justify it. I also think that the intuitive answer we agreed on just a little while ago very probably reflects the expectation people would tend to have when authoring nodeset-contextual expressions, even if they haven't thought it through in such detail.

This comment from @eyelidlessness also resonates and may be an alternative to addressing this specific issue:

it might be worth revisiting the group > repeat > group aspect of spec and recommendations. it feels like those structures are doing a lot of "if you hold it right" work that could benefit from something more dedicated to the use cases that we already know

It's also at least plausible that rethinking the conventions and recommendations around these structures could make answering the broader question more compelling.


I believe this intersects in interesting ways with edits (#36): if certain repeat instances are entirely non-relevant, because the spec doesn't have a stable ordinal concept, their position will change on edit. That's why Enketo has an additional ordinal concept: https://enketo.github.io/enketo-express/tutorial-12-ordinals.html This is also at least tangentially related to enketo/enketo#49 I think

I am eager to dive into these kinds of questions about edit-specific considerations.

The absence of non-relevant repeat instances certainly could imply a positional change upon edit, but it's not necessarily what I'd expect. Depending on the particular relevant expression, it's also conceivable that we could infer where repeat instances have been elided and repopulate them without an explicit ordinal artifact in the original submission. That gets harder to imagine where they're elided at the end of a range, but that's probably moot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants