Skip to content

Supporting different schemas around getting child nodes #18

@LeaVerou

Description

@LeaVerou

Problem

In #16 we converted our hardcoded { node, property, index } parent pointer structure to a more flexible { node, path[] } schema. However, we did not change our internals much to really support arbitrary paths. We still assume children are found by:

  • If no getChildProperties() is specified, we follow all properties on a node, and call isNode() on them to see if they are child nodes. This only goes 1-2 levels deep: values that are obtained by following one property, or array values if that one property is an array.
  • If getChildProperties() is specified, we follow these properties and do not call isNode(), we just filter by existence.

Currently, this assumes a specific structure that may be overfit to Treecle’s AST beginnings.
I suspect in the wild there are two common ways to represent tree data structures:

  1. Follow specific properties that always point to children (the AST case)
  2. Follow a property that always points to children (children for Mavo Nodes, childNodes for DOM Nodes)

Currently, the API only supports providing a function that takes a node and returns a list of properties. As an example, this is how this setting is specified in vastly:

export const properties = {
	CallExpression: ["arguments", "callee"],
	BinaryExpression: ["left", "right"],
	UnaryExpression: ["argument"],
	ArrayExpression: ["elements"],
	ConditionalExpression: ["test", "consequent", "alternate"],
	MemberExpression: ["object", "property"],
	Compound: ["body"],
};

defaults.getChildProperties = (node) => {
	return properties[node.type] ?? [];
};

This appears to be overfit to 1. Yet, I suspect 2 may even be more common.
How do we allow both to be specified without making either more complicated due to the existence of the other?

Ideas

getChildProperties() to getChildPaths(), handle both string[][] and string[]?

We don't want to complicate 1 to cater to 2, but what if we could do both? If the function returns an array of strings, they are single properties. If it returns an array of arrays, they are paths.

The problem is, we don’t necessarily have specific child properties in 2, often once you get from the node to its children, everything in that data structure is a child.

Wildcards? JSON Paths?

Basically, we want a way to say children/* for these cases. What if we handle / and * in properties specially?

But then we’re basically creating a path microsyntax, and restricting the potential syntax of actual real properties accordingly.
OTOH, that's basically JSON Path syntax, which is quite well established.

The advantage of something like this is that we can still handle properties like Vastly’s in exactly the same way.


Not a huge fan of any of these ideas, so I’ll continue brainstorming.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions