Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Allowed @name values for @prop in documentation #1972

Open
iMichaela opened this issue Jan 9, 2024 · 19 comments
Open

Incorrect Allowed @name values for @prop in documentation #1972

iMichaela opened this issue Jan 9, 2024 · 19 comments
Assignees
Labels

Comments

@iMichaela
Copy link
Contributor

Describe the bug

CROSS-REFERENCE BUG, INITIALLY: OSCAL-Reference/issue#6

There are many places in the OSCAL documentation lists incorrect values for the @name attribute on the prop field.

For example, prop[@name='marking'] is only supposed to be valid in the //metadata of each model; however, it is also incorrectly listed in the documentation as valid in many other places.

A search of the Catalog documentation shows eight additional occurrences of prop[@name='marking'] in places such as:

//metadata/revisions/prop
//metadata/role/prop
//metadata/location/prop
//metadata/party/prop
//metadata/responsible-party/prop
//param/prop (root, group, and control levels)
//control/prop (root, group, and control levels)
//part/prop (group and control levels)
//group/prop
//back-matter/resource/prop

Who is the bug affecting

Developers trying to properly implement OSCAL properties.

What is affected by this bug

Documentation, Metaschema

How do we replicate this issue

  1. Visit the documentation page for any model.
  2. Search the page for "marking" (three occurrences of marking per entry)
  3. Observe prop[@name='marking'] as valid in places other than //metadata/prop

Expected behavior (i.e. solution)

Documentation for prop in each context should include only the actual accepted values for @name.

Other comments

This issue has existed since the pre 1.0.0 release candidates. To my knowledge no issue was created for it. I could not find one among the open issues.

Revisions

No response

@iMichaela iMichaela added the bug label Jan 9, 2024
@iMichaela
Copy link
Contributor Author

iMichaela commented Jan 9, 2024

Important comments are available under the original issue OSCAL-Reference/issue#6

Moving the earlier-assigned devs here.

@wendellpiez
Copy link
Contributor

wendellpiez commented Jan 9, 2024

In my assessment (as I have remarked repeatedly, whenever consulted), the current design for documentation as a 'spill out' or rendering of a document tree, simply does not accommodate the semantics of 'allowed-values' listings in particular - and of constraints in general - especially given the way Metaschema constraints definitions are evolving.

I.e. I can corroborate the seriousness of this problem since it has been with us since the start, or at least since we settled on a 'good enough for now' docs design. That was fine at the time; now it is no longer 'good enough'. (Indeed on closer examination this usability problem is related to others I won't go into here.)

Accordingly the only way forward I see is this:

  • Deal with the problem in the current design by reduction and simplification -- removing all constraints docs from the embedded (per-definition) docs.
  • Then, supplement the docs set with a new generated HTML file, listing all constraints defined everywhere, and indexing them, not only by their contexts of definition but more importantly, by the sets of nodes (node types) to which they are applicable.
  • Iterate this design with users working test samples

This is not a trivial work item and indeed #2 in itself will require significant effort starting with research spikes.

More importantly, none of this is a one-person job. It was already too ambitious as a two-person job (when it was not done in the current design). The second item especially (the design of an index to constraints by node type and context) is not trivial and ultimately may require more than a static index - i,e, Metapath-aware processing. The third item explicitly requires a team.

Am I wrong?

By an 'index' I mean being able to see all the constraints applicable to any OSCAL element by looking it up some sensible way. But even if you can find a metaschema definition for an element (in itself currently a problem, due to element name reuse in the current metaschema sources), you can't from this query alone find all the constraints applicable to any given case, for two reasons:

  1. Such constraints might be defined anywhere in a metaschema, not only in definitions of nodes to which they are applicable or their parents, but actually anywhere
  2. Constraints definitions also include arbitrary conditionals such a co-occurrent tests, which mean that which constraints apply to a node can never be known before runtime of the entire tree not just a local fragment (object)

Thus an index would show not only the constraints known to apply to a given node in an actual or abstracted case (element, object or property), but those that might apply depending on other conditions.

If none of the above made sense to you, that only shows how far we have to go.

A more radical approach would be to rip the constraints definitions out from OSCAL entirely and no longer require them for conformance. But I don't know how to do this while also acknowledging that obviously users need them (in some form), and guiding them or showing them how to implement such constraints themselves (i.e. exploring the same mountain range from the other direction). More fundamentally, it would mean giving up this particular tool to exert any leverage at all, meaning 'anything goes' would effectively be the rule.

@wendellpiez
Copy link
Contributor

Just noting in passing that likelihood that there are almost certainly bugs among the current constraints definitions and possibly the tools as well -- but these need to be exposed with test cases before they can be corrected.

This is another reason to have a second implementation of the constraints validation available for testing.

@iMichaela
Copy link
Contributor Author

The Discussion #1968 is also analyzing prop/@name constraints error in the assessment layer (discovered in the AP and reported int he OSCAL/Lobby)

@david-waltermire
Copy link
Contributor

The property named marking was intended to be allowed anywhere a property is allowed. That is how the constraint is defined. This is a feature, not a bug. The bug is in the documentation being incorrect.

Please keep in mind that changing this is a backwards compatibility breaking change.

@iMichaela
Copy link
Contributor Author

iMichaela commented Feb 22, 2024

I think the documentation lists prop@name="marking" as a constraint, that does not allow other values (there is no allow-others="yes" -- see oscal_metadata_metaschema.xml (line 720)

    <constraint>
            <allowed-values target=".[has-oscal-namespace('http://csrc.nist.gov/ns/oscal')]/@name">
                <enum value="marking">A label or descriptor that is tied to a sensitivity or classification marking system. An optional class can be used to define the specific marking system used for the associated value.</enum>
            </allowed-values>
        </constraint>

This issue is a clone of @brian-ruf issue opened in OSCAL-Reference. He argues that marking was envisioned to be constraint only in metadata. I do not think there is any harm in allowing "marking" everywhere and preserve backwards compatibility, as long as other values are allowed in the metadata. Per my review, only keywords and marking are defined as allowed values. The catalog introduces 2 more allowed values in the metadata (see oscal_catalog_metaschema.xml)
It might be important to investigate if allowing other values is beneficial or not to the OSCAL adopters.

    <constraint>
            <allowed-values allow-other="yes" target=".[has-oscal-namespace('http://csrc.nist.gov/ns/oscal')]/@name">
                <enum value="marking">A label or descriptor that is tied to a sensitivity or classification marking system. An optional class can be used to define the specific marking system used for the associated value.</enum>
            </allowed-values>
        </constraint>

image

@david-waltermire
Copy link
Contributor

@brian-ruf is not remembering this correctly. Please see this commit log and #600. The decision that was implemented was to "allow for a marking value to be provided anywhere that prop is included."

allow-other="no" does not work as you are indicating. See allow-other in the Metaschema specification.

allow-other="no" applies to the the expected value set, which is compiled from all matching allowed values statements. To avoid this aggregation, you need to use both allow-other="no" and @extension="none", which is not used in this case.

The OSCAL model documentation is not accurate around constraints, since it doesn't factor in the aggregation of allowed values across multiple allowed-value statements. This has been a long-standing known problem with the documentation generator.

@iMichaela
Copy link
Contributor Author

iMichaela commented Feb 22, 2024

Thank you for providing the decision record.
I was proposing allow-other="yes", not "no".
On a brief review, the documentation link does not explain how allow-other should be processed when:

  • not set
  • set to "no"
  • set to "yes"

@iMichaela
Copy link
Contributor Author

The allow-other="no" with and without @extension is well summarized. Thanks.

@david-waltermire
Copy link
Contributor

david-waltermire commented Feb 22, 2024

Keeping allow-other="no" allows OSCAL to control property names that are within its namespace. If others want to add a new property, this forces them to use their own namespace or to work with the OSCAL project to get it added to the OSCAL namespace. By changing to allow-other="yes", then any name can be used in the OSCAL namespace, which sets up a case where others may be squatting on the OSCAL namespace instead of using their own namespace. The consequence of this is a future name clash where OSCAL defines a name (in the OSCAL namespace) that others are using for a different purpose. allow-other="no" is used to ensure that this doesn't happen.

@david-waltermire
Copy link
Contributor

Thank you for providing the decision record. I was proposing allow-other="yes", not "no". On a brief review, the documentation link does not explain how allow-other should be processed when:

* not set

* set to "no"

* set to "yes"

See https://github.com/usnistgov/metaschema/blob/develop/website/content/specification/syntax/constraints.md#allow-other

@wendellpiez
Copy link
Contributor

More / better examples of correct and incorrect usage would help to bridge the gap here.

A 'gordian knot' solution to the documentation problems could be to remove the constraints descriptions from current docs and pull them out into separate pages (per model), isolating the design problem there.

Also, concur 💯 that a big problem is that users don't know they can extend freely (not only prop but also part names and classes) into their own namespaces.

@iMichaela
Copy link
Contributor Author

iMichaela commented Feb 26, 2024

I concur with @wendellpiez's proposal of defining the constraints in separate pages. The way they are currently defined and get pulled into the Reference is confusing for many.

Regarding the extensions under the specific namespace - that is a good mechanism BUT it can be used (understood) by tools ONLY if a registry exists from where GRC tools can learn with a simple query how to interpret the extension. Once a registry like this exists (and that would imply an OSCAL extension model is defined), then RMF and FedRAMP specific constraints will be pulled out from core OSCAL as well. Until then, extensions of parts under namespaces do not work. As a matter of fact, oscal-cli is still reporting validation errors related to constraints even when a specific namespace is defined to bypass an existing RMF/FedRAMP constraint .

@wendellpiez
Copy link
Contributor

An external registry would be useful and arguably essential for supporting shared value spaces but would not be necessary for tools, which can use locally-registered extensions, to some definition of "local". The reserved values become noise in their data to other consumers, but that can also be mitigated. Saying ns='urn:my_org would be okay too ...

Besides, the question is not whether applications need to be able to extend (which they do), but whether they need to be able to use the OSCAL namespace to do so (i.e. the one taken to be implicit if none other is given) -- which they don't.

I guess I can understand people not wanting to coin a namespace with a URL without putting something at the address given. This too is a problem but not a problem we can solve. In XML, the URIs assigned to namespaces are fictitious as often as not, and the idea that something predictable would be at the other end of such a URI was long ago abandoned.

In any case it's a problem of user expectations vs 'education' as much as design. Guidance might be offered to reassure devs that while providing a live link might be nice (especially linking to something useful), it isn't absolutely imperative for coining a URL for this use.

@iMichaela
Copy link
Contributor Author

@brian-ruf has a vision and researched an extension model. My understanding (from Chris) is that in FHIR anyone can establish an extension server. Maybe the ns is used as a 'key' to determine what to do with the part/@name="my-specific name" ns="my_org" which is not defined in core OSCAL. Learning from FHIR might be useful when we get to addressing this issue.

@brian-ruf
Copy link
Contributor

@david-waltermire / @iMichaela / @wendellpiez there is a lot going on in this thread, and it's difficult to respond to everything.

@david-waltermire it's not so much that I don't remember correctly regarding the implementation of @marking. It's that I was proposing it only be implemented at the document-level (via metaschema) until the OSCAL-equivalent of document paragraph-level could be better understood and defined. I think my position and recommendation on this is clear in the comments on the issue you cited.

The decision to implement everywhere there are props was apparently made just as I departed the FedRAMP PMO, and I missed it. Had I been aware of it, I would have cautioned against it, and suggested that it be applied more strategically instead of everywhere.

For example, it makes no sense to have marking in document revisions or roles. There are no real-world use cases for this.

Another thing to consider is that adding the marking property everywhere AND preventing other allowed properties should have been two different and un-related decisions. This is because prior to this implementation, anyplace that didn't explicitly have property names defined, implicitly allowed other properties. I'm not saying the decision is wrong. I'm only saying it should be separate from the "marking" conversation.

More importantly: I only partially agree with your statement in the above comment about limiting property values to just NIST OSCAL-defined properties.

@iMichaela / @david-waltermire: I agree there are some places in the OSCAL syntax where property values within the OSCAL namespace should only be limited to those defined by NIST; however, there are notable exceptions.

For example, within components (implementation layer), we had made some conscious decisions to allow other component types and allow additional property values because we couldn't predict every possible data need for the existing component types, much less for any additional types added. (Again, perhaps decisions were made after my departure.)

My Recommendations

  • Close this issue
  • Create a new issue for that revisits the application of the "marking" of content beyond document-level markings. It should:
    • evaluate the requirements and guidance for paragraph-level marking from entities that require it
    • be selective/surgical about where more granular marking should occur
    • adjust the availability of marking within OSCAL based on the above analysis
    • consider implementing an approach similar to document-id with a first-class marking field that includes an @schema (or similar) flag to differentiate what marking system being applied
  • Create a new issue related to providing a mechanism that allows organizations to express extensions and allowed values in a machine-readable format.:
  • Create a new issue that speaks to a specific need being blocked by the current allow-others settings.
    • This allows the community to weigh in on a specific use case and the best way to accomplish it with OSCAL. It may help settle the question of whether allow-others is appropriately applied at a particular place in the syntax.

@wendellpiez
Copy link
Contributor

Very much 💯 on the idea of helping orgs define and govern their own extensions.

As the Metaschema technology gets closer to supporting layered constraints -- applying constraints defined at the local level to documents also conformant to the standard schemas -- we can also formalize and operationalize such "mini-specifications" for controlled vocabularies within OSCAL.

I.e., an organization could write a metaschema module to govern just prop/@name usage over its own namespace, and use that in combination with the more generic standard module for the model being enhanced....

@iMichaela
Copy link
Contributor Author

iMichaela commented Mar 12, 2024

Very much 💯 on the idea of helping orgs define and govern their own extensions.

@Wendell - I feel I need to clarify that I totally agree organizations need to define and govern their extensions, but imagine the NIST CSF v2 extension(s) - how would all GRC tools that entities around the world are using could learn how to interpret the extensions (e.g. what is or what to do with a control/part@name="example" without a human reading, interpreting and coding per his understanding the tool's behavior).

If an extension model would allow an instruction saying for ns="my_ns" & control/part@name="example", action="display-information" (where a set of actions are pre-defined in the Extension model) then a tool could query a registry with extensions in real time or when needed. It can also build its own copy of the ns it needs to support.

@wendellpiez
Copy link
Contributor

wendellpiez commented Mar 13, 2024

That is a difficult problem because it requires coordination among entities that by definition are assumed to be uncoordinated.

In other words, the support for 'discoverable extensions' (not just extensions) you describe is certainly possible on a technical level. It only needs someone (some group of someones) to do it. There may be both top-down and bottom-up initiatives with this goal in mind. Some of these might compete with each other.

Additionally, if there is a requirement for 'discoverable extensions' there might also be a requirement (somewhere) for 'undiscoverable extensions' (or at any rate unregistered) as well as other approaches (aiming for efficiency not discoverability etc.)

There is plenty of analogous work in the XML and metadata informatics fields suggesting ways to do it technically. (A door has been left half-open here by defining @ns as a URI and that is only one way in.) Additionally, the standoff constraints model soon being offered by Metaschema (if developers can finalize the design, implementation and testing of this feature) will allow anyone to design, deploy and publish their own 'bespoke' constraint set to apply as an 'overlay' to a standard model such as OSCAL - which amounts to extensibility, albeit leaving developers (properly and necessarily) with the cross-boundary support problem, which entails validation of applications using the extensions, not just exchange formats.

Indeed, reading again your requirement:

If an extension model would allow an instruction saying for ns="my_ns" & control/part@name="example", action="display-information" (where a set of actions are pre-defined in the Extension model) then a tool could query a registry with extensions in real time or when needed. It can also build its own copy of the ns it needs to support.

This basically defines a feature in a Metaschema-based validator such as oscal-cli or the OSCAL InspectorXSLT (which I'd like to get back to), namely that it would know how to use "hot namespaces" or whatever you'd like to call them for dynamic lookup, smart caching of rules and all that good stuff. This is only assuming the semantics of @ns are extended for this purpose - indeed as there are also other ways such a tool could support back-end dynamic bindings such as you describe.

It is also something that could be supported by an application without a standard model, since it shows the problem is really about resource management (schemas, constraint sets, test suites and documentation) across organizations, not only 'discoverability' at that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Todo
Development

No branches or pull requests

5 participants