Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce vocabulary parameter to wikitext content type #345

Open
Jermolene opened this issue Jan 14, 2014 · 21 comments
Open

Introduce vocabulary parameter to wikitext content type #345

Jermolene opened this issue Jan 14, 2014 · 21 comments

Comments

@Jermolene
Copy link
Member

To aid in the interchange of content, it is proposed that we introduce a parameter to the TiddlyWiki5 content type (text/vnd.tiddlywiki) to specify the vocabulary with which the text is intended to be parsed. For example:

A vocabulary specified via a URL:

text/vnd.tiddlywiki; vocab=http://vocabs.tiddlywiki.com/standard.json

A vocabulary defined in a local tiddler:

text/vnd.tiddlywiki; vocab=myBirdwatchingVocabulary

In either case, the vocabulary contains information specifying:

  • The parse rules to be applied to the text
  • Optional configuration parameters for those parse rules

It might be useful for vocabularies to be able to cascade, so that one could describe a custom vocab as, say, "standard TW markup plus these three plugins".

@davidjade
Copy link
Contributor

I just want to comment on the "cascade" part. I think I would find this useful for rules in general (maybe it is already possible?). I've actually been wondering how feasible it would be to extend the existing rules rather than replace through plug-ins. For instance, I have one Wiki where I want any all capitalized words to be auto links in addition to the standard rules. I've also thought about having a list of "magic" words that would always get auto links, etc...

@Jermolene
Copy link
Member Author

@davidjade yup, this proposal would meet your needs. The regexp for wikilink matching could be one of the vocab configuration parameters.

@davidjade
Copy link
Contributor

So this would be set per-tiddler though? Seems like maybe I'd still need a plug-in approach if I wanted to extend the wikilink rules for all tiddlers (new and existing)?

@Jermolene
Copy link
Member Author

The vocabulary setting would be part of the "type" field of each tiddler. For example:

text/vnd.tiddlywiki; vocab=http://vocabs.tiddlywiki.com/standard

There would also be a way of setting the default vocabulary for new tiddlers.

If you wanted to change the wikitext rules applied to a tiddler overriding the vocabulary specified within the type, then, yes, maybe that would need a plugin.

The primary motivation for all of this is to enable interoperability of content.

@buggyj
Copy link
Contributor

buggyj commented Jan 29, 2014

I have Written a simple framework, based on cascading settings (from our discussion). Only the overall structure is defined, along with the methods to combine a cascade of these structures, the result being passed to the parser.
A collection of setting are defined using a json tid, (Stid), which contains the sturucture "parserrules":{...}, within which lists(as 1-d arrays) of strings, and atomic types are allowed. Also within the stid is the item "baseparser":"..." which references the previous (if any) Stid in the cascade. Thus the cascade is defined recursively in reverse. An Stid is only referenced as part of a block (or tiddler) type, eg "baseparser":"text/vnd.tiddlywiki<fullTW5", here fullTW5 is the Stid. The outcome of the cascade is that lists, with the same name, that appear within differnet "parserrules" are merged, atomic items of the same type are overridden, and this resultant parserrules is passed to the last named parser, for example if a tiddler is of type "text/vnd.tiddlywiki<fullTW5" then the result of the cascade is passed to the text/vnd.tiddlywiki parser. Note that it is up to the author of the parser to decide what the items within "parserrules" are called and what they mean.

In addition I have extended the framework to include preparsers with the inclusion of "preparser":"..." within the Stid. It has the form "preparser":"text/xtext/y" where B is an (optional) Stid, that modifies the (pre)parser text/x and text/y is the form of the output of the preparser that is pass thru to the next parser in the cascade.

Some examples are demoed here:
http://tw5vocab.tiddlyspot.com/
code is here:
https://github.com/buggyj/TiddlyWiki5/tree/remotes/origin/vocabs

@pmario
Copy link
Member

pmario commented Jan 29, 2014

I have Written a simple framework, based on cascading settings (from our discussion). Only the overall structure is defined, along with the methods to combine a cascade of these structures, the result being passed to the parser.

very interesting

@pmario
Copy link
Member

pmario commented Jan 29, 2014

@buggyj
I think there is a problem with your type definitions. eg: text/vnd.tiddlywiki<fullTW5
They don't match the rules defined at: http://tools.ietf.org/html/rfc6838#section-4.2

@Jermolene
I think also the semicolon ; and spaces are not allowed

@Jermolene
Copy link
Member Author

@buggyj many thanks, looks very interesting. I'll study it and provide feedback.

@pmario the parameter syntax I'm suggesting is discussed here:

http://tools.ietf.org/html/rfc6838#section-4.3

I found this StackOverflow article which gives some examples:

http://stackoverflow.com/questions/3051048/mime-rfc-content-type-parameter-confusion-unclear-rfc-specification

@Jermolene
Copy link
Member Author

@buggyj I wonder if it's worth making this into a pull request so that we can use the line commenting feature to discuss the implementation?

@Jermolene
Copy link
Member Author

@buggyj there are some issues with coding styles (eg TW5 always uses braces with the if statement). I've started trying to record the house style here:

http://tiddlywiki.com/static/TiddlyWiki%2520Coding%2520Style%2520Guidelines.html

@pmario
Copy link
Member

pmario commented Jan 29, 2014

the parameter syntax I'm suggesting is discussed here:
http://tools.ietf.org/html/rfc6838#section-4.3

yes it says:

Parameter names have the syntax as media type names and values:

       parameter-name = restricted-name

and restricted-name is defined at: Naming Requirements: http://tools.ietf.org/html/rfc6838#section-4.2 where is no "space" ... but the semicolon seems to be ok.

http://www.iana.org/assignments/media-types/media-types.xhtml which links to some mime types with parameters http://www.iana.org/assignments/media-types-parameters/media-types-parameters.xhtml

so imo this will be ok.

text/vnd.tiddlywiki;parameter=value

but not recomended as seen here: http://tools.ietf.org/html/rfc6838#section-4.3

New parameters SHOULD NOT be defined as a way to introduce new
functionality in types registered in the standards tree, although new
parameters MAY be added to convey additional information that does
not otherwise change existing functionality. An example of this
would be a "revision" parameter to indicate a revision level of an
external specification such as JPEG. Similar behavior is encouraged
for media types registered in the vendor or personal trees, but is
not required.


The StackOverflow article referes to http://tools.ietf.org/html/rfc2045 which is the specification for HTTP header types. ...

Where http://tools.ietf.org/html/rfc6838#section-4.3 says: ``

Note that this syntax is somewhat more restrictive than what is
allowed by the ABNF in [RFC2045] and amended by [RFC2231].


So may be it would be best to have a mime subtype that tells a user, to have a look at special tiddler fields. eg: vocab=....

text/vnd.tiddlywiki
text/vnd.tiddlywiki.vocab

@pmario
Copy link
Member

pmario commented Jan 29, 2014

I did find a text vnd format: vnd.fmi.flexstor that uses optional parameters: http://www.iana.org/assignments/media-types/text/vnd.fmi.flexstor

so for me it seems:

text/vnd.tiddlywiki;vocab=http://vocabs.tiddlywiki.com/standard

would be possible but the syntax of the parameter has to be specified. since http://tools.ietf.org/html/rfc6838#section-4.3 uses a MUST for the parameter specification. ... So we need to restrict the tiddler names, that can contain the vocab definitions. Otherwise it may be complicated.

There is no defined syntax for parameter values. Therefore,
registrations MUST specify parameter value syntax. Additionally,
some transports impose restrictions on parameter value syntax, so
care needs be taken to limit the use of potentially problematic
syntaxes; e.g., pure binary valued parameters, while permitted in
some protocols, are best avoided.

@buggyj
Copy link
Contributor

buggyj commented Jan 29, 2014

@Jermolene
The code is only to 'demo' quality. I was expecting some feedback about the general logic, are there issues, eg
the format of the vocab parameter - the answer to this one seems to be yes :-)
If your happy with the scope I will tidy up the code and see if there are corner cases to be dealt with before submitting a pull request. (then you can tell me what's wrong with it)
Alternatively, if you would like to use the code to talk about issues I can submit a pull request with the code as it is.

@Jermolene
Copy link
Member Author

Hi @buggyj no problem about coding standards, just wanted to make sure you were aware.

I'd like to be able to discuss the code before you get stuck into lots of work on it, so I'd be happy for you to make a pull request in it's present form, and then we can bash it into shape.

Many thanks!

@buggyj
Copy link
Contributor

buggyj commented Jan 30, 2014

@Jermolene
fair enough!

@buggyj
Copy link
Contributor

buggyj commented Feb 11, 2014

@Jermolene
Hello, I did the pull request some time ago, I hope it go to you?
#382

@Jermolene
Copy link
Member Author

My apologies @buggyj I've been fighting a ridiculous backlog for a few days now. I've made a few comments on the pull request; I'm not sure that I understand all of the changes properly.

My conclusion from a brief review is that in order to sort this issue out we need to do a more thorough refactoring of the parser architecture. There's a few problems I'd like to sort out in one go:

  • Making the parsers and parse rules re-entrant
  • Changing the way that transcluding images works so that we can more naturally apply dimensions to the <img> tag (currently the <img> tag is generated by the parser)
  • Reusing more of the common parser infrastructure
  • Fixing up the generation of <p> tags and treatment of line breaks after HTML elements

Anyhow, there's some other stuff on the roadmap I'd like to get done first (eg the current work on autosave and translation), and then return to this in a few weeks.

@buggyj
Copy link
Contributor

buggyj commented May 30, 2014

simplified and resubmitted as pull request #629
new demo is at http://tw5vocab.tiddlyspot.com/

@pmario
Copy link
Member

pmario commented May 31, 2014

@buggyj
great stuff

Jermolene added a commit that referenced this issue Jul 30, 2015
This is to address #1875 more quickly than via #345.

This commit only includes a partial implementation of the UI - see the
ticket for details.
Jermolene referenced this issue Dec 23, 2015
* Ignore parser rule configuration in safe mode
* Made text translatable
* Added new setting for camelcase links
* Added warning on parser rule configuration tab

See #1875
@Telumire
Copy link
Contributor

Telumire commented Jan 6, 2023

I like that idea a lot. Maybe instead of "vocab", this could be called "xmlns" or namespace, to follow the html spec ?

@Jermolene
Copy link
Member Author

I like that idea a lot. Maybe instead of "vocab", this could be called "xmlns" or namespace, to follow the html spec ?

"namespace" might work, but seems a little technical. I think of these vocabularies as being very visible for end users. The dream would be that we'd evolve vocabularies for specific niches (eg a vocabulary for dance choregraphers that incorporates a notation for dance moves, or a special vocabulary for writing D&D games).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants