diff --git a/README.md b/README.md index a179efd..129fcf8 100644 --- a/README.md +++ b/README.md @@ -4,12 +4,25 @@ many (simpler) pattern-based parsers that exist ([such as mine][1]), properly supporting code like ``, CDATA nodes, comments, namespaces, and processing instructions. -It is currently not a truly valid XML parser, however, as it allows some invalid -XML such as `` to be parsed (and reported) as such. -See the "Limitations / TODO" section below for more details. +It is currently not a truly valid XML parser, however, as it allows certain XML that +is syntactically-invalid (not well-formed) to be parsed without reporting an error. [1]: http://phrogz.net/lua/AKLOMParser.lua +## Features + +* Pure Lua in a single file (two files if you use the DOM parser). +* Streaming parser does a single pass through the input and reports what it sees along the way. +* Supports processing instructions (``). +* Supports comments (``). +* Supports CDATA sections (` & other content as text ]]>`). +* Supports namespaces, resolving prefixes to the proper namespace URI (`` and ``). +* Supports unescaped greater-than symbols in attribute content (a common failing for simpler pattern-based parsers). +* Unescapes named XML entities (`< > & " '`) and numeric entities (e.g. ` `) in attributes and text nodes (but—properly—not in comments or CDATA). Properly handles edge cases like `&amp;`. +* Optionally ignore whitespace-only text nodes (as appear when indenting XML markup). +* Includes a DOM parser that is a both a convenient way to pull in XML to use as well as a nice example of using the streaming parser. +* Adds only a single `SLAXML` key to the environment; there is no spam of utility functions polluting the global namespace. + ## Usage require 'slaxml'