Staxel is a library that can greatly simplify StAX XML parsing. It also supports permissive XML parsing using [StAX] without significant performance sacrifice. See example usage for more details.
Available from Maven Central:
<dependency>
<groupId>uk.elementarysoftware</groupId>
<artifactId>staxel</artifactId>
<version>0.1.0</version>
</dependency>
Let's assume we want to parse XML response from OpenWeatherMap public API.
Simplified repsonse to weather forecast request looks like so
<weatherdata>
<location>
<name>London</name>
<country>GB</country>
</location>
<sun rise="2016-08-04T04:30:14" set="2016-08-04T19:41:42" />
<forecast>
<time from="2016-08-04T12:00:00" to="2016-08-04T15:00:00">
<symbol>overcast clouds</symbol>
<temperature unit="celsius" value="21.68" min="19.71" max="21.68" />
</time>
<time from="2016-08-04T15:00:00" to="2016-08-04T18:00:00">
<symbol>overcast clouds</symbol>
<temperature unit="celsius" value="20.67" min="19.19" max="20.67" />
</time>
</forecast>
</weatherdata>
Staxel is most useful when XML to be parsed is more complex than that, but we want to keep things simple here.
Suppose we model this in Java like below, with getter/setters/constuctors/builders omitted. In real code you would most likely want to define builder classes to simpify model objects construction.
class WeatherData {
Location location = new Location();
List<Forecast> forecasts = new ArrayList<>();
}
class Location {
String city;
String country;
}
class Forecast {
LocalDateTime from;
LocalDateTime to;
double temperature;
}
First we create StaxelReaderFactory
by calling it's no argument constructor and get reference to StaxelReader
by supplying
the factory with some input, in this case InputStream
:
StaxelReaderFactory f = new StaxelReaderFactory();
try(StaxelReader r = f.fromStream(getClass().getResourceAsStream(resource))) {
... parsing happens here...
}
StaxelReader
is an extension of StAX API's XMLEventReader
and adds concept of cursors. Cursors iterate over parts or whole of
the XML and allow to structure code efficiently. To create a cursor you need to specify element name (or more generally path suffix)
from which the cursor should start. Child cursors can be created at any point and can be nested as required. Let's take a look at main parsing loop:
WeatherData wd = new WeatherData();
Cursor cur = r.getCursor("weatherdata"); //cursor will start from <weatherdata> and will finish at </weatherdata>.
for (XMLElement e : cur) { //iterate over all XML elements inside <weatherdata>
if (e.pathEndsWith("location")) { // is this location
wd.location = e.parseWithChildCursor(this::parseLocation); //create child cursor to parse location data
} else if (e.pathEndsWith("forecast", "time")) { //is this forecast, note how we can check parent element name as well as actual element name
wd.forecasts.add(e.parseWithChildCursor(this::parseForecast)); //create child cursor to parse forecast data
}
}
and the rest
private Location parseLocation(Cursor cur) {
Location loc = new Location();
for (XMLElement e : cur) { //child cursor will iterate over <location> and all of it's child elements
switch (e.getName()) {
case "name":
loc.city = e.getText();//get inner text of current element from XMLElement
break;
case "country":
loc.country = e.getText();
break;
}
}
return loc;
}
private Forecast parseForecast(Cursor cur) {
Forecast f = new Forecast();
for (XMLElement e : cur) { //child cursor will iterate over <forecast> and all of it's child elements
switch (e.getName()) {
case "time":
f.from = LocalDateTime.parse(e.getAttribute("from")); //get attribute value from XMLElement
f.to = LocalDateTime.parse(e.getAttribute("to"));
break;
case "temperature":
f.temperature = Double.parseDouble(e.getAttribute("value"));
break;
}
}
return f;
}
Note how easy it is with Staxel to structure the code to parse XML fragments to particular model classes. The code that parses fragments
is unaware of absolute position of the fragment in the XML tree. Also note that if you want to check parent element name it is easily
possible with XMLElement.pathEndsWith(String... suffix)
method as was done to parse forecast pathEndsWith("forecast", "time")
.
For full source look at WeatherDataParsingIntegrationTest
.
Staxel requires Java 8 and has no other dependencies.
Library is licensed under the terms of Apache License 2.0.