-
Notifications
You must be signed in to change notification settings - Fork 1
Batch Analyst
The OpenTripPlanner Analyst web services produce map tiles or rasters derived from a single shortest path tree, which is to say that they reveal information (often travel time) about the geographic area covered by a graph from the perspective of a single (but freely movable) origin point.
The OpenTripPlanner Analyst batch framework covers a larger range of use cases, which may involve accumulating or aggregating results derived from many independently built single-source shortest path trees. Rather than a web service that serves map tiles, the batch framework includes a command-line program that is configured via Spring dependency injection XML. The locations and other attributes of path search endpoints (origins and destinations) may be loaded from Shapefile, CSV, or raster formats. Results may be saved back to a raster or a CSV file for further manipulation or analysis in a desktop GIS package like QGIS or a statistics package like R.
Batch Analyst is not a web servlet, but a plain Java main class: org.opentripplanner.analyst.batch.BatchProcessor
. The simplest way to work with it is to configure your IDE to run it (e.g. using the "run configurations" dialog in Eclipse). It would be equally possible to invoke it from the command line but you'd have to include all the Maven dependencies on the classpath. As with the OTP web services, be sure to give the Java virtual machine plenty of heap space, especially when using large graphs. Maximum heap size is controlled with the -Xmx parameter, and typically needs to be on the order of a few gigabytes for smooth operation with medium-sized graphs.
The BatchProcessor will read load its Spring application context from the XML configuration in src/main/resources/batch-context.xml, or alternatively a configuration file specified on the command line. The XML in this file declaratively describes the configuration and instantiation of Spring "beans", which will both provide the processing logic and represent the data sets you are working with. The processing logic components will generally remain the same from one execution to another, but you will need to edit and/or rearrange the beans that furnish the origin set, destination set, and aggregate function to fit your specific use case.
The origin and destination set objects must implement the Population interface, which is to say that they are iterable collections of Individuals that retain some information about their structure (grid or scattered), source format, and CRS. Each Individual has a location (latitude and longitude) and a user-defined input value, all of which are double-precision floating point values. Population implementations are provided which load their individuals from a flat comma-separated file, an ESRI Shapefile, or a georeferenced raster (image) file. It is also possible to manually build up a Population element by element via Spring properties, or by specifying the width, height, CRS, and envelope of a regular grid.
Batch Analyst execution always follows the same general pattern: the program iterates over each Individual in the source population, building a shortest path tree for each individual. At each iteration, it examines the shortest path tree at each destination individual's location, recording some information such as travel time or number of transfers in a result set.
The associations between destination individuals and vertices in the graph are cached to speed up the process.
This is in effect finding a shortest path for every pair in the Cartesian product of the origin and destination sets and recording some information about each of those paths, but exploiting the fact that a single shortest path tree can be reused for all paths with the same origin. This greatly reduces run time relative to a naïve iteration over all O/D pairs, since we need only build |O| instead of |O||D| shortest path trees.
Properties of the BatchProcessor itself allow the user to customize the search process, and a PrototypeRoutingRequest may be provided to set OTP routing parameters such as mode of transport or maximum walk distance (which would be specified in the OTP query string when using the REST routing service).
To cover several kinds of batch requests, there are two modes: agg and non-agg. The batch processor chooses a mode based on whether the aggregator property has been set or not.
In either mode, both the source and target population properties must be set. The batch analysis is always carried out as a loop over the source set. In aggregate mode, the supplied aggregate function is evaluated over the target set for every element of the source set. The resulting aggregate value is associated with the origin individual that produced it, and the entire set of aggregates are saved together in a format appropriate for that population type. Thus, aggregate mode produces a single output object/stream/buffer, containing one unit of output (tuple/line/pixel) per individual in the source set. In non-aggregate mode, one output object/stream/buffer is produced per source location. Thus, for S sources and D destinations, S output objects will be produced, each containing D data items.
Here is an example batch-context.xml:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:context="http://www.springframework.org/schema/context" xmlns:aop="http://www.springframework.org/schema/aop"
xmlns:tx="http://www.springframework.org/schema/tx" xmlns:sec="http://www.springframework.org/schema/security"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.5.xsd
http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context-2.5.xsd
http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx-2.0.xsd
http://www.springframework.org/schema/aop http://www.springframework.org/schema/aop/spring-aop-2.0.xsd
http://www.springframework.org/schema/security http://www.springframework.org/schema/security/spring-security-2.0.xsd">
<context:annotation-config />
<bean class="org.opentripplanner.analyst.request.SampleFactory" />
<bean class="org.opentripplanner.routing.impl.DefaultRemainingWeightHeuristicFactoryImpl"/>
<bean class="org.opentripplanner.routing.algorithm.GenericAStar"/>
<bean class="org.opentripplanner.analyst.batch.IndividualFactory" />
<bean class="org.opentripplanner.analyst.core.GeometryIndex" />
<!-- specify a GraphService, configuring the path to the serialized Graphs -->
<bean id="graphService" class="org.opentripplanner.routing.impl.GraphServiceImpl">
<property name="path" value="/var/otp/graphs/{}/" />
<property name="defaultRouterId" value="arbor" />
</bean>
<!-- this creates a population directly from of a list of individuals -->
<!--
<bean id="origins" class="org.opentripplanner.analyst.batch.BasicPopulation">
<property name="individuals">
<list>
<bean class="org.opentripplanner.analyst.batch.Individual">
<property name="label" value="UMich" />
<property name="lon" value="-83.73820" />
<property name="lat" value="42.27490" />
</bean>
</list>
</property>
</bean>
-->
<!-- this creates a population arranged on a regular grid that can later be saved as an image -->
<bean id="destinations" class="org.opentripplanner.analyst.batch.SyntheticRasterPopulation">
<property name="left" value="-84.14" />
<property name="right" value="-83.41" />
<property name="bottom" value="42.07" />
<property name="top" value="42.45" />
<property name="crsCode" value="epsg:4326" />
<property name="cols" value="1280" />
<property name="rows" value="1024" />
</bean>
<!-- this loads a population from a comma-separated flat text file -->
<bean id="origins" class="org.opentripplanner.analyst.batch.CSVPopulation">
<property name="sourceFilename" value="/home/abyrd/access/annarbor.csv" />
<property name="latCol" value="1" />
<property name="lonCol" value="2" />
<property name="labelCol" value="0" />
<property name="inputCol" value="3" />
</bean>
<!-- aggregate results are no longer stored in individuals so populations can be reused -->
<!-- <alias name="destinations" alias="origins"/> -->
<!-- define the main batch processor, which will build one shortest path tree from each origin to all destinations -->
<bean id="batchProcessor" class="org.opentripplanner.analyst.batch.BatchProcessor">
<property name="outputPath" value="/home/abyrd/access/out1834_clamp.tiff" />
<property name="routerId" value="arbor" />
<property name="date" value="2012-07-12" />
<property name="time" value="08:00 AM" />
<property name="timeZone" value="America/New_York" />
<property name="prototypeRoutingRequest">
<bean class="org.opentripplanner.routing.core.PrototypeRoutingRequest">
<!-- Set default routing parameters here -->
<property name="maxWalkDistance" value="400000" />
<property name="clampInitialWait" value="1800" />
<property name="arriveBy" value="false" />
</bean>
</property>
<!--
<property name="aggregator">
<bean class="org.opentripplanner.analyst.batch.aggregator.ThresholdSumAggregator">
<property name="threshold" value="3600" />
</bean>
</property>
-->
<property name="accumulator">
<bean class="org.opentripplanner.analyst.batch.ThresholdAccumulator">
<property name="threshold" value="3600" />
</bean>
</property>
</bean>
</beans>
The number of 18-35 year old residents in the Ann Arbor, Michigan area that can reach each raster cell in less than 1 hour of transit + walking (blue=5000, yellow=100000, green=200000).