Proposal for a More Compact GeoJSON Encoding
Background
In many geospatial datasets, the structure of properties is highly regular, meaning that most features share the same set of properties. This proposal aims to introduce a more compact encoding for GeoJSON that optimizes data transfer and storage while maintaining compatibility with existing tools and workflows.
Current Approach Using GeoJSON (Within Specification)
Currently, GeoJSON does not provide a standardized method for property key deduplication. However, a common approach to achieving more compact representations while staying within the existing specification is as follows:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [139.6917, 35.6895] },
"properties": { "values": ["Tokyo", 37400068] }
},
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [-74.006, 40.7128] },
"properties": { "values": ["New York", 8419600] }
}
],
"propertyKeys": ["city", "population"]
}
This method introduces a propertyKeys array at the FeatureCollection level to define the property names, while each feature stores only the corresponding values array. While this approach reduces redundancy, it has several limitations:
- Extra nesting: The
values array introduces an unnecessary hierarchical level under properties.
- Non-standard propertyKeys: The
propertyKeys field is not officially defined in the GeoJSON specification, making its usage unclear in standard-compliant software.
Proposed More Compact GeoJSON Encoding
To further optimize property storage, we propose a direct mapping of properties as an array at the properties level, eliminating the need for the values wrapper:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [139.6917, 35.6895] },
"properties": ["Tokyo", 37400068]
},
{
"type": "Feature",
"geometry": { "type": "Point", "coordinates": [-74.006, 40.7128] },
"properties": ["New York", 8419600]
}
],
"propertyKeys": ["city", "population"]
}
Advantages of This Proposal
- Eliminates Unnecessary Nesting: The
values wrapper is removed, reducing structural overhead.
- Preserves Semantic Meaning: The
propertyKeys array still defines the order of properties, ensuring that property interpretation remains clear.
- More Efficient Encoding: By flattening the structure, we reduce unnecessary object syntax, leading to reduced file size.
Compression Efficiency Demonstration
Using actual data from the Japanese Land Price Public Announcement dataset (L01-24), we observed the following size reductions:
- Original GeoJSON: 151,799 KB
- Current Method (
values under properties): ~39,716 KB (approx. 73% reduction)
- Proposed Method (
properties as array): Further reduced with additional optimizations
- Compressed (ZIP) Comparison:
- Original GeoJSON (ZIP): 7,371 KB
- Proposed Method (ZIP): 3,904 KB (approx. 47% reduction)
Implementation and Compatibility
A conversion tool has been developed to facilitate encoding in this proposed format:
Existing GeoJSON consumers that rely on standard properties as key-value objects may need adaptation. However, software designed to handle structured datasets can easily incorporate this encoding.
Conclusion
This proposal presents a structured yet efficient way to encode GeoJSON properties for datasets where the property schema is uniform across features. Given the prevalence of such structured datasets, this approach can offer significant efficiency gains without compromising clarity or compatibility.
Feedback and discussion are welcome to refine this approach further.
Proposal for a More Compact GeoJSON Encoding
Background
In many geospatial datasets, the structure of properties is highly regular, meaning that most features share the same set of properties. This proposal aims to introduce a more compact encoding for GeoJSON that optimizes data transfer and storage while maintaining compatibility with existing tools and workflows.
Current Approach Using GeoJSON (Within Specification)
Currently, GeoJSON does not provide a standardized method for property key deduplication. However, a common approach to achieving more compact representations while staying within the existing specification is as follows:
{ "type": "FeatureCollection", "features": [ { "type": "Feature", "geometry": { "type": "Point", "coordinates": [139.6917, 35.6895] }, "properties": { "values": ["Tokyo", 37400068] } }, { "type": "Feature", "geometry": { "type": "Point", "coordinates": [-74.006, 40.7128] }, "properties": { "values": ["New York", 8419600] } } ], "propertyKeys": ["city", "population"] }This method introduces a
propertyKeysarray at the FeatureCollection level to define the property names, while each feature stores only the correspondingvaluesarray. While this approach reduces redundancy, it has several limitations:valuesarray introduces an unnecessary hierarchical level underproperties.propertyKeysfield is not officially defined in the GeoJSON specification, making its usage unclear in standard-compliant software.Proposed More Compact GeoJSON Encoding
To further optimize property storage, we propose a direct mapping of properties as an array at the
propertieslevel, eliminating the need for thevalueswrapper:{ "type": "FeatureCollection", "features": [ { "type": "Feature", "geometry": { "type": "Point", "coordinates": [139.6917, 35.6895] }, "properties": ["Tokyo", 37400068] }, { "type": "Feature", "geometry": { "type": "Point", "coordinates": [-74.006, 40.7128] }, "properties": ["New York", 8419600] } ], "propertyKeys": ["city", "population"] }Advantages of This Proposal
valueswrapper is removed, reducing structural overhead.propertyKeysarray still defines the order of properties, ensuring that property interpretation remains clear.Compression Efficiency Demonstration
Using actual data from the Japanese Land Price Public Announcement dataset (L01-24), we observed the following size reductions:
valuesunderproperties): ~39,716 KB (approx. 73% reduction)propertiesas array): Further reduced with additional optimizationsImplementation and Compatibility
A conversion tool has been developed to facilitate encoding in this proposed format:
Existing GeoJSON consumers that rely on standard
propertiesas key-value objects may need adaptation. However, software designed to handle structured datasets can easily incorporate this encoding.Conclusion
This proposal presents a structured yet efficient way to encode GeoJSON properties for datasets where the property schema is uniform across features. Given the prevalence of such structured datasets, this approach can offer significant efficiency gains without compromising clarity or compatibility.
Feedback and discussion are welcome to refine this approach further.