Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What format of the geometry is faster for T-Rex? #272

Open
dmitrykinakh opened this issue Nov 19, 2021 · 5 comments
Open

What format of the geometry is faster for T-Rex? #272

dmitrykinakh opened this issue Nov 19, 2021 · 5 comments
Labels

Comments

@dmitrykinakh
Copy link

dmitrykinakh commented Nov 19, 2021

Hi there,

in the project we are building at the moment the biggest problem is tiles generation speed.

So, these are input parameters:

T-Rex version 0.14.1

Config in toml.

# t-rex configuration

[service.mvt]
viewer = true

[[datasource]]
dbconn = "postgresql://{{USER}}:{{PASS}}@{{DBHOST}}.c4epkxdwfidz.us-east-1.rds.amazonaws.com/{{DBNAME}}"
name = "shared"

[grid]
# Predefined grids: web_mercator, wgs84
predefined = "web_mercator"

# ************** Global Settings **************

[[tileset]]
name = "municipal_zoning_code"
extent = [-198.808594,-16.499992,-61.699219,70.958956]
[[tileset.layer]]
datasource = "shared"
name = "municipal_zoning_code"
table_name = "land_use_city"
geometry_field = "geom"
geometry_type = "MULTIPOLYGON"
srid = 3857
#fid_field = "id"
#tile_size = "4096"
buffer_size = 3
simplify = false
minzoom = 10
maxzoom = 18
#query_limit = 1000
[[tileset.layer.query]]
sql = """SELECT Sieve(geom, ZRes(!zoom!::int - 1) * 3) as geom,id,title,land_use_type,place,jurisdiction,uniqueid,color FROM land_use_city WHERE geom && !bbox! and land_use_type='Municipal Zoning Code'"""

[cache.s3]
endpoint = "https://s3.us-east-1.amazonaws.com"
bucket = "{{BUCKET}}"
access_key = "{{AKEY}}"
secret_key = "{{SKEY}}"
region = "us-east-1"
baseurl = "https://st1-tiles.{{DOMAIN}}.com"

[webserver]
# Bind address. Use 0.0.0.0 to listen on all adresses.
bind = "0.0.0.0"
port = 6767
threads = 4
cache_control_max_age = 2592000

Number of records in DB - 13231
Number of records that match query in selected bbox - 60

Generation logs for:

time docker-compose run --name TL_01 --rm trex-master sh -c 't_rex generate --tileset municipal_zoning_code --overwrite true --minzoom 10 --maxzoom 18 --extent -81.658619,28.346742,-80.863155,28.786278 --config /storage/trex/config.toml;'

Timing

Z10 - 0m 20.023s
Z11 - 0m 14.679s
Z12 - 0m 31.691s
Z13 - 1m 55.904s
Z14 - 6m 17.114s
Z15 - 19m 56.263s
Z16 - 73m 25.188s
Z17 - 284m 10.441s
Z18 - 1130m 6.336s

Generation was performed on ec2 r5.4xlarge (16 vCPU 128 GB RAM) and with DB db.r6g.4xlarge (16 vCPU 128 GB RAM)

With bigger zoom levels, like 17 or 18 this takes hours. We are wondering if the issue is in data that we store as multipolygons. Those multipolygons contain from 1 to 2-3K individual polygons.
image

From the performance perspective, would it be better to have these data represented in DB as polygons, so instead of 60 records, in our case, we will have several thousand? Looking forward to recommendations.

Note: this is a part of the query that takes the most of the time:

SELECT ST_Multi(ST_Buffer(ST_Intersection(geom,ST_MakeEnvelope($1-?*$6::FLOAT8,$2-?*$6::FLOAT8,$3+?*$6::FLOAT8,$4+?*$6::FLOAT8,?)), ?)) AS geom,"id","title","land_use_type","place","jurisdiction","uniqueid","color" FROM (SELECT Sieve(geom, ZRes($5::int - ?) * ?) as geom,id,title,land_use_type,place,jurisdiction,uniqueid,color FROM land_use_city WHERE geom && ST_MakeEnvelope($1-?*$6::FLOAT8,$2-?*$6::FLOAT8,$3+?*$6::FLOAT8,$4+?*$6::FLOAT8,?) and land_use_type=?) AS _q

image

@pka
Copy link
Member

pka commented Nov 24, 2021

This is really extremely slow. Is there an index on the geometry field (CREATE INDEX ON land_use_city USING GIST(geom);) or could the Sieve function on the geometry be slow? I would recommend using EXPLAIN to find out which part of the query is slow.

@pka pka added the question label Nov 24, 2021
@dmitrykinakh
Copy link
Author

dmitrykinakh commented Nov 25, 2021

Is there an index on the geometry field (CREATE INDEX ON land_use_city USING GIST(geom);)

There is an index for the geometry column. I've performed tests with it and without it for zoom levels 12 and 13 using the smaller machine. Generation time was the same.
The blue color and first execution is zoom level 12 and no index.
The red color and first execution is zoom level 13 and no index.
The second execution for both zoom levels was with the index field added.
image

or could the Sieve function on the geometry be slow?

I don't think that this is the case since the same performance was before we started using that function.

I would recommend using EXPLAIN to find out which part of the query is slow

I'll try to get this information for you and post it here.

@dmitrykinakh
Copy link
Author

Here is a log from DB for all of the queries that are longer than 4 seconds
postgresql.log.2021-11-25-11(more than 4 seconds).txt
.

@pka
Copy link
Member

pka commented Nov 25, 2021

Geometry index usage looks fine. How's the timing without using the Sieve function?

@dmitrykinakh
Copy link
Author

This is a log without using the Sieve function
postgresql.log.2021-11-26-13.txt

Also, this is a generation timing using ec2 r5.4xlarge (16 vCPU 128 GB RAM) and with DB db.r6g.4xlarge (16 vCPU 128 GB RAM) without the Sieve function.

Zoom With Sieve Without Sieve
Z10 0m 20.023s 0m 22.549s
Z11 0m 14.679s 0m 8.360s
Z12 0m 31.691s 0m 15.165s
Z13 1m 55.904s 0m 20.980s
Z14 6m 17.114s 0m 37.817s
Z15 19m 56.263s 2m 13.246s
Z16 73m 25.188s 5m 19.229s
Z17 284m 10.441s 11m 45.406s
Z18 1130m 6.336s 34m 49.984s

Looks like I know the answer to why generation is so slow. However, still, wondering what it's better - polygon or multipolygon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants