-
Notifications
You must be signed in to change notification settings - Fork 211
Closed
Description
I'm running out of vespene gas or somteh
$ wget https://files.pushshift.io/reddit/submissions/RS_2022-08.zst
$ unzstd --memory=2048MB --stdout RS_2022-08.zst | octosql "SELECT count(*) FROM stdin.json" -o csv
...
Error: couldn't run query: couldn't run source: couldn't run source: bufio.Scanner: token too long
sad :'(
the great octopus god is able to work with this other, smaller, file in 110.6s:
$ unzstd --memory=2048MB --stdout RS_2021-08.zst | octosql "SELECT count(*) FROM stdin.json" -o csv
count
28384220
It does not use much RAM with either file so not sure what's up :? Both are similar-ish file-ish size-ish 7.8G vs 10GB compressed. maybe 200GB uncompressed
Metadata
Metadata
Assignees
Labels
No labels