0.118.0
Highlights:
- With the Polars upgrade to 0.34.2, the
sqlp
andjoinp
enjoy expanded capabilities and a noticeable performance boost. π¦π - We now publish the 500, 1000, 5000 and 15000 Geonames cities indices for the
geocode
command, with users able to easily switch indices with theindex-load
subcommand. As the name implies, the 500 index contains cities with populations of 500 or more, the 1000 index contains cities with populations of 1000 or more, and so on.
The 15000 index (default) is the smallest (13mb) and fastest with ~26k cities. The 500 index is the largest(56mb) and slowest, with ~200k cities. The 5000 index is 21mb with ~53k cities. The 1000 index is 44mb with ~140k cities. π - The
geocode
command now returns US Census FIPS codes for US places with the%json
and%pretty-json
formats, returning both US State and US County FIPS codes, with upcoming support for Cities and other US Census geographies (School Districts, Voting Districts, Congressional Districts, etc.) π - Improved performance for
stats
,schema
andtojsonl
commands with the stats cache bincode refactor. This is especially noticeable for large CSV files asstats
previously created large bincode cache files by default.
The bincode cache allows other commands (currently, onlyschema
andtojsonl
) to skip recomputing statistics and deserialize the saved stats data structures directly into memory. Now, it will only create a bincode file if the--stats-binout
option is specified (typically, before using theschema
antojsonl
commands).stats
will still continue to create a stats CSV cache file by default, but it will be much smaller than the bincode file, and is universally applicable, unlike the bincode cache. π - self-update will now verify updates. This is done by verifying the zipsign signature of the release zip archive before applying it. This should make it harder for malicious actors to compromise the self-update process. Version 0.118.0 has the verification code, and future releases will use this new verification process.
Regardless, we will zipsign all zip archives starting with this release.
Users can manually verify the signatures by downloading the zipsign public key and running thezipsign
command line tool. See Verifying the Integrity of the Prebuilt Binaries Zip Archive for more info. π¦ - The
frequency
command now supports the--ignore-case
option for case-insensitive frequency counts. π¦π - The
schema
command can now compile case-insensitive enum constraints. π¦ - Improved performance for
apply
andapplydp
commands with faster compile-time perfect hash functions for operations lookups. π - Several minor performance improvements and bug fixes with
snappy
,sniff
&cat
commands. π
Added
frequency
: added--ignore-case
option #1386geocode
: added 500, 1000, 5000, 15000 Geonames cities convenience shortcuts toindex
subcommands bd9f4c3schema
: added--ignore-case
option when compiling enum constraints; replaced Hashset with faster AHashset a16a1casnappy
: addedbuf_size
parm to compress helper fn e0c0d1fsniff
added--just-mime
option #1372- added zipsign signature verification to self-update #1389
Changed
apply
&applydp
: replaced binary_search with faster compile-time perfect hash functions for operations lookups #1371stats
,schema
andtojsonl
: stats cache bincode refactor #1377luau
: replaced sanitise-file-name with more popular sanitize-filename crate 8927cb7cat
: minor optimization by preallocating with capacity c13c341sqlp
&joinp
: expanded speed/functionality with upgrade to Polars 0.34.2 #1385tojsonl
: improved boolean inferencing. Now correctly infers booleans, even if the enum domain range is more than 2, but has cardinality 2 case-insensitive 6345f2d- build(deps): bump strum_macros from 0.25.2 to 0.25.3 by @dependabot in #1368
- build(deps): bump regex from 1.10.1 to 1.10.2 by @dependabot in #1369
- build(deps): bump uuid from 1.4.1 to 1.5.0 by @dependabot in #1373
- build(deps): bump hashbrown from 0.14.1 to 0.14.2 by @dependabot in #1376
- build(deps): bump self_update from 0.38.0 to 0.39.0 by @dependabot in #1378
- build(deps): bump ahash from 0.8.5 to 0.8.6 by @dependabot in #1383
- build(deps): bump serde from 1.0.189 to 1.0.190 by @dependabot in #1388
- build(deps): bump futures from 0.3.28 to 0.3.29 by @dependabot in #1390
- build(deps): bump futures-util from 0.3.28 to 0.3.29 by @dependabot in #1391
- build(deps): bump tempfile from 3.8.0 to 3.8.1 by @dependabot in 4f6200c
- apply select clippy suggestions
- update several indirect dependencies
- pin Rust nightly to 2023-10-26
Fixed
dedup
: fixed --ignore-case not being honored during internal sort option #1387applydp
: fixed wrong usage text usingapply
and notapplydp
c47ba86geocode
: fixedindex-update
not honoring--timeout
parameter 3272a9egeocode
: fixedindex-load
to work properly with convenience shortcuts 5097326
Full Changelog: 0.117.0...0.118.0