Skip to content

Commit

Permalink
changelog: keep wordsmithing highlights section
Browse files Browse the repository at this point in the history
  • Loading branch information
jqnatividad committed Oct 26, 2023
1 parent e091a20 commit e7077df
Showing 1 changed file with 7 additions and 5 deletions.
12 changes: 7 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [0.118.0] - 2023-10-26

## Highlights:
* Polars has been upgraded to [0.34.2](https://github.com/pola-rs/polars/releases/tag/rs-0.34.0), which enables expanded capabilities and a performance boost for both `sqlp` and `joinp` commands.
* We now publish the 500, 1000, 5000 and 15000 Geonames cities indices for convenience for the `geocode` command, and users can easily switch indices with the `index-load` subcommand. As the name implies, the 500 index contains cities with populations of 500 or more, the 1000 index contains cities with populations of 1000 or more, and so on.
* With the Polars upgrade to [0.34.2](https://github.com/pola-rs/polars/releases/tag/rs-0.34.0), the `sqlp` and `joinp` enjoy [expanded](https://github.com/pola-rs/polars/blob/rs-0.34.0/crates/polars-sql/src/functions.rs
) [capabilities](https://github.com/pola-rs/polars/blob/rs-0.34.0/crates/polars-sql/src/keywords.rs) and a noticeable performance boost.
* We now publish the 500, 1000, 5000 and 15000 Geonames cities indices for the `geocode` command, with users able to easily switch indices with the `index-load` subcommand. As the name implies, the 500 index contains cities with populations of 500 or more, the 1000 index contains cities with populations of 1000 or more, and so on.
The 15000 index (default) is the smallest (13mb) and fastest with ~26k cities. The 500 index is the largest(56mb) and slowest, with ~200k cities. The 5000 index is 21mb with ~53k cities. The 1000 index is 44mb with ~140k cities.
* The `geocode` command now returns US Census FIPS codes for US places for the `%json` and `%pretty-json` formats, returning both US State and US County FIPS codes, with upcoming support for Cities and other US Census geographies (School Districts, Voting Districts, Congressional Districts, etc.)
* Improved performance for `stats`, `schema` and `tojsonl` commands with a stats cache bincode refactor. This is especially noticeable for large CSV files as `stats` would previously create large bincode cache file by default. The bincode cache allows other commands (currently, only `schema` and `tojsonl`) to skip recomputing statistics and deserialize the saved stats data structures directly into memory. Now, it will only create a bincode file if the `--stats-binout` option is specified (typically, before using the `schema` an `tojsonl` commands). `stats` will still continue to create a stats cache file by default, but it will be much smaller than the bincode file.
* The `geocode` command now returns US Census FIPS codes for US places with the `%json` and `%pretty-json` formats, returning both US State and US County FIPS codes, with upcoming support for Cities and other US Census geographies (School Districts, Voting Districts, Congressional Districts, etc.)
* Improved performance for `stats`, `schema` and `tojsonl` commands with the stats cache bincode refactor. This is especially noticeable for large CSV files as `stats` previously created large bincode cache files by default.
The bincode cache allows other commands (currently, only `schema` and `tojsonl`) to skip recomputing statistics and deserialize the saved stats data structures directly into memory. Now, it will only create a bincode file if the `--stats-binout` option is specified (typically, before using the `schema` an `tojsonl` commands). `stats` will still continue to create a stats CSV cache file by default, but it will be much smaller than the bincode file, and is universally applicable, unlike the bincode cache.
* self-update will now verify updates. This is done by verifying the [zipsign](https://crates.io/crates/zipsign) signature of the release zip archive before applying it. This should make it harder for malicious actors to compromise the self-update process. Version 0.118.0 has the verification code, and future releases will use this new verification process.
Regardless, we will zipsign all zip archives starting with this release.
Users can manually verify the signatures by downloading the zipsign public key and running the `zipsign` command line tool. See Verifying Releases for more info.
Users can manually verify the signatures by downloading the zipsign public key and running the `zipsign` command line tool. See [Verifying Releases](https://todo-add-link.com) for more info.
* The `frequency` command now supports the `--ignore-case` option for case-insensitive frequency counts.
* Improved performance for `apply` and `applydp` commands with faster compile-time perfect hash functions for operations lookups.
* Several minor performance improvements and bug fixes with `snappy`, `sniff` & `cat` commands.
Expand Down

0 comments on commit e7077df

Please sign in to comment.