Skip to content

Conversation

@Tmonster
Copy link
Owner

Needed for reordering left joins

taniabogatsch and others added 30 commits January 20, 2025 18:19
# Conflicts:
#	src/execution/operator/persistent/physical_insert.cpp
* For small probe cardinalities, plan a nested loop join + aggregate
* Add asof_loop_join_threshold setting to control its use.
* Add asof_loop_join_threshold loops to tests.
Mytherin and others added 30 commits February 17, 2025 22:11
…or verification (duckdb#16262)

This PR fixes duckdblabs/duckdb-internal#4233

`Execute` complained because the validity mask was dirty, using a vector
cache to reset the Vector before every Execute call now.
…er (duckdb#16263)

Currently we keep the compressed buffer around, when we really only need
it for a short amount of time (when we read data to then decompress it).
Since we keep the compressed buffer around per column, these resizeable
buffers can grow to a rather substantial amount of memory, which can
lead to us using 50-100% extra memory when reading Parquet files.
Destroying and re-allocating the buffer does not seem to greatly affect
performance - below are ClickBench timings ran on my laptop on a single
big file (which is when caching matters the most given that caches are
re-instantiated between files anyway):

We can perhaps do something more clever where we keep around a single
cache that we share across readers, but not for the bug-fix release.

| Query |  Old  |  New  | Ratio |
|-------|------:|------:|------:|
| Q00   | 0.165 | 0.156 | 0.95  |
| Q01   | 0.156 | 0.155 | 0.99  |
| Q02   | 0.192 | 0.191 | 0.99  |
| Q03   | 0.179 | 0.188 | 1.05  |
| Q04   | 0.292 | 0.287 | 0.98  |
| Q05   | 0.364 | 0.349 | 0.96  |
| Q06   | 0.154 | 0.157 | 1.02  |
| Q07   | 0.158 | 0.159 | 1.01  |
| Q08   | 0.361 | 0.348 | 0.96  |
| Q09   | 0.449 | 0.440 | 0.98  |
| Q10   | 0.225 | 0.220 | 0.98  |
| Q11   | 0.251 | 0.249 | 0.99  |
| Q12   | 0.379 | 0.355 | 0.94  |
| Q13   | 0.552 | 0.539 | 0.98  |
| Q14   | 0.426 | 0.412 | 0.97  |
| Q15   | 0.351 | 0.325 | 0.93  |
| Q16   | 0.665 | 0.629 | 0.95  |
| Q17   | 0.617 | 0.596 | 0.97  |
| Q18   | 0.971 | 0.947 | 0.98  |
| Q19   | 0.215 | 0.196 | 0.91  |
| Q20   | 0.929 | 0.892 | 0.96  |
| Q21   | 0.730 | 0.719 | 0.98  |
| Q22   | 1.213 | 1.226 | 1.01  |
| Q23   | 3.062 | 2.905 | 0.95  |
| Q24   | 0.189 | 0.189 | 1.00  |
| Q25   | 0.178 | 0.172 | 0.97  |
| Q26   | 0.201 | 0.186 | 0.93  |
| Q27   | 0.771 | 0.729 | 0.95  |
| Q28   | 7.475 | 7.246 | 0.97  |
| Q29   | 0.254 | 0.236 | 0.93  |
| Q30   | 0.483 | 0.477 | 0.99  |
| Q31   | 0.508 | 0.549 | 1.08  |
| Q32   | 1.168 | 1.254 | 1.07  |
| Q33   | 1.431 | 1.493 | 1.04  |
| Q34   | 1.453 | 1.511 | 1.04  |
| Q35   | 0.359 | 0.369 | 1.03  |
| Q36   | 0.165 | 0.184 | 1.12  |
| Q37   | 0.161 | 0.166 | 1.03  |
| Q38   | 0.173 | 0.172 | 0.99  |
| Q39   | 0.177 | 0.182 | 1.03  |
| Q40   | 0.165 | 0.169 | 1.02  |
| Q41   | 0.162 | 0.172 | 1.06  |
…n the compiler (MSVC or not) (duckdb#16267)

This PR addresses duckdb#16237

My local builds started failing because `-std=c++11` was missing, this
should fix the issue for both flavors of compiler.
Please verify that this doesn't break the MSVC environment @cfis
output from `duckdb_extensions` table function whould show `NULL` as
install mode instead of `NOT_INSTALLED`
…nemap (duckdb#16269)

During row group checks - we keep track of which filters are always true
(and hence do not need to be checked during the actual scan). Due to a
missing `else` statement, we could call this method twice for the same
filter.

Now this doesn't look problematic, but in `SetFilterAlwaysTrue` we
*also* keep track of how many filters have been set as being always true
- and if all filters are set as being always true, the filters are
skipped entirely.

This can cause problems when optional filters (as e.g. constructed due
to a Top-N) are combined with non-optional filters (as e.g. constructed
through a `WHERE` clause). The filter would not be correctly executed -
leading to incorrect rows sneaking into the result.
* Check hints for equality and skip search.

fixes: duckdb#16250
fixes: duckdblabs/duckdb-internal#4229
Fix duckdb#16257

I've attached the diff that needs to happen to `main` when we eventually
need to merge this branch into it, as the Parquet writer has been
refactored significantly in the meantime.


[page_fix.txt](https://github.com/user-attachments/files/18827335/page_fix.txt)
Basically cherry-picking the commits of
duckdb#16154 to the `v1.2` branch, and
reducing the number of tests to a minimal set.
1. Slightly improve the performance of hashing long strings
2. Implement bit-packing for our `RleBpEncoder` (it only did RLE until
now)
3. Implement a custom hash map to build Parquet dictionaries (instead of
using `std::unordered_map`)
4. Add many fast paths, e.g., for handling `NULL` values when we know
there are no `NULL`s

This brings the time it takes to write `lineitem` from TPC-H SF10 to
Parquet from ~4.5s to ~2.5s on my laptop, as well as reducing the size
from 2.4 GB to 2.0 GB (with the default `PARQUET_VERSION V1`).

When enabling `PARQUET_VERSION V2`, time reduces from ~3.8s to ~2.4s on
my laptop, and size reduces from 1.7 GB to 1.6 GB.

It also adds the parameter `string_dictionary_page_size_limit` to the
Parquet writer, which makes the page size of dictionary pages
configurable. It defaults to 1 MB.
* Use two cursors for range searches
* Reduces benchmark time from 35s to 25s
These changes affect bugs related to optimistically writing blocks via
the partial block manager.

- `LocalTableStorage::Rollback` has to `row_groups->CommitDropTable()`;
otherwise, we'll have no way to figure out which blocks have to be
marked modified on rollback if the partial block manager is already
destroyed.
- `PartialBlockManager::Rollback(const bool mark_modified);`: The
top-level optimistic writer directly rolls back its row groups (now).
Child writers (and their partial block managers) need to mark their
written blocks as modified.
- `LocalTableStorage` + `ALTER TYPE`: `row_groups->CommitDropColumn` the
(old) altered column to add its optimistically written blocks to the
modified list.
- `LocalTableStorage` + `DROP COLUMN`: `row_groups->CommitDropColumn`
the dropped column to add its optimistically written blocks to the
modified list.
- `PartialBlockManager::FlushPartialBlocks` needs to add its partial
flushed blocks to `written_blocks`. Otherwise, in some rare cases
(successful insert, constraint violation on other data, then rollback in
a transaction), we cannot know that a written block must be marked as
modified. That situation is also why we keep the written blocks alive
(and thus, the entire partial block manager) in
`OptimisticDataWriter::FinalFlush`.

The tests cover all these cases, and often, a combination of these fixes
is necessary to fix the test.

`insert_order_preserving_odd_sized_batches.test_slow` has a `FIXME`, as
I am unsure if we expect the different block counts?

Fixes duckdblabs/duckdb-internal#2453.
Fixes duckdblabs/duckdb-internal#3931.
…ckdb#16246)

Instead of allocating a `std::vector` per group, we now use arena
allocations to allocate fixed-size arrays. This greatly reduces
contention on the system allocator when there are many threads/groups.
It also fixes the scaling issues encountered here:
https://duckdb.org/2024/10/25/topn.html

I also removed an error thrown when a `NULL` was encountered when
creating an `ENUM`. This was unnecessary, as we could just skip.
Follow-up of duckdb#16243. Scraping the bottom of the barrel here, as the
previous PR got many of the biggest performance gains already.

This PR adds some more fast paths for when there are no `NULL`s, and
implements a branchless hash function for `string_t`'s that are inlined.
This required some extra care to make sure that the hash function
returns the same value whether the string is inlined or not.

Overall, the changes reduce the time it takes to write TPC-H SF10
`lineitem` to Parquet from ~2.6s to ~2.4s (with the default
`PARQUET_VERSION V1`, ~2.5s to ~2.3s with `V2`).
…nd align behavior (duckdb#15944)

This PR fixes
<duckdblabs/duckdb-internal#3945>

The following behavior applies to casting from `VARCHAR` ->
`LIST`|`STRUCT`|`MAP`

- Backslashes outside of quotes are treated as literals, unless they are
followed by a single/double quote, in which case they act as an escape
character and are removed from the result.
- Backslashes inside of quotes are always treated as escape characters,
to use a literal backslash inside quotes, the backslash has to be
escaped (`\\`), unescaped backslashes are removed from the result.

Quotes that are used to start and end a quoted value are removed from
the result.
`[""""]` -> `[]`
`['''']` -> `[]`
`['"']` -> `["]`
`["'"]` -> `[']`

Whenever your string contains a character that is considered special by
the casting logic, it should be wrapped in quotes.
To preserve leading or trailing spaces, the value should also be wrapped
in quotes.
The list of characters that are *always*[1] special: `(whitespace)`,
`{`, `}`, `(`, `)`, `[`, `]`, `"`, `'`, `\`
Other characters that are special in certain contexts:
- STRUCT key: `:`
- MAP key: `=`
- LIST, MAP, STRUCT value: `,`

[1] STRUCT key permits the use of `{`, `}`, `(`, `)`, `[`, `]`

------------------------------------------------------------

This PR also adds support for casting from `VARCHAR` -> unnamed `STRUCT`
In the form of `( <value>, <value2>, ..., <value n> )`
Although I haven't run benchmarks, the release binary size has been
reduced by ~35KB, which should result in some performance improvements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.