Skip to content

Conversation

devavret
Copy link

@devavret devavret commented Sep 3, 2025

A copy of facebookincubator#14294
Depends on #49

mhaseeb123 and others added 30 commits July 17, 2025 03:16
… And add special handling for single value that's out of column type's range
…lso make use of subfield filters to make AST now that it's available to our datasource
tanjialiang and others added 19 commits September 8, 2025 10:33
…incubator#14735)

Summary:
Pull Request resolved: facebookincubator#14735

SortingWriter::outputBatchRows() returns 0 if maxOutputBytesConfig_ is less than  the estimated row size, making subsequent check fail. This method shall never return 0 as a correct behavior. Flooring it with 1 instead.

Reviewed By: xiaoxmeng, amitkdutta

Differential Revision: D81729964

fbshipit-source-id: 5f7ae20e3618e1f3ea5738c4ace565863b4ea511
…4779)

Summary:
Pull Request resolved: facebookincubator#14779

X-link: facebookexperimental/verax#371

Reviewed By: Yuhta

Differential Revision: D81923560

fbshipit-source-id: bf400ae9d2002f03bd350cd21bb8e0c0b36ab2aa
…ebookincubator#14772)

Summary:
Pull Request resolved: facebookincubator#14772

Index lookup join doesn't fill the match column properly. It set the fill end bit to the number of bits to fill which is wrong. This is discovered by the extension to Meta internal use case. The existing index join unit test can't catch this because (1) the test always generate the hit probe rows first; (2) match verify logic only check lookup value is null if match value is false so this can't catch the issue. This PR fix the issues (1) randomize the probe hit rows; (2) check lookup value for both value null and not null cases. Verified that with this change, the improved index join unit test can catch the issue.

Reviewed By: zacw7, mbasmanova

Differential Revision: D80915369

fbshipit-source-id: f3788694542c5d66777ba40c04a492eb544c5432
…bator#14783)

Summary:
Pull Request resolved: facebookincubator#14783

misc: Clean up QDigest registration in FunctionBaseTest

Reviewed By: duxiao1212

Differential Revision: D81950600

fbshipit-source-id: 2c04ab6cdcf5003f258c8fa4017927aa4b0234fc
…acebookincubator#14796)

Summary:
Pull Request resolved: facebookincubator#14796

Using utility casts.h function

Reviewed By: xiaoxmeng

Differential Revision: D81986594

fbshipit-source-id: c7303f50ad86c83f7495a1885f2b0170dcd44058
…kincubator#14784)

Summary:
Pull Request resolved: facebookincubator#14784

Connector factories are used only by Prestissimo to create multiple instances of the same kind of connector for different catalogs. These factories are not needed for other use cases.

A follow-up would be to move connector factories out of Velox into Prestissimo.

Reviewed By: xiaoxmeng

Differential Revision: D81960874

fbshipit-source-id: 0688a4c2b24f9cf41f12bbf081ac5b0426a7db3e
Summary:
We have been testing it in CI for a long time now and it has big benefits in binary sizes etc. so we should make it the default!

Pull Request resolved: facebookincubator#14663

Reviewed By: xiaoxmeng

Differential Revision: D81973981

Pulled By: Yuhta

fbshipit-source-id: 350238d0c0da4263d6eae095fb0a4d7da7754856
Summary:
This PR adds the Spark `timestamp_seconds` function. A key difference between
`timestamp_second` and other variations of this function like `timestamp_millis`
and `timestamp_micros` is that the `seconds` input parameter can be fractional (
whereas the `milliseconds` and `microseconds` input parameters are integers).

Spark doc: https://spark.apache.org/docs/latest/api/sql/index.html#timestamp_seconds
Spark code: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala#L596

Pull Request resolved: facebookincubator#14222

Reviewed By: mbasmanova

Differential Revision: D81921635

Pulled By: Yuhta

fbshipit-source-id: c01a3f77260443a385133e57cacac1ab7e0d58ef
Summary:
Fixes facebookincubator#14275
Fixes facebookincubator#14756

This is not a complete CMake package support. This works only when:

* `VELOX_BUILD_SHARED=ON`
* `VELOX_BUILD_MINIMAL_WITH_DWIO=ON`
* All dependencies are resolved from system (No bundled dependencies)

FYI: I want to use this for Nimble: facebookincubator/nimble#215
Nimble uses `VELOX_BUILD_MINIMAL_WITH_DWIO=ON`.

This is disabled by default. We can enabled this by specifying `VELOX_BUILD_CMAKE_PACKAGE=ON`.

Users can find Velox by `find_package(Velox)`.

We can expand supported cases step by step. How about this as the first step?

Pull Request resolved: facebookincubator#14738

Reviewed By: mbasmanova

Differential Revision: D81923615

Pulled By: Yuhta

fbshipit-source-id: 8a7cb55c5e57c3b87b696fa7298cbd171e80d40d
Summary:
This is follow-up work for facebookincubator#14375 (comment).

Pull Request resolved: facebookincubator#14698

Reviewed By: mbasmanova

Differential Revision: D81962221

Pulled By: Yuhta

fbshipit-source-id: 572ba75309b75a101410b569528037245f0d9178
Summary:
This fixes two categories of errors:
- hidden declaration of a virtual function
- initializer list would use explicit constructor

The latter was previously addressed in tests but new tests were added that caused a re-occurrence of this problem.

Pull Request resolved: facebookincubator#14781

Reviewed By: mbasmanova

Differential Revision: D81962301

Pulled By: Yuhta

fbshipit-source-id: 22eb32f6fc4b503cd71c6077f1ac38a565e4bf5e
Summary:
Pull Request resolved: facebookincubator#14626

Add data type P4HyperLogLog
Add casting of both from/to varbinary
https://prestodb.io/docs/current/language/types.html#hyperloglog (check second subsection)
https://prestodb.io/docs/current/language/types.html#khyperloglog

Reviewed By: kagamiori

Differential Revision: D81148399

fbshipit-source-id: de6b8c9491ebbd9f7380f846eee0a2b92eb7cece
…bookincubator#14706)

Summary:
Previously the actor (the workflow initiator) was used to try and determine the merge base commit sha. However, a user pushing to a different repo becomes the owner and $OWNER:$HEAD_REF does not exist in that scenario. Instead, try to use the PR creator name. We’ve seen HTTP 404 from GH CLI when the actor does not match the repo name.

Pull Request resolved: facebookincubator#14706

Reviewed By: mbasmanova

Differential Revision: D81922077

Pulled By: Yuhta

fbshipit-source-id: d2c81f38e0b37b62288a2b797b8954950a7e9fd0
…kincubator#14791)

Summary:
Pull Request resolved: facebookincubator#14791

There are cases where RowVector may be missing subfields (particularly in the dwio field reader usage when a field is not projected). Previous setType() implementation doesn't handle missing subfield case.

Reviewed By: Yuhta

Differential Revision: D81987618

fbshipit-source-id: 8e8350120d598953b7a547fe7778cac847a84ef2
Summary:
Pull Request resolved: facebookincubator#14806

Part of facebookincubator#14802

Reviewed By: pansatadru

Differential Revision: D82057681

fbshipit-source-id: eb6afb55c52f8cdd324c316e944fe07ed0eae5c1
copy-pr-bot bot pushed a commit that referenced this pull request Sep 10, 2025
wecharyu and others added 10 commits September 9, 2025 20:46
…14545)

Summary:
Fix facebookincubator#14530

Pull Request resolved: facebookincubator#14545

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Rollback Plan:

Reviewed By: kagamiori

Differential Revision: D81503154

Pulled By: peterenescu

fbshipit-source-id: 953440f510e839c9a0627beba7964e2ab88f0374
Summary:
X-link: facebookexperimental/verax#382

Pull Request resolved: facebookincubator#14809

Fixes facebookincubator#14802

Reviewed By: pansatadru, xiaoxmeng

Differential Revision: D82071237

fbshipit-source-id: c78e3b634e0c08575a3bdf50bb5b0256472c1cff
…14813)

Summary:
Pull Request resolved: facebookincubator#14813

Add IndexLookupJoinBuilder into PlanBuilder.cc to unify the way of using build style to make code more consistent with other node builder such as table scan.

Reviewed By: xiaoxmeng

Differential Revision: D82078011

fbshipit-source-id: c542c57b4f4de6a1d83cf7571ba9b1d1fcf778af
…large projections (facebookincubator#14403)

Summary:
…and keep it similar for small.

Memory overhead is just additional ~16.125 bytes per type in row.

Also I don't think lazy is good here, but I made it this way because it was TODO in previous attempt

It will be nice if exists some benchmarks for this

Context: facebookexperimental/verax#118 (comment)

Pull Request resolved: facebookincubator#14403

Reviewed By: mbasmanova

Differential Revision: D81515241

Pulled By: bikramSingh91

fbshipit-source-id: bc127521757550161fe6703a373b68a44a57df14
…incubator#14818)

Summary:
Pull Request resolved: facebookincubator#14818

Continuation of facebookincubator#14784

bypass-github-export-checks

Reviewed By: amitkdutta

Differential Revision: D82104883

fbshipit-source-id: dccb98143c27c1c8f5183de522d5a9e6025eeb84
Summary:
Pull Request resolved: facebookincubator#14253

Adds `geometry_to_bing_tiles` UDF to velox. Also uses namespace `functions::geospatial` in `BingTileType.cpp` to reduce verbosity from frequent usage of geospatial constants.

Reviewed By: jagill, Yuhta

Differential Revision: D78950042

fbshipit-source-id: 07c362d4595e6d976b98c726afa29302656c8f9a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.