Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
406ec2e
let the warehouse and scope be provided through the catalog parameter…
Tishj Mar 25, 2025
981e88e
revert changes to api_result_to_doc, I misunderstood the intent there…
Tishj Mar 26, 2025
5c423a2
add support for additional secrets
Tishj Mar 27, 2025
d47dbce
now the file read actually succeeds
Tishj Mar 27, 2025
ba8c4db
check if the path ends in 'gz.metadata.json', if it does - it's gzip-…
Tishj Mar 27, 2025
9dea357
Merge branch 'main' into lakekeeper_compatibility
Tishj Mar 27, 2025
6b8717e
remove the prefix from GetBaseUrl
Tishj Mar 27, 2025
a838504
remove the ability to provide 'warehouse' as an option, the warehouse…
Tishj Mar 27, 2025
b8cc79b
opt for the more specific secret name. restore previous behavior of o…
Tishj Mar 28, 2025
9da8a89
remove dead code (GetTablesInSchema)
Tishj Mar 28, 2025
2b1c760
distinguish between the config secret and the storage-credential secr…
Tishj Mar 28, 2025
742d9c7
verify that the 'token_type' is 'bearer'
Tishj Mar 28, 2025
f24d619
add warning for deprecated oath2 endpoint
Tishj Mar 28, 2025
e1cb59e
remove warehouse from endpoint builder + other clean up there
Tishj Mar 28, 2025
733af4c
only create a secret out of the 'config' if there are no 'storage-cre…
Tishj Mar 28, 2025
d5f1ba3
rename 'scope' to 'oauth2_scope'
Tishj Mar 28, 2025
49ff59c
run make-format
Tishj Mar 29, 2025
0c7b44f
Merge remote-tracking branch 'upstream/main' into lakekeeper_compatib…
Tishj Mar 29, 2025
54204d5
Replace IOExceptions
Flogex Mar 28, 2025
ccf9507
accidentally lost this change it seems??
Tishj Mar 31, 2025
0932428
formatting on cmakelists (with extension-ci-tools checked out at the …
Tishj Apr 1, 2025
41eec36
fix Format Check CI
Tishj Apr 1, 2025
beb486c
remove generated check
Tishj Apr 1, 2025
1d86365
Merge pull request #139 from Tishj/main
Tishj Apr 1, 2025
8d117a3
install ninja-build before running the extension-ci-tools workflow
Tishj Apr 1, 2025
5b83991
Merge remote-tracking branch 'upstream/main' into lakekeeper_compatib…
Tishj Apr 1, 2025
e13aa5d
downgrade exception types
Tishj Apr 1, 2025
cb5185b
address feedback
Tishj Apr 1, 2025
73b2fb0
add a step to verify that ninja is on the path
Tishj Apr 1, 2025
0930bf5
clean up veeeery long install line, add cmake=3.22 and cmake-data=3.22
Tishj Apr 1, 2025
e4a3617
any 3.* version then?
Tishj Apr 1, 2025
0581fcb
check cmake version
Tishj Apr 1, 2025
8fe54be
move the cmake install to a separate line
Tishj Apr 1, 2025
94bccdc
Merge pull request #140 from Tishj/ci_missing_ninja_build
Tishj Apr 1, 2025
34e53db
Merge remote-tracking branch 'upstream/main' into lakekeeper_compatib…
Tishj Apr 1, 2025
4a47dff
move the install of cmake 3.* to a separate step, to enforce that it …
Tishj Apr 1, 2025
27543ef
Merge pull request #132 from Tishj/lakekeeper_compatibility
Tishj Apr 1, 2025
e44759d
initial commit. want to test on AWS
Tmonster Apr 2, 2025
d78f435
add Makefile
Tmonster Apr 2, 2025
72bda43
mods to get some things working
Tmonster Apr 2, 2025
68b47d1
copy all the options from the referenced storage secret to the scoped…
Tishj Apr 2, 2025
f2a5e42
test polaris in its own workflow (for now)
Tmonster Apr 2, 2025
fb80c40
put polaris CI in existing local test
Tmonster Apr 2, 2025
854d43a
fix compilation for linux arm64
Tishj Apr 2, 2025
3d3997a
add environment variable to force minimum cmake requirement to 3.5, t…
Tishj Apr 2, 2025
4017d0a
Merge pull request #149 from Tishj/bandaid_fix_for_cmake4
Tishj Apr 2, 2025
33b681f
Merge remote-tracking branch 'upstream/main' into fix_ci_local_rest_c…
Tishj Apr 2, 2025
76f5345
Merge pull request #144 from Tishj/fix_ci_local_rest_catalog
Tishj Apr 2, 2025
6091da3
Merge remote-tracking branch 'upstream/main' into linux_arm64_fixes
Tishj Apr 2, 2025
3286b51
Merge remote-tracking branch 'upstream/main' into add_polaris_ci
Tishj Apr 2, 2025
64a01fe
Merge pull request #148 from Tishj/linux_arm64_fixes
Tmonster Apr 3, 2025
4dac4c4
remove steps
Tmonster Apr 2, 2025
333e210
Bump cmake_minimum_required to range (tracking duckdb/duckdb)
carlopi Apr 2, 2025
2f04f32
Move avro to subfolder
carlopi Apr 2, 2025
cb6c40a
Add vcpkg-cmake
carlopi Apr 3, 2025
aaed76e
Add CMAKE_POLICY_VERSION_MINIMUM=3.5
carlopi Apr 3, 2025
0e1974a
Revert previous CMAKE_POLICY_VERSION_MINIMUM env variable
carlopi Apr 3, 2025
ddccfbf
Use InvalidConfigurationExceptions instead
Flogex Apr 3, 2025
4f93ec2
Add also vcpkg-cmake
carlopi Apr 3, 2025
7baa22a
thijs comments
Tmonster Apr 3, 2025
a1f7aa2
typo
Tmonster Apr 3, 2025
0287c74
dont forget to source bashrc
Tmonster Apr 3, 2025
5982037
better naming for workflow jobs
Tmonster Apr 3, 2025
2b830c7
better workflow for polaris
Tmonster Apr 3, 2025
1e6fe7d
Merge pull request #141 from carlopi/fix_cmake
Tmonster Apr 3, 2025
fd65be1
source bashrc not bash_profile
Tmonster Apr 3, 2025
6b2f328
Merge remote-tracking branch 'upstream/main' into add_polaris_ci
Tmonster Apr 3, 2025
6a2b923
add tmate check
Tmonster Apr 3, 2025
2ec564e
add polaris to CI
Tmonster Apr 3, 2025
7511616
run polaris testing in it's own CI
Tmonster Apr 3, 2025
836a3bd
run make format-fix
Tmonster Apr 3, 2025
02b32a7
we can test polaris against tpch
Tmonster Apr 3, 2025
1ac04c6
remove unused table directories, all polaris generation happens in da…
Tmonster Apr 3, 2025
c433ec4
remove pdb, also include changes to MakeFile
Tmonster Apr 3, 2025
8337417
print what data is being generated
Tmonster Apr 3, 2025
a2399e1
need to initialize rest data first, then local data
Tmonster Apr 4, 2025
b729e66
split test into two because it keeps failing randomly. we will eventu…
Tmonster Apr 4, 2025
44e56ac
Fix HTTPException usage
Flogex Apr 4, 2025
2dcea92
Merge pull request #150 from Tmonster/add_polaris_to_ci
Tishj Apr 4, 2025
98d6952
Use Exception::ConstructMessage with StringUtil::Format
Flogex Apr 4, 2025
6bfb30f
Merge remote-tracking branch 'upstream/main' into replace-io-exceptions
Flogex Apr 4, 2025
1c8de5c
Replace expected exception type in tests
Flogex Apr 7, 2025
5b33522
"Failed to query" is actually HTTP error
Flogex Apr 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/CloudTesting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,4 @@ jobs:
AWS_DEFAULT_REGION: ${{secrets.S3_ICEBERG_TEST_USER_REGION}}
ICEBERG_AWS_REMOTE_AVAILABLE: 1
run: |
make test_release
make test_release
6 changes: 1 addition & 5 deletions .github/workflows/CodeQuality.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
submodules: 'true'

- name: Install
shell: bash
Expand All @@ -61,8 +62,3 @@ jobs:
black --version
make format-check-silent

- name: Generated Check
shell: bash
run: |
make generate-files
git diff --exit-code
34 changes: 33 additions & 1 deletion .github/workflows/LocalTesting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,48 @@ jobs:
sudo apt-get install -y -qq software-properties-common
sudo add-apt-repository ppa:git-core/ppa
sudo apt-get update -y -qq
sudo apt-get install -y -qq ninja-build make gcc-multilib g++-multilib libssl-dev wget openjdk-8-jdk zip maven unixodbc-dev libc6-dev-i386 lib32readline6-dev libssl-dev libcurl4-gnutls-dev libexpat1-dev gettext unzip build-essential checkinstall libffi-dev curl libz-dev openssh-client
sudo apt-get install -y -qq \
ninja-build \
make gcc-multilib \
g++-multilib \
libssl-dev \
wget \
openjdk-8-jdk \
zip \
maven \
unixodbc-dev \
libc6-dev-i386 \
lib32readline6-dev \
libssl-dev \
libcurl4-gnutls-dev \
libexpat1-dev \
gettext \
unzip \
build-essential \
checkinstall \
libffi-dev \
curl \
libz-dev \
openssh-client
sudo apt-get install -y -qq tar pkg-config
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

- name: Install CMake 3.x
run: |
sudo apt-get remove -y cmake cmake-data
sudo apt-get install --allow-downgrades -y -qq 'cmake=3.*' 'cmake-data=3.*'

- uses: actions/checkout@v4
with:
fetch-depth: 0
submodules: 'true'

- name: Check installed versions
run: |
ninja --version
cmake --version

- name: Setup vcpkg
uses: lukka/[email protected]
with:
Expand Down
135 changes: 135 additions & 0 deletions .github/workflows/PolarisTesting.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
name: Local Polaris Testing
on: [push, pull_request,repository_dispatch]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref || '' }}-${{ github.base_ref || '' }}-${{ github.ref != 'refs/heads/main' || github.sha }}
cancel-in-progress: true
defaults:
run:
shell: bash

env:
BASE_BRANCH: ${{ github.base_ref || (endsWith(github.ref, '_feature') && 'feature' || 'main') }}
CMAKE_POLICY_VERSION_MINIMUM: 3.5

jobs:
rest:
name: Test against Polaris Catalog
runs-on: ubuntu-latest
env:
VCPKG_TARGET_TRIPLET: 'x64-linux'
GEN: ninja
VCPKG_TOOLCHAIN_PATH: ${{ github.workspace }}/vcpkg/scripts/buildsystems/vcpkg.cmake
PIP_BREAK_SYSTEM_PACKAGES: 1

steps:
- name: Install required ubuntu packages
run: |
sudo apt-get update -y -qq
sudo apt-get install -y -qq software-properties-common
sudo add-apt-repository ppa:git-core/ppa
sudo apt-get update -y -qq
sudo apt-get install -y -qq \
ninja-build \
make gcc-multilib \
g++-multilib \
libssl-dev \
wget \
openjdk-8-jdk \
zip \
maven \
unixodbc-dev \
libc6-dev-i386 \
lib32readline6-dev \
libssl-dev \
libcurl4-gnutls-dev \
libexpat1-dev \
gettext \
unzip \
build-essential \
checkinstall \
libffi-dev \
curl \
libz-dev \
openssh-client
sudo apt-get install -y -qq tar pkg-config
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

- name: Install CMake 3.x
run: |
sudo apt-get remove -y cmake cmake-data
sudo apt-get install --allow-downgrades -y -qq 'cmake=3.*' 'cmake-data=3.*'

- uses: actions/checkout@v4
with:
fetch-depth: 0
submodules: 'true'

- name: Setup vcpkg
uses: lukka/[email protected]
with:
vcpkgGitCommitId: 5e5d0e1cd7785623065e77eff011afdeec1a3574

- name: Setup Ccache
uses: hendrikmuhs/ccache-action@main
continue-on-error: true

- name: Build extension
env:
GEN: ninja
STATIC_LIBCPP: 1
run: |
make release

- name: Set up for Polaris
run: |
# install java
sudo apt install -y -qq openjdk-21-jre-headless
sudo apt install -y -qq openjdk-21-jdk-headless
# install python virtual environment (is this needed?)
sudo apt-get install -y -qq python3-venv

- name: Wait for polaris initialization
env:
JAVA_HOME: /usr/lib/jvm/java-21-openjdk-amd64
run: |
make setup_polaris_ci
# let polaris initialize
max_attempts=50
attempt=1
while ! (curl -sf http://localhost:8182/healthcheck || curl -sf http://localhost:8182/q/health); do
if [ $attempt -gt $max_attempts ]; then
echo "Polaris failed to initialize after $max_attempts attempts"
exit 1
fi
echo "Waiting for Polaris to initialize (attempt $attempt/$max_attempts)..."
sleep 5
attempt=$((attempt + 1))
done
echo "Polaris is healthy"

- name: Generate Polaris Data
run: |
python3 -m venv .
source ./bin/activate
python3 -m pip install poetry
python3 -m pip install pyspark==3.5.0
python3 -m pip install duckdb
python3 scripts/polaris/get_polaris_root_creds.py
# needed for setup_polaris_catalog.sh
export POLARIS_ROOT_ID=$(cat polaris_root_id.txt)
export POLARIS_ROOT_SECRET=$(cat polaris_root_password.txt)
cd polaris_catalog && ../scripts/polaris/setup_polaris_catalog.sh > user_credentials.json
cd ..
python3 scripts/polaris/get_polaris_client_creds.py
export POLARIS_CLIENT_ID=$(cat polaris_client_id.txt)
export POLARIS_CLIENT_SECRET=$(cat polaris_client_secret.txt)
python3 scripts/data_generators/generate_data.py polaris

- name: Test with rest catalog
env:
POLARIS_SERVER_AVAILABLE: 1
run: |
export POLARIS_CLIENT_ID=$(cat polaris_client_id.txt)
export POLARIS_CLIENT_SECRET=$(cat polaris_client_secret.txt)
make test_release
21 changes: 13 additions & 8 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
cmake_minimum_required(VERSION 2.8.12)
cmake_minimum_required(VERSION 3.5...3.29)

# Set extension name here
set(TARGET_NAME iceberg)
Expand All @@ -16,7 +16,7 @@ set(EXTENSION_SOURCES
src/iceberg_manifest.cpp
src/manifest_reader.cpp
src/catalog_api.cpp
src/catalog_utils.cpp
src/catalog_utils.cpp
src/common/utils.cpp
src/common/url_utils.cpp
src/common/schema.cpp
Expand All @@ -34,8 +34,7 @@ set(EXTENSION_SOURCES
src/storage/irc_table_entry.cpp
src/storage/irc_table_set.cpp
src/storage/irc_transaction.cpp
src/storage/irc_transaction_manager.cpp
)
src/storage/irc_transaction_manager.cpp)

add_library(${EXTENSION_NAME} STATIC ${EXTENSION_SOURCES})

Expand All @@ -46,14 +45,20 @@ find_package(CURL REQUIRED)
find_package(AWSSDK REQUIRED COMPONENTS core sso sts)
include_directories(${CURL_INCLUDE_DIRS})

# Reset the TARGET_NAME, the AWS find_package build could bleed into our build - overriding `TARGET_NAME`
set(TARGET_NAME iceberg)

# AWS SDK FROM vcpkg
target_include_directories(${EXTENSION_NAME} PUBLIC $<BUILD_INTERFACE:${AWSSDK_INCLUDE_DIRS}>)
target_include_directories(${EXTENSION_NAME}
PUBLIC $<BUILD_INTERFACE:${AWSSDK_INCLUDE_DIRS}>)
target_link_libraries(${EXTENSION_NAME} PUBLIC ${AWSSDK_LINK_LIBRARIES})
target_include_directories(${TARGET_NAME}_loadable_extension PRIVATE $<BUILD_INTERFACE:${AWSSDK_INCLUDE_DIRS}>)
target_link_libraries(${TARGET_NAME}_loadable_extension ${AWSSDK_LINK_LIBRARIES})
target_include_directories(${TARGET_NAME}_loadable_extension
PRIVATE $<BUILD_INTERFACE:${AWSSDK_INCLUDE_DIRS}>)
target_link_libraries(${TARGET_NAME}_loadable_extension
${AWSSDK_LINK_LIBRARIES})

# Link dependencies into extension
target_link_libraries(${EXTENSION_NAME} PUBLIC ${CURL_LIBRARIES})
target_link_libraries(${EXTENSION_NAME} PUBLIC ${CURL_LIBRARIES})
target_link_libraries(${TARGET_NAME}_loadable_extension ${CURL_LIBRARIES})

install(
Expand Down
12 changes: 10 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,18 @@ install_requirements:

# Custom makefile targets
data: data_clean start-rest-catalog
python3 scripts/data_generators/generate_data.py
python3 scripts/data_generators/generate_data.py spark-rest local

data_large: data data_clean
python3 scripts/data_generators/generate_data.py
python3 scripts/data_generators/generate_data.py spark-rest local

# setup polaris server. See PolarisTesting.yml to see instructions for a specific machine.
setup_polaris_ci:
mkdir polaris_catalog
git clone https://github.com/apache/polaris.git polaris_catalog
cd polaris_catalog && ./gradlew clean :polaris-quarkus-server:assemble -Dquarkus.container-image.build=true --no-build-cache
cd polaris_catalog && ./gradlew --stop
cd polaris_catalog && nohup ./gradlew run > polaris-server.log 2> polaris-error.log &

data_clean:
rm -rf data/generated
Expand Down
30 changes: 28 additions & 2 deletions scripts/data_generators/generate_data.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,43 @@
from generate_spark_local.generate_iceberg_spark_local import IcebergSparkLocal
from generate_spark_rest.generate_iceberg_spark_rest import IcebergSparkRest
from generate_polaris_rest.generate_iceberg_polaris_rest import IcebergPolarisRest
import sys

# Example usage:
if __name__ == "__main__":
def GenerateSparkRest():
db2 = IcebergSparkRest()
conn2 = db2.GetConnection()
db2.GenerateTables(conn2)
db2.CloseConnection(conn2)
del db2
del conn2

def GenerateSparkLocal():
db = IcebergSparkLocal()
conn = db.GetConnection()
db.GenerateTables(conn)
db.CloseConnection(conn)
del db
del conn

def GeneratePolarisData():
db = IcebergPolarisRest()
conn = db.GetConnection()
db.GenerateTables(conn)
db.CloseConnection(conn)
del db
del conn

if __name__ == "__main__":
argv = sys.argv
for i in range(1, len(argv)):
if argv[i] == "polaris":
print("generating polaris data")
GeneratePolarisData()
elif argv[i] == "local":
print("generating local iceberg data")
GenerateSparkLocal()
elif argv[i] == "spark-rest":
print("generating local iceberg REST data")
GenerateSparkRest()
else:
print(f"{argv[i]} not recognized, skipping")
1 change: 1 addition & 0 deletions scripts/data_generators/generate_polaris_rest/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Loading
Loading