Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add test workflow for docker-based replication #254

Merged
merged 7 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
140 changes: 140 additions & 0 deletions .github/workflows/replication-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
name: Docker Replica Mode Test

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

jobs:
test-replication:
runs-on: ubuntu-latest
strategy:
matrix:
source: ['mysql', 'postgres']
steps:
- uses: actions/checkout@v4

- name: Install dependencies
run: |
# Only install DuckDB for data comparison
curl -LJO https://github.com/duckdb/duckdb/releases/latest/download/duckdb_cli-linux-amd64.zip
unzip duckdb_cli-linux-amd64.zip
chmod +x duckdb
sudo mv duckdb /usr/local/bin

- name: Start source ${{ matrix.source }} database
run: |
if [ "${{ matrix.source }}" = "mysql" ]; then
docker run -d --name source-db -p 3306:3306 \
-e MYSQL_ROOT_PASSWORD=root \
-e MYSQL_DATABASE=test \
mysql:lts

# Wait for MySQL to be ready
until docker exec source-db mysql -uroot -proot -e "SELECT 1"; do
sleep 1
done

# Create test data
docker exec source-db mysql -uroot -proot test -e "
CREATE TABLE items (id INT PRIMARY KEY, name VARCHAR(50));
INSERT INTO items VALUES (1, 'test1'), (2, 'test2');"

else
docker run -d --name source-db -p 5432:5432 \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=test \
postgres:latest \
-c wal_level=logical

# Wait for PostgreSQL to be ready
until docker exec source-db pg_isready; do
sleep 1
done

# Create test data
docker exec source-db psql -U postgres test -c "
CREATE TABLE items (id INT PRIMARY KEY, name VARCHAR(50));
INSERT INTO items VALUES (1, 'test1'), (2, 'test2');"
fi

- name: Start MyDuck Server in replica mode
run: |
if [ "${{ matrix.source }}" = "mysql" ]; then
SOURCE_DSN="mysql://root:[email protected]:3306"
else
SOURCE_DSN="postgres://postgres:[email protected]:5432/test"
fi

docker run -d --name myduck \
--add-host=host.docker.internal:host-gateway \
-p 13306:3306 \
-p 15432:5432 \
--env=SETUP_MODE=REPLICA \
--env=SOURCE_DSN="$SOURCE_DSN" \
apecloud/myduckserver:latest

# Wait for MyDuck to be ready
sleep 10

- name: Verify initial replication
run: |
# Query source data
if [ "${{ matrix.source }}" = "mysql" ]; then
docker exec source-db mysql -uroot -proot test \
-e "SELECT * FROM items ORDER BY id;" > source_data.csv
else
docker exec source-db psql -U postgres -h 127.0.0.1 test \
-c "\COPY (SELECT * FROM items ORDER BY id) TO STDOUT WITH CSV;" | tee source_data.csv
fi

# Query MyDuck data through Postgres interface
docker exec myduck psql -U postgres -h 127.0.0.1 \
-c "\COPY (SELECT * FROM items ORDER BY id) TO STDOUT WITH CSV;" | tee myduck_data.csv

# Compare data using DuckDB
duckdb --csv -c "
CREATE TABLE source AS FROM 'source_data.csv';
CREATE TABLE myduck AS FROM 'myduck_data.csv';
SELECT COUNT(*) FROM (
SELECT * FROM source EXCEPT SELECT * FROM myduck
) diff;" | tail -n 1 | tee diff_count.txt

# Verify no differences
if grep -q '^0$' diff_count.txt; then
echo 'Initial replication verification successful'
else
echo 'Initial replication verification failed'
exit 1
fi

- name: Test replication of new data
run: |
# Insert new data in source
if [ "${{ matrix.source }}" = "mysql" ]; then
docker exec source-db mysql -uroot -proot test \
-e "INSERT INTO items VALUES (3, 'test3');"
else
docker exec source-db psql -U postgres test \
-c "INSERT INTO items VALUES (3, 'test3');"
fi

# Wait for replication
sleep 5

# Verify new data was replicated
docker exec myduck psql -t -U postgres -h 127.0.0.1 -c \
"SELECT COUNT(*) FROM items WHERE id = 3;" | tr -d ' ' | tee count.txt

if grep -q '^1$' count.txt; then
echo 'Replication of new data verified successfully'
else
echo 'Replication of new data verification failed'
exit 1
fi

- name: Cleanup
if: always()
run: |
docker rm -f source-db myduck || true
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,15 +122,17 @@ psql -h 127.0.0.1 -p 15432 -U postgres

We have integrated a setup tool in the Docker image that helps replicate data from your primary (MySQL|Postgres) server to MyDuck Server. The tool is available via the `SETUP_MODE` environment variable. In `REPLICA` mode, the container will start MyDuck Server, dump a snapshot of your primary (MySQL|Postgres) server, and start replicating data in real-time.

> [!NOTE]
> Supported primary database versions: MySQL>=8.0 and PostgreSQL>=13. In addition to the default settings,
logical replication must be enabled for PostgreSQL by setting `wal_level=logical`.
> For MySQL, GTID-based replication (`gtid_mode=ON` and `enforce_gtid_consistency=ON`) is recommended but not required.

```bash
docker run \
docker run -d --name myduck \
-p 13306:3306 \
-p 15432:5432 \
--privileged \
--workdir=/home/admin \
--env=SETUP_MODE=REPLICA \
--env=SOURCE_DSN="<postgresql|mysql>://<user>:<password>@<host>:<port>/<dbname>"
--detach=true \
apecloud/myduckserver:latest
```
`SOURCE_DSN` specifies the connection string to the primary database server, which can be either MySQL or PostgreSQL.
Expand All @@ -141,6 +143,9 @@ docker run \
- **PostgreSQL Primary:** Use the `postgres` URI scheme, e.g.,
`--env=SOURCE_DSN=postgres://postgres:[email protected]:5432`

> [!NOTE]
> To replicate from a server running on the host machine, use `host.docker.internal` as the hostname instead of `localhost` or `127.0.0.1`. On Linux, you must also add `--add-host=host.docker.internal:host-gateway` to the `docker run` command.

### Connecting to Cloud MySQL & Postgres

MyDuck Server supports setting up replicas from common cloud-based MySQL & Postgres offerings. For more information, please refer to the [replica setup guide](docs/tutorial/replica-setup-rds.md).
Expand Down
6 changes: 3 additions & 3 deletions devtools/replica-setup-mysql/checker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ check_server_params() {
echo "Checking MySQL server parameters..."

# Retrieve the required MySQL server variables using mysqlsh
result=$(mysqlsh --uri="$SOURCE_DSN" $NO_PASSWORD_OPTION --sql -e "
result=$(mysqlsh --uri="$SOURCE_DSN" $SOURCE_NO_PASSWORD_OPTION --sql -e "
SHOW VARIABLES WHERE variable_name IN ('binlog_format', 'enforce_gtid_consistency', 'gtid_mode', 'gtid_strict_mode', 'log_bin');
")

Expand Down Expand Up @@ -65,7 +65,7 @@ check_user_privileges() {
echo "Checking privileges for the current user '$SOURCE_USER'..."

# Check the user grants for the currently authenticated user using mysqlsh
result=$(mysqlsh --uri "$SOURCE_DSN" $NO_PASSWORD_OPTION --sql -e "
result=$(mysqlsh --uri "$SOURCE_DSN" $SOURCE_NO_PASSWORD_OPTION --sql -e "
SHOW GRANTS FOR CURRENT_USER();
")

Expand Down Expand Up @@ -98,7 +98,7 @@ check_mysql_config() {
# Function to check if source MySQL server is empty
check_if_source_mysql_is_empty() {
# Run the query using mysqlsh and capture the output
OUTPUT=$(mysqlsh --uri "$SOURCE_DSN" $NO_PASSWORD_OPTION --sql -e "SHOW DATABASES;" 2>/dev/null)
OUTPUT=$(mysqlsh --uri "$SOURCE_DSN" $SOURCE_NO_PASSWORD_OPTION --sql -e "SHOW DATABASES;" 2>/dev/null)

check_command "retrieving database list"

Expand Down
15 changes: 5 additions & 10 deletions devtools/replica-setup-mysql/replica_setup.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/bash

usage() {
echo "Usage: $0 --mysql_host <host> --mysql_port <port> --mysql_user <user> --mysql_password <password> [--myduck_host <host>] [--myduck_port <port>] [--myduck_user <user>] [--myduck_password <password>] [--myduck_in_docker <true|false>]"
echo "Usage: $0 --mysql_host <host> --mysql_port <port> --mysql_user <user> --mysql_password <password> [--myduck_host <host>] [--myduck_port <port>] [--myduck_user <user>] [--myduck_password <password>]"
exit 1
}

Expand All @@ -10,7 +10,6 @@ MYDUCK_PORT=${MYDUCK_PORT:-3306}
MYDUCK_USER=${MYDUCK_USER:-root}
MYDUCK_PASSWORD=${MYDUCK_PASSWORD:-}
MYDUCK_SERVER_ID=${MYDUCK_SERVER_ID:-2}
MYDUCK_IN_DOCKER=${MYDUCK_IN_DOCKER:-false}
GTID_MODE="ON"

while [[ $# -gt 0 ]]; do
Expand Down Expand Up @@ -51,22 +50,18 @@ while [[ $# -gt 0 ]]; do
MYDUCK_SERVER_ID="$2"
shift 2
;;
--myduck_in_docker)
MYDUCK_IN_DOCKER="$2"
shift 2
;;
*)
echo "Unknown parameter: $1"
usage
;;
esac
done

# if MYDUCK_PASSWORD is empty, set NO_PASSWORD_OPTION to "--no-password"
if [[ -z "$MYDUCK_PASSWORD" ]]; then
NO_PASSWORD_OPTION="--no-password"
# if SOURCE_PASSWORD is empty, set SOURCE_NO_PASSWORD_OPTION to "--no-password"
if [[ -z "$SOURCE_PASSWORD" ]]; then
SOURCE_NO_PASSWORD_OPTION="--no-password"
else
NO_PASSWORD_OPTION=""
SOURCE_NO_PASSWORD_OPTION=""
fi

# Check if all parameters are set
Expand Down
2 changes: 1 addition & 1 deletion devtools/replica-setup-mysql/snapshot.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ echo "Thread count set to: $THREAD_COUNT"

echo "Copying data from MySQL to MyDuck..."
# Run mysqlsh command and capture the output
output=$(mysqlsh --uri "$SOURCE_DSN" $NO_PASSWORD_OPTION -- util copy-instance "mysql://${MYDUCK_USER}:${MYDUCK_PASSWORD}@${MYDUCK_HOST}:${MYDUCK_PORT}" --users false --consistent false --ignore-existing-objects true --handle-grant-errors ignore --threads $THREAD_COUNT --bytesPerChunk 256M --ignore-version true)
output=$(mysqlsh --uri "$SOURCE_DSN" $SOURCE_NO_PASSWORD_OPTION -- util copy-instance "mysql://${MYDUCK_USER}:${MYDUCK_PASSWORD}@${MYDUCK_HOST}:${MYDUCK_PORT}" --users false --consistent false --ignore-existing-objects true --handle-grant-errors ignore --threads $THREAD_COUNT --bytesPerChunk 256M --ignore-version true)

if [[ $GTID_MODE == "ON" ]]; then
# Extract the EXECUTED_GTID_SET from this output:
Expand Down
7 changes: 0 additions & 7 deletions devtools/replica-setup-mysql/start_replication.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,6 @@ OS=$(uname -s)
# fi
# fi

if [[ "${MYDUCK_IN_DOCKER}" == "true" && "$OS" == "Darwin" &&
("${SOURCE_HOST}" == "127.0.0.1" || "${SOURCE_HOST}" == "localhost" || "${SOURCE_HOST}" == "0.0.0.0") ]]; then
SOURCE_HOST_FOR_REPLICA="host.docker.internal"
else
SOURCE_HOST_FOR_REPLICA="${SOURCE_HOST}"
fi

# Use the EXECUTED_GTID_SET variable from the previous steps
if [ $GTID_MODE == "ON" ] && [ ! -z "$EXECUTED_GTID_SET" ]; then
mysqlsh --sql --host=${MYDUCK_HOST} --port=${MYDUCK_PORT} --user=root --no-password <<EOF
Expand Down
17 changes: 9 additions & 8 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -63,15 +63,16 @@ RUN pip install --no-cache-dir "sqlglot[rs]" --break-system-packages

# Install mysql-shell
RUN if [ "$TARGETARCH" = "arm64" ]; then \
ARCH="arm"; \
curl -LJO https://dev.mysql.com/get/Downloads/MySQL-Shell/mysql-shell-9.1.0-linux-glibc2.28-arm-64bit.tar.gz \
&& tar -zxf mysql-shell-9.1.0-linux-glibc2.28-arm-64bit.tar.gz \
&& rm mysql-shell-9.1.0-linux-glibc2.28-arm-64bit.tar.gz \
&& mv mysql-shell-9.1.0-linux-glibc2.28-arm-64bit /usr/local/mysqlsh \
&& ln -s /usr/local/mysqlsh/bin/mysqlsh /usr/local/bin/mysqlsh; \
else \
ARCH="x86"; \
fi && \
curl -LJO https://dev.mysql.com/get/Downloads/MySQL-Shell/mysql-shell-9.1.0-linux-glibc2.28-${ARCH}-64bit.tar.gz \
&& tar -zxvf mysql-shell-9.1.0-linux-glibc2.28-${ARCH}-64bit.tar.gz \
&& rm mysql-shell-9.1.0-linux-glibc2.28-${ARCH}-64bit.tar.gz \
&& mv mysql-shell-9.1.0-linux-glibc2.28-${ARCH}-64bit /usr/local/mysqlsh \
&& ln -s /usr/local/mysqlsh/bin/mysqlsh /usr/local/bin/mysqlsh \
curl -LJO https://dev.mysql.com/get/Downloads/MySQL-Shell/mysql-shell_9.1.0-1debian12_amd64.deb \
&& apt install -y ./mysql-shell_9.1.0-1debian12_amd64.deb \
&& rm mysql-shell_9.1.0-1debian12_amd64.deb; \
fi \
&& mysqlsh --version

# Dynamic DuckDB CLI download based on architecture
Expand Down
Loading
Loading