From 813a2eac4a7c9412a79222342beaeb58d411f313 Mon Sep 17 00:00:00 2001 From: Julien Richard-Foy Date: Sun, 11 Aug 2024 15:59:29 +0200 Subject: [PATCH] Add documentation for resuming migrations --- docs/source/configuration.rst | 6 +++--- docs/source/getting-started/index.rst | 1 + docs/source/index.rst | 1 + docs/source/resume-interrupted-migration.rst | 15 +++++++++++++++ 4 files changed, 20 insertions(+), 3 deletions(-) create mode 100644 docs/source/resume-interrupted-migration.rst diff --git a/docs/source/configuration.rst b/docs/source/configuration.rst index bf1370de..38a3aeae 100644 --- a/docs/source/configuration.rst +++ b/docs/source/configuration.rst @@ -334,15 +334,15 @@ The optional ``renames`` property lists the item columns to rename along the mig Savepoints ---------- -When migrating data from Apache Cassandra or DynamoDB, the migrator is able to resume an interrupted migration. To achieve this, it stores so-called “savepoints” along the process to remember which data items have already been migrated and should be skipped when the migration is restarted. +When migrating data from Apache Cassandra or DynamoDB, the migrator is able to :doc:`resume an interrupted migration `. To achieve this, it stores so-called “savepoints” along the process to remember which data items have already been migrated and should be skipped when the migration is restarted. .. code-block:: yaml savepoints: - # Whe should savepoint configurations be stored? This is a path on the host running + # Where should savepoint configurations be stored? This is a path on the host running # the Spark driver - usually the Spark master. path: /app/savepoints - # Interval in which savepoints will be created + # Interval at which savepoints will be created intervalSeconds: 300 ---------- diff --git a/docs/source/getting-started/index.rst b/docs/source/getting-started/index.rst index 875d8209..ba91edb8 100644 --- a/docs/source/getting-started/index.rst +++ b/docs/source/getting-started/index.rst @@ -56,6 +56,7 @@ You might also be interested in the following extra features: * :doc:`rename columns along the migration `, * :doc:`replicate changes applied to the source table after the initial snapshot transfer has completed `, +* :doc:`resume an interrupted migration where it left off `, * :doc:`validate that the migration was complete and correct `. .. toctree:: diff --git a/docs/source/index.rst b/docs/source/index.rst index 4e26d901..06405734 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -31,5 +31,6 @@ Migrator Spark Scala migrate-from-dynamodb stream-changes rename-columns + resume-interrupted-migration validate configuration diff --git a/docs/source/resume-interrupted-migration.rst b/docs/source/resume-interrupted-migration.rst new file mode 100644 index 00000000..4803c50f --- /dev/null +++ b/docs/source/resume-interrupted-migration.rst @@ -0,0 +1,15 @@ +================================================= +Resume an Interrupted Migration Where it Left Off +================================================= + +.. note:: This feature is currently supported only when migrating from Apache Cassandra or DynamoDB. + +If, for some reason, the migration is interrupted (e.g., because of a networking issue, or if you need to manually stop it for some reason), the migrator is able to resume it from a “savepoints”. + +Savepoints are configuration files that contain information about the already migrated items, which can be skipped when the migration is resumed. The savepoint files are automatically generated during the migration. To use a savepoint, start a migration using it as configuration file. + +You can control the savepoints location and the interval at which they are generated in the configuration file under the top-level property ``savepoints``. See `the corresponding section of the configuration reference `_. + +During the migration, the savepoints are generated with file names like ``savepoint_xxx.yaml``, where ``xxx`` is a timestamp looking like ``1234567890``. To resume a migration, start a new migration with the latest savepoint as configuration file. + +.. caution:: When migrating from DynamoDB, this feature works only if no items are added to or deleted from the table during the migration (ie, it works on “cold migration” scenarios only).