Skip to content

Commit

Permalink
Add documentation for resuming migrations
Browse files Browse the repository at this point in the history
  • Loading branch information
julienrf committed Aug 22, 2024
1 parent 2a7715d commit b1c4589
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 3 deletions.
6 changes: 3 additions & 3 deletions docs/source/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -334,15 +334,15 @@ The optional ``renames`` property lists the item columns to rename along the mig
Savepoints
----------

When migrating data from Apache Cassandra or DynamoDB, the migrator is able to resume an interrupted migration. To achieve this, it stores so-called “savepoints” along the process to remember which data items have already been migrated and should be skipped when the migration is restarted.
When migrating data from Apache Cassandra or DynamoDB, the migrator is able to :doc:`resume an interrupted migration </resume-interrupted-migration>`. To achieve this, it stores so-called “savepoints” along the process to remember which data items have already been migrated and should be skipped when the migration is restarted.

.. code-block:: yaml
savepoints:
# Whe should savepoint configurations be stored? This is a path on the host running
# Where should savepoint configurations be stored? This is a path on the host running
# the Spark driver - usually the Spark master.
path: /app/savepoints
# Interval in which savepoints will be created
# Interval at which savepoints will be created
intervalSeconds: 300
----------
Expand Down
1 change: 1 addition & 0 deletions docs/source/getting-started/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ You might also be interested in the following extra features:

* :doc:`rename columns along the migration </rename-columns>`,
* :doc:`replicate changes applied to the source table after the initial snapshot transfer has completed </stream-changes>`,
* :doc:`resume an interrupted migration where it left off </resume-interrupted-migration>`,
* :doc:`validate that the migration was complete and correct </validate>`.

.. toctree::
Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Migrator Spark Scala
migrate-from-dynamodb
stream-changes
rename-columns
resume-interrupted-migration
validate
configuration
tutorials/index
13 changes: 13 additions & 0 deletions docs/source/resume-interrupted-migration.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
=================================================
Resume an Interrupted Migration Where it Left Off
=================================================

.. note:: This feature is currently supported only when migrating from Apache Cassandra or DynamoDB.

If, for some reason, the migration is interrupted (e.g., because of a networking issue, or if you need to manually stop it for some reason), the migrator is able to resume it from a “savepoints”.

Savepoints are configuration files that contain information about the already migrated items, which can be skipped when the migration is resumed. The savepoint files are automatically generated during the migration. To use a savepoint, start a migration using it as configuration file.

You can control the savepoints location and the interval at which they are generated in the configuration file under the top-level property ``savepoints``. See `the corresponding section of the configuration reference </configuration#savepoints>`_.

During the migration, the savepoints are generated with file names like ``savepoint_xxx.yaml``, where ``xxx`` is a timestamp looking like ``1234567890``. To resume a migration, start a new migration with the latest savepoint as configuration file.

0 comments on commit b1c4589

Please sign in to comment.