cloudflare · agents-git-bot · Jan 14, 2026 · Jan 14, 2026 · Jan 14, 2026 · Jan 14, 2026
@@ -10,10 +10,17 @@ import { FileTree } from "~/components"
 import { Tabs, TabItem } from "~/components"
 import { InlineBadge } from "~/components";
 
-## Deleting data in R2 Data Catalog
-
 Deleting data from R2 Data Catalog or any Apache Iceberg catalog requires that operations are done in a transaction through the catalog itself. Manually deleting metadata or data files directly can lead to data catalog corruption.
 
+## Automatic table maintenance
+R2 Data Catalog can automatically manage table maintenance operations such as snapshot expiration and compaction. These continuous operations help keep latency and storage costs down.
+ - **Snapshot expiration**: Automatically removes old snapshots. This reduces metadata overhead. Data files are not removed until orphan file removal is run.
+ - **Compaction**: Merges small data files into larger ones. This optimizes read performance and reduces the number of files read during queries.
+
+ Without enabling automatic maintenance, you need to manually handle these operations.
+
+ Learn more in the [table maintenance](/r2/data-catalog/table-maintenance/) documentation.
+
 ## Examples of enabling automatic table maintenance in R2 Data Catalog
 ```bash
 # Enable automatic snapshot expiration for entire catalog
@@ -25,9 +32,13 @@ npx wrangler r2 bucket catalog snapshot-expiration enable my-bucket \
 npx wrangler r2 bucket catalog compaction enable my-bucket \
 	--target-size 256
 ```
-More information can be found in the [table maintenance](/r2/data-catalog/table-maintenance/) and [manage catalogs](/r2/data-catalog/manage-catalogs/) documentation.
+Refer to additional examples in the [manage catalogs](/r2/data-catalog/manage-catalogs/) documentation.
 
-## Examples of deleting data from R2 Data Catalog using PySpark
+## Manually deleting and removing data
+You need to manually delete data for:
+ - Complying with data retention policies such as GDPR or CCPA.
+ - Selective based deletes using conditional logic.
+ - Removing stale or unreferenced files that R2 Data Catalog does not manage.
 
 The following are basic examples using PySpark but similar operations can be performed using other Iceberg-compatible engines. To configure PySpark, refer to our [example](/r2/data-catalog/config-examples/spark-python/) or the official [PySpark documentation](https://spark.apache.org/docs/latest/api/python/getting_started/index.html).
 

@@ -82,7 +82,11 @@ npx wrangler r2 bucket catalog disable <BUCKET_NAME>
 ## Enable compaction
 
 Compaction improves query performance by combining the many small files created during data ingestion into fewer, larger files according to the set `target file size`. For more information about compaction and why it's valuable, refer to [About compaction](/r2/data-catalog/table-maintenance/).
+:::note[API token permission requirements]
+Table maintenance operations such as compaction and snapshot expiration requires a Cloudflare API token with both R2 storage and R2 Data Catalog read/write permissions to act as a service credential.
 
+Refer to [Authenticate your Iceberg engine](#authenticate-your-iceberg-engine) for details on creating a token with the required permissions.
+:::
 <Tabs syncKey='CLIvDash'>
 <TabItem label='Dashboard'>
 
@@ -120,12 +124,6 @@ npx wrangler r2 bucket catalog compaction enable <BUCKET_NAME> <NAMESPACE> <TABL
 </TabItem>
 </Tabs>
 
-:::note[API token permission requirements]
-Compaction requires a Cloudflare API token with both R2 storage and R2 Data Catalog read/write permissions to act as a service credential. The compaction process uses this token to read files, combine them, and update table metadata.
-
-Refer to [Authenticate your Iceberg engine](#authenticate-your-iceberg-engine) for details on creating a token with the required permissions.
-:::
-
 Once enabled, compaction applies retroactively to all existing tables (for catalog-level compaction) or the specified table (for table-level compaction). During open beta, we currently compact up to 2 GB worth of files once per hour for each table.
 
 ## Disable compaction
@@ -165,6 +163,10 @@ npx wrangler r2 bucket catalog compaction disable <BUCKET_NAME> <NAMESPACE> <TAB
 
 Snapshot expiration automatically removes old table snapshots to reduce metadata bloat and storage costs. For more information about snapshot expiration and why it is valuable, refer to [Table maintenance](/r2/data-catalog/table-maintenance/).
 
+:::note
+Snapshot expiration commands are available as of Wrangler version 4.56.0.
+:::
+
 To enable snapshot expiration on your catalog, run the [`r2 bucket catalog snapshot-expiration enable` command](/workers/wrangler/commands/#r2-bucket-catalog-snapshot-expiration-enable):
 
 ```bash
@@ -180,12 +182,6 @@ npx wrangler r2 bucket catalog snapshot-expiration enable <BUCKET_NAME> <NAMESPA
   --retain-last 5
 ```
 
-:::note[API token permission requirements]
-Catalog-level snapshot expiration requires a Cloudflare API token with both R2 storage and R2 Data Catalog read/write permissions to act as a service credential. The snapshot expiration process uses this token to update table metadata and remove old snapshots.
-
-Refer to [Authenticate your Iceberg engine](#authenticate-your-iceberg-engine) for details on creating a token with the required permissions.
-:::
-
 ## Disable snapshot expiration
 
 Disabling snapshot expiration prevents the process from running for all tables (catalog level) or a specific table (table level). You can re-enable snapshot expiration at any time.