You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/docs/docs/build/snapshots.md
-23
Original file line number
Diff line number
Diff line change
@@ -390,29 +390,6 @@ snapshots:
390
390
391
391
</VersionBlock>
392
392
393
-
## Snapshot query best practices
394
-
395
-
This section outlines some best practices for writing snapshot queries:
396
-
397
-
- #### Snapshot source data
398
-
Your models should then select from these snapshots, treating them like regular data sources. As much as possible, snapshot your source data in its raw form and use downstream models to clean up the data
399
-
400
-
- #### Use the `source` function in your query
401
-
This helps when understanding <Term id="data-lineage">data lineage</Term> in your project.
402
-
403
-
- #### Include as many columns as possible
404
-
In fact, go for `select *` if performance permits! Even if a column doesn't feel useful at the moment, it might be better to snapshot it in case it becomes useful – after all, you won't be able to recreate the column later.
405
-
406
-
- #### Avoid joins in your snapshot query
407
-
Joins can make it difficult to build a reliable `updated_at` timestamp. Instead, snapshot the two tables separately, and join them in downstream models.
408
-
409
-
- #### Limit the amount of transformation in your query
410
-
If you apply business logic in a snapshot query, and this logic changes in the future, it can be impossible (or, at least, very difficult) to apply the change in logic to your snapshots.
411
-
412
-
Basically – keep your query as simple as possible! Some reasonable exceptions to these recommendations include:
413
-
* Selecting specific columns if the table is wide.
414
-
* Doing light transformation to get data into a reasonable shape, for example, unpacking a <Term id="json" /> blob to flatten your source data into columns.
415
-
416
393
## Snapshot meta-fields
417
394
418
395
Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.
0 commit comments