Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable writing new delta tables #54

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

SSchotten
Copy link
Contributor

This PR should address the current error:

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/dq_suite/common.py:198, in merge_df_with_unity_table(df, catalog_name, table_name, table_merge_id, df_merge_id, merge_dict, spark_session)
    194 full_table_name = get_full_table_name(
    195     catalog_name=catalog_name, table_name=table_name
    196 )
    197 df_alias = f'{table_name}_df'
--> 198 regelTabel = DeltaTable.forName(spark_session, full_table_name)

....

AnalysisException: [DELTA_MISSING_DELTA_TABLE] `data_quality`.`brondataset` is not a Delta table.

df.write.mode(mode).option("overwriteSchema", "true").saveAsTable(
full_table_name
) # TODO: write as delta-table? .format("delta")
df.write.format("delta").mode(mode).option(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ik snap de behoefte, maar dit wordt een lastig puntje wanneer we de opslag van resultaten willen gaan centraliseren. Ben benieuwd naar jouw gedachten hierover. Keertje meeten?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ja laten we dit komende week even bespreken, ik ben wel benieuwd waar je moeilijkheden ziet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ik ga kijken of we de functionaliteit van delta-spark kunnen vangen in de PySpark API. Dan zouden we zonder het stukje delta-spark code toekunnen, wat het geheel simpeler maakt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants