-
Notifications
You must be signed in to change notification settings - Fork 2k
fix: fail gracefully when subscription exports take too long #39132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
pytestmark = [pytest.mark.asyncio, pytest.mark.django_db(transaction=True)] | ||
|
||
|
||
@pytest_asyncio.fixture(autouse=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mocking this because export_asset_direct
makes the tests run for 2 min (vs 14s)
============================= test session starts ==============================
PASSED [ 12%]
PASSED [ 25%]
ee/tasks/test/subscriptions/test_generate_assets_async.py::test_generate_assets_async_excludes_deleted_insights PASSED [ 37%]
ee/tasks/test/subscriptions/test_generate_assets_async.py::test_generate_assets_async_raises_if_missing_resource PASSED [ 50%]
PASSED [ 62%]
ee/tasks/test/subscriptions/test_generate_assets_async.py::test_generate_assets_async_handles_empty_dashboard PASSED [ 75%]
PASSED [ 87%]
PASSED [100%]
=============================== inline snapshot ================================
======================== 8 passed in 160.56s (0:02:40) =========================
we already have tests for export_asset
which calls export_asset_direct
in test_exporter.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (2)
-
ee/tasks/test/subscriptions/test_generate_assets_async.py
, line 193-194 (link)style: Consider using
list()
directly instead of lambda wrapper for better readability -
posthog/temporal/subscriptions/subscription_scheduling_workflow.py
, line 197 (link)logic: Inconsistent timeout configuration. This workflow still uses the old calculation pattern while ScheduleAllSubscriptionsWorkflow was updated to use TEMPORAL_TASK_TIMEOUT_MINUTES. Should use the same timeout setting for consistency.
6 files reviewed, 3 comments
@pytest.mark.django_db(transaction=True) | ||
@patch("ee.tasks.subscriptions.subscription_utils.exporter.export_asset_direct") | ||
async def test_async_generate_assets_basic(mock_export: MagicMock, team, user) -> None: | ||
from asgiref.sync import sync_to_async |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider moving the sync_to_async
import to the top of the file since it's used in all three async tests
Prompt To Fix With AI
This is a comment left during a code review.
Path: ee/tasks/test/subscriptions/test_subscriptions_utils.py
Line: 100:100
Comment:
**style:** Consider moving the `sync_to_async` import to the top of the file since it's used in all three async tests
How can I resolve this? If you propose a fix, please make it concise.
Problem
Currently, some subscriptions fail to send because the asset export takes too long and Temporal times out. We have configured the Temporal
start_to_close_timeout
to be 15 min, but the asset export can take longer if the underlying queries are slow.For example, in attempt=3 for this workflow run: https://grafana.prod-us.posthog.dev/goto/95mPwU3HR?orgId=1
5 of the 6 the
exporting_asset
logs (emitted when the export is started) have a correspondingasset_exported
log. Asset 2401337 does not have aasset_exported
log. 2401337's attempt started at 2025-10-03 15:25:36.305979 and the lastasset_exported
log was 2025-10-03 15:38:00.759231 for the 5th asset. The timeout was at 2025-10-03 UTC 15:40:36.28 (Temporal logs).My assumption here is that exporting 2401337 took longer than 2min and did not finish before 15:40 thus the Temporal start_to_close_timeout killed the activity.
#39055
Changes
start_to_close_timeout
.How did you test this code?
Unit tests
👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Changelog: (features only) Is this feature complete?