[434] Allow to filter jobs in ZyteJobsComparisonMonitor by close_reason #440

shafiq-muhammad · 2024-04-04T09:00:39Z

Adds limit for nested dict stats computation

Added a new setting SPIDERMON_JOBS_COMPARISON_CLOSE_REASONS to allow ZyteJobsComparisonMonitor to filter by the ScrapyCloud jobs with their close_reason stat.

codecov · 2024-04-04T09:03:10Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.54%. Comparing base (6fe8928) to head (7260d08).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #440      +/-   ##
==========================================
+ Coverage   79.51%   79.54%   +0.03%     
==========================================
  Files          76       76              
  Lines        3237     3242       +5     
  Branches      537      539       +2     
==========================================
+ Hits         2574     2579       +5     
  Misses        593      593              
  Partials       70       70

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

curita

Changes look good to me!

We should add some unittest to cover the new addition, and it would be nice to run some test job in ScrapyCloud to verify these changes (Installing Spidermon in the requierements.txt from git+https://github.com/shafiq-muhammad/spidermon@shafiq-434 should get you these changes). After that's addressed we should be good to go 👍

spidermon/contrib/scrapy/monitors/monitors.py

aaneto · 2024-05-03T01:51:16Z

Just a quick FYI @shafiq-muhammad @curita, I believe the pipeline is failing due to a regression from pytest 8.1.1 to 8.2.0.

My validation was:

Creating an empty .env for the project and running pip install tox and tox -e base
Noticing the errors, and setting pytest==8.1.1 on tox.ini
Running tox -e base again, with success this time.

Hope this helps (and is correct regardless of environment).

VMRuiz · 2024-05-03T07:06:55Z

Yes, it looks like Pytest 8.2.0 release has caused a lot of troubles according to their issues page.
@shafiq-muhammad , Could you please pin pytest to 8.1.1 until they release a new version?

curita

Changes look good to me, though we might want to change how we get the jobs depending on whether we want to get fewer API calls. I'm inclined to get fewer API calls, but let's wait for Victor's opinion.

Other than that, I suggested a small documentation addition, which should be quick to include.

spidermon/contrib/scrapy/monitors/monitors.py

curita · 2024-06-24T15:47:02Z

spidermon/contrib/scrapy/monitors/monitors.py


-            if len(current_jobs) < MAX_API_COUNT or len(total_jobs) == number_of_jobs:
+            for job in current_jobs:


thought: This makes sense since it seems we can't filter by close_reason in client.spider.jobs.list().

I wonder if it makes sense to retrieve MAX_API_COUNT count jobs each time from the API and append to total_jobs the jobs that meet the criteria until we reach number_of_jobs. This should result in fewer API calls, but those calls will be larger. What do you think, @VMRuiz?

If the default close reason is finished I would keep it as it is. As the most common scenario will be that you fetch the last few jobs and those will have the finished status. Even in the case where a second call is needed - due to a few unexpected failed jobs in between - it could still be faster for lower volumes than making 1 very big call.

On the contrary, if it's expected that uncommon close reasons like timeout are used, then it makes sense to fetch as many jobs as possible, as they will be few or even none.

curita

Looks ready to me!

added filter based on close reason

ee6aa4f

curita requested changes Apr 13, 2024

View reviewed changes

spidermon/contrib/scrapy/monitors/monitors.py Outdated Show resolved Hide resolved

spidermon/contrib/scrapy/monitors/monitors.py Outdated Show resolved Hide resolved

VMRuiz closed this Apr 17, 2024

VMRuiz reopened this Apr 17, 2024

shafiq-muhammad added 3 commits April 18, 2024 16:56

handled pr change requests

12b4862

added test cases

74e89e5

updated testcase name

6406295

shafiq-muhammad requested a review from curita May 2, 2024 12:08

Merge branch 'master' into shafiq-434

493bb9d

shafiq-muhammad added 4 commits May 16, 2024 16:14

merge master into shafiq-434

ebb6781

fixed test cases

9df1949

added new unit test cases

2114ca4

linting

0ebe5f9

curita requested changes Jun 24, 2024

View reviewed changes

updated doc string

868566f

curita approved these changes Jul 11, 2024

View reviewed changes

VMRuiz approved these changes Jul 12, 2024

View reviewed changes

VMRuiz added 3 commits July 12, 2024 09:57

Merge branch 'master' into shafiq-434

92dc72c

Merge branch 'master' into shafiq-434

cc2c256

Merge branch 'master' into shafiq-434

7260d08

VMRuiz merged commit d7d553a into scrapinghub:master Jul 15, 2024
8 checks passed

shafiq-muhammad deleted the shafiq-434 branch August 22, 2024 11:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[434] Allow to filter jobs in ZyteJobsComparisonMonitor by close_reason #440

[434] Allow to filter jobs in ZyteJobsComparisonMonitor by close_reason #440

shafiq-muhammad commented Apr 4, 2024

codecov bot commented Apr 4, 2024 •

edited

Loading

curita left a comment

aaneto commented May 3, 2024

VMRuiz commented May 3, 2024

curita left a comment

curita Jun 24, 2024

VMRuiz Jul 8, 2024

curita left a comment


		if len(current_jobs) < MAX_API_COUNT or len(total_jobs) == number_of_jobs:
		for job in current_jobs:

[434] Allow to filter jobs in ZyteJobsComparisonMonitor by close_reason #440

[434] Allow to filter jobs in ZyteJobsComparisonMonitor by close_reason #440

Conversation

shafiq-muhammad commented Apr 4, 2024

codecov bot commented Apr 4, 2024 • edited Loading

Codecov Report

curita left a comment

Choose a reason for hiding this comment

aaneto commented May 3, 2024

VMRuiz commented May 3, 2024

curita left a comment

Choose a reason for hiding this comment

curita Jun 24, 2024

Choose a reason for hiding this comment

VMRuiz Jul 8, 2024

Choose a reason for hiding this comment

curita left a comment

Choose a reason for hiding this comment

codecov bot commented Apr 4, 2024 •

edited

Loading