Implications of merging Advanced Crawling Add-on to core? #774
Replies: 4 comments 2 replies
-
It would make sense to merge in the parts you mentioned. But the add-on as a whole is unstable and needs more refinement and testing. There are a lot of performance trade-offs, as an optimization for one site tends to slow down other sites that don't benefit from the optimization. It's still very experimental. |
Beta Was this translation helpful? Give feedback.
-
I'm about ready to MergeAllTheThings back together, including add-ons. There hasn't been a lot of interest in ppl building 3rd party add-ons, so it's just extra overhead for me at the moment. As long as not blocking the extensible functions in core, impact for users should be net beneficial. Regarding the core functions of Detect, Crawl, Post Process, Deploy, what are the variables we need to allow configuration for? I'll start thinking: Detect
Crawling
Post processing
Deploy
What other variables do you think we need to make configurable? |
Beta Was this translation helpful? Give feedback.
-
The config values in advanced crawling are: Detect
Crawl
Post Process
|
Beta Was this translation helpful? Give feedback.
-
Some of the functionality in Advanced Crawling just isn't ready to get locked in. The main thing is that we're adding a crawled_time column to the URLs table which causes extra writes, since crawl_cache also has that column. The problem is that there's no cheap way to cross-reference the urls table with the crawl cache. The most efficient solution that I can think of is to combine the two tables, and I think we should give that serious consideration since it would eliminate a lot of DB load. Failing that, I'd still like to test other approaches than the current monkey-patching. |
Beta Was this translation helpful? Give feedback.
-
@john-shaffer should make life easier.
Scanning through code, looks like it will replace the Sitemap parsing in core? That would be nice. HTML parser looks elegant, too.
This all makes more sense to you as you wrote it, do you feel keen to do the merging? I'll go work on something else for a bit to avoid conflicts.
Beta Was this translation helpful? Give feedback.
All reactions