Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excludables functionality causes nothing to be extracted except the main page. #281

Closed
linuxlurak opened this issue Nov 26, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@linuxlurak
Copy link

Hi

This part of the code of src/tasks/class-ss-fetch-urls-task.php

			$excludable = apply_filters( 'ss_find_excludable', $this->find_excludable( $static_page ), $static_page );
			if ( $excludable !== false ) {
				$save_file   = false;
				$follow_urls = false;
				Util::debug_log( "Excludable found: URL: " . $static_page->url );

$excludable = apply_filters( 'ss_find_excludable', $this->find_excludable( $static_page ), $static_page );

results in no other pages being crawled apart from the main page. None of the pages contain no-folloe or no save. Deactivated all plugins to check if third party plugin causes this problem ==> no other plugin is the cause.

The logs show this for example:
[2024-11-26 09:03:45] [class-ss-fetch-urls-task.php:79] Excludable found: URL: https://[CUT]sitemap.xml

At the moment I patched it by setting save_file and follow_urls to true. Didn't have the time to investigate further.

@linuxlurak linuxlurak changed the title Excludables functionality leads to not extraxcting anything but main page Excludables functionality causes nothing to be extracted except the main page. Nov 26, 2024
@patrickposner patrickposner added the bug Something isn't working label Nov 26, 2024
@igorbenic
Copy link
Collaborator

Hi @linuxlurak, could you save the main page HTML and add the code here or upload it as a file, so I can run the code through it and see what could go wrong?

@linuxlurak
Copy link
Author

Hi @ibenic, There's too much personal information on the site for me to share. And editing is time-consuming and may not be useful for your analysis, I guess. Any other way to help with the investigation?

@patrickposner
Copy link
Collaborator

Hey @linuxlurak, if you like, you can send it to us via [email protected].
We keep your information private and won't expose any personal data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants