Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to manually stop crawling #15

Open
rushi opened this issue Jul 11, 2021 · 3 comments
Open

Ability to manually stop crawling #15

rushi opened this issue Jul 11, 2021 · 3 comments

Comments

@rushi
Copy link

rushi commented Jul 11, 2021

I'd like to be able to stop the crawling process when certain conditions have been met. It would be useful to have:

const crawler = new Crawler('example.com');
crawler.crawl();

setTimeout(() => {
  crawler.stop(); // Emits 'end' event
}, 3000);
@rushi
Copy link
Author

rushi commented Jul 11, 2021

Thinking of a way to implement this - Maybe just have a status property within the crawler with three possible values: 'new', 'started', 'complete' and you only crawl to the next page if status != 'complete'

I can submit a PR if you like

@rushi
Copy link
Author

rushi commented Jul 11, 2021

Upon a little debugging, I found instead of setting a status to stop the crawling one can set the URL filter this.config.urlFilter = () => false;

@safonovpro
Copy link
Owner

Hey, @rushi. Sorry for the long answer. Thank you for a good idea! Should be easy to do. I will do it within a week.
P.S. If you want, send your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants