Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud run deployment doesn't start crawling #506

Open
infctr opened this issue Oct 31, 2023 · 1 comment
Open

Cloud run deployment doesn't start crawling #506

infctr opened this issue Oct 31, 2023 · 1 comment

Comments

@infctr
Copy link

infctr commented Oct 31, 2023

Hey! First of all many thanks for keeping this project updated and well alive!

I'm having trouble running a Google Cloud Run job on latest main. The job starts with the following config

Settings from config: {"captcha_enabled": false, "captcha_driver_arguments": 
["--no-sandbox", "--headless", "--disable-gpu", "--remote-debugging-port=9222", 
"--disable-dev-shm-usage", "--window-size=1024,768"], "captcha_solver": "NoneType", 
"imagetyperz_token": null, "twocaptcha_key": null, "mattermost_webhook_url": null, 
"notifiers": ["telegram"], "slack_webhook_url": "", "telegram_receiver_ids": [****], 
"telegram_bot_token": "580xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxzwU", 
"target_urls": ["https://kleinanzeigen.de/s-wohnung-mieten/****"], "use_proxy": false}

and then immediately exits with Container called exit(0).

I have added FLATHUNTER_VERBOSE_LOG=1 to env variables, but there are no additional log messages. What am I missing from setup?

@codders
Copy link

codders commented Nov 1, 2023

Hi @infctr,

No worries - happy to keep it ticking along :) I don't know exactly how your docker image is configured, but the line after configure_logging (which prints Settings from config:) is init_searchers which initialises the crawlers. My guess would be that the initialisation of the Immobilienscout crawler triggers Chrome-related code (downloads the undetected-chromedriver, tries to connect to the browser), and that causes a crash.

In your config, you have "captcha_enabled" as false, but you anyway supply "captcha_driver_arguments". If you're not crawling immoscount, maybe drop the "captcha_driver_arguments" entirely. And if you comment out the Immobilienscout initialisation (

Immobilienscout(self),
), you might find that it just starts normally. That would be a good hint for further debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants