Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

woob.browser.exceptions.ClientError: 403 Client Error: Forbidden #296

Open
GregAce opened this issue Nov 30, 2021 · 23 comments
Open

woob.browser.exceptions.ClientError: 403 Client Error: Forbidden #296

GregAce opened this issue Nov 30, 2021 · 23 comments

Comments

@GregAce
Copy link

GregAce commented Nov 30, 2021

I have this new error every time i start the script.
Maybe an issue with 2FA ?

@rbignon
Copy link
Owner

rbignon commented Nov 30, 2021

Hi,

What is the content of ~/.local/share/doctoshotgun/state.json? (you can obfuscate cookies)

Try to remove the file and run doctoshotgun again.

@GregAce
Copy link
Author

GregAce commented Nov 30, 2021

Hi,
i removed the file, but it's not better.

state.json seems to be populated with the cookies:

{"cookies": "eJwlj1tT...bQvGTjTqWPoLk+9k6N/0dp+IJ2BsH+w5s/urajLmiomtcgbZ03KG0qS8Y6JpY15IfxEQIfTnokXnPX4wvjcEoa7i/gv1RYK+Pn8BcrBToI="}

@rbignon
Copy link
Owner

rbignon commented Nov 30, 2021

Can you run doctoshotgun with -d and paste all the output?

@GregAce
Copy link
Author

GregAce commented Nov 30, 2021

here is the output :

2021-11-30 11:16:05,776:DEBUG:browser::browsers.py:1056:_load_cookies Reloaded cookies from storage
2021-11-30 11:16:05,786:DEBUG:urllib3.connectionpool::connectionpool.py:971:_new_conn Starting new HTTPS connection (1): www.doctolib.fr:443
2021-11-30 11:16:05,973:DEBUG:urllib3.connectionpool::connectionpool.py:452:_make_request https://www.doctolib.fr:443 "GET /sessions/new HTTP/1.1" 403 None
2021-11-30 11:16:06,021:INFO:browser::browsers.py:369:save_response Response saved to dc0ba7839d3146fea77171cef3631eaf
2021-11-30 11:16:06,023:DEBUG:browser::browsers.py:1095:dump_state Stored cookies into storage
Traceback (most recent call last):
File "G:\Users\Greg\Desktop\Script\doctoshotgun.py", line 908, in
sys.exit(Application().main())
File "G:\Users\Greg\Desktop\Script\doctoshotgun.py", line 726, in main
if not docto.do_login(args.code):
File "G:\Users\Greg\Desktop\Script\doctoshotgun.py", line 278, in do_login
self.open(self.BASEURL + '/sessions/new')
File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 898, in open
return super(PagesBrowser, self).open(callback=internal_callback, *args, **kwargs)
File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 790, in open
return super(DomainBrowser, self).open(req, *args, **kwargs)
File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 531, in open
response = self.session.send(preq,
File "G:\Users\Greg\Desktop\Script\doctoshotgun.py", line 78, in send
return callback(self, resp)
File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 527, in inner_callback
self.raise_for_status(response)
File "G:\Users\Greg\AppData\Local\Programs\Python\Python39\lib\site-packages\woob\browser\browsers.py", line 560, in raise_for_status
raise ClientError(http_error_msg, response=response)
woob.browser.exceptions.ClientError: 403 Client Error: Forbidden

@GregAce
Copy link
Author

GregAce commented Nov 30, 2021

when connecting with a new browser on doctolib url, there is a puzzle captcha to access the site.

@rbignon
Copy link
Owner

rbignon commented Nov 30, 2021

wtf

@samuelguesnier
Copy link

samuelguesnier commented Nov 30, 2021

I get the same error

├╴ Spécifiez votre situation: 1
│ 1 dose de rappel
│ 2 complétion du schéma vaccinal (personnes immunodéprimées uniquement)
├╴ Spécifiez votre situation: 1
An unexpected exception of type ClientError occurred. Arguments:
('403 Client Error: Forbidden',)

@samuelguesnier
Copy link

samuelguesnier commented Nov 30, 2021

It just worked, maybe the cookie was not valid anymore

@GregAce
Copy link
Author

GregAce commented Nov 30, 2021

Changing my @ip with a VPN, captcha is no more requested, and the script is working again. So I guess a captcha protection is activated when doing to much requests to doctolib.

@GregAce GregAce closed this as completed Nov 30, 2021
@seranpion
Copy link
Contributor

Hello. Commenting even after issue closure.

I am running in the same issue, except that I don't have a VPN at hand.
Removing the state.json cache file does not help.

Strangely, I can successfully log into the same account and use the doctolib web portal with a browser.
I only ran into a captcha verification once (was both using the script and the browser).

I hope this is just a temporary ban, and that the cooldown is not too long.

@rbignon
Copy link
Owner

rbignon commented Dec 1, 2021

Unfortunately I can't reproduce. I guess once you resolved the captcha, a cookie is set to prove you are not a bot.

What kind of captcha is it? If possible the best thing would be to redirect you in the browser when it occurs to let you resolve the captcha by hand, and enter a callback uri or something like that in doctoshotgun.

@GregAce
Copy link
Author

GregAce commented Dec 1, 2021

I still have the captcha when opening doctolib in a new browser (even in the phone app). The protection seems to be activated on my @ip.
I agree with the workaround you propose.
Here is the captcha :
humain
image

@GregAce
Copy link
Author

GregAce commented Dec 1, 2021

Strangely, I can successfully log into the same account and use the doctolib web portal with a browser.
I only ran into a captcha verification once (was both using the script and the browser).

if you open doctolib in a new private browsing window, the captcha will be required each time.

@seranpion
Copy link
Contributor

I had the same captcha challenge.

if you open doctolib in a new private browsing window, the captcha will be required each time.

I suppose once an IP is suspicious, any connection without a "human" cookie is challenged.
That would make sense.

If possible the best thing would be to redirect you in the browser when it occurs to let you resolve the captcha by hand, and enter a callback uri or something like that in doctoshotgun.

Another workaround would be to transfer whatever cookies attest the captcha-challenge success to the script store.
But that's much more complicated.

@seranpion
Copy link
Contributor

I just looked at the captcha wall, no apparent redirection.
It's a frame on the portal, with captcha-clearing cookies at the end.

This feature seem specifically designed to prevent what this project is doing.

I analysed the challenge and found details by other people's experience here that it is going to be a tricky issue (emphasis mine):

The new method is POST to ?cf_chl_captcha_tk=GENERATED_TOKEN. It hands a cf_clearance cookie, allowing the user to bypass captcha, to the accepted device and, as usual, a __cfuid cookie stating the CloudFlare visitor id. The cf_clearance expires 1 day after the cookie was given and is valid for over 1k requests or until CloudFlare forces you to captcha again.

There do is a request at the end that yields a cf_clearance token ("cf" as CloudFlare of course), which could be retrieved by a tech-savvy user.
It's on the POST /sessions/new?__cf_chl_captcha_tk__=<some_68_char_code> post-challenge request in browser.

I don't know if the 1k request limit is going to pose an issue.
The script seem to make only one persistent connection per run, though.

@rbignon as you suggested, the script could do the following:

  1. Request the user to log into and out of the doctolib web portal in his browser (same IP) if not already done.
  2. Prompt the user for the cf_clearance cookie from his browser storage.
  3. Set that cookie on the session before attempting logins.

Maybe that issue should be re-opened until this is implemented?

@GregAce GregAce reopened this Dec 1, 2021
@GregAce
Copy link
Author

GregAce commented Dec 1, 2021

i reopen the issue

@rbignon
Copy link
Owner

rbignon commented Dec 1, 2021

In woob we support anticaptcha, I'll try to use it in doctoshotgun, but it requires to subscribe to the service.

However, what you suggest seems fine, but I don't get captcha here, can you try to do a PR?

@seranpion
Copy link
Contributor

However, what you suggest seems fine, but I don't get captcha here, can you try to do a PR?

I got a captcha challenge by running the script while already logged and active in my browser.
Maybe you can trigger it too.

But otherwise, I'll try to implement this soon and send a PR if someone does not beat me to it.

Side note: The conversation about CloudFlare challenges I linked to refers to a "Privacy Pass" feature.

Privacy Pass is a Chrome and Firefox browser extension that provides a better visitor experience for Cloudflare-protected websites. For instance, a visitor IP address with poor reputation may receive a Cloudflare captcha page before gaining access to a Cloudflare-protected website. After a single captcha page is solved, Privacy Pass generates tokens for use with Cloudflare websites to prevent frequent captcha. Privacy Pass generates 30 tokens for each solved captcha.

Privacy Pass allows a user to bypass CAPTCHAs.

I don't know if that can be of any use in that matter.

@seranpion
Copy link
Contributor

OMG, I just took a look at that anti-captcha website…
This screams of black-hat business.
I'd recommend never relying on that kind of service.

[…] by using our service you are helping thousands of people to feed themselves and their families.

-__-

@seranpion
Copy link
Contributor

It turns out the CF challenge bypass is much trickier than I hoped (more reseach).

I have commited some work on my fork branch.
But as it is now it doesn't work.

I can reproduce the issue by using the tor network (quite some bad reputation IPs there).
But sometimes the challenge would only appear in browser, sometimes only with the script.
And recycling the cf_clearance token is not enough when it does (more cookies required?).

So I don't see any solution for the moment.
Work around by using a proxy or wait to be un-blacklisted by CloudFlare.

@seranpion
Copy link
Contributor

FYI @rbignon I have noticed one weird thing:

When using the https_proxy env variable, the login & 2FA codes seem not to go through the proxy.
I noticed this when I left the variable set but the proxy was down.
The auth worked but right after that I had a NewConnectionError.

I suppose it's a issue of its own, but I'm unsure of my analysis.

@rbignon
Copy link
Owner

rbignon commented Dec 7, 2021

Hm, we use the cloudscraper library, perhaps there is a link.

@sly-net
Copy link

sly-net commented Dec 12, 2021

Maybe Cloudflare performs TLS fingerprinting?
There's a very interesting article about this: https://httptoolkit.tech/blog/tls-fingerprinting-node-js/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants