Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added request.cookies command to fix file downloads #37

Merged
merged 2 commits into from
Dec 7, 2020

Conversation

JBou
Copy link
Contributor

@JBou JBou commented Nov 21, 2020

Added request.cookies command to only fetch the cookies without response/content.

This fixes file downloads using external libraries (for example CloudProxySharp).

It's working fine as of now.

The only thing missing is handling captchas, haven't tested it yet and don't know how it behaves.
Would like this PR to get merged, maybe someone else could try out with captcha-harvester. I'm not using it.

@JBou
Copy link
Contributor Author

JBou commented Nov 21, 2020

This is also related to #20 and should make life easier for @lululombard.
It doesn't return the file as response or serialized as json (as described here), but it returns valid cookies for the other client or library to use, without blocking the response as before, and the the client/library can handle the download itself and in a nativ way (without workarounds).

We are intercepting the response here (using Chrome DevTools Fetch API before the response-content is send to the browser, but after the cookies/response-headers have been received.

Copy link
Contributor

@lululombard lululombard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR! While I found a workaround for my solution (using external links instead of file downloads to escalate my tokens), this would allow more websites to be compatible to bypass CloudFlare challenges

src/routes.ts Outdated Show resolved Hide resolved
src/routes.ts Show resolved Hide resolved
@NoahCardoza
Copy link
Owner

I'll take a look at this on Monday, it's currently Saturday for me and I'm adding the finishing touches to a hackathon project 💀

…ponse/content, fixes file downloads using external libraries
// TODO: find out why these pages hang sometimes
while (Date.now() - ctx.startTimestamp < maxTimeout) {
await page.waitFor(1000)
try {
// catch exception timeout in waitForNavigation
await page.waitForNavigation({ waitUntil: 'domcontentloaded', timeout: 5000 })
response = await page.waitForNavigation({ waitUntil: 'domcontentloaded', timeout: 5000 })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this assignment the response status and the response headers when using request.get are not correct. They remain 503 (assigned here). The page content, instead, is returned correctly, because it doesn't depend on the response variable, but is being retrieved from the page directly

@@ -11,7 +11,8 @@
"es2015", "dom"
],
"module": "commonjs",
"outDir": "dist"
"outDir": "dist",
"sourceMap": true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this line to enable break points in the source files (*.ts) as described here, to make debugging work

Comment on lines +255 to +269
let interceptingResult: ChallengeResolutionT;
if (returnOnlyCookies) { //If we just want to get the cookies, intercept the response before we get the content/body (just cookies and headers)
await interceptResponse(page, async function(payload){
interceptingResult = payload;
});
}

// submit captcha response
challengeForm.evaluate((e: HTMLFormElement) => e.submit())
response = await page.waitForNavigation({ waitUntil: 'domcontentloaded' })

if (returnOnlyCookies && interceptingResult) {
await page.close();
return interceptingResult;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the same logic as above to the captcha response, it should work here too, but wasn't able to test it because I'm not using captcha-harvester. Needs to be confirmed to work.

@NoahCardoza
Copy link
Owner

Re: #31 (comment)

Since I don't have much time but wan't want to be the bottleneck, I'm just going to merge this. I just went over it and it seems all good so let's hope for the best.

I'll take a more in-depth look this weekend hopefully.

@NoahCardoza NoahCardoza merged commit efc8b4a into NoahCardoza:master Dec 7, 2020
@JBou
Copy link
Contributor Author

JBou commented Dec 7, 2020

Thanks a lot!

The only thing left is to add documentation about this endpoint in the readme.

@JBou JBou deleted the feature-justcookies branch July 21, 2021 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants