Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIxing HTTP 406 error while runing the test suite (npm test) #96

Closed
wants to merge 7 commits into from
Closed

FIxing HTTP 406 error while runing the test suite (npm test) #96

wants to merge 7 commits into from

Conversation

Jacobojijo
Copy link
Contributor

This is to fix issue #95

Solution

I modified the scraping.js file to use more browser-like User-Agent and Accept headers. Here are the key changes:

  1. Added constants for user agent and accept header:

    const userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36';
    const acceptHeader = 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8';
  • Created a getWithHeaders function to make requests with these headers:

    function getWithHeaders(url) {
        return preq.get({
            uri: url,
            headers: {
                'User-Agent': userAgent,
                'Accept': acceptHeader
            }
        });
    }
  • Updated all preq.get() calls to use getWithHeaders() instead.

  • Modified the meta() function calls to include the headers:

    return meta({
        uri: url,
        headers: {
            'User-Agent': userAgent,
            'Accept': acceptHeader
        }
    })

This should improve the reliability of the test suite, especially when dealing with websites that have stricter requirements for incoming requests.

package.json Outdated
@@ -13,7 +13,8 @@
"istanbul": "^0.4.5",
"mocha": "^10.3.0",
"mocha-jshint": "^2.3.1",
"mocha-lcov-reporter": "^1.3.0"
"mocha-lcov-reporter": "^1.3.0",
"nock": "^13.3.0"
Copy link
Collaborator

@mvolz mvolz Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's an extra space here

});
});
});
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stick to tabs please. (Probably this repo needs some linting added!)

Copy link
Collaborator

@mvolz mvolz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some style stuff (we should add linting so that's automated, sorry!)

@Jacobojijo
Copy link
Contributor Author

Hello @mvolz, I have made the requested changes to the pull request. You can check and if there is still anything I should do, please inform me

@Jacobojijo
Copy link
Contributor Author

Hello @mvolz would you please do the PR review for the changes made ?

package.json Outdated
@@ -51,4 +52,4 @@
"url": "https://github.com/wikimedia/html-metadata/issues"
},
"homepage": "https://github.com/wikimedia/html-metadata"
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The red circle with the line through it means this file is missing a new line character.
https://stackoverflow.com/questions/729692/why-should-text-files-end-with-a-newline

@Jacobojijo
Copy link
Contributor Author

@mvolz, can you review my PR? I also think a styling lint should be added to help in other future changes. I'm thinking of introducing one to this project. What do you think?

@Jacobojijo Jacobojijo mentioned this pull request Jul 18, 2024
@Jacobojijo
Copy link
Contributor Author

@mvolz, I guess this PR is done

@Jacobojijo Jacobojijo closed this by deleting the head repository Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants