Skip to content

Scraping in google returns no results (chinese queries) #81

@richardzhu64

Description

@richardzhu64

Hello,

I'm working on a project where I am trying to search google with a few keywords in Chinese. Currently, there are no error messages being shown in the terminal, and the return value shows "no_results": false - however, there are no results and the num_results is empty.

My scraping scripti is here:

`const se_scraper = require('se-scraper');

(async () => {
/*
let browser_config = {
test_evasion: true,
output_file: 'scraped_urls_manual_google.json',
search_engine: "baidu",
sleep_range: "",//[70,150],
random_user_agent: true,
debug: true,
keywords: clean_terms
}

se_scraper.scrape(browser_config, (err, response) => {
    if (err) { console.error(err) }
 
    /* response object has the following properties:
 
        response.results - json object with the scraping results
        response.metadata - json object with metadata information
        response.statusCode - status code of the scraping process
 
    console.dir(response.results, {depth: null, colors: true});
});
*/
let browser_config = {
    test_evasion: true,
    debug_level: 1,
    output_file: 'scraped_urls_manual_google.json',
    log_http_headers: false,
    log_ip_address: true,
    // whether to prevent images, css, fonts and media from being loaded
    block_assets: true,
    sleep_range: "",//[70,150],
    random_user_agent: true,
    apply_evasion_techniques: true,
    debug: true,
};

let scrape_job = {
    search_engine: 'google',
    keywords: clean_terms,
    num_pages: 1,
    // add some cool google search settings
    google_settings: {
        gl: 'hk', // The gl parameter determines the Google country to use for the query.
        hl: 'en', // The hl parameter determines the Google UI language to return results.
        start: 0, // Determines the results offset to use, defaults to 0.
        num: 10, // Determines the number of results to show, defaults to 10. Maximum is 100.
    },
};

var scraper = new se_scraper.ScrapeManager(browser_config);

await scraper.start();

var results = await scraper.scrape(scrape_job);

console.dir(results, {depth: null, colors: true});

await scraper.quit();

})();`

I get the following output when running the script:

{ "\"阿尔及利亚TAMANRASSET至In Salah饮用水供给工程项下水井项目 水井项目子项目3:2眼水井及35眼观测井的成井\"": { "1": { "num_results": "", "no_results": false, "effective_query": "", "right_info": {}, "results": [], "top_products": [], "right_products": [], "top_ads": [], "bottom_ads": [], "places": [], "time": "Tue, 06 Oct 2020 04:28:36 GMT" } }, "\"突尼斯斯菲西发水坝\"": { "1": { "num_results": "", "no_results": false, "effective_query": "", "right_info": {}, "results": [], "top_products": [], "right_products": [], "top_ads": [], "bottom_ads": [], "places": [], "time": "Tue, 06 Oct 2020 04:30:13 GMT" } }, "\"苏丹石油1/2/4区CANNER地面设施及长输管道项目\"": { "1": { "num_results": "", "no_results": false, "effective_query": "", "right_info": {}, "results": [], "top_products": [], "right_products": [], "top_ads": [], "bottom_ads": [], "places": [], "time": "Tue, 06 Oct 2020 04:32:35 GMT" } }, "\"阿尔及尔BAB EZZOUAR 1号地1577套住宅商业服务层部分的设计与施工\"": { "1": { "num_results": "", "no_results": false, "effective_query": "", "right_info": {}, "results": [], "top_products": [], "right_products": [], "top_ads": [], "bottom_ads": [], "places": [], "time": "Tue, 06 Oct 2020 04:34:47 GMT" } } }

I've also attached an image of the console while my script is running. Do we know how to resolve this issue where there aren't any apparent errors, but there are no results or links returned for when scraping Google? Previously I was able to do this successfully, but running it again seems to have broken something.

Thanks!
console scraping

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions