Extract Tripadvisor reviews from a specific page with Google Colab #17

biagioscalingipsy · 2023-07-10T09:06:58Z

Hi Giuseppe!
I premise that I am a very novice user of Python, and for the moment, I am using Google Colab to perform some operations. In particular, I am trying to extract the reviews on TripAdvisor at this link:(https://www.tripadvisor.it/Attraction_Review-g2173026-d8059630-Reviews-Bungee_Jumping_Asiago_Enego_Foza_175_metri-Foza_Province_of_Vicenza_Veneto.html).

I tried several attempts using BeautifulSoup:
import requests
from bs4 import BeautifulSoup as soup

import requests
from bs4 import BeautifulSoup as soup

URL della pagina di TripAdvisor

url = 'https://www.tripadvisor.it/Attraction_Review-g2173026-d8059630-Reviews-Bungee_Jumping_Asiago_Enego_Foza_175_metri-Foza_Province_of_Vicenza_Veneto.html'

Effettua la richiesta HTTP per ottenere il contenuto della pagina

html = requests.get(url)
bsobj = soup(html.content, 'html.parser')

Trova tutti i tag 'q' che contengono le recensioni

reviews = []
for r in bsobj.findAll('q'):
reviews.append(r.span.text.strip())
print(r.span.text.strip())

Stampa le recensioni estratte

for review in reviews:
print(review)`

The code seems to work, but the runtime is too long and eventually crashes because of a large idle time on Colab (I even tried inserting an automatic click to avoid the timeout, but it doesn't work).

After that, I tried following your script but when I run:
driver = webdriver.Safari()
I get this error:
"Exception: SafariDriver was not found; are you using Safari 10 or later? You can download Safari from https://developer.apple.com/safari/download/".

The point is that I have the latest version of Safari (version 16.5.1), and I also checked the Safari Development section "Allow remote automation". How do you think I can download the reviews into a txt file or put them into a dataframe?

Thank you in advance.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract Tripadvisor reviews from a specific page with Google Colab #17

Extract Tripadvisor reviews from a specific page with Google Colab #17

biagioscalingipsy commented Jul 10, 2023

Extract Tripadvisor reviews from a specific page with Google Colab #17

Extract Tripadvisor reviews from a specific page with Google Colab #17

Comments

biagioscalingipsy commented Jul 10, 2023

URL della pagina di TripAdvisor

Effettua la richiesta HTTP per ottenere il contenuto della pagina

Trova tutti i tag 'q' che contengono le recensioni

Stampa le recensioni estratte