Skip to content
/ scraper Public

A poc for scraping website contents and converting them into vector representation using Vertex AI

Notifications You must be signed in to change notification settings

0x32e/scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scraper

What

This tool scrapes a webpage, extracts its contents, and convert them into vector representation using Vertex AI's embeddings api by Google.

Why

I wanted to add a web scraping capability to my GPT-4-based Slack bot. The embedding api doesn't have to be the one by Vertex AI. I chose it because I haven't tried it before.

TODO

  • Split the text into chunks before converting them into vector data
  • Upsert the vector data into a vector db (e.g., Qdrant, Chroma)

About

A poc for scraping website contents and converting them into vector representation using Vertex AI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages