Talking Youtube

An AI-powered chat system that allows users to interact with YouTube video content through natural language queries. Built using LangChain and Google's Gemini model.

Features

Extract transcripts from YouTube videos
Process and chunk video transcripts
Generate embeddings using Google's Gemini model
Create semantic search capabilities using FAISS vector store
Interactive Q&A with video content
Multi-query retrieval for better context understanding

Prerequisites

Python 3.8+
Google API Key (for Gemini model)
Virtual Environment (recommended)

Installation

Clone the repository:

git clone <repository-url>
cd TalkingYoutube

Create and activate a virtual environment:

python -m venv venv
# On Windows
.\venv\Scripts\activate
# On Unix or MacOS
source venv/bin/activate

Install required packages:

pip install -r requirements.txt

Create a .env file in the root directory and add your Google API key:

GOOGLE_API_KEY=your_api_key_here

Project Structure

TalkingYoutube/
├── assistant/
│   └── prompt.py           # AI assistant prompt templates
├── indexing/
│   ├── document_load.py    # YouTube transcript fetching
│   ├── embedder.py         # Document embedding generation
│   └── splitter.py         # Text splitting utilities
├── utils/
│   ├── formatter.py        # Context formatting
│   └── parser.py          # Output parsing
├── .env                    # Environment variables
├── main.py                # Main application
└── requirements.txt       # Project dependencies

Usage

Run the main script with a YouTube video ID:

python main.py

The system will:
- Fetch the video transcript
- Split it into manageable chunks
- Generate embeddings
- Create a vector store
- Allow you to ask questions about the video content

Example

question = "What is vectorization?"
answer = ask.invoke(question)
print(answer)

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Open a Pull Request

Acknowledgments

LangChain for the framework
Google's Gemini model for embeddings and chat
FAISS for vector storage
YouTube Transcript API for caption extraction

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
assistant		assistant
indexing		indexing
utils		utils
.gitignore		.gitignore
Readme.md		Readme.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Talking Youtube

Features

Prerequisites

Installation

Project Structure

Usage

Example

Contributing

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

beingadish/TalkingYT

Folders and files

Latest commit

History

Repository files navigation

Talking Youtube

Features

Prerequisites

Installation

Project Structure

Usage

Example

Contributing

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages