x-tweet-url-scraper

A Node.js script to scrape tweets from a list of URLs and save the content to a Markdown file. This script uses Puppeteer for web scraping and displays a progress bar during the scraping process.

Features

Reads URLs from a file
Validates URLs to ensure they start with https://x.com/
Scrapes tweet content and user information
Saves the scraped content to a Markdown file
Displays a progress bar during the scraping process

Prerequisites

Node.js (v12 or higher)
npm (Node Package Manager)

Installation

Clone the repository:

git clone https://github.com/your-username/x-tweet-scraper.git
cd x-tweet-scraper

Install the dependencies:
```
npm install
```

Usage

Prepare a text file containing the list of tweet URLs (one URL per line). Ensure that the URLs start with https://x.com/.
Run the script with the input file and output file as arguments:
```
node scraper.js input.txt output.md
```
- input.txt: Path to the file containing tweet URLs.
- output.md: Path to the output Markdown file where the scraped content will be saved.

Dependencies

puppeteer: For web scraping.
fs: For file system operations.
progress: For displaying a progress bar.

Example

node scraper.js urls.txt tweets.md

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
scraper.js		scraper.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

x-tweet-url-scraper

Features

Prerequisites

Installation

Usage

Dependencies

Example

About

Releases

Packages

Languages

License

noelje/x-tweet-url-scraper

Folders and files

Latest commit

History

Repository files navigation

x-tweet-url-scraper

Features

Prerequisites

Installation

Usage

Dependencies

Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages