🎯
Focusing
learning
Stars
tool
7 repositories
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/mEkkMXFG
Toolkit for linearizing PDFs for LLM datasets/training