git clone https://github.com/Strykez/fastscrape.git
./main.py
cp ./main.py fastscrape
chmod +x fastscrape
mv fastscrape /bin
fastscrape
- Make a bin folder inside your User's folder
- Copy the main.py script in it and remove it's extension
- Rename the main file as the name you want the command to have (In this case fastscrape)
- Type path in Windows search bar and hit enter
- Add the folder in the path as per this gif:
███████╗ █████╗ ███████╗████████╗███████╗ ██████╗██████╗ █████╗ ██████╗ ███████╗
██╔════╝██╔══██╗██╔════╝╚══██╔══╝██╔════╝██╔════╝██╔══██╗██╔══██╗██╔══██╗██╔════╝
█████╗ ███████║███████╗ ██║ ███████╗██║ ██████╔╝███████║██████╔╝█████╗
██╔══╝ ██╔══██║╚════██║ ██║ ╚════██║██║ ██╔══██╗██╔══██║██╔═══╝ ██╔══╝
██║ ██║ ██║███████║ ██║ ███████║╚██████╗██║ ██║██║ ██║██║ ███████╗
╚═╝ ╚═╝ ╚═╝╚══════╝ ╚═╝ ╚══════╝ ╚═════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚══════╝
V 0.7
Made by Strykez
options:
-h, --help, show this help message and exits
-m, -man, --manual
-u, --url sets the script's URL
-s, --selector the selector string used in the script
selector format --> Column_name:selector.class/another_selector.another_class
If column name is empty, it will append to the current column, else it will create a new column and append
the data to it
Examples: Titles:div.card/div.first_half/p.title --> Gets all the instances of p.title in the specified path
Titles:p.title --> Gets all the instances of p.title in the page
p.title --> If you do not want a column name
-o, --output the path where you want the results to be saved in .csv format (creates the directory/ies if necessary)
if left blank it will print the selected elements to the terminal
-v, --verbose displays more information about the steps performed in the script
NOTE: Put the verbose argument as the last argument because putting it ahead can make the script crash
- The script requires a valid URL and a valid selector to work.
- The verbose argument must be put last in order for the command to work.
- If no output argument is given it will print the requested code in the console
- You can give a specific path as an output argument, such as: Desktop/myfolder/results.csv
- You can give a specific path as a selector argument. For example: -s div.product_container/div.desc/p
- You can add columns to the .csv file to make it more easily readable in Excel. Example: -s Price:div.product_info/p.price
- If the path does not exist, the program will create it
- If no Column Name is detected in the selector, it will append the result to the last column created
- If another column exists in the .csv file, it will append the result in a different one
Using QuotesToScrape website as a dummy example.
Extracting all the elements with a specific selector and class (in this example all quotes) and outputting into a folder:
./main.py --url https://quotes.toscrape.com/ --selector span.text -o Desktop/some_folder/quotes.csv
./main.py --url https://quotes.toscrape.com/ --selector div.col-md-8/div.quote/span.text -o Desktop/some_folder/quotes.csv
./main.py --url https://quotes.toscrape.com/ --selector Quotes:div.col-md-8/div.quote/span.text -o Desktop/some_folder/quotes.csv
./main.py --url https://quotes.toscrape.com/page/2/ --selector div.col-md-8/div.quote/span.text -o Desktop/some_folder/quotes.csv
./main.py --url https://quotes.toscrape.com/page/3/ --selector Other_Quotes:div.col-md-8/div.quote/span.text -o Desktop/some_folder/quotes.csv
Feel free to submit issues with bugs that need fixing or with new features that you wish to be added.
- Email: [email protected]
- Discord: Roshy#5849
- Twitter: @strykez_dev