parallel_aria2 is a small command-line helper that enables aria2c to download every file under a browsable HTTP/HTTPS URL tree without you having to manually build a URL list first.
It works by:
- Crawling the remote directory with
wgetin spider mode. - Automatically generating an
aria2cinput file from the discovered URLs. - Preserving the remote folder structure under a local output directory.
- Feeding that list into
aria2cfor fast, parallel downloads.
- Turn a directory-style HTTP/HTTPS URL into a full
aria2cdownload job. - Automatically discover all files reachable via
wgetrecursion. - Preserve the remote subdirectory structure under a local download root.
- Supports both HTTP Basic Authentication and anonymous (no-auth) access.
- Any extra CLI arguments after the URL are passed directly to
aria2c. - Simple, self-contained Bash script with built-in
--help.
Check that both tools are installed and available in your PATH:
wget --version
aria2c --versionClone the repository and make the script executable:
git clone https://github.com/<your-username>/parallel_aria2.git
cd parallel_aria2
chmod +x parallel_aria2Optionally, move it somewhere on your PATH:
sudo mv parallel_aria2 /usr/local/bin/Basic syntax:
parallel_aria2 <username> <password> <URL> [aria2c-extra-args...]username– HTTP basic auth username, or-for anonymous mode.password– HTTP basic auth password, or-for anonymous mode.URL– Root HTTP/HTTPS URL to crawl (directory/folder listing).aria2c-extra-args– Any additional options you want to pass directly toaria2c.
To see help:
parallel_aria2 --helpIf your server requires HTTP Basic Authentication, pass your credentials as usual:
parallel_aria2 myuser mypass "https://secure.example.com/data/"The script will:
- Provide
--user/--passwordtowget, and - Provide
--http-user/--http-passwdtoaria2c.
If the URL is anonymously accessible and you do not want to send any Authorization headers at all, use - for both username and password:
parallel_aria2 "-" "-" "https://example.com/public/"In this mode, the script omits all auth options for both wget and aria2c. The server must allow public access to the content under the given URL.
You can still pass additional options to aria2c:
parallel_aria2 "-" "-" "https://example.com/public/" \
--max-tries=3 \
--max-overall-download-limit=5Mparallel_aria2 myuser mypass "https://example.com/data/"This will:
- Crawl
https://example.com/data/recursively withwget. - Generate an
aria2cinput file (default:downloadlist.txt) that includes directory hints. - Download all discovered files in parallel, recreating the remote folder structure locally.
parallel_aria2 "-" "-" "https://example.com/public/data/"No authentication headers will be sent; all files under https://example.com/public/data/ that wget can see will be downloaded with aria2c using parallel transfers.
parallel_aria2 myuser mypass "https://example.com/data/" \
--max-tries=3 \
--max-overall-download-limit=2MAll arguments after the URL are passed straight to aria2c.
DOWNLOAD_LIST_FILE=myfiles.txt \
DOWNLOAD_ROOT_DIR="downloads" \
parallel_aria2 myuser mypass "https://example.com/data/"- Discovered URLs will be written to
myfiles.txt. - All files will be downloaded under the
downloads/directory, keeping the remote subdirectory layout.
parallel_aria2 uses wget in spider mode to crawl the URL tree:
-r– recursive-np– no parent-nH– disable host-prefixed directories--cut-dirs=1– drop one leading directory component (tune as needed)--reject "index.html*"– skip index pages--spider– check only, do not download files
The script parses wget's output to extract the discovered URLs.
The script feeds an aria2-compatible input file to aria2c and uses these default options:
-x8– up to 8 connections per server-j12– up to 12 parallel downloads-c– continue partial downloads-m0– infinite retries-V– show console summary--http-user,--http-passwd– only when you provide real credentials (not-/-)-i <download list>– input file containing URLs and per-URL options
You can override or extend aria2c behavior by passing extra command-line arguments after the URL.
A key goal of parallel_aria2 is to keep the same folder layout locally as you have under the starting URL.
Internally, the script:
- Normalizes the path portion of the starting
URL. - Normalizes the path of each discovered URL.
- Computes a relative path for each file (remote path minus the base path).
- Splits that relative path into:
- a directory component, and
- a file name.
- Writes each URL to the input file along with a
dir=...directive foraria2c, so that:- Subdirectories are recreated under the chosen download root.
- Files appear in the correct relative folders.
If you specify DOWNLOAD_ROOT_DIR, that directory becomes the top-level location under which the entire mirrored tree is created.
-
DOWNLOAD_LIST_FILE
Name/path of the generated aria2c input file.
Default:downloadlist.txt -
DOWNLOAD_ROOT_DIR
Root directory under which the remote folder structure is recreated.
Default: current working directory (.)
Example:
DOWNLOAD_LIST_FILE="aria_downloads.txt" \
DOWNLOAD_ROOT_DIR="/data/mirror" \
parallel_aria2 "-" "-" "https://example.com/public/data/"- This script assumes:
- Either HTTP Basic Authentication (username/password) or completely anonymous access.
- Directory-style listings that
wgetcan parse and crawl.
- It does not:
- Handle complex web apps that require JavaScript or dynamic navigation.
- Manage cookies or more advanced authentication flows out of the box.
- URL discovery is controlled by the
wgetoptions inside the script. If your server layout is different, you may need to adjust:--cut-dirs--rejectpatterns- Recursion depth (
-l)
MIT License
Copyright (c) 2025 Farshad Farshidfar [email protected]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.