Skip to content

Commit 3618981

Browse files
author
tanq
committed
fix channels, fix default to simple; fix arg and params and output name
1 parent dc16aa7 commit 3618981

File tree

6 files changed

+158
-99
lines changed

6 files changed

+158
-99
lines changed

README.md

Lines changed: 83 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,47 @@
11
<div align="center">
2-
<img src=".github/assets/logo.png" alt="Danzo Logo" width="250">
2+
<img src=".github/assets/logo.png" alt="Danzo Logo" width="300">
33

4-
<a href="https://github.com/tanq16/danzo/actions/workflows/binary.yml"><img alt="Build Binary" src="https://github.com/tanq16/danzo/actions/workflows/binary.yml/badge.svg"></a> <a href="https://github.com/Tanq16/danzo/releases"><img alt="GitHub Release" src="https://img.shields.io/github/v/release/tanq16/danzo"></a><br><br>
5-
<a href="#features">Features</a> &bull; <a href="#installation-and-usage">Install & Use</a> &bull; <a href="#tips-and-notes">Tips & Notes</a>
4+
<a href="https://github.com/tanq16/danzo/actions/workflows/binary.yml"><img alt="Build" src="https://github.com/tanq16/danzo/actions/workflows/binary.yml/badge.svg"></a> <a href="https://github.com/Tanq16/danzo/releases"><img alt="GitHub Release" src="https://img.shields.io/github/v/release/tanq16/danzo"></a><br><br>
5+
<a href="#features">Features</a> &bull; <a href="#install-and-usage">Install & Use</a> &bull; <a href="#tips-and-notes">Tips & Notes</a>
66
</div>
77

88
---
99

10-
***Danzo*** is a cross-platform and cross-architecture CLI downloader utility designed for fast parallel connections, progress tracking, and an easy to use binary. The tool aims to maximize download speeds by utilizing optimized buffer sizes and parallel processing.
10+
***Danzo*** is a cross-platform and cross-architecture CLI downloader utility designed for multi-threaded downloads, progress tracking, and an intuitive command structure. Danzo maximizes download speeds by using a large number of goroutines.
1111

12-
Yes, the name is the same as a Naruto character who has a hobby of collecting many things, reprentative of parallel connections used in this tool.
12+
*Side note - yes, the name is the same as a Naruto character with a hobby of collecting and using multiple "items", reprentative of parallel connections used in this tool.*
1313

1414
## Features
1515

16-
- Multi-connection downloads to improve speed
17-
- Automatic chunk size optimization
18-
- Real-time progress display with speed and ETA
19-
- Batch downloading with YAML configuration
20-
- Parallel downloading of multiple files
21-
- Customizable user agent and timeout settings
16+
- Multiple connection threads for high speed downloads and assembly
17+
- Temporary directory for chunk downloads
18+
- Automatic cleanup of temporary files
19+
- Manual cleanup of temporary files in case of failures
20+
- Automatic optimization of chunk size vs. threads
21+
- Direct single-threaded download preference for small chunk sizes
22+
- Fallback to single thread operation for lack of byte-range support
23+
- Automatic configuration of TCP dialer high-thread mode (>6 connection threads)
24+
- Real-time rotating progress display with average speed and ETA
25+
- Multi-worker (second threading layer) batch file downloads with a YAML config
26+
- Customizable download parameters
27+
- Custom or randomized user angent strings
28+
- Custom timeout settings
29+
- Configurable worker and connection threads (capped at 64 total)
2230
- Support for HTTP or HTTPS proxies
31+
- Configurable (optional, except for batch YAML config) output filenames
32+
- Automatic numbering of existing names for single URL downloads
33+
- Automatic output name inference from URL path
2334

2435
## Install and Use
2536

26-
### Using Binary
37+
### Release Binary (Recommended)
2738

2839
1. Download the appropriate binary for your system from the [latest release](https://github.com/tanq16/danzo/releases/latest)
29-
2. Make the binary executable (Linux/macOS) with `chmod +x danzo-*`
30-
3. Run the binary with:
40+
2. Make the binary executable (Linux/macOS) with `chmod +x danzo-*` and optionally rename to just `danzo`
41+
3. Run the binary:
3142

3243
```bash
33-
./danzo --url "https://example.com/largefile.zip" --output "./downloaded-file.zip"
44+
danzo "https://example.com/largefile.zip"
3445
```
3546

3647
### Using Go
@@ -44,10 +55,8 @@ go install github.com/tanq16/danzo@latest
4455
Or, you can build from source like so:
4556

4657
```bash
47-
git clone https://github.com/tanq16/danzo.git
48-
cd danzo
58+
git clone https://github.com/tanq16/danzo.git && cd danzo
4959
go build .
50-
./danzo --url "https://example.com/largefile.zip" --output "./downloaded-file.zip"
5160
```
5261

5362
### Command Options
@@ -65,63 +74,97 @@ Available Commands:
6574
help Help about any command
6675
6776
Flags:
68-
-c, --connections int Number of connections per download (default: CPU cores) (default 16)
77+
-c, --connections int Number of connections per download (default 4)
6978
--debug Enable debug logging
7079
-h, --help help for danzo
71-
-k, --keep-alive-timeout duration Keep-alive timeout for client (eg./ 10s, 1m, 80s; default: 90s) (default 1m30s)
80+
-k, --keep-alive-timeout duration Keep-alive timeout for client (eg. 10s, 1m, 80s) (default 1m30s)
7281
-o, --output string Output file path (required with --url/-u)
7382
-p, --proxy string HTTP/HTTPS proxy URL (e.g., proxy.example.com:8080)
74-
-t, --timeout duration Connection timeout (eg., 5s, 10m; default: 3m) (default 3m0s)
75-
-u, --url string URL to download
83+
-t, --timeout duration Connection timeout (eg. 5s, 10m) (default 3m0s)
7684
-l, --urllist string Path to YAML file containing URLs and output paths
77-
-a, --user-agent string User agent (default "Danzo/1337")
78-
-w, --workers int Number of links to download in parallel (default: 1) (default 1)
85+
-a, --user-agent string User agent (default "danzo/1337")
86+
-w, --workers int Number of links to download in parallel (default 1)
7987
8088
Use "danzo [command] --help" for more information about a command.
8189
```
8290

91+
### Basic Usage
92+
93+
The simplest way to download a file is to provide a URL directly:
94+
95+
```bash
96+
danzo https://example.com/largefile.zip
97+
```
98+
99+
The output filename will be inferred from the URL and Danzo will use 4 connection threads by default. You can also specify an output filename manually with:
100+
101+
```bash
102+
danzo https://example.com/largefile.zip -o ./path/to/file.zip
103+
```
104+
105+
> [!NOTE]
106+
> The value for `-c` can go upto `64` for a single URL. Danzo creates chunks equal to number of connections requested. Once all chunks are downloaded, they are combined into a single file. If the decided number of chunks are smaller than 20 MB, Danzo falls back to a single threaded download for that file. This number was **arbitrarily** chosen based on heuristics.
107+
108+
You can customize the number of connections to use like so:
109+
110+
```bash
111+
danzo "https://example.com/largefile.zip" -c 16
112+
```
113+
114+
> [!WARNING]
115+
> You should be careful of the disk IO as well. Multi-connection download takes disk IO, which can add to overall time before the file is ready.
116+
>
117+
> For example, a 1 GB file takes 54 seconds when using 50 connections vs. 62 seconds when using 64 connections. This is because combining 64 files takes longer than combining 50 files.
118+
>
119+
> Therefore, you need to find a balance where the number of connections maximize your network throughput without putting extra strain on disk IO. This effect is especially observable in HDDs.
120+
121+
Lastly, if a URL does not use byte-range requests (i.e., server doesn't support partial content downloads), Danzo automatically switches to a simple, single-threaded, direct download.
122+
83123
### Batch Download
84124

85-
For downloading multiple files, create a YAML file with the following format:
125+
Danzo can be provided a YAML config to allow simultaneous downloads of several URLs. Each URL in turn will use multi-threaded connection mode by default to maximize throughput. The YAML file requires following format:
86126

87127
```yaml
88128
- op: "./output1.zip"
89129
link: "https://example.com/file1.zip"
90130
- op: "./output2.zip"
91131
link: "https://example.com/file2.zip"
132+
# more entries with output path and urls...
92133
```
93134

94-
Then run as:
135+
Then run Danzo as:
95136

96137
```bash
97-
./danzo --urllist "./downloads.yaml"
138+
danzo -l config.yaml
98139
```
99140

100-
Number of workers and connections per worker can be specified as follows:
141+
The number of files being downloaded in parallel can be configured as workers (default: 1) and the number of connections would be applied per worker. Define these parameters as follows:
101142

102143
```bash
103-
./danzo -l downloads.yaml -w 3 -c 16
144+
danzo -l downloads.yaml -w 3 -c 16
104145
```
105146

106147
> [!NOTE]
107-
> Danzo caps the total number of parallel workers at 64. Specifically `# workers * # connections <= 64`. This is a sensible default to prevent overwhelming the system.
148+
> Danzo caps the total number of parallel workers at 64. Specifically `#workers * #connections <= 64`. This is a generous default to prevent overwhelming the system.
108149
109150
### Cleaning Temporary Files
110151

111-
Danzo stores partial downloads on disk in the `.danzo-temp` directory (situated in the same path as the associated output path). If a download event is interrupted or failed, the temporary files can be cleared by specifying the output path like so:
152+
Danzo stores partial downloads on disk in the `.danzo-temp` directory (situated in the same path as the associated output path). If a download event is interrupted or failed, the temporary files can be cleared by using the `clean` command:
112153

113154
```bash
114-
./danzo clean --output "./downloaded-file.zip"
155+
danzo clean -o "./path/for/download.zip"
115156
```
116157

158+
For batch downloads, you may need to run the clean command for each output path individually if they don't share the same parent directory.
159+
117160
## Tips and Notes
118161

119-
- For optimal download speeds, the number of connections is automatically set to match your CPU cores, but you can adjust this with the -c flag
120-
- Large files benefit the most from multiple connections
121-
- If a download fails, Danzo will retry individual chunks up to 5 times
162+
- Large files benefit the most from multiple connections, but also add to disk IO. Be mindful of the balance between network and disk IO.
163+
- If a chunk download fails, Danzo will retry individual chunks up to 5 times.
122164
- For downloading through a proxy, use the `--proxy` or `-p` flag with your proxy URL (you needn't provide the HTTP scheme, Danzo matches it to that of the URL)
123-
- *Not all servers support multi-connection downloads (range requests)*
124-
- For servers with rate limiting, reducing the number of connections might help
125-
- Debug mode (`--debug`) provides detailed information about the download process
126-
- Temporary files are stored in a .danzo-temp directory and automatically cleaned up after download
127-
- Use `-a randomize` to randomize the user agent for every HTTP client
165+
- Not all servers support multi-connection downloads (range requests), in which case, Danzo auto-switches to simple downloads.
166+
- For servers with rate limiting, reducing the number of connections might help.
167+
- Debug mode (`--debug`) provides detailed information about the download process.
168+
- Temporary files are automatically cleaned up after successful downloads.
169+
- Use `-a randomize` to randomly assign a user agent for every HTTP client. The full list of user agents considered are stored in the [helpers.go](https://github.com/Tanq16/danzo/blob/main/internal/helpers.go) file.
170+
- The tool automatically activates "high-thread-mode" when using more than 6 connections, which optimizes socket buffer sizes for better performance.

cmd/root.go

Lines changed: 43 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -42,66 +42,66 @@ var rootCmd = &cobra.Command{
4242
log.Fatal().Msg("Cannot specify url argument and --urllist together, choose one")
4343
}
4444
url := ""
45-
if len(args) > 1 {
45+
if len(args) > 0 {
46+
// Handle single URL download
4647
url = args[0]
4748
if _, err := u.Parse(url); err != nil {
4849
log.Fatal().Err(err).Msg("Invalid URL format")
4950
}
50-
}
51-
52-
// Handle single URL download
53-
if url != "" {
5451
if output == "" {
5552
parsedURL, _ := u.Parse(url)
5653
output = strings.Split(parsedURL.Path, "/")[len(strings.Split(parsedURL.Path, "/"))-1]
5754
log.Debug().Str("output", output).Msg("Output file path not specified, using URL path")
5855
}
5956
entries := []internal.DownloadEntry{{URL: url, OutputPath: output}}
57+
if _, err := os.Stat(output); err == nil {
58+
entries[0].OutputPath = internal.RenewOutputPath(output)
59+
}
6060
err := internal.BatchDownload(entries, 1, connections, timeout, kaTimeout, userAgent, proxyURL)
6161
if err != nil {
6262
log.Fatal().Err(err).Msg("Download failed")
6363
}
6464
return
65-
}
66-
67-
// Handle batch download from URL list file
68-
entries, err := internal.ReadDownloadList(urlListFile)
69-
if err != nil {
70-
log.Fatal().Err(err).Msg("Failed to read URL list file")
71-
}
72-
connectionsPerLink := connections
73-
maxConnections := 64
74-
if numLinks*connectionsPerLink > maxConnections {
75-
connectionsPerLink = max(maxConnections/numLinks, 1)
76-
log.Warn().Int("connections", connectionsPerLink).Int("numLinks", numLinks).Msg("adjusted connections to below max limit")
77-
}
78-
err = internal.BatchDownload(entries, numLinks, connectionsPerLink, timeout, kaTimeout, userAgent, proxyURL)
79-
if err != nil {
80-
log.Fatal().Err(err).Msg("Batch download completed with errors")
65+
} else {
66+
// Handle batch download from URL list file
67+
entries, err := internal.ReadDownloadList(urlListFile)
68+
if err != nil {
69+
log.Fatal().Err(err).Msg("Failed to read URL list file")
70+
}
71+
connectionsPerLink := connections
72+
maxConnections := 64
73+
if numLinks*connectionsPerLink > maxConnections {
74+
connectionsPerLink = max(maxConnections/numLinks, 1)
75+
log.Warn().Int("connections", connectionsPerLink).Int("numLinks", numLinks).Msg("adjusted connections to below max limit")
76+
}
77+
err = internal.BatchDownload(entries, numLinks, connectionsPerLink, timeout, kaTimeout, userAgent, proxyURL)
78+
if err != nil {
79+
log.Fatal().Err(err).Msg("Batch download completed with errors")
80+
}
8181
}
8282
},
8383
}
8484

85-
var simpleCmd = &cobra.Command{
86-
Use: "simple",
87-
Short: "Simple mode for single threaded direct download",
88-
Args: cobra.ExactArgs(1),
89-
Run: func(cmd *cobra.Command, args []string) {
90-
url := args[0]
91-
if _, err := u.Parse(url); err != nil {
92-
log.Fatal().Err(err).Msg("Invalid URL format")
93-
}
94-
if output == "" {
95-
parsedURL, _ := u.Parse(url)
96-
output = strings.Split(parsedURL.Path, "/")[len(strings.Split(parsedURL.Path, "/"))-1]
97-
log.Debug().Str("output", output).Msg("Output file path not specified, using URL path")
98-
}
99-
err := internal.SimpleDownload(url, output)
100-
if err != nil {
101-
log.Fatal().Err(err).Msg("Download failed")
102-
}
103-
},
104-
}
85+
// var simpleCmd = &cobra.Command{
86+
// Use: "simple",
87+
// Short: "Simple mode for single threaded direct download",
88+
// Args: cobra.ExactArgs(1),
89+
// Run: func(cmd *cobra.Command, args []string) {
90+
// url := args[0]
91+
// if _, err := u.Parse(url); err != nil {
92+
// log.Fatal().Err(err).Msg("Invalid URL format")
93+
// }
94+
// if output == "" {
95+
// parsedURL, _ := u.Parse(url)
96+
// output = strings.Split(parsedURL.Path, "/")[len(strings.Split(parsedURL.Path, "/"))-1]
97+
// log.Debug().Str("output", output).Msg("Output file path not specified, using URL path")
98+
// }
99+
// err := internal.SimpleDownload(url, output)
100+
// if err != nil {
101+
// log.Fatal().Err(err).Msg("Download failed")
102+
// }
103+
// },
104+
// }
105105

106106
var cleanCmd = &cobra.Command{
107107
Use: "clean",
@@ -136,6 +136,6 @@ func init() {
136136
rootCmd.AddCommand(cleanCmd)
137137
cleanCmd.Flags().StringVarP(&cleanOutput, "output", "o", "", "Output file path")
138138

139-
rootCmd.AddCommand(simpleCmd)
140-
simpleCmd.Flags().StringVarP(&output, "output", "o", "", "Output file path")
139+
// rootCmd.AddCommand(simpleCmd)
140+
// simpleCmd.Flags().StringVarP(&output, "output", "o", "", "Output file path")
141141
}

internal/downloader.go

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ import (
1111

1212
func BatchDownload(entries []DownloadEntry, numLinks int, connectionsPerLink int, timeout time.Duration, kaTimeout time.Duration, userAgent string, proxyURL string) error {
1313
log := GetLogger("downloader")
14-
log.Info().Int("totalFiles", len(entries)).Int("numLinks", numLinks).Int("connections", connectionsPerLink).Msg("Initiating download")
14+
log.Info().Int("totalFiles", len(entries)).Int("workers", numLinks).Int("connections", connectionsPerLink).Msg("Initiating download")
1515

1616
progressManager := NewProgressManager()
1717
progressManager.StartDisplay()
@@ -80,8 +80,16 @@ func BatchDownload(entries []DownloadEntry, numLinks int, connectionsPerLink int
8080
progressManager.Complete(outputPath, totalDownloaded)
8181
}(entry.OutputPath, progressCh)
8282

83-
if err == ErrRangeRequestsNotSupported {
84-
err = performSimpleDownload(entry.URL, entry.OutputPath, client, config.UserAgent, progressCh)
83+
if err == ErrRangeRequestsNotSupported || config.Connections == 1 {
84+
logger.Debug().Str("output", entry.OutputPath).Msg("SIMPLE DOWNLOAD with 1 connection")
85+
simpleClient := createHTTPClient(config.Timeout, config.KATimeout, config.ProxyURL, false)
86+
err = performSimpleDownload(entry.URL, entry.OutputPath, simpleClient, config.UserAgent, progressCh)
87+
close(progressCh)
88+
} else if fileSize/int64(config.Connections) < 20*1024*1024 {
89+
logger.Debug().Str("output", entry.OutputPath).Msg("SIMPLE DOWNLOAD bcz low file size")
90+
simpleClient := createHTTPClient(config.Timeout, config.KATimeout, config.ProxyURL, false)
91+
err = performSimpleDownload(entry.URL, entry.OutputPath, simpleClient, config.UserAgent, progressCh)
92+
close(progressCh)
8593
} else {
8694
err = downloadWithProgress(config, client, fileSize, progressCh)
8795
}
@@ -112,24 +120,21 @@ func BatchDownload(entries []DownloadEntry, numLinks int, connectionsPerLink int
112120

113121
func downloadWithProgress(config DownloadConfig, client *http.Client, fileSize int64, progressCh chan<- int64) error {
114122
log := GetLogger("download-worker")
115-
// client := createHTTPClient(config.Timeout, config.KATimeout, config.ProxyURL)
116123
log.Debug().Str("size", formatBytes(uint64(fileSize))).Msg("File size determined")
117124
job := DownloadJob{
118125
Config: config,
119126
FileSize: fileSize,
120127
StartTime: time.Now(),
121128
}
129+
tempDir := filepath.Join(filepath.Dir(config.OutputPath), ".danzo-temp")
130+
if err := os.MkdirAll(tempDir, 0755); err != nil {
131+
log.Error().Err(err).Str("dir", tempDir).Msg("Error creating temp directory")
132+
return fmt.Errorf("error creating temp directory: %v", err)
133+
}
122134

123135
// Setup chunks
124136
mutex := &sync.Mutex{}
125137
chunkSize := fileSize / int64(config.Connections)
126-
minChunkSize := int64(2 * 1024 * 1024) // 2MB minimum
127-
if chunkSize < minChunkSize && fileSize > minChunkSize {
128-
newConnections := max(int(fileSize/minChunkSize), 1)
129-
log.Debug().Int("oldConnections", config.Connections).Int("newConnections", newConnections).Msg("Adjust connections for min. chunk size")
130-
config.Connections = newConnections
131-
chunkSize = fileSize / int64(config.Connections)
132-
}
133138
log.Debug().Int("connections", config.Connections).Str("chunkSize", formatBytes(uint64(chunkSize))).Msg("Download configuration")
134139
var currentPosition int64 = 0
135140
for i := range config.Connections {

internal/helpers.go

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,21 @@ func ReadDownloadList(filePath string) ([]DownloadEntry, error) {
108108
return entries, nil
109109
}
110110

111+
func RenewOutputPath(outputPath string) string {
112+
dir := filepath.Dir(outputPath)
113+
base := filepath.Base(outputPath)
114+
ext := filepath.Ext(base)
115+
name := base[:len(base)-len(ext)]
116+
index := 1
117+
for {
118+
outputPath = filepath.Join(dir, fmt.Sprintf("%s-(%d)%s", name, index, ext))
119+
if _, err := os.Stat(outputPath); os.IsNotExist(err) {
120+
return outputPath
121+
}
122+
index++
123+
}
124+
}
125+
111126
func createHTTPClient(timeout time.Duration, keepAliveTO time.Duration, proxyURL string, highThreadMode bool) *http.Client {
112127
transport := &http.Transport{
113128
MaxIdleConns: 100,

0 commit comments

Comments
 (0)