Skip to content

Commit

Permalink
Improved benchmarks
Browse files Browse the repository at this point in the history
Fully automated benchmarks with the run_benchmarks.py script.
Also updated the readme to add in benchmark information and other nice
things.
  • Loading branch information
JacobCallahan committed Mar 28, 2024
1 parent 8941f5e commit a4105a8
Show file tree
Hide file tree
Showing 9 changed files with 171 additions and 101 deletions.
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# Hussh: SSH for humans.
[![image](https://img.shields.io/pypi/v/hussh.svg)](https://pypi.python.org/pypi/hussh)
[![image](https://img.shields.io/pypi/pyversions/hussh.svg)](https://pypi.python.org/pypi/hussh)
[![Actions status](https://github.com/jacobcallahan/hussh/actions/workflows/build_and_test.yml/badge.svg)](https://github.com/jacobcallahan/hussh/actions)

Hussh (pronounced "hush") is a client-side ssh library that offers low level performance through a high level interface.

Hussh uses [pyo3](https://docs.rs/pyo3/latest/pyo3/) to create Python bindings around the [ssh2](https://docs.rs/ssh2/latest/ssh2/) library for Rust.
Expand All @@ -24,6 +28,25 @@ That's it! One import and class instantion is all you need to:
- Perform SFTP actions
- Get an interactive shell

# Why Hussh?
- 🔥 Blazingly fast!
- 🪶 Incredibly lightweight!
- 🧠 Super easy to use!

## Benchmarks
Hussh demonstrates the performance you'd expect from a low level ssh library.
Hussh is also much lighter weight in both total memory and memory allocations.

Local Server
![Local Server Benchmarks](benchmarks/local_server_bench.png)

Remote Server
![Remote Server Benchmarks](benchmarks/remote_server_bench.png)

### Try it for yourself!
Hussh's benchmark script are also open sourced in the `benchmarks` directory in this repository.
Clone the repo, follow the setup instructions, then let us know how it did!

# Authentication
You've already seen password-based authentication, but here it is again.
```python
Expand Down Expand Up @@ -138,6 +161,7 @@ With that said, try it out and let me know your thoughts!

# Future Features
- Proper exception handling
- Concurrent actions class
- Async Connection class
- Low level bindings
- Misc codebase improvements
Expand Down
28 changes: 11 additions & 17 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Benchmark Testing
Benchmarks against ssh libraries are difficult due to network connection variations.
Libraries are likely to vary from run to run. However, there are some things we can do to reduce this uncertainty.
The first is to remove as much network variability as we reasonably can by running a local test server (see below).
It is also a good idea to run benchmarks a few times and run an average.

## Install benchmarking requirements
First, you will either need to have your own test target, putting that information in the `target.json` file in this directory.
Expand All @@ -12,25 +17,14 @@ pip install -r requirements.txt
```

## Running all benchmark scripts
To run the test scripts, you can either manually run each one, or use this bash loop.
To run the test scripts, you just need to execute the run_benchmarks.py script.
```bash
for file in bench_*.py; do echo "$file"; python "$file"; done
```
This will also create a memray output file for each script ran.
We'll use these in the next step.

## Getting the total memory consumption for all benchmarks
This loop will get summaries for each benchmark's memory consumption and pull out the total memory and allocations.
```bash
for file in memray-bench_*; do echo "$file"; memray summary -r 1 "$file" | grep " at " | tr -d '[:space:]' | awk -F '' '{print "Memory: " $3 ", Allocations: " $7}'; done
python run_benchmarks.py
```
This will ultimately collect all the benchmark and memray information into a table.

## Cleanup
You likely don't want the memray files to hang around, so you can easily delete them by running this.
Alternatively, if you'd prefer to run individual benchmarks, you can do that.
```bash
rm -f memray-bench_*
python test_hussh.py
```

# ToDo
- Improve reporting by putting it all in a nice table.
- Remove the need to execute commands manually by scripting it all.
This will also create a memray output file for each script ran.
47 changes: 25 additions & 22 deletions benchmarks/bench_fabric.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,19 @@
import json
import memray
import timeit
from pprint import pprint
from pathlib import Path

with memray.Tracker("memray-bench_fabric.bin"):
results_dict = {}

if (mem_path := Path("memray-bench_fabric.bin")).exists():
mem_path.unlink()
with memray.Tracker("memray-bench_fabric.bin", native_traces=True, follow_fork=True):
start_time = timeit.default_timer()

from fabric import Connection

import_time = timeit.default_timer() - start_time
results_dict["import_time"] = f"{(timeit.default_timer() - start_time) * 1000:.2f} ms"
host_info = json.loads(Path("target.json").read_text())

temp_time = timeit.default_timer()
Expand All @@ -23,53 +28,51 @@
},
)
conn.open()
connect_time = timeit.default_timer() - temp_time
results_dict["connect_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
result = conn.run("echo test")
run_time = timeit.default_timer() - temp_time
results_dict["cmd_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

# small file (1kb)
temp_time = timeit.default_timer()
conn.put("1kb.txt", "/root/1kb.txt")
s_put_time = timeit.default_timer() - temp_time
results_dict["s_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
conn.get("/root/1kb.txt", "small.txt")
s_get_time = timeit.default_timer() - temp_time
results_dict["s_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("small.txt").unlink()

# medium file (14kb)
temp_time = timeit.default_timer()
conn.put("14kb.txt", "/root/14kb.txt")
m_put_time = timeit.default_timer() - temp_time
results_dict["m_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
conn.get("/root/14kb.txt", "medium.txt")
m_get_time = timeit.default_timer() - temp_time
results_dict["m_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("medium.txt").unlink()

# large file (64kb)
temp_time = timeit.default_timer()
conn.put("64kb.txt", "/root/64kb.txt")
l_put_time = timeit.default_timer() - temp_time
results_dict["l_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
conn.get("/root/64kb.txt", "large.txt")
l_get_time = timeit.default_timer() - temp_time
results_dict["l_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("large.txt").unlink()

conn.close()

total_time = timeit.default_timer() - start_time

print(f"import_time: {import_time * 1000:.2f} ms")
print(f"connect_time: {connect_time * 1000:.2f} ms")
print(f"run_time: {run_time * 1000:.2f} ms")
print(f"s_put_time: {s_put_time * 1000:.2f} ms")
print(f"s_get_time: {s_get_time * 1000:.2f} ms")
print(f"m_put_time: {m_put_time * 1000:.2f} ms")
print(f"m_get_time: {m_get_time * 1000:.2f} ms")
print(f"l_put_time: {l_put_time * 1000:.2f} ms")
print(f"l_get_time: {l_get_time * 1000:.2f} ms")
print(f"total_time: {total_time * 1000:.2f} ms")
results_dict["total_time"] = f"{(timeit.default_timer() - start_time) * 1000:.2f} ms"

pprint(results_dict, sort_dicts=False)

if Path("bench_results.json").exists():
results = json.loads(Path("bench_results.json").read_text())
else:
results = {}
results.update({"fabric": results_dict})
Path("bench_results.json").write_text(json.dumps(results, indent=2))
42 changes: 22 additions & 20 deletions benchmarks/bench_hussh.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
import json
import memray
import timeit
from pprint import pprint
from pathlib import Path

results_dict = {}

if (mem_path := Path("memray-bench_hussh.bin")).exists():
mem_path.unlink()
with memray.Tracker("memray-bench_hussh.bin"):
start_time = timeit.default_timer()
from hussh import Connection
import_time = timeit.default_timer() - start_time
results_dict["import_time"] = f"{(timeit.default_timer() - start_time) * 1000:.2f} ms"

host_info = json.loads(Path("target.json").read_text())

Expand All @@ -17,51 +21,49 @@
port=host_info["port"],
password=host_info["password"],
)
connect_time = timeit.default_timer() - temp_time
results_dict["connect_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
result = conn.execute("echo test")
run_time = timeit.default_timer() - temp_time
results_dict["cmd_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

# small file (1kb)
temp_time = timeit.default_timer()
conn.sftp_write("1kb.txt", "/root/1kb.txt")
s_put_time = timeit.default_timer() - temp_time
results_dict["s_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
conn.sftp_read("/root/1kb.txt", "small.txt")
s_get_time = timeit.default_timer() - temp_time
results_dict["s_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("small.txt").unlink()

# medium file (14kb)
temp_time = timeit.default_timer()
conn.sftp_write("14kb.txt", "/root/14kb.txt")
m_put_time = timeit.default_timer() - temp_time
results_dict["m_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
conn.sftp_read("/root/14kb.txt", "medium.txt")
m_get_time = timeit.default_timer() - temp_time
results_dict["m_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("medium.txt").unlink()

# large file (64kb)
temp_time = timeit.default_timer()
conn.sftp_write("64kb.txt", "/root/64kb.txt")
l_put_time = timeit.default_timer() - temp_time
results_dict["l_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
conn.sftp_read("/root/64kb.txt", "large.txt")
l_get_time = timeit.default_timer() - temp_time
results_dict["l_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("large.txt").unlink()

total_time = timeit.default_timer() - start_time
results_dict["total_time"] = f"{(timeit.default_timer() - start_time) * 1000:.2f} ms"

print(f"import_time: {import_time * 1000:.2f} ms")
print(f"connect_time: {connect_time * 1000:.2f} ms")
print(f"run_time: {run_time * 1000:.2f} ms")
print(f"s_put_time: {s_put_time * 1000:.2f} ms")
print(f"s_get_time: {s_get_time * 1000:.2f} ms")
print(f"m_put_time: {m_put_time * 1000:.2f} ms")
print(f"m_get_time: {m_get_time * 1000:.2f} ms")
print(f"l_put_time: {l_put_time * 1000:.2f} ms")
print(f"l_get_time: {l_get_time * 1000:.2f} ms")
print(f"total_time: {total_time * 1000:.2f} ms")
pprint(results_dict, sort_dicts=False)

if Path("bench_results.json").exists():
results = json.loads(Path("bench_results.json").read_text())
else:
results = {}
results.update({"hussh": results_dict})
Path("bench_results.json").write_text(json.dumps(results, indent=2))
44 changes: 23 additions & 21 deletions benchmarks/bench_paramiko.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
import json
import memray
import timeit
from pprint import pprint
from pathlib import Path

results_dict = {}

if (mem_path := Path("memray-bench_paramiko.bin")).exists():
mem_path.unlink()
with memray.Tracker("memray-bench_paramiko.bin"):
start_time = timeit.default_timer()
import paramiko
import_time = timeit.default_timer() - start_time
results_dict["import_time"] = f"{(timeit.default_timer() - start_time) * 1000:.2f} ms"

host_info = json.loads(Path("target.json").read_text())

Expand All @@ -22,57 +26,55 @@
look_for_keys=False,
allow_agent=False,
)
connect_time = timeit.default_timer() - temp_time
results_dict["connect_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
stdin, stdout, stderr = ssh.exec_command("echo test")
result = stdout.read()
run_time = timeit.default_timer() - temp_time
results_dict["cmd_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"


# small file (1kb)
temp_time = timeit.default_timer()
sftp = ssh.open_sftp()
sftp.put("1kb.txt", "/root/1kb.txt")
s_put_time = timeit.default_timer() - temp_time
results_dict["s_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
sftp.get("/root/1kb.txt", "small.txt")
s_get_time = timeit.default_timer() - temp_time
results_dict["s_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("small.txt").unlink()

# medium file (14kb)
temp_time = timeit.default_timer()
sftp.put("14kb.txt", "/root/14kb.txt")
m_put_time = timeit.default_timer() - temp_time
results_dict["m_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
sftp.get("/root/14kb.txt", "medium.txt")
m_get_time = timeit.default_timer() - temp_time
results_dict["m_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("medium.txt").unlink()

# large file (64kb)
temp_time = timeit.default_timer()
sftp.put("64kb.txt", "/root/64kb.txt")
l_put_time = timeit.default_timer() - temp_time
results_dict["l_put_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"

temp_time = timeit.default_timer()
sftp.get("/root/64kb.txt", "large.txt")
l_get_time = timeit.default_timer() - temp_time
results_dict["l_get_time"] = f"{(timeit.default_timer() - temp_time) * 1000:.2f} ms"
Path("large.txt").unlink()

sftp.close()
ssh.close()

total_time = timeit.default_timer() - start_time

print(f"import_time: {import_time * 1000:.2f} ms")
print(f"connect_time: {connect_time * 1000:.2f} ms")
print(f"run_time: {run_time * 1000:.2f} ms")
print(f"s_put_time: {s_put_time * 1000:.2f} ms")
print(f"s_get_time: {s_get_time * 1000:.2f} ms")
print(f"m_put_time: {m_put_time * 1000:.2f} ms")
print(f"m_get_time: {m_get_time * 1000:.2f} ms")
print(f"l_put_time: {l_put_time * 1000:.2f} ms")
print(f"l_get_time: {l_get_time * 1000:.2f} ms")
print(f"total_time: {total_time * 1000:.2f} ms")
results_dict["total_time"] = f"{(timeit.default_timer() - start_time) * 1000:.2f} ms"

pprint(results_dict, sort_dicts=False)

if Path("bench_results.json").exists():
results = json.loads(Path("bench_results.json").read_text())
else:
results = {}
results.update({"paramiko": results_dict})
Path("bench_results.json").write_text(json.dumps(results, indent=2))
Loading

0 comments on commit a4105a8

Please sign in to comment.