Skip to content

Commit

Permalink
benchmarks: update to diffusers==0.22.0
Browse files Browse the repository at this point in the history
  • Loading branch information
isidentical committed Nov 6, 2023
1 parent cfa719a commit 4e2c429
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 17 deletions.
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,25 +17,25 @@ Running on an A100 80G SXM hosted at [fal.ai](https://fal.ai).
### SD1.5 (End-to-end) Benchmarks
| | mean (s) | median (s) | min (s) | max (s) | speed (it/s) |
|------------------|----------|------------|---------|---------|--------------|
| Diffusers (torch 2.1, xformers) | 1.758s | 1.759s | 1.746s | 1.772s | 28.43 it/s |
| Diffusers (torch 2.1, SDPA) | 1.591s | 1.590s | 1.581s | 1.601s | 31.44 it/s |
| Diffusers (torch 2.1, SDPA, [tiny VAE](https://github.com/madebyollin/taesd))\* | 1.562s | 1.556s | 1.544s | 1.591s | 32.14 it/s |
| Diffusers (torch 2.1, SDPA, compiled) | 1.352s | 1.351s | 1.348s | 1.356s | 37.01 it/s |
| Diffusers (torch 2.1, SDPA, compiled, NCHW channels last) | 1.066s | 1.065s | 1.062s | 1.076s | 46.95 it/s |
| Diffusers (torch 2.1, xformers) | 1.729s | 1.728s | 1.720s | 1.747s | 28.94 it/s |
| Diffusers (torch 2.1, SDPA) | 1.604s | 1.603s | 1.589s | 1.618s | 31.19 it/s |
| Diffusers (torch 2.1, SDPA, [tiny VAE](https://github.com/madebyollin/taesd))\* | 1.567s | 1.562s | 1.547s | 1.602s | 32.02 it/s |
| Diffusers (torch 2.1, SDPA, compiled) | 1.354s | 1.354s | 1.351s | 1.356s | 36.93 it/s |
| Diffusers (torch 2.1, SDPA, compiled, NCHW channels last) | 1.058s | 1.057s | 1.056s | 1.060s | 47.29 it/s |
| OneFlow | 0.951s | 0.953s | 0.941s | 0.957s | 52.48 it/s |
| TensorRT 9.0 (cuda graphs, static shapes) | 0.819s | 0.818s | 0.817s | 0.821s | 61.14 it/s |

### SDXL (End-to-end) Benchmarks
| | mean (s) | median (s) | min (s) | max (s) | speed (it/s) |
|------------------|----------|------------|---------|---------|--------------|
| [minSDXL](https://github.com/cloneofsimo/minSDXL) (torch 2.1) | 8.131s | 8.133s | 8.116s | 8.145s | 6.15 it/s |
| Diffusers (torch 2.1, SDPA) | 5.933s | 5.933s | 5.924s | 5.943s | 8.43 it/s |
| [minSDXL+](https://github.com/isidentical/minSDXL) (torch 2.1, SDPA) | 5.881s | 5.881s | 5.872s | 5.891s | 8.50 it/s |
| Diffusers (torch 2.1, SDPA, [tiny VAE](https://github.com/madebyollin/taesd))\* | 5.748s | 5.746s | 5.734s | 5.776s | 8.70 it/s |
| Diffusers (torch 2.1, xformers) | 5.724s | 5.724s | 5.714s | 5.731s | 8.74 it/s |
| [minSDXL+](https://github.com/isidentical/minSDXL) (torch 2.1, flash-attention v2) | 5.306s | 5.304s | 5.288s | 5.333s | 9.43 it/s |
| Diffusers (torch 2.1, SDPA, compiled) | 5.246s | 5.247s | 5.233s | 5.259s | 9.53 it/s |
| Diffusers (torch 2.1, SDPA, compiled, NCHW channels last) | 5.132s | 5.132s | 5.121s | 5.142s | 9.74 it/s |
| [minSDXL](https://github.com/cloneofsimo/minSDXL) (torch 2.1) | 8.146s | 8.146s | 8.137s | 8.155s | 6.14 it/s |
| Diffusers (torch 2.1, SDPA) | 5.932s | 5.932s | 5.924s | 5.940s | 8.43 it/s |
| [minSDXL+](https://github.com/isidentical/minSDXL) (torch 2.1, SDPA) | 5.887s | 5.887s | 5.872s | 5.897s | 8.49 it/s |
| Diffusers (torch 2.1, SDPA, [tiny VAE](https://github.com/madebyollin/taesd))\* | 5.739s | 5.738s | 5.722s | 5.767s | 8.71 it/s |
| Diffusers (torch 2.1, xformers) | 5.719s | 5.717s | 5.710s | 5.732s | 8.75 it/s |
| [minSDXL+](https://github.com/isidentical/minSDXL) (torch 2.1, flash-attention v2) | 5.323s | 5.322s | 5.313s | 5.340s | 9.39 it/s |
| Diffusers (torch 2.1, SDPA, compiled) | 5.217s | 5.216s | 5.213s | 5.220s | 9.59 it/s |
| Diffusers (torch 2.1, SDPA, compiled, NCHW channels last) | 5.136s | 5.137s | 5.125s | 5.147s | 9.73 it/s |
| OneFlow | 4.605s | 4.607s | 4.581s | 4.625s | 10.85 it/s |
| TensorRT 9.0 (cuda graphs, static shapes) | 4.102s | 4.104s | 4.091s | 4.107s | 12.18 it/s |

Expand Down
2 changes: 1 addition & 1 deletion artifacts/latest.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"settings": {"warmup_iterations": 3, "benchmark_iterations": 10}, "parameters": {"prompt": "A photo of a cat", "steps": 50}, "timings": [{"name": "Diffusers (torch 2.1, SDPA)", "category": "SD1.5 (End-to-end)", "timings": [1.5917212970089167, 1.5975631090113893, 1.5821007050108165, 1.5864128279790748, 1.5813008210097905, 1.588955162995262, 1.583035584015306, 1.5979954930080567, 1.6009252599906176, 1.5956080609757919]}, {"name": "Diffusers (torch 2.1, SDPA, [tiny VAE](https://github.com/madebyollin/taesd))\\*", "category": "SD1.5 (End-to-end)", "timings": [1.562345665995963, 1.5535877529764548, 1.5727124799741432, 1.5913520029862411, 1.584301869967021, 1.5461461240192875, 1.557934220007155, 1.5496932759997435, 1.553419985983055, 1.5442492840229534]}, {"name": "Diffusers (torch 2.1, xformers)", "category": "SD1.5 (End-to-end)", "timings": [1.7560910189931747, 1.7572659730212763, 1.7597715989977587, 1.7469689899880905, 1.763645778002683, 1.748716948000947, 1.7602629070170224, 1.7721076029993128, 1.7460152900021058, 1.7701677379955072]}, {"name": "Diffusers (torch 2.1, SDPA, compiled)", "category": "SD1.5 (End-to-end)", "timings": [1.356168844999047, 1.354804383998271, 1.3516721340129152, 1.3500280909938738, 1.3562533959920984, 1.3556265980005264, 1.3505920349853113, 1.3477569509996101, 1.3498703970108181, 1.3481854719866533]}, {"name": "Diffusers (torch 2.1, SDPA, compiled, NCHW channels last)", "category": "SD1.5 (End-to-end)", "timings": [1.0672315989795607, 1.0727007249952294, 1.0632865040097386, 1.0763663580000866, 1.06514667099691, 1.065665372996591, 1.0638107580016367, 1.0616009290097281, 1.0649084030010272, 1.063036303006811]}, {"name": "Diffusers (torch 2.1, SDPA)", "category": "SDXL (End-to-end)", "timings": [5.940763157996116, 5.926704184006667, 5.932992869988084, 5.940833892993396, 5.923987179005053, 5.938259807007853, 5.923574882996036, 5.930732762994012, 5.942996845988091, 5.932096109987469]}, {"name": "Diffusers (torch 2.1, SDPA, [tiny VAE](https://github.com/madebyollin/taesd))\\*", "category": "SDXL (End-to-end)", "timings": [5.733747632999439, 5.7373922020196915, 5.739806508005131, 5.7549302889965475, 5.754676963028032, 5.745761024008971, 5.751321510004345, 5.776052331959363, 5.744757027016021, 5.746225669980049]}, {"name": "Diffusers (torch 2.1, xformers)", "category": "SDXL (End-to-end)", "timings": [5.728389803000027, 5.7223248230002355, 5.713896728004329, 5.7198221340077, 5.716055455995956, 5.730836973001715, 5.725524671986932, 5.730034602980595, 5.726219657983165, 5.722418188001029]}, {"name": "Diffusers (torch 2.1, SDPA, compiled)", "category": "SDXL (End-to-end)", "timings": [5.233289741008775, 5.24713467201218, 5.235783365002135, 5.239803472999483, 5.251854731992353, 5.242411447019549, 5.250333832023898, 5.259196978004184, 5.247713554999791, 5.255097048007883]}, {"name": "Diffusers (torch 2.1, SDPA, compiled, NCHW channels last)", "category": "SDXL (End-to-end)", "timings": [5.12098954099929, 5.122773736016825, 5.130043459008448, 5.131235945009394, 5.132947302015964, 5.1301643929909915, 5.13384268200025, 5.141838512994582, 5.139718485996127, 5.141162898013135]}, {"name": "TensorRT 9.0 (cuda graphs, static shapes)", "category": "SD1.5 (End-to-end)", "timings": [0.819957683008397, 0.8171751589979976, 0.8198997500003316, 0.8168765410082415, 0.8175504659884609, 0.817866342025809, 0.8211427440110128, 0.8207452670030762, 0.8174457829736639, 0.8177875310066156]}, {"name": "TensorRT 9.0 (cuda graphs, static shapes)", "category": "SDXL (End-to-end)", "timings": [4.099050192977302, 4.091173734981567, 4.09869981801603, 4.100261182000395, 4.1056046999874525, 4.1030455399886705, 4.104289636015892, 4.105645445990376, 4.1050181849859655, 4.106528664997313]}, {"name": "OneFlow", "category": "SD1.5 (End-to-end)", "timings": [0.9568120219919365, 0.9468847009993624, 0.9545126229932066, 0.9472718389879446, 0.9552929110068362, 0.9412291230109986, 0.9544001989997923, 0.9557115529896691, 0.9509655799774919, 0.9482630330021493]}, {"name": "OneFlow", "category": "SDXL (End-to-end)", "timings": [4.586631373997079, 4.61347366499831, 4.600411992985755, 4.6092570440087, 4.611457958992105, 4.604861573025119, 4.602566407003906, 4.624956093000947, 4.615925558988238, 4.580915530998027]}, {"name": "[minSDXL](https://github.com/cloneofsimo/minSDXL) (torch 2.1)", "category": "SDXL (End-to-end)", "timings": [8.118194965005387, 8.118167286971584, 8.115796227997635, 8.130944961973, 8.135477469011676, 8.122692861012183, 8.14406747900648, 8.144401906989515, 8.139942242007237, 8.14477308903588]}, {"name": "[minSDXL+](https://github.com/isidentical/minSDXL) (torch 2.1, SDPA)", "category": "SDXL (End-to-end)", "timings": [5.888596308999695, 5.891138902050443, 5.881036774022505, 5.887884529947769, 5.881468039995525, 5.873392584035173, 5.8823586810030974, 5.872284794982988, 5.874876776011661, 5.874297180038411]}, {"name": "[minSDXL+](https://github.com/isidentical/minSDXL) (torch 2.1, flash-attention v2)", "category": "SDXL (End-to-end)", "timings": [5.2938573459978215, 5.333262387022842, 5.287751997995656, 5.295834582997486, 5.314443435985595, 5.31391279597301, 5.2958174420055, 5.300047887023538, 5.307358076039236, 5.318470707978122]}]}
{"settings": {"warmup_iterations": 3, "benchmark_iterations": 10}, "parameters": {"prompt": "A photo of a cat", "steps": 50}, "timings": [{"name": "Diffusers (torch 2.1, SDPA)", "category": "SD1.5 (End-to-end)", "timings": [1.6092358700116165, 1.590405477967579, 1.6014833319932222, 1.6045241150422953, 1.6173307650024071, 1.588649354991503, 1.6177432839758694, 1.599749773973599, 1.5973809910356067, 1.611054973967839]}, {"name": "Diffusers (torch 2.1, SDPA, [tiny VAE](https://github.com/madebyollin/taesd))\\*", "category": "SD1.5 (End-to-end)", "timings": [1.553542829991784, 1.5469060649629682, 1.5682512729545124, 1.5926210439647548, 1.5490420999703929, 1.5811396269709803, 1.601516699010972, 1.573098658991512, 1.5553199910209514, 1.5495691270334646]}, {"name": "Diffusers (torch 2.1, xformers)", "category": "SD1.5 (End-to-end)", "timings": [1.7270463349996135, 1.7257355590118095, 1.72448783100117, 1.7195176769746467, 1.724910780962091, 1.733797338034492, 1.7283085110248066, 1.7298935379949398, 1.7296830140403472, 1.7474360400228761]}, {"name": "Diffusers (torch 2.1, SDPA, compiled)", "category": "SD1.5 (End-to-end)", "timings": [1.3513619299628772, 1.3534803700167686, 1.3520958359586075, 1.3536381669691764, 1.354587804991752, 1.3556970959762111, 1.3540321679902263, 1.3534756689914502, 1.3540198029950261, 1.3547918749973178]}, {"name": "Diffusers (torch 2.1, SDPA, compiled, NCHW channels last)", "category": "SD1.5 (End-to-end)", "timings": [1.0565082289977, 1.0567370590288192, 1.0561835389817134, 1.0582475429982878, 1.0596825950196944, 1.0571462360094301, 1.0588196199969389, 1.0595528869889677, 1.05731826700503, 1.0564522079657763]}, {"name": "Diffusers (torch 2.1, SDPA)", "category": "SDXL (End-to-end)", "timings": [5.926582371990662, 5.934434743016027, 5.9270851469482295, 5.935426631011069, 5.924764123046771, 5.940283620962873, 5.923806683975272, 5.939957841997966, 5.936083366977982, 5.9296819130540825]}, {"name": "Diffusers (torch 2.1, SDPA, [tiny VAE](https://github.com/madebyollin/taesd))\\*", "category": "SDXL (End-to-end)", "timings": [5.721943153010216, 5.728673742036335, 5.741363879002165, 5.76699190097861, 5.737180910015013, 5.739464172977023, 5.734640704002231, 5.739124519051984, 5.736957271001302, 5.743521926051471]}, {"name": "Diffusers (torch 2.1, xformers)", "category": "SDXL (End-to-end)", "timings": [5.710114244022407, 5.713956555002369, 5.712215353967622, 5.711912807018962, 5.717077467998024, 5.717427038005553, 5.716518344997894, 5.723207506001927, 5.732120550004765, 5.730621722002979]}, {"name": "Diffusers (torch 2.1, SDPA, compiled)", "category": "SDXL (End-to-end)", "timings": [5.215489073016215, 5.213854452013038, 5.219272127957083, 5.21321740699932, 5.216327044996433, 5.215428333031014, 5.218647814006545, 5.215836882998701, 5.220495002984535, 5.2171645070193335]}, {"name": "Diffusers (torch 2.1, SDPA, compiled, NCHW channels last)", "category": "SDXL (End-to-end)", "timings": [5.125404424034059, 5.128606810001656, 5.129265585972462, 5.131429913977627, 5.139510503038764, 5.135477636009455, 5.141903425042983, 5.1426030559814535, 5.1420622110017575, 5.1471164600225165]}, {"name": "TensorRT 9.0 (cuda graphs, static shapes)", "category": "SD1.5 (End-to-end)", "timings": [0.819957683008397, 0.8171751589979976, 0.8198997500003316, 0.8168765410082415, 0.8175504659884609, 0.817866342025809, 0.8211427440110128, 0.8207452670030762, 0.8174457829736639, 0.8177875310066156]}, {"name": "TensorRT 9.0 (cuda graphs, static shapes)", "category": "SDXL (End-to-end)", "timings": [4.099050192977302, 4.091173734981567, 4.09869981801603, 4.100261182000395, 4.1056046999874525, 4.1030455399886705, 4.104289636015892, 4.105645445990376, 4.1050181849859655, 4.106528664997313]}, {"name": "OneFlow", "category": "SD1.5 (End-to-end)", "timings": [0.9568120219919365, 0.9468847009993624, 0.9545126229932066, 0.9472718389879446, 0.9552929110068362, 0.9412291230109986, 0.9544001989997923, 0.9557115529896691, 0.9509655799774919, 0.9482630330021493]}, {"name": "OneFlow", "category": "SDXL (End-to-end)", "timings": [4.586631373997079, 4.61347366499831, 4.600411992985755, 4.6092570440087, 4.611457958992105, 4.604861573025119, 4.602566407003906, 4.624956093000947, 4.615925558988238, 4.580915530998027]}, {"name": "[minSDXL](https://github.com/cloneofsimo/minSDXL) (torch 2.1)", "category": "SDXL (End-to-end)", "timings": [8.153573560994118, 8.144518585992046, 8.136577832978219, 8.14440743502928, 8.146547965996433, 8.137827199010644, 8.150413497991394, 8.143599029979669, 8.154678368009627, 8.15259703004267]}, {"name": "[minSDXL+](https://github.com/isidentical/minSDXL) (torch 2.1, SDPA)", "category": "SDXL (End-to-end)", "timings": [5.87210746697383, 5.879055427969433, 5.893418683030177, 5.887948323041201, 5.883382624015212, 5.88199090200942, 5.886507772025652, 5.893981233006343, 5.895993906015065, 5.897189933981281]}, {"name": "[minSDXL+](https://github.com/isidentical/minSDXL) (torch 2.1, flash-attention v2)", "category": "SDXL (End-to-end)", "timings": [5.314938348019496, 5.328400561993476, 5.314847628993448, 5.321663878043182, 5.31307160895085, 5.323098871042021, 5.315845976991113, 5.323869657004252, 5.333241019980051, 5.340266301005613]}]}
11 changes: 10 additions & 1 deletion benchmarks/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,12 @@ def main() -> None:
action="store_true",
help="Force running all benchmarks, even if they have already been run.",
)
parser.add_argument(
"--force-run-only",
type=str.lower,
help="Force running only the specified benchmarks, even if they have already been run.",
choices=["diffusers", "tensorrt", "minsdxl", "oneflow"],
)

# For ensuring consistency among results, make sure to compare the numbers
# within the same node. So the driver, cuda version, power supply, CPU compute
Expand All @@ -84,7 +90,10 @@ def main() -> None:
for benchmark in track(ALL_BENCHMARKS, description="Running benchmarks..."):
benchmark_key = (benchmark["category"], benchmark["name"])
should_skip = benchmark.get("skip_if", False)
if benchmark_key in previous_timings and (not options.force_run or should_skip):
should_force_run = options.force_run or (
options.force_run_only in benchmark["name"].lower()
)
if benchmark_key in previous_timings and (not should_force_run or should_skip):
print(f"Skipping {benchmark_key} (already run)")
timings.append(
{
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/benchmark_diffusers.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
@fal.function(
requirements=[
"accelerate==0.24.1",
"diffusers==0.21.4",
"diffusers==0.22.0",
"torch==2.1.0",
"transformers==4.35.0",
"xformers==0.0.22.post7",
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/benchmark_minsdxl.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
@fal.function(
requirements=[
"accelerate==0.24.1",
"diffusers==0.21.4",
"diffusers==0.22.0",
"torch==2.1.0",
"transformers==4.35.0",
"xformers==0.0.22.post7",
Expand Down

2 comments on commit 4e2c429

@isidentical
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @sayakpaul, seems like there aren't any performance regressions for SD/SDXL which is amazing to see after a very major release!

@sayakpaul
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cc: @patrickvonplaten too :)

Please sign in to comment.