Skip to content

Commit 95f801a

Browse files
committed
Update blog
1 parent af0e379 commit 95f801a

File tree

1 file changed

+30
-1
lines changed

1 file changed

+30
-1
lines changed

content/blog/2025-10-10-1760088945.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,4 +55,33 @@ On a Windows 11 desktop with NVIDIA 3060 12 GB (CUDA backend):
5555
* Executorch isn't focused on desktops/laptops.
5656
* XLA is pretty confusing. They apparently use cuDNN/cuBLAS finally (which exceeds the "< 200 MB" installation size target of the new engine).
5757

58-
I don't have an AMD or Intel GPU to test MiGraphX or OpenVINO, but I plan on compiling with them anyway and asking for testing help on Easy Diffusion's [Discord server](https://discord.com/invite/u9yhsFmEkB). But from what I've read, their features fit my needs and I don't doubt their performance numbers (since it's their hardware).
58+
I don't have an AMD or Intel GPU to test MiGraphX or OpenVINO, but I plan on compiling with them anyway and asking for testing help on Easy Diffusion's [Discord server](https://discord.com/invite/u9yhsFmEkB). But from what I've read, their features fit my needs and I don't doubt their performance numbers (since it's their hardware).
59+
60+
Raw notes:
61+
62+
```
63+
# For SD VAE (130 MB):
64+
65+
At fp32:
66+
- TensorRT-RTX: 100 ms / it
67+
- PyTorch (Windows): 137 ms / it
68+
- PyTorch (Linux, torch.compile): 137 ms / it
69+
- IREE (CUDA): 3033 ms / it
70+
71+
At fp16:
72+
- TensorRT-RTX: 33 ms / it
73+
- PyTorch (Windows): 72 ms / it
74+
- PyTorch (Linux, torch.compile): 74 ms / it
75+
- IREE (CUDA): 3315 ms / it
76+
77+
IREE (Vulkan) failed to compile.
78+
79+
80+
# For MobileNet (13.3 MB):
81+
82+
At fp32:
83+
- TensorRT-RTX: 1 ms / it
84+
- PyTorch: 6.9 ms / it
85+
- IREE (CUDA): 5.4 ms / it
86+
- IREE (Vulkan): 12.8 ms / it
87+
```

0 commit comments

Comments
 (0)