Update blog

cmdr2 · cmdr2 · commit 95f801a5561d · 2025-10-10T15:23:23.000+05:30
diff --git a/content/blog/2025-10-10-1760088945.md b/content/blog/2025-10-10-1760088945.md
@@ -55,4 +55,33 @@ On a Windows 11 desktop with NVIDIA 3060 12 GB (CUDA backend):
 * Executorch isn't focused on desktops/laptops.
 * XLA is pretty confusing. They apparently use cuDNN/cuBLAS finally (which exceeds the "< 200 MB" installation size target of the new engine).
 
-I don't have an AMD or Intel GPU to test MiGraphX or OpenVINO, but I plan on compiling with them anyway and asking for testing help on Easy Diffusion's [Discord server](https://discord.com/invite/u9yhsFmEkB). But from what I've read, their features fit my needs and I don't doubt their performance numbers (since it's their hardware).
+I don't have an AMD or Intel GPU to test MiGraphX or OpenVINO, but I plan on compiling with them anyway and asking for testing help on Easy Diffusion's [Discord server](https://discord.com/invite/u9yhsFmEkB). But from what I've read, their features fit my needs and I don't doubt their performance numbers (since it's their hardware).
+
+Raw notes:
+
+```
+# For SD VAE (130 MB):
+
+At fp32:
+- TensorRT-RTX: 100 ms / it
+- PyTorch (Windows): 137 ms / it
+- PyTorch (Linux, torch.compile): 137 ms / it
+- IREE (CUDA): 3033 ms / it
+
+At fp16:
+- TensorRT-RTX: 33 ms / it
+- PyTorch (Windows): 72 ms / it
+- PyTorch (Linux, torch.compile): 74 ms / it
+- IREE (CUDA): 3315 ms / it
+
+IREE (Vulkan) failed to compile.
+
+
+# For MobileNet (13.3 MB):
+
+At fp32:
+- TensorRT-RTX: 1 ms / it
+- PyTorch: 6.9 ms / it
+- IREE (CUDA): 5.4 ms / it
+- IREE (Vulkan): 12.8 ms / it
+```