Update blog

cmdr2 · cmdr2 · commit 11ea9fddc5c1 · 2025-11-18T16:43:31.000+05:30
diff --git a/content/blog/2025-11-18-1763464399.md b/content/blog/2025-11-18-1763464399.md
@@ -0,0 +1,25 @@
+---
+title: "Post from Nov 18, 2025"
+date: 2025-11-18T11:13:19
+slug: "1763464399"
+tags:
+  - ml
+  - compiler
+  - sdkit
+  - onnx
+  - ggml
+---
+
+Successfully compiled the VAE of a Stable Diffusion 1.5 model using [graph-compiler](https://github.com/cmdr2/graph-compiler).
+
+The compiled model is terribly slow because I haven't written any performance optimizations, and it (conservatively) converts a lot of intermediate tensors to contiguous copies. But we don't need a lot of clever optimizations to get to decent performance.
+
+It's pretty exciting because I was able to bypass the need to port the model to C++ manually. Instead, I was able to just compile the exported ONNX model and get the same output values as the original PyTorch implementation (given the same input and weights). I could compile to any platform supported by ggml by just changing one flag (e.g. CPU, CUDA, ROCm, Vulkan, Metal etc).
+
+This pushes further the idea of compiling models using their ONNX export (instead of rewriting them manually in C++). In the future, the compiler will be able to perform a number of optimizations (far more than we can do manually for larger models).
+
+The only big downside of this approach is the need to specify the input shape (e.g. `512x512`) during compilation. The compiled graph will only work with that input shape. I'm still thinking about this problem.
+
+The VAE of Stable Diffusion 1.5 is a 10x step-up in complexity from my first test model (i.e [TinyCNN](https://cmdr2.github.io/notes/2025/11/1762514507/)). It has one `Attention` operation, a bunch of `Conv2D`, `MatMul` and `Transpose` operations, and the weights file is `133 MB` in size (float32).
+
+The next model to target is the Unet of Stable Diffusion 1.5. This will be another 10x step-up in complexity and model size (compared to the VAE). I think the ONNX-to-GGML translation layer (i.e. [ggml-onnx.h](https://github.com/cmdr2/graph-compiler/blob/main/src/ggml-onnx.h)) now covers most of the operators required for Unet, but I'll know more once I actually try to compile and run the model.