Skip to content

Commit 11ea9fd

Browse files
committed
Update blog
1 parent 25188be commit 11ea9fd

File tree

1 file changed

+25
-0
lines changed

1 file changed

+25
-0
lines changed
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
title: "Post from Nov 18, 2025"
3+
date: 2025-11-18T11:13:19
4+
slug: "1763464399"
5+
tags:
6+
- ml
7+
- compiler
8+
- sdkit
9+
- onnx
10+
- ggml
11+
---
12+
13+
Successfully compiled the VAE of a Stable Diffusion 1.5 model using [graph-compiler](https://github.com/cmdr2/graph-compiler).
14+
15+
The compiled model is terribly slow because I haven't written any performance optimizations, and it (conservatively) converts a lot of intermediate tensors to contiguous copies. But we don't need a lot of clever optimizations to get to decent performance.
16+
17+
It's pretty exciting because I was able to bypass the need to port the model to C++ manually. Instead, I was able to just compile the exported ONNX model and get the same output values as the original PyTorch implementation (given the same input and weights). I could compile to any platform supported by ggml by just changing one flag (e.g. CPU, CUDA, ROCm, Vulkan, Metal etc).
18+
19+
This pushes further the idea of compiling models using their ONNX export (instead of rewriting them manually in C++). In the future, the compiler will be able to perform a number of optimizations (far more than we can do manually for larger models).
20+
21+
The only big downside of this approach is the need to specify the input shape (e.g. `512x512`) during compilation. The compiled graph will only work with that input shape. I'm still thinking about this problem.
22+
23+
The VAE of Stable Diffusion 1.5 is a 10x step-up in complexity from my first test model (i.e [TinyCNN](https://cmdr2.github.io/notes/2025/11/1762514507/)). It has one `Attention` operation, a bunch of `Conv2D`, `MatMul` and `Transpose` operations, and the weights file is `133 MB` in size (float32).
24+
25+
The next model to target is the Unet of Stable Diffusion 1.5. This will be another 10x step-up in complexity and model size (compared to the VAE). I think the ONNX-to-GGML translation layer (i.e. [ggml-onnx.h](https://github.com/cmdr2/graph-compiler/blob/main/src/ggml-onnx.h)) now covers most of the operators required for Unet, but I'll know more once I actually try to compile and run the model.

0 commit comments

Comments
 (0)