Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
af2146d
feat: Gemma 4 support
giladgd Apr 6, 2026
2471df8
fix: Gemma 4 resource requirements estimation
giladgd Apr 12, 2026
67b60c6
feat: more precise resource usage estimation, auto flash attention, r…
giladgd Apr 28, 2026
2fe0dd9
fix: Vulkan backend successful load detection even when no devices ar…
giladgd Apr 28, 2026
d5c4c2c
feat: optimize grammar sampling performance
giladgd Apr 28, 2026
2f01d10
fix: resolve Gemma 4 chat wrapper for relevant models
giladgd Apr 28, 2026
4ce206b
test: gemma 4 function calling
giladgd Apr 28, 2026
dacca3b
feat: `useMmap: "auto"`, bug fixes, fix tests
giladgd Apr 28, 2026
3142124
feat: support `Q1_0` quant, fix `MXFP4_MOE` quant name
giladgd Apr 28, 2026
b48681f
fix: apply `llama.cpp` patches if pending PRs aren't merged yet
giladgd May 5, 2026
74fef2f
fix: adapt to breaking `llama.cpp` changes
giladgd May 5, 2026
c772709
test: fix tests
giladgd May 5, 2026
40d204d
fix: bug
giladgd May 5, 2026
0acdc31
test: fix tests
giladgd May 5, 2026
ea7fce0
fix: type
giladgd May 5, 2026
3fc0363
test: fix tests
giladgd May 6, 2026
aa50af2
fix: don't crash on unsupported model architecture
giladgd May 6, 2026
ca607fd
feat: improve stability on unified memory systems
giladgd May 20, 2026
79543d9
fix: bugs
giladgd May 20, 2026
547c692
fix: correct wired memory calculation
giladgd May 20, 2026
fe284fe
fix: improve measure safety
giladgd May 20, 2026
5ef1c2b
fix: bug
giladgd May 20, 2026
1c62b87
fix: bug
giladgd May 20, 2026
b099ead
fix: bugs
giladgd May 20, 2026
6b387e6
fix: remove patch for merged PR
giladgd May 25, 2026
720a2d2
Merge remote-tracking branch 'origin/master' into gilad/gemma4
giladgd May 26, 2026
7f91df0
feat: try using github token to fetch latest llama.cpp release on rat…
giladgd May 26, 2026
cb6f8c1
feat: disabled residency sets on macOS by default for better OS respo…
giladgd May 26, 2026
6977bcd
fix: bug
giladgd May 26, 2026
9d9cccb
fix: Windows LLVM toolchain
giladgd May 27, 2026
0cf657e
feat: more optimized local build
giladgd May 27, 2026
68386d6
fix: consider paddings in resource usage calculations
giladgd Jun 3, 2026
d3f88c9
feat: skip specific patches, respect progress logs config
giladgd Jun 3, 2026
e15fbbd
fix: add missing change
giladgd Jun 3, 2026
4bf6958
fix: properly use HF token when needed and present
giladgd Jun 4, 2026
7879fb2
fix: improve thready safety
giladgd Jun 4, 2026
98977a1
fix: update pending PR patch
giladgd Jun 4, 2026
35b0184
fix: bugs
giladgd Jun 4, 2026
5201176
feat(`inspect estimate` command): auto resolve flash attention
giladgd Jun 4, 2026
18ebb8b
fix: circular imports from `config.ts`
giladgd Jun 5, 2026
876bb0b
feat: default to `progressLogs: "stderr"`
giladgd Jun 5, 2026
115ef68
feat(`inspect measure` command): support embedding models
giladgd Jun 5, 2026
6f592d4
fix: native types in sampler
giladgd Jun 5, 2026
0ebe562
fix: update pending PR patch
giladgd Jun 5, 2026
c41b189
test: fix tests
giladgd Jun 6, 2026
dab905e
fix: model memory estimation
giladgd Jun 6, 2026
94ad63a
fix(`inspect measure` command): align `useDirectIo` with the rest of …
giladgd Jun 6, 2026
15e1c38
fix: Vulkan thread safety
giladgd Jun 6, 2026
243ed2e
fix: Vulkan thread safety
giladgd Jun 6, 2026
ecac77a
fix: load deadlock
giladgd Jun 6, 2026
a1b5a27
fix: add missing change
giladgd Jun 6, 2026
f6b8b5e
test: fix test
giladgd Jun 6, 2026
5f2094a
fix: update pending PR patch
giladgd Jun 7, 2026
3f46e0a
feat: faster resource usage estimation
giladgd Jun 7, 2026
89e3e01
docs: inform about RAM cap behavior on unified memory systems
giladgd Jun 7, 2026
9cbb5cd
fix: bugs
giladgd Jun 7, 2026
6598cfd
fix(CLI): avoid redownloading existing model that consists of multipl…
giladgd Jun 14, 2026
db92aac
fix: join metadata from multi-file models for resource usage estimation
giladgd Jun 14, 2026
0b2348e
fix: update pending PR patch
giladgd Jun 14, 2026
86eb3ba
fix: optimize checkpoints management when using grammar
giladgd Jun 14, 2026
2ad108a
fix: don't crash when loading huge models
giladgd Jun 14, 2026
922475f
fix: attribute external JS memory to the relevant objects
giladgd Jun 14, 2026
9f31ca0
fix: github client ratelimit workaround
giladgd Jun 14, 2026
ce952ac
chore: update workflow versions
giladgd Jun 14, 2026
e0cfc89
test: fix tests
giladgd Jun 14, 2026
70cfccf
docs: update Electron CI example
giladgd Jun 14, 2026
4a5ee76
fix: build
giladgd Jun 14, 2026
a98d5b9
test: fix test
giladgd Jun 14, 2026
40c6f67
feat: reduce log spam and report layers resolution progress
giladgd Jun 16, 2026
f0156bd
feat: build CPU variants for arm
giladgd Jun 16, 2026
3a5a4fa
fix: windows build
giladgd Jun 16, 2026
f90e3cd
build: arm builds
giladgd Jun 16, 2026
33876c8
fix: bug
giladgd Jun 17, 2026
79b7090
feat: optimize layers resolution algorithm
giladgd Jun 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 102 additions & 62 deletions .github/workflows/build.yml

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@ jobs:
name: Test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
- name: Install modules
run: npm ci
- name: ESLint
Expand All @@ -27,14 +27,14 @@ jobs:
name: Test docs compilation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v6
with:
lfs: true
fetch-depth: 0
fetch-tags: true
- uses: actions/setup-node@v4
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
- name: Install modules
run: npm ci
- name: Build
Expand Down
1 change: 1 addition & 0 deletions .vitepress/config/apiReferenceSidebar.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ const chatWrappersOrder = [
"Llama3ChatWrapper",
"Llama2ChatWrapper",
"MistralChatWrapper",
"Gemma4ChatWrapper",
"GemmaChatWrapper",
"ChatMLChatWrapper",
"FalconChatWrapper",
Expand Down
9 changes: 5 additions & 4 deletions docs/guide/electron.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,11 @@ jobs:
os: macos-13

steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
with:
node-version: "20"
node-version: "22"
package-manager-cache: false

- name: Install dependencies on Ubuntu
if: matrix.config.name == 'Ubuntu'
Expand All @@ -87,7 +88,7 @@ jobs:
run: npm run build

- name: Upload artifacts
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
include-hidden-files: true
name: "electron-app-${{ matrix.config.name }}"
Expand Down
4 changes: 2 additions & 2 deletions llama/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -120,8 +120,8 @@ list(REMOVE_DUPLICATES GPU_INFO_HEADERS)
list(REMOVE_DUPLICATES GPU_INFO_SOURCES)
list(REMOVE_DUPLICATES GPU_INFO_EXTRA_LIBS)

addVariantSuffix(llama ${NLC_VARIANT})
addVariantSuffix(ggml ${NLC_VARIANT})
addVariantSuffix(llama "${NLC_VARIANT}")
addVariantSuffix(ggml "${NLC_VARIANT}")

file(GLOB SOURCE_FILES "addon/*.cpp" "addon/**/*.cpp" ${GPU_INFO_SOURCES})

Expand Down
Loading
Loading