Extend Any Model to Unlimited Context

All open-weight models plus your own custom/self-trained GGUF models. Run with ./alphallama -m <model>.gguf --port 18080.

175 models

DeepSeek: DeepSeek V3 0324deepseek/deepseek-chat-v3-0324

Dense

Params

685B

Context

160K

Run

./alphallama -m deepseek-chat-v3-0324.gguf

DeepSeek: DeepSeek V3.1deepseek/deepseek-chat-v3.1

Dense

Params

671B

Context

160K

Run

./alphallama -m deepseek-chat-v3.1.gguf

DeepSeek: R1 0528deepseek/deepseek-r1-0528

Dense

Params

671B

Context

160K

Run

./alphallama -m deepseek-r1-0528.gguf

DeepSeek: R1deepseek/deepseek-r1

Dense

Params

671B

Context

160K

Run

./alphallama -m deepseek-r1.gguf

Qwen: Qwen3 Coder Plusqwen/qwen3-coder-plus

MoE

Params

480B (35B active)

Context

977K

Run

./alphallama -m qwen3-coder-plus.gguf

Qwen: Qwen3 Coder 480B A35Bqwen/qwen3-coder

MoE

Params

480B (35B active)

Context

1024K

Run

./alphallama -m qwen3-coder.gguf

Baidu: ERNIE 4.5 VL 424B A47B baidu/ernie-4.5-vl-424b-a47b

MoE

Params

424B (47B active)

Context

128K

Run

./alphallama -m ernie-4.5-vl-424b-a47b.gguf

Nous: Hermes 4 405Bnousresearch/hermes-4-405b

Dense

Params

405B

Context

128K

Run

./alphallama -m hermes-4-405b.gguf

Nous: Hermes 3 405B Instructnousresearch/hermes-3-llama-3.1-405b

Dense

Params

405B

Context

128K

Run

./alphallama -m hermes-3-llama-3.1-405b.gguf

Qwen: Qwen3.5 397B A17Bqwen/qwen3.5-397b-a17b

MoE

Params

397B (17B active)

Context

256K

Run

./alphallama -m qwen3.5-397b-a17b.gguf

Qwen 3.5 397B MoEqwen3.5:397b

MoE

Params

397B (17B active)

Context

—

Run

./alphallama -m qwen3.5.gguf

Xiaomi: MiMo-V2-Flashxiaomi/mimo-v2-flash

Dense

Params

309B

Context

256K

Run

./alphallama -m mimo-v2-flash.gguf

DeepSeek: DeepSeek V4 Flashdeepseek/deepseek-v4-flash

MoE

Params

284B (13B active)

Context

1024K

Run

./alphallama -m deepseek-v4-flash.gguf

Qwen: Qwen3 VL 235B A22B Thinkingqwen/qwen3-vl-235b-a22b-thinking

MoE

Params

235B (22B active)

Context

128K

Run

./alphallama -m qwen3-vl-235b-a22b-thinking.gguf

Qwen: Qwen3 VL 235B A22B Instructqwen/qwen3-vl-235b-a22b-instruct

MoE

Params

235B (22B active)

Context

256K

Run

./alphallama -m qwen3-vl-235b-a22b-instruct.gguf

Qwen: Qwen3 235B A22B Thinking 2507qwen/qwen3-235b-a22b-thinking-2507

MoE

Params

235B (22B active)

Context

256K

Run

./alphallama -m qwen3-235b-a22b-thinking-2507.gguf

Qwen: Qwen3 235B A22B Instruct 2507qwen/qwen3-235b-a22b-2507

MoE

Params

235B (22B active)

Context

256K

Run

./alphallama -m qwen3-235b-a22b-2507.gguf

Qwen: Qwen3 235B A22Bqwen/qwen3-235b-a22b

MoE

Params

235B (22B active)

Context

128K

Run

./alphallama -m qwen3-235b-a22b.gguf

Mistral: Mistral Medium 3.5mistralai/mistral-medium-3-5

Dense

Params

128B

Context

256K

Run

./alphallama -m mistral-medium-3-5.gguf

Mistral: Devstral 2 2512mistralai/devstral-2512

Dense

Params

123B

Context

256K

Run

./alphallama -m devstral-2512.gguf

Qwen: Qwen3.5-122B-A10Bqwen/qwen3.5-122b-a10b

MoE

Params

122B (10B active)

Context

256K

Run

./alphallama -m qwen3.5-122b-a10b.gguf

Qwen 3.5 122B MoEqwen3.5:122b

MoE

Params

122B (10B active)

Context

—

Run

./alphallama -m qwen3.5.gguf

NVIDIA: Nemotron 3 Supernvidia/nemotron-3-super-120b-a12b

MoE

Params

120B (12B active)

Context

977K

Run

./alphallama -m nemotron-3-super-120b-a12b.gguf

Prime Intellect: INTELLECT-3prime-intellect/intellect-3

Dense

Params

106B

Context

128K

Run

./alphallama -m intellect-3.gguf

Z.ai: GLM 4.5Vz-ai/glm-4.5v

MoE

Params

106B

Context

64K

Run

./alphallama -m glm-4.5v.gguf

inclusionAI: Ling-2.6-flashinclusionai/ling-2.6-flash

Dense

Params

104B

Context

256K

Run

./alphallama -m ling-2.6-flash.gguf

Upstage: Solar Pro 3upstage/solar-pro-3

MoE

Params

102B

Context

125K

Run

./alphallama -m solar-pro-3.gguf

Qwen: Qwen3 Coder Nextqwen/qwen3-coder-next

MoE

Params

80B

Context

256K

Run

./alphallama -m qwen3-coder-next.gguf

Qwen: Qwen3 Next 80B A3B Thinkingqwen/qwen3-next-80b-a3b-thinking

MoE

Params

80B (3B active)

Context

256K

Run

./alphallama -m qwen3-next-80b-a3b-thinking.gguf

Qwen: Qwen3 Next 80B A3B Instructqwen/qwen3-next-80b-a3b-instruct

MoE

Params

80B (3B active)

Context

256K

Run

./alphallama -m qwen3-next-80b-a3b-instruct.gguf

Qwen: Qwen3.6 Plusqwen/qwen3.6-plus

Dense

Params

72B

Context

977K

Run

./alphallama -m qwen3.6-plus.gguf

Arcee AI: Virtuoso Largearcee-ai/virtuoso-large

Dense

Params

72B

Context

128K

Run

./alphallama -m virtuoso-large.gguf

Qwen: Qwen2.5 VL 72B Instructqwen/qwen2.5-vl-72b-instruct

Dense

Params

72B

Context

128K

Run

./alphallama -m qwen2.5-vl-72b-instruct.gguf

Magnum v4 72Banthracite-org/magnum-v4-72b

Dense

Params

72B

Context

32K

Run

./alphallama -m magnum-v4-72b.gguf

Qwen2.5 72B Instructqwen/qwen-2.5-72b-instruct

Dense

Params

72B

Context

128K

Run

./alphallama -m qwen-2.5-72b-instruct.gguf

Nous: Hermes 4 70Bnousresearch/hermes-4-70b

Dense

Params

70B

Context

128K

Run

./alphallama -m hermes-4-70b.gguf

DeepSeek: R1 Distill Llama 70Bdeepseek/deepseek-r1-distill-llama-70b

Dense

Params

70B

Context

125K

Run

./alphallama -m deepseek-r1-distill-llama-70b.gguf

Sao10K: Llama 3.1 70B Hanami x1sao10k/l3.1-70b-hanami-x1

Dense

Params

70B

Context

16K

Run

./alphallama -m l3.1-70b-hanami-x1.gguf

Sao10K: Llama 3.3 Euryale 70Bsao10k/l3.3-euryale-70b

Dense

Params

70B

Context

128K

Run

./alphallama -m l3.3-euryale-70b.gguf

Meta: Llama 3.3 70B Instructmeta-llama/llama-3.3-70b-instruct

Dense

Params

70B

Context

128K

Run

./alphallama -m llama-3.3-70b-instruct.gguf

Sao10K: Llama 3.1 Euryale 70B v2.2sao10k/l3.1-euryale-70b

Dense

Params

70B

Context

128K

Run

./alphallama -m l3.1-euryale-70b.gguf

Nous: Hermes 3 70B Instructnousresearch/hermes-3-llama-3.1-70b

Dense

Params

70B

Context

128K

Run

./alphallama -m hermes-3-llama-3.1-70b.gguf

Meta: Llama 3.1 70B Instructmeta-llama/llama-3.1-70b-instruct

Dense

Params

70B

Context

128K

Run

./alphallama -m llama-3.1-70b-instruct.gguf

Meta: Llama 3 70B Instructmeta-llama/llama-3-70b-instruct

Dense

Params

70B

Context

Run

./alphallama -m llama-3-70b-instruct.gguf

inclusionAI: Ring-2.6-1Tinclusionai/ring-2.6-1t

Dense

Params

63B

Context

256K

Run

./alphallama -m ring-2.6-1t.gguf

NVIDIA: Nemotron 3 Ultranvidia/nemotron-3-ultra-550b-a55b

MoE

Params

55B

Context

977K

Run

./alphallama -m nemotron-3-ultra-550b-a55b.gguf

DeepSeek: DeepSeek V4 Prodeepseek/deepseek-v4-pro

Dense

Params

49B

Context

1024K

Run

./alphallama -m deepseek-v4-pro.gguf

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5nvidia/llama-3.3-nemotron-super-49b-v1.5

Dense

Params

49B

Context

128K

Run

./alphallama -m llama-3.3-nemotron-super-49b-v1.5.gguf

Mistral: Mistral Large 3 2512mistralai/mistral-large-2512

Dense

Params

41B

Context

256K

Run

./alphallama -m mistral-large-2512.gguf

TheDrummer: Skyfall 36B V2thedrummer/skyfall-36b-v2

Dense

Params

36B

Context

32K

Run

./alphallama -m skyfall-36b-v2.gguf

Qwen: Qwen3.6 35B A3Bqwen/qwen3.6-35b-a3b

MoE

Params

35B (3B active)

Context

256K

Run

./alphallama -m qwen3.6-35b-a3b.gguf

Qwen: Qwen3.5-35B-A3Bqwen/qwen3.5-35b-a3b

MoE

Params

35B (3B active)

Context

256K

Run

./alphallama -m qwen3.5-35b-a3b.gguf

Qwen 3.5 35B MoEqwen3.5:35b

MoE

Params

35B (3B active)

Context

—

Run

./alphallama -m qwen3.5.gguf

Qwen: Qwen3.6 Flashqwen/qwen3.6-flash

Dense

Params

32B

Context

977K

Run

./alphallama -m qwen3.6-flash.gguf

AllenAI: Olmo 3 32B Thinkallenai/olmo-3-32b-think

Dense

Params

32B

Context

64K

Run

./alphallama -m olmo-3-32b-think.gguf

Qwen: Qwen3 VL 32B Instructqwen/qwen3-vl-32b-instruct

Dense

Params

32B

Context

256K

Run

./alphallama -m qwen3-vl-32b-instruct.gguf

Arcee AI: Coder Largearcee-ai/coder-large

Dense

Params

32B

Context

32K

Run

./alphallama -m coder-large.gguf

Qwen: Qwen3 32Bqwen/qwen3-32b

Dense

Params

32B

Context

128K

Run

./alphallama -m qwen3-32b.gguf

AionLabs: Aion-1.0-Miniaion-labs/aion-1.0-mini

Dense

Params

32B

Context

128K

Run

./alphallama -m aion-1.0-mini.gguf

DeepSeek: R1 Distill Qwen 32Bdeepseek/deepseek-r1-distill-qwen-32b

Dense

Params

32B

Context

125K

Run

./alphallama -m deepseek-r1-distill-qwen-32b.gguf

Qwen2.5 Coder 32B Instructqwen/qwen-2.5-coder-32b-instruct

Dense

Params

32B

Context

125K

Run

./alphallama -m qwen-2.5-coder-32b-instruct.gguf

Qwen 3.5 32B denseqwen3.5:32b

Dense

Params

32B

Context

—

Run

./alphallama -m qwen3.5.gguf

Google: Gemma 4 31Bgoogle/gemma-4-31b-it

Dense

Params

31B

Context

256K

Run

./alphallama -m gemma-4-31b-it.gguf

Z.ai: GLM 4.7 Flashz-ai/glm-4.7-flash

Dense

Params

30B

Context

198K

Run

./alphallama -m glm-4.7-flash.gguf

NVIDIA: Nemotron 3 Nano 30B A3Bnvidia/nemotron-3-nano-30b-a3b

MoE

Params

30B (3B active)

Context

256K

Run

./alphallama -m nemotron-3-nano-30b-a3b.gguf

Qwen: Qwen3 VL 30B A3B Thinkingqwen/qwen3-vl-30b-a3b-thinking

MoE

Params

30B (3B active)

Context

128K

Run

./alphallama -m qwen3-vl-30b-a3b-thinking.gguf

Qwen: Qwen3 VL 30B A3B Instructqwen/qwen3-vl-30b-a3b-instruct

MoE

Params

30B (3B active)

Context

256K

Run

./alphallama -m qwen3-vl-30b-a3b-instruct.gguf

Qwen: Qwen3 30B A3B Thinking 2507qwen/qwen3-30b-a3b-thinking-2507

MoE

Params

30B (3B active)

Context

128K

Run

./alphallama -m qwen3-30b-a3b-thinking-2507.gguf

Qwen: Qwen3 Coder 30B A3B Instructqwen/qwen3-coder-30b-a3b-instruct

MoE

Params

30B (3B active)

Context

156K

Run

./alphallama -m qwen3-coder-30b-a3b-instruct.gguf

Qwen: Qwen3 30B A3B Instruct 2507qwen/qwen3-30b-a3b-instruct-2507

MoE

Params

30B (3B active)

Context

128K

Run

./alphallama -m qwen3-30b-a3b-instruct-2507.gguf

Qwen: Qwen3 30B A3Bqwen/qwen3-30b-a3b

MoE

Params

30B (3B active)

Context

128K

Run

./alphallama -m qwen3-30b-a3b.gguf

Qwen: Qwen3.6 27Bqwen/qwen3.6-27b

Dense

Params

27B

Context

256K

Run

./alphallama -m qwen3.6-27b.gguf

Qwen: Qwen3.5-27Bqwen/qwen3.5-27b

Dense

Params

27B

Context

256K

Run

./alphallama -m qwen3.5-27b.gguf

Google: Gemma 3 27Bgoogle/gemma-3-27b-it

Dense

Params

27B

Context

128K

Run

./alphallama -m gemma-3-27b-it.gguf

Google: Gemma 2 27Bgoogle/gemma-2-27b-it

Dense

Params

27B

Context

Run

./alphallama -m gemma-2-27b-it.gguf

Qwen 3.5 27B denseqwen3.5:27b

Dense

Params

27B

Context

—

Run

./alphallama -m qwen3.5.gguf

Google: Gemma 4 26B A4B google/gemma-4-26b-a4b-it

MoE

Params

26B (4B active)

Context

256K

Run

./alphallama -m gemma-4-26b-a4b-it.gguf

Arcee AI: Trinity Miniarcee-ai/trinity-mini

Dense

Params

26B

Context

128K

Run

./alphallama -m trinity-mini.gguf

LiquidAI: LFM2-24B-A2Bliquid/lfm-2-24b-a2b

MoE

Params

24B (2B active)

Context

125K

Run

./alphallama -m lfm-2-24b-a2b.gguf

Mistral: Voxtral Small 24B 2507mistralai/voxtral-small-24b-2507

Dense

Params

24B

Context

31K

Run

./alphallama -m voxtral-small-24b-2507.gguf

TheDrummer: Cydonia 24B V4.1thedrummer/cydonia-24b-v4.1

Dense

Params

24B

Context

128K

Run

./alphallama -m cydonia-24b-v4.1.gguf

Mistral: Mistral Small 3.2 24Bmistralai/mistral-small-3.2-24b-instruct

Dense

Params

24B

Context

125K

Run

./alphallama -m mistral-small-3.2-24b-instruct.gguf

Mistral: Mistral Small 3.1 24Bmistralai/mistral-small-3.1-24b-instruct

Dense

Params

24B

Context

125K

Run

./alphallama -m mistral-small-3.1-24b-instruct.gguf

Mistral: Sabamistralai/mistral-saba

Dense

Params

24B

Context

32K

Run

./alphallama -m mistral-saba.gguf

Mistral: Mistral Small 3mistralai/mistral-small-24b-instruct-2501

Dense

Params

24B

Context

32K

Run

./alphallama -m mistral-small-24b-instruct-2501.gguf

Mistral: Mixtral 8x22B Instructmistralai/mixtral-8x22b-instruct

Dense

Params

22B

Context

64K

Run

./alphallama -m mixtral-8x22b-instruct.gguf

WizardLM-2 8x22Bmicrosoft/wizardlm-2-8x22b

Dense

Params

22B

Context

64K

Run

./alphallama -m wizardlm-2-8x22b.gguf

Meta: Llama 4 Maverickmeta-llama/llama-4-maverick

MoE

Params

17B

Context

1024K

Run

./alphallama -m llama-4-maverick.gguf

Meta: Llama 4 Scoutmeta-llama/llama-4-scout

MoE

Params

17B

Context

9766K

Run

./alphallama -m llama-4-scout.gguf

Mistral: Ministral 3 14B 2512mistralai/ministral-14b-2512

Dense

Params

14B

Context

256K

Run

./alphallama -m ministral-14b-2512.gguf

Qwen: Qwen3 14Bqwen/qwen3-14b

Dense

Params

14B

Context

129K

Run

./alphallama -m qwen3-14b.gguf

Tencent: Hunyuan A13B Instructtencent/hunyuan-a13b-instruct

MoE

Params

13B

Context

128K

Run

./alphallama -m hunyuan-a13b-instruct.gguf

ReMM SLERP 13Bundi95/remm-slerp-l2-13b

Dense

Params

13B

Context

Run

./alphallama -m remm-slerp-l2-13b.gguf

MythoMax 13Bgryphe/mythomax-l2-13b

Dense

Params

13B

Context

Run

./alphallama -m mythomax-l2-13b.gguf

Meta: Llama Guard 4 12Bmeta-llama/llama-guard-4-12b

Dense

Params

12B

Context

160K

Run

./alphallama -m llama-guard-4-12b.gguf

Google: Gemma 3 12Bgoogle/gemma-3-12b-it

Dense

Params

12B

Context

128K

Run

./alphallama -m gemma-3-12b-it.gguf

TheDrummer: UnslopNemo 12Bthedrummer/unslopnemo-12b

Dense

Params

12B

Context

32K

Run

./alphallama -m unslopnemo-12b.gguf

TheDrummer: Rocinante 12Bthedrummer/rocinante-12b

Dense

Params

12B

Context

32K

Run

./alphallama -m rocinante-12b.gguf

Mistral: Mistral Nemomistralai/mistral-nemo

Dense

Params

12B

Context

128K

Run

./alphallama -m mistral-nemo.gguf

Meta: Llama 3.2 11B Vision Instructmeta-llama/llama-3.2-11b-vision-instruct

Dense

Params

11B

Context

128K

Run

./alphallama -m llama-3.2-11b-vision-instruct.gguf

Qwen: Qwen3.5-9Bqwen/qwen3.5-9b

Dense

Params

Context

256K

Run

./alphallama -m qwen3.5-9b.gguf

Qwen 3.5 9B denseqwen3.5:9b

Dense

Params

Context

—

Run

./alphallama -m qwen3.5.gguf

IBM: Granite 4.1 8Bibm-granite/granite-4.1-8b

Dense

Params

Context

128K

Run

./alphallama -m granite-4.1-8b.gguf

EssentialAI: Rnj 1 Instructessentialai/rnj-1-instruct

Dense

Params

Context

32K

Run

./alphallama -m rnj-1-instruct.gguf

Mistral: Ministral 3 8B 2512mistralai/ministral-8b-2512

Dense

Params

Context

256K

Run

./alphallama -m ministral-8b-2512.gguf

Qwen: Qwen3 VL 8B Thinkingqwen/qwen3-vl-8b-thinking

Dense

Params

Context

250K

Run

./alphallama -m qwen3-vl-8b-thinking.gguf

Qwen: Qwen3 VL 8B Instructqwen/qwen3-vl-8b-instruct

Dense

Params

Context

250K

Run

./alphallama -m qwen3-vl-8b-instruct.gguf

Qwen: Qwen3 8Bqwen/qwen3-8b

Dense

Params

Context

128K

Run

./alphallama -m qwen3-8b.gguf

AionLabs: Aion-RP 1.0 (8B)aion-labs/aion-rp-llama-3.1-8b

Dense

Params

Context

32K

Run

./alphallama -m aion-rp-llama-3.1-8b.gguf

Sao10K: Llama 3 8B Lunarissao10k/l3-lunaris-8b

Dense

Params

Context

Run

./alphallama -m l3-lunaris-8b.gguf

Meta: Llama 3.1 8B Instructmeta-llama/llama-3.1-8b-instruct

Dense

Params

Context

128K

Run

./alphallama -m llama-3.1-8b-instruct.gguf

Meta: Llama 3 8B Instructmeta-llama/llama-3-8b-instruct

Dense

Params

Context

Run

./alphallama -m llama-3-8b-instruct.gguf

Reka Edgerekaai/reka-edge

Dense

Params

Context

16K

Run

./alphallama -m reka-edge.gguf

Qwen: Qwen2.5 7B Instructqwen/qwen-2.5-7b-instruct

Dense

Params

Context

128K

Run

./alphallama -m qwen-2.5-7b-instruct.gguf

Google: Gemma 3n 4Bgoogle/gemma-3n-e4b-it

Dense

Params

Context

32K

Run

./alphallama -m gemma-3n-e4b-it.gguf

Google: Gemma 3 4Bgoogle/gemma-3-4b-it

Dense

Params

Context

128K

Run

./alphallama -m gemma-3-4b-it.gguf

Qwen 3.5 4B denseqwen3.5:4b

Dense

Params

Context

—

Run

./alphallama -m qwen3.5.gguf

Mistral: Ministral 3 3B 2512mistralai/ministral-3b-2512

Dense

Params

Context

128K

Run

./alphallama -m ministral-3b-2512.gguf

IBM: Granite 4.0 Microibm-granite/granite-4.0-h-micro

Dense

Params

Context

128K

Run

./alphallama -m granite-4.0-h-micro.gguf

Meta: Llama 3.2 3B Instructmeta-llama/llama-3.2-3b-instruct

Dense

Params

Context

128K

Run

./alphallama -m llama-3.2-3b-instruct.gguf

Qwen 3.5 1.5B denseqwen3.5:1.5b

Dense

Params

Context

—

Run

./alphallama -m qwen3.5.gguf

Meta: Llama 3.2 1B Instructmeta-llama/llama-3.2-1b-instruct

Dense

Params

Context

128K

Run

./alphallama -m llama-3.2-1b-instruct.gguf

Qwen 3.5 0.8B denseqwen3.5:0.8b

Dense

Params

Context

—

Run

./alphallama -m qwen3.5.gguf

Qwen: Qwen3.7 Plusqwen/qwen3.7-plus

Dense

Params

—

Context

977K

Run

./alphallama -m qwen3.7-plus.gguf

Qwen: Qwen3.7 Maxqwen/qwen3.7-max

Dense

Params

—

Context

977K

Run

./alphallama -m qwen3.7-max.gguf

Perceptron: Perceptron Mk1perceptron/perceptron-mk1

Dense

Params

—

Context

32K

Run

./alphallama -m perceptron-mk1.gguf

Qwen: Qwen3.5 Plus 2026-04-20qwen/qwen3.5-plus-20260420

Dense

Params

—

Context

977K

Run

./alphallama -m qwen3.5-plus-20260420.gguf

Qwen: Qwen3.6 Max Previewqwen/qwen3.6-max-preview

Dense

Params

—

Context

256K

Run

./alphallama -m qwen3.6-max-preview.gguf

inclusionAI: Ling-2.6-1Tinclusionai/ling-2.6-1t

Dense

Params

—

Context

256K

Run

./alphallama -m ling-2.6-1t.gguf

Tencent: Hy3 previewtencent/hy3-preview

Dense

Params

—

Context

256K

Run

./alphallama -m hy3-preview.gguf

Xiaomi: MiMo-V2.5-Proxiaomi/mimo-v2.5-pro

Dense

Params

—

Context

1024K

Run

./alphallama -m mimo-v2.5-pro.gguf

Xiaomi: MiMo-V2.5xiaomi/mimo-v2.5

Dense

Params

—

Context

1024K

Run

./alphallama -m mimo-v2.5.gguf

Z.ai: GLM 5.1z-ai/glm-5.1

Dense

Params

—

Context

198K

Run

./alphallama -m glm-5.1.gguf

Arcee AI: Trinity Large Thinkingarcee-ai/trinity-large-thinking

Dense

Params

—

Context

256K

Run

./alphallama -m trinity-large-thinking.gguf

Kwaipilot: KAT-Coder-Pro V2kwaipilot/kat-coder-pro-v2

Dense

Params

—

Context

250K

Run

./alphallama -m kat-coder-pro-v2.gguf

Mistral: Mistral Small 4mistralai/mistral-small-2603

Dense

Params

—

Context

256K

Run

./alphallama -m mistral-small-2603.gguf

Z.ai: GLM 5 Turboz-ai/glm-5-turbo

Dense

Params

—

Context

256K

Run

./alphallama -m glm-5-turbo.gguf

ByteDance Seed: Seed-2.0-Litebytedance-seed/seed-2.0-lite

Dense

Params

—

Context

256K

Run

./alphallama -m seed-2.0-lite.gguf

Inception: Mercury 2inception/mercury-2

Dense

Params

—

Context

125K

Run

./alphallama -m mercury-2.gguf

ByteDance Seed: Seed-2.0-Minibytedance-seed/seed-2.0-mini

Dense

Params

—

Context

256K

Run

./alphallama -m seed-2.0-mini.gguf

Qwen: Qwen3.5-Flashqwen/qwen3.5-flash-02-23

Dense

Params

—

Context

977K

Run

./alphallama -m qwen3.5-flash-02-23.gguf

AionLabs: Aion-2.0aion-labs/aion-2.0

Dense

Params

—

Context

128K

Run

./alphallama -m aion-2.0.gguf

Qwen: Qwen3.5 Plus 2026-02-15qwen/qwen3.5-plus-02-15

Dense

Params

—

Context

977K

Run

./alphallama -m qwen3.5-plus-02-15.gguf

Z.ai: GLM 5z-ai/glm-5

Dense

Params

—

Context

198K

Run

./alphallama -m glm-5.gguf

Qwen: Qwen3 Max Thinkingqwen/qwen3-max-thinking

Dense

Params

—

Context

256K

Run

./alphallama -m qwen3-max-thinking.gguf

Writer: Palmyra X5writer/palmyra-x5

Dense

Params

—

Context

1016K

Run

./alphallama -m palmyra-x5.gguf

ByteDance Seed: Seed 1.6 Flashbytedance-seed/seed-1.6-flash

Dense

Params

—

Context

256K

Run

./alphallama -m seed-1.6-flash.gguf

ByteDance Seed: Seed 1.6bytedance-seed/seed-1.6

Dense

Params

—

Context

256K

Run

./alphallama -m seed-1.6.gguf

Z.ai: GLM 4.7z-ai/glm-4.7

Dense

Params

—

Context

198K

Run

./alphallama -m glm-4.7.gguf

Relace: Relace Searchrelace/relace-search

Dense

Params

—

Context

250K

Run

./alphallama -m relace-search.gguf

Z.ai: GLM 4.6Vz-ai/glm-4.6v

Dense

Params

—

Context

128K

Run

./alphallama -m glm-4.6v.gguf

DeepSeek: DeepSeek V3.2deepseek/deepseek-v3.2

Dense

Params

—

Context

128K

Run

./alphallama -m deepseek-v3.2.gguf

Microsoft: Phi 4 Mini Instructmicrosoft/phi-4-mini-instruct

Dense

Params

—

Context

128K

Run

./alphallama -m phi-4-mini-instruct.gguf

Z.ai: GLM 4.6z-ai/glm-4.6

Dense

Params

—

Context

198K

Run

./alphallama -m glm-4.6.gguf

DeepSeek: DeepSeek V3.2 Expdeepseek/deepseek-v3.2-exp

Dense

Params

—

Context

160K

Run

./alphallama -m deepseek-v3.2-exp.gguf

Relace: Relace Apply 3relace/relace-apply-3

Dense

Params

—

Context

250K

Run

./alphallama -m relace-apply-3.gguf

Qwen: Qwen3 Maxqwen/qwen3-max

Dense

Params

—

Context

256K

Run

./alphallama -m qwen3-max.gguf

DeepSeek: DeepSeek V3.1 Terminusdeepseek/deepseek-v3.1-terminus

Dense

Params

—

Context

160K

Run

./alphallama -m deepseek-v3.1-terminus.gguf

Qwen: Qwen3 Coder Flashqwen/qwen3-coder-flash

Dense

Params

—

Context

977K

Run

./alphallama -m qwen3-coder-flash.gguf

Qwen: Qwen Plus 0728 (thinking)qwen/qwen-plus-2025-07-28:thinking

Dense

Params

—

Context

977K

Run

./alphallama -m qwen-plus-2025-07-28.gguf

Qwen: Qwen Plus 0728qwen/qwen-plus-2025-07-28

Dense

Params

—

Context

977K

Run

./alphallama -m qwen-plus-2025-07-28.gguf

Mistral: Mistral Medium 3.1mistralai/mistral-medium-3.1

Dense

Params

—

Context

128K

Run

./alphallama -m mistral-medium-3.1.gguf

Mistral: Codestral 2508mistralai/codestral-2508

Dense

Params

—

Context

250K

Run

./alphallama -m codestral-2508.gguf

Z.ai: GLM 4.5z-ai/glm-4.5

MoE

Params

—

Context

128K

Run

./alphallama -m glm-4.5.gguf

Z.ai: GLM 4.5 Airz-ai/glm-4.5-air

MoE

Params

—

Context

128K

Run

./alphallama -m glm-4.5-air.gguf

Switchpoint Routerswitchpoint/router

Dense

Params

—

Context

128K

Run

./alphallama -m router.gguf

Morph: Morph V3 Largemorph/morph-v3-large

Dense

Params

—

Context

256K

Run

./alphallama -m morph-v3-large.gguf

Morph: Morph V3 Fastmorph/morph-v3-fast

Dense

Params

—

Context

80K

Run

./alphallama -m morph-v3-fast.gguf

Mistral: Mistral Medium 3mistralai/mistral-medium-3

Dense

Params

—

Context

128K

Run

./alphallama -m mistral-medium-3.gguf

Reka Flash 3rekaai/reka-flash-3

Dense

Params

—

Context

64K

Run

./alphallama -m reka-flash-3.gguf

AionLabs: Aion-1.0aion-labs/aion-1.0

Dense

Params

—

Context

128K

Run

./alphallama -m aion-1.0.gguf

Qwen: Qwen-Plusqwen/qwen-plus

Dense

Params

—

Context

977K

Run

./alphallama -m qwen-plus.gguf

Microsoft: Phi 4microsoft/phi-4

Dense

Params

—

Context

16K

Run

./alphallama -m phi-4.gguf

DeepSeek: DeepSeek V3deepseek/deepseek-chat

Dense

Params

—

Context

128K

Run

./alphallama -m deepseek-chat.gguf

Mancer: Weaver (alpha)mancer/weaver

Dense

Params

—

Context

Run

./alphallama -m weaver.gguf

AlphaLlama downloads and optimizes models automatically.