Small, local, open models — distilled from frontier teachers.
ADI is a line of compact language models built at theLAB (Learning. Algorithms. Breakthroughs.). Each model is a knowledge distillation: a strong frontier "teacher" generates high-quality answers across thousands of prompts, and a small "student" model is fine-tuned to imitate them — producing a model that reasons and responds like something much larger, while staying small enough to run on a single consumer GPU.
Every model here is built end-to-end on theLAB hardware — no cloud training — then quantized to GGUF and shipped ready to run in Ollama or any llama.cpp-based runtime.
Links: Website · theLAB · YouTube — Advanced Data Intelligence · YouTube — ADI Online
General-purpose local assistant. Qwen3.5-4B distilled from glm-5.2. Reasons and explains like a frontier model on general topics. Native tool-calling, 262K context, ~2.7 GB.
ollama run hf.co/AdvancedDataIntelligence/adi-qwen3.5-4b-glm5.2-general-GGUF:Q4_K_M
General-purpose local assistant. Qwen3-8B distilled from glm-5.2. Reasons and explains like a frontier model on general topics, with more headroom than the 4B. Native tool-calling, 128K context, ~5 GB.
ollama run hf.co/AdvancedDataIntelligence/adi-qwen3-8b-glm5.2-general-GGUF:Q4_K_M
General-purpose local assistant. Qwen3.5-9B distilled from glm-5.2. The most capable general student in the line — more parametric headroom for nuanced reasoning while still fitting a single consumer GPU. Native tool-calling, 262K context, ~5.6 GB.
ollama run hf.co/AdvancedDataIntelligence/adi-qwen3.5-9b-glm5.2-general-GGUF:Q4_K_M
Local coding assistant. Qwen2.5-Coder-7B distilled from kimi-k2.7-code. Writes, explains, and debugs code with frontier-style quality. Native tool-calling, 128K context, ~4.4 GB.
ollama run hf.co/AdvancedDataIntelligence/adi-qwen2.5-coder-7b-kimi2.7-code-GGUF:Q4_K_M
ADI Models Lab — the full lineup in one place. Pick a student from the rail (Qwen3.5 4B, Qwen3.5 9B, Qwen3 8B, Coder 7B, and the hey-adi wakeword), read its teacher, context, and size at a glance, then copy a ready-to-paste run command. Includes the live in-browser demo — no install to try, no sign-in to copy.
Pick a student. Copy a command. Run offline.
▶ Open ADI Models Lab
A hosted demo is available as a Hugging Face Space — chat with the model directly in your browser, no install required.
Ollama (recommended). Pull and run any model directly from this org — no manual download needed. Ollama fetches the GGUF from Hugging Face on first run:
ollama run hf.co/AdvancedDataIntelligence/adi-qwen3-8b-glm5.2-general-GGUF:Q4_K_M
Swap :Q4_K_M for another quant tag if a model ships multiple. To pull without running:
ollama pull hf.co/AdvancedDataIntelligence/adi-qwen3-8b-glm5.2-general-GGUF:Q4_K_M
Manual download (llama.cpp or offline). Grab the raw GGUF with the Hugging Face CLI:
huggingface-cli download AdvancedDataIntelligence/adi-qwen3-8b-glm5.2-general-GGUF adi-qwen3-8b-glm5.2-q4_k_m.gguf --local-dir .
Then point any llama.cpp-based runtime at the downloaded file.
In the pipeline, distilled the same way and headed here soon:
Follow the org to catch them on release.
Models follow the pattern adi-<base>-<size>-<teacher>-<purpose> — so the name tells you the student base, its size, the teacher it learned from, and what it's tuned for.
Built at theLAB — Learning. Algorithms. Breakthroughs.