Back to Table
SmSmall Models3

Small Models

Big capability, small package

modelsRow 3: Deploymentintermediate2 hoursRequires: Lg

Overview

Small Language Models (SLMs) offer efficient, deployable AI that can run on edge devices or with minimal resources.

What is it?

Compact AI models optimized for efficiency while maintaining useful capabilities.

Why it matters

Not every task needs GPT-4. SLMs offer faster inference, lower cost, and can run locally or on-device.

How it works

Techniques like distillation, quantization, and pruning compress large models. Architecture innovations create efficient models from scratch.

Real-World Examples

Phi-3

Microsoft's efficient small model

Llama 3.2 1B

Meta's smallest Llama

Gemma

Google's lightweight models

Tools & Libraries

Ollamaframework

Run small models locally

llama.cpplibrary

Efficient CPU inference

GGUFtechnique

Quantized model format