← Developer Home · API Reference

Model Catalog

All models available on IIO AI Platform — updated 2026-05

Chat & Reasoning

qwen2.5:7b Free

Fast, capable 7B parameter model. Best for summaries, Q&A, classification, and everyday tasks. Low latency.

params 7B
context 128k
latency ~1s

Use cases

Summarization Q&A Classification Translation Code assist
qwen2.5:72b Pro

High-capability 72B model for complex reasoning, analysis, and long-form generation. Use when 7B is not enough.

params 72B
context 128k
latency ~5s

Use cases

Complex analysis Legal review Report generation Research
gpt-oss:20b Pro

Open-source GPT-class model fine-tuned for instruction following. Good balance of speed and quality.

params 20B
context 32k
latency ~2s

Use cases

Instruction following Writing Chatbots
qwen3:8b Pro

Qwen3 next-gen model with improved reasoning and multilingual capabilities. Excellent for German-language tasks.

params 8B
context 128k
latency ~1.5s
lang DE+EN

Use cases

German text Reasoning Multi-step tasks

Embeddings & RAG

nomic-embed-text Free

High-quality text embeddings for semantic search and RAG. 768-dim vectors. Best general-purpose embedding model.

dims 768
max tokens 8192

Use cases

Semantic search RAG Similarity Clustering
qwen3-embedding:8b Pro

Qwen3 embedding model with 4096-dim vectors. Better multilingual coverage than nomic-embed.

dims 4096
max tokens 32k
lang multilingual

Use cases

Multilingual RAG Long documents Cross-lingual search

Model Selection Guide

TaskRecommended ModelWhy
Simple Q&A, summariesqwen2.5:7bFast, low cost, good enough for most tasks
Complex analysis, legalqwen2.5:72bHigher reasoning quality, handles nuance
German-language tasksqwen3:8bBest German comprehension
RAG / document searchnomic-embed-text + any chat modelFast embeddings, proven RAG quality
Production chatbotqwen2.5:7b with fallback to :72bSpeed + quality fallback pattern

Availability by Cell

CellTenantModelsStatus
inhzgx1intelegoqwen2.5:7b, nomic-embed-text● live
inhzgx2bilzqwen2.5:7b, gpt-oss:20b, qwen3-embedding:8b, nomic-embed-text● live
inhzgx3netplansqwen2.5:7b, nomic-embed-text● live
inhzgx4occqwen2.5:7b, nomic-embed-text● live
inhzgx5pm24qwen2.5:7b● live

For Developer API access, requests are routed to the nearest available cell automatically.