Model Discovery

ModelBeam provides a dynamic model registry. Never hardcode model names — use the models endpoint to discover available models at runtime.

GET /api/v1/client/models

Returns all available models with their capabilities, limits, and defaults.

Query Parameters

Parameter	Type	Description
`filter[inference_types]`	string	Comma-separated inference types to filter by
`per_page`	integer	Models per page (default: 25)
`page`	integer	Page number

Inference Types

Type	Description
`txt2img`	Text to Image
`img2img`	Image to Image
`txt2audio`	Text to Speech
`txt2video`	Text to Video
`img2video`	Image to Video
`aud2video`	Audio to Video
`txt2music`	Text to Music
`txt2embedding`	Text to Embedding
`vid2txt`	Video URL to Text
`aud2txt`	Audio URL to Text
`videofile2txt`	Video File to Text
`audiofile2txt`	Audio File to Text
`transcribe`	Unified Transcription
`img2txt`	Image to Text (OCR)
`img_upscale`	Image Upscale
`img_rmbg`	Background Removal
`videos_replace`	Video Replace (Animate)

Example Request

curl https://api.modelbeam.eu/api/v1/client/models?filter[inference_types]=txt2img \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Accept: application/json"

Model Schema

{
  "name": "FLUX.1 Schnell 12B NF4",
  "slug": "Flux1schnell",
  "inference_types": ["txt2img"],
  "info": {
    "limits": {
      "min_width": 256, "max_width": 2048,
      "min_height": 256, "max_height": 2048,
      "min_steps": 1, "max_steps": 10,
      "resolution_step": 128
    },
    "features": {
      "supports_steps": true,
      "supports_guidance": false,
      "supports_negative_prompt": true,
      "supports_last_frame": false,
      "supports_custom_output_size": false
    },
    "defaults": {
      "width": 768, "height": 768,
      "steps": 4,
      "negative_prompt": ""
    }
  },
  "loras": null,
  "languages": null
}

Schema Notes

loras — null for models without LoRA support, or an array of {"display_name", "name"} objects. Use in generation requests as loras: [{"name": "LoraSlug", "weight": 0.8}].
features — Varies by model type. Image models have supports_guidance, supports_steps, supports_negative_prompt. Video models add supports_last_frame. TTS models have supports_voice_clone, supports_custom_voice, supports_voice_design.
limits — Varies by model type. Image models have width/height/steps limits. Music models have min_caption/max_caption, min_duration/max_duration, min_bpm/max_bpm. TTS models have min_text/max_text, min_speed/max_speed. Embedding models have max_input_tokens/max_total_tokens.
languages — For TTS models, contains supported languages with voice presets.

Available Models

Text to Image

Model	Slug	Max Resolution	Max Steps
FLUX.1 Schnell 12B NF4	`Flux1schnell`	2048x2048	10
FLUX.2 Klein 4B BF16	`Flux_2_Klein_4B_BF16`	1536x1536	10
Z-Image-Turbo INT8	`ZImageTurbo_INT8`	1536x1536	8

Image to Image

Model	Slug	Features
FLUX.2 Klein 4B BF16	`Flux_2_Klein_4B_BF16`	Steps, guidance, negative prompt
Qwen Image Edit Plus NF4	`QwenImageEdit_Plus_NF4`	Prompt-only editing

Text to Speech

Model	Slug	Features
Kokoro	`Kokoro`	11 languages, 40+ voices
Qwen3 TTS 12Hz 1.7B CustomVoice	`Qwen3_TTS_12Hz_1_7B_CustomVoice`	Custom voice, voice clone, voice design
Qwen3 TTS 12Hz 1.7B VoiceDesign	`Qwen3_TTS_12Hz_1_7B_VoiceDesign`	Voice design from instructions
Qwen3 TTS 12Hz 1.7B Base	`Qwen3_TTS_12Hz_1_7B_Base`	Clone voice from reference audio
Chatterbox	`Chatterbox`	Voice cloning

Text to Video

Model	Slug	Max Resolution	Max Frames
LTX-Video 13B Distilled FP8	`Ltxv_13B_0_9_8_Distilled_FP8`	1280x1280	120
LTX Video 2.3 22B Distilled INT8	`LTX_2_3_22B_Dist_INT8`	1280x1280	120

Image to Video

Model	Slug	Max Resolution	Max Frames
LTX-2.3 22B Distilled INT8	`Ltx2_3_22B_Dist_INT8`	1280x1280	120
LTX Video 2.0 19B Distilled FP8	`LTX_2_19B_Dist_FP8`	1280x1280	120

Audio to Video

Model	Slug	Max Resolution	Max Frames
LTX Video 2.1 9B Distilled FP8	`Ltx2_19B_Dist_FP8`	1280x1280	120

Transcription

Model	Slug	Types
Whisper Large V3	`WhisperLargeV3`	vid2txt, aud2txt, transcribe, audiofile2txt, videofile2txt

OCR

Model	Slug
Nanonets OCR S F16	`Nanonets_Ocr_S_F16`

Embeddings

Model	Slug	Max Tokens
BGE M3 FP16	`Bge_M3_FP16`	8192 per input, 300K total

Music

Model	Slug	Duration
ACE-Step 1.5 Turbo	`AceStep_1_5_Turbo`	10-600s
ACE-Step 1.5 Base	`AceStep_1_5_Base`	10-600s
ACE-Step 1.5 XL Turbo INT8	`AceStep_1_5_XL_Turbo_INT8`	10-600s

Background Removal

Model	Slug
BEN2	`Ben2`

Image Upscale

Model	Slug
Real-ESRGAN x4	`RealESRGAN_x4`

Video Replace

Model	Slug
Wan 2.2 Animate	`Wan2_2_Animate`

Python Caching Example

import requests
import time

class ModelBeamModelCache:
    """Cache model list with TTL to avoid redundant API calls."""

    def __init__(self, api_key, cache_ttl=300):
        self.api_key = api_key
        self.base_url = "https://api.modelbeam.eu"
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Accept": "application/json"
        }
        self.cache_ttl = cache_ttl
        self._cache = {}
        self._cache_ts = {}

    def get_models(self, inference_type=None, force_refresh=False):
        cache_key = inference_type or "__all__"
        now = time.time()

        if (not force_refresh
                and cache_key in self._cache
                and now - self._cache_ts[cache_key] < self.cache_ttl):
            return self._cache[cache_key]

        params = {}
        if inference_type:
            params["filter[inference_types]"] = inference_type

        resp = requests.get(
            f"{self.base_url}/api/v1/client/models",
            headers=self.headers,
            params=params
        )
        resp.raise_for_status()
        models = resp.json()["data"]

        self._cache[cache_key] = models
        self._cache_ts[cache_key] = now
        return models

    def get_model_slugs(self, inference_type):
        return [m["slug"] for m in self.get_models(inference_type)]

Getting Started

Execution Modes

Integrations

API Reference

Generation

Analysis

Transformation

Prompt Enhancement

Utilities

Models

Model Discovery

GET /api/v1/client/models

Query Parameters

Inference Types

Example Request

Model Schema

Schema Notes

Available Models

Text to Image

Image to Image

Text to Speech

Text to Video

Image to Video

Audio to Video

Transcription

OCR

Embeddings

Music

Background Removal

Image Upscale

Video Replace

Python Caching Example

Getting Started

Execution Modes

Integrations

API Reference

Generation

Analysis

Transformation

Prompt Enhancement

Utilities

​Model Discovery

​GET /api/v1/client/models

​Query Parameters

​Inference Types

​Example Request

​Model Schema

​Schema Notes

​Available Models

​Text to Image

​Image to Image

​Text to Speech

​Text to Video

​Image to Video

​Audio to Video

​Transcription

​OCR

​Embeddings

​Music

​Background Removal

​Image Upscale

​Video Replace

​Python Caching Example

Model Discovery

GET /api/v1/client/models

Query Parameters

Inference Types

Example Request

Model Schema

Schema Notes

Available Models

Text to Image

Image to Image

Text to Speech

Text to Video

Image to Video

Audio to Video

Transcription

OCR

Embeddings

Music

Background Removal

Image Upscale

Video Replace

Python Caching Example