Model Discovery

ModelBeam provides a dynamic model registry. Never hardcode model names — use the models endpoint to discover available models at runtime.

GET /api/v1/client/models

Returns all available models with their capabilities, limits, and defaults.

Query Parameters

ParameterTypeDescription
filter[inference_types]stringComma-separated inference types to filter by
per_pageintegerModels per page (default: 25)
pageintegerPage number

Inference Types

TypeDescription
txt2imgText to Image
img2imgImage to Image
txt2audioText to Speech
txt2videoText to Video
img2videoImage to Video
aud2videoAudio to Video
txt2musicText to Music
txt2embeddingText to Embedding
vid2txtVideo URL to Text
aud2txtAudio URL to Text
videofile2txtVideo File to Text
audiofile2txtAudio File to Text
transcribeUnified Transcription
img2txtImage to Text (OCR)
img_upscaleImage Upscale
img_rmbgBackground Removal
videos_replaceVideo Replace (Animate)

Example Request

curl https://api.modelbeam.ai/api/v1/client/models?filter[inference_types]=txt2img \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Accept: application/json"

Model Schema

{
  "name": "FLUX.1 Schnell 12B NF4",
  "slug": "Flux1schnell",
  "inference_types": ["txt2img"],
  "info": {
    "limits": {
      "min_width": 256, "max_width": 2048,
      "min_height": 256, "max_height": 2048,
      "min_steps": 1, "max_steps": 10,
      "resolution_step": 128
    },
    "features": {
      "supports_steps": true,
      "supports_guidance": false,
      "supports_negative_prompt": true,
      "supports_last_frame": false,
      "supports_custom_output_size": false
    },
    "defaults": {
      "width": 768, "height": 768,
      "steps": 4,
      "negative_prompt": ""
    }
  },
  "loras": null,
  "languages": null
}

Schema Notes

  • lorasnull for models without LoRA support, or an array of {"display_name", "name"} objects. Use in generation requests as loras: [{"name": "LoraSlug", "weight": 0.8}].
  • features — Varies by model type. Image models have supports_guidance, supports_steps, supports_negative_prompt. Video models add supports_last_frame. TTS models have supports_voice_clone, supports_custom_voice, supports_voice_design.
  • limits — Varies by model type. Image models have width/height/steps limits. Music models have min_caption/max_caption, min_duration/max_duration, min_bpm/max_bpm. TTS models have min_text/max_text, min_speed/max_speed. Embedding models have max_input_tokens/max_total_tokens.
  • languages — For TTS models, contains supported languages with voice presets.

Available Models

Text to Image

ModelSlugMax ResolutionMax Steps
FLUX.1 Schnell 12B NF4Flux1schnell2048x204810
FLUX.2 Klein 4B BF16Flux_2_Klein_4B_BF161536x153610
Z-Image-Turbo INT8ZImageTurbo_INT81536x15368

Image to Image

ModelSlugFeatures
FLUX.2 Klein 4B BF16Flux_2_Klein_4B_BF16Steps, guidance, negative prompt
Qwen Image Edit Plus NF4QwenImageEdit_Plus_NF4Prompt-only editing

Text to Speech

ModelSlugFeatures
KokoroKokoro11 languages, 40+ voices
Qwen3 TTS 12Hz 1.7B CustomVoiceQwen3_TTS_12Hz_1_7B_CustomVoiceCustom voice, voice clone, voice design
Qwen3 TTS 12Hz 1.7B VoiceDesignQwen3_TTS_12Hz_1_7B_VoiceDesignVoice design from instructions
Qwen3 TTS 12Hz 1.7B BaseQwen3_TTS_12Hz_1_7B_BaseClone voice from reference audio
ChatterboxChatterboxVoice cloning

Text to Video

ModelSlugMax ResolutionMax Frames
LTX-Video 13B Distilled FP8Ltxv_13B_0_9_8_Distilled_FP81280x1280120
LTX Video 2.3 22B Distilled INT8LTX_2_3_22B_Dist_INT81280x1280120

Image to Video

ModelSlugMax ResolutionMax Frames
LTX-2.3 22B Distilled INT8Ltx2_3_22B_Dist_INT81280x1280120
LTX Video 2.0 19B Distilled FP8LTX_2_19B_Dist_FP81280x1280120

Audio to Video

ModelSlugMax ResolutionMax Frames
LTX Video 2.1 9B Distilled FP8Ltx2_19B_Dist_FP81280x1280120

Transcription

ModelSlugTypes
Whisper Large V3WhisperLargeV3vid2txt, aud2txt, transcribe, audiofile2txt, videofile2txt

OCR

ModelSlug
Nanonets OCR S F16Nanonets_Ocr_S_F16

Embeddings

ModelSlugMax Tokens
BGE M3 FP16Bge_M3_FP168192 per input, 300K total

Music

ModelSlugDuration
ACE-Step 1.5 TurboAceStep_1_5_Turbo10-600s
ACE-Step 1.5 BaseAceStep_1_5_Base10-600s
ACE-Step 1.5 XL Turbo INT8AceStep_1_5_XL_Turbo_INT810-600s

Background Removal

ModelSlug
BEN2Ben2

Image Upscale

ModelSlug
Real-ESRGAN x4RealESRGAN_x4

Video Replace

ModelSlug
Wan 2.2 AnimateWan2_2_Animate

Python Caching Example

import requests
import time

class ModelBeamModelCache:
    """Cache model list with TTL to avoid redundant API calls."""

    def __init__(self, api_key, cache_ttl=300):
        self.api_key = api_key
        self.base_url = "https://api.modelbeam.ai"
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Accept": "application/json"
        }
        self.cache_ttl = cache_ttl
        self._cache = {}
        self._cache_ts = {}

    def get_models(self, inference_type=None, force_refresh=False):
        cache_key = inference_type or "__all__"
        now = time.time()

        if (not force_refresh
                and cache_key in self._cache
                and now - self._cache_ts[cache_key] < self.cache_ttl):
            return self._cache[cache_key]

        params = {}
        if inference_type:
            params["filter[inference_types]"] = inference_type

        resp = requests.get(
            f"{self.base_url}/api/v1/client/models",
            headers=self.headers,
            params=params
        )
        resp.raise_for_status()
        models = resp.json()["data"]

        self._cache[cache_key] = models
        self._cache_ts[cache_key] = now
        return models

    def get_model_slugs(self, inference_type):
        return [m["slug"] for m in self.get_models(inference_type)]