Run Meta's SAM Audio segmentation model through a fast, dedicated API endpoint. The model is always loaded on GPUs with no cold starts, optimized for consistent low-latency performance.
Segment any sound from audio or video inputs using text, visual, or temporal prompts. Built for interactive web applications that demand instant inference.
curl -X POST "https://api.samaudioapi.com/v1/separate" \
-H "Authorization: Bearer $SAM_API_KEY" \
-F "audio=@./input.wav" \
-F 'request={
"description": "dog barking"
};type=application/json'
Remove background noise, isolate dialogue, and extract specific sounds from recordings. Clean up interviews, isolate speakers, or remove unwanted audio elements from your content.
Extract and isolate individual sound elements from complex audio for manipulation and reuse. Perfect for creating custom sound libraries and layering effects in post-production.
Integrate text-based audio segmentation into editing software, mobile apps, and digital audio workstations for faster, more intuitive post-production workflows.
Pricing starts at $0.20 per minute of audio processed and is based on latency needs, batch processing volumes, and specific use cases.
Standard rate limit: 10 requests/min sustained. Higher limits available on request.
API rate limits are calculated based on seconds of audio/video length processed, ensuring fair usage for both short and long-form content.