Use case

Text to Speech API for Accessibility and Audio Content Versions

Serve audio versions of your written content to users with visual impairments, reading disabilities, or situational constraints. One endpoint, 33 languages.

Start free — 30K credits, no card Voice playground API reference

Why it matters

Screen readers mispronounce technical terms and proper nouns

OS screen readers (NVDA, VoiceOver) apply generic phonetic rules. Audexum narrates your exact text as a trained TTS voice — proper nouns, product names, and abbreviations render more naturally. Pre-generated audio also bypasses screen reader latency entirely.

WCAG 1.2.1 asks for audio alternatives to text content

Pre-generated audio files are a straightforward path to meeting WCAG Success Criterion 1.2.1 (Audio-only and Video-only). Generate an MP3 per article, link it with an <audio> element, and the criterion is satisfied for that content unit.

On-demand generation costs more than pre-generation at scale

For static content (blog posts, help docs, legal pages), render audio at publish time and store the file. You pay the character cost once. Serving a cached MP3 on repeat visits costs zero additional API calls.

Multilingual audiences need audio in their own language

If you publish content in 5 languages, a single API integration generates audio for all 5. No separate TTS provider per language, no per-language pricing uplift.

Integration

First audio in 60 seconds.

No SDK — one POST request, binary audio in the response body.

Python — generate audio version at article publish time

"""
Accessibility audio generator: called from your CMS publish webhook.
Stores one MP3 per article in /audio/<slug>.mp3 (or S3 / R2).
"""
import os, requests

API = "https://audexum.com/api/synthesize"
KEY = os.environ["AUDEXUM_API_KEY"]

def publish_audio_version(article_slug: str, body_text: str, lang: str = "en") -> str:
    """
    Render body_text as MP3, save to disk, return file path.
    Call this from your CMS publish hook.
    """
    r = requests.post(
        API,
        headers={"Authorization": f"Bearer {KEY}"},
        json={
            "text": body_text,
            "voice": "F1",          # warm, readable narrator voice
            "lang": lang,
            "format": "mp3",
        },
        timeout=120,
    )
    r.raise_for_status()

    dest = f"/var/www/audio/{article_slug}.mp3"
    with open(dest, "wb") as f:
        f.write(r.content)

    return dest   # link in your <audio src="..."> element

Full parameter reference: audexum.com/docs. Supported formats: wav, mp3, ogg. Supported voices: F1–F5, M1–M5 (43 voices total). Supported languages: 33.

Pricing

Transparent, no per-use-case surcharge.

Every plan covers every use case at the same credit rate. PAYG credits never expire.

Plan	Credits/mo	Price
Free	30,000 credits	€0 / mo
Starter	250,000 credits	€4 / mo
Pro	1,200,000 credits	€12 / mo
Scale	4,000,000 credits	€30 / mo
Business	15,000,000 credits	€99 / mo
Pay-as-you-go	Unlimited	€20 / 1M credits

All plans include STT (speech-to-text dictation) at no extra cost. Full pricing details →

Tools and guides for this use case.

FAQ

Does Audexum support SSML for pronunciation control?+

SSML support is on the roadmap. Currently, the plain-text endpoint covers the majority of accessibility use cases. For abbreviation expansion or phoneme overrides, preprocess the text before the API call.

Can I pre-generate audio for an entire documentation site?+

Yes. Loop over your content pages, pass each article body, and store the returned MP3. The PAYG tier (€20/1M credits) is cost-effective for large documentation sites. A 1,000-word article (~5,000 credits) costs roughly €0.10.

What voice is recommended for accessibility narration?+

F1 (clear diction, measured pace) and M1 (neutral, unhurried) perform best in user testing for long-form reading. Both are available in all 33 languages.

Can I serve audio on-demand rather than pre-generating?+

Yes — call the API in your server route handler and stream the response to the browser. Median latency is under 200 ms for texts under 300 characters, which is acceptable for on-demand "read this paragraph" interactions.

Other use cases

Same API, every use case.

One endpoint handles Discord bots, podcast narration, e-learning courses, accessibility audio, and newsletter editions.

TTS for Newsletters TTS for E-Learning TTS for Podcasts TTS for Discord Bots

Start free. Ship fast.

30,000 credits per month, no credit card required. First audio in your terminal in 60 seconds.

Create free account Voice playground API reference

Questions? [email protected]