Text to Speech API for E-Learning and Course Narration
Consistent instructor voice across every slide, chapter, and language edition. 33 languages, 43 voices, single REST endpoint.
Human narrators break voice consistency on script revisions
When a compliance update changes 10 slides out of 200, a human narrator re-records those 10 with slightly different mic placement, room tone, and delivery. AI narration is deterministic: the same text, same voice ID, same acoustic character — always.
Localization triples production cost with human voice actors
Audexum generates a French, German, Spanish, and Japanese edition of your course by passing the translated script with lang changed. The voice count (43 voices) and price schedule are identical regardless of language.
LMS platforms need file-based audio, not an embedded player
Audexum returns a binary WAV or MP3. Store it in S3, Cloudflare R2, or your LMS asset library directly. No iframe embeds, no third-party player dependency, no outbound CDN request from the learner's browser at playback time.
Rapid course iteration needs batch generation
Script edits happen late in production. A Python loop over your JSON slide data regenerates all changed slides in minutes. The Batch page in the Audexum dashboard handles CSV-based bulk synthesis without writing code.
First audio in 60 seconds.
No SDK — one POST request, binary audio in the response body.
"""
Generate narration audio for every slide in a course manifest.
manifest.json: [{"id": "s01", "lang": "en", "text": "Welcome..."}, ...]
"""
import json, pathlib, requests
API = "https://audexum.com/api/synthesize"
KEY = "sk_..."
slides = json.loads(pathlib.Path("manifest.json").read_text())
out_dir = pathlib.Path("audio")
out_dir.mkdir(exist_ok=True)
VOICE = "F1" # consistent voice across all slides
for slide in slides:
dest = out_dir / f"{slide['id']}_{slide['lang']}.mp3"
if dest.exists():
continue # skip already-rendered slides
r = requests.post(
API,
headers={"Authorization": f"Bearer {KEY}"},
json={
"text": slide["text"],
"voice": VOICE,
"lang": slide["lang"],
"format": "mp3",
},
timeout=60,
)
r.raise_for_status()
dest.write_bytes(r.content)
print(f"Rendered {dest.name}")
print(f"Done — {len(slides)} slides.")Full parameter reference: audexum.com/docs. Supported formats: wav, mp3, ogg. Supported voices: F1–F5, M1–M5 (43 voices total). Supported languages: 33.
Transparent, no per-use-case surcharge.
Every plan covers every use case at the same character rate. PAYG credits never expire.
All plans include STT (speech-to-text dictation) at no extra cost. Full pricing details →
Tools and guides for this use case.
Can the same voice narrate in multiple languages?+
Yes. Voice IDs (F1–F5, M1–M5) work across all 33 languages. F1 in English and F1 in French use the same underlying model weights — learners hear a consistent instructor voice regardless of which language edition they choose.
How do I keep narration in sync after a script edit?+
Only re-render slides whose text changed. Because Audexum is deterministic (same text + same voice = same audio), unchanged slides produce bit-identical output and can be safely cached.
What output format do LMS platforms expect?+
Most SCORM/xAPI-based LMS platforms (Moodle, Canvas, Articulate Rise) accept MP3. Request format: "mp3" in the API call. For SSML-heavy content or very precise timing, request "wav" and convert with ffmpeg post-processing.
Is there a no-code option for instructional designers?+
The Batch page (/batch) in your dashboard accepts a CSV with columns: text, voice, lang, filename. It renders all rows and lets you download a ZIP. No code required.
Same API, every use case.
One endpoint handles Discord bots, podcast narration, e-learning courses, accessibility audio, and newsletter editions.
Start free. Ship fast.
10,000 characters per month, no credit card required. First audio in your terminal in 60 seconds.
Questions? [email protected]