Skip to content

AI University — Body of Knowledge

Speech & Audio AI

Speech & Audio AI¶

Understanding and generating sound — speech, music, and everything in between.

Speech & Audio AI is one of the core areas in the AI University map of AI. Explore the diagram, then dive into each topic — every subtopic grows into its own deep-dive over time.

flowchart LR
  V([Voice]) --> ASR[ASR] --> TXT[/Text/] --> TTS[TTS] --> V2([Speech])

Key topics¶

Speech recognition (ASR)

Turning spoken audio into text.
Text-to-speech (TTS)

Generating natural-sounding speech from text.
Voice & speaker tech

Speaker identification, diarization, and voice cloning (and its ethics).
Music & audio generation

Composing and synthesizing music and sound effects.

Deep Learning · Generative AI

Learn this properly

Want hands-on training in speech & audio ai? Explore AI University courses and AI School camps for kids.