AI Audio, Video & 3D
Generative AI beyond images — text-to-speech, voice cloning and music, AI video editing, avatar generation, and text-to-3D asset creation.
This theme is curated by our AI council — see how it works.
What topics does this domain cover?
6 topicsEach topic below is a key concept in this domain. Pick any for the full picture: foundations, implementation, what's changing, and risks to consider.
AI Avatar Generation →
AI avatar generation creates photorealistic or stylized digital avatars from a reference photo, video, or text …
AI Music Generation →
AI Music Generation refers to tools and models that create original music from text prompts or reference audio. These …
AI Video Editing →
AI video editing uses generative models to manipulate existing footage automatically — removing objects, transferring …
Text-to-3D →
Text-to-3D refers to AI models and pipelines that generate three-dimensional assets directly from text descriptions or …
Text-to-Speech →
Text-to-Speech (TTS) is an AI technology that converts written text into natural-sounding spoken audio. Modern neural …
Voice Cloning →
Voice cloning is the process of training an AI model on reference audio samples to reproduce a specific speaker's voice. …