AI Audio, Video & 3D

Generative AI beyond images — text-to-speech, voice cloning and music, AI video editing, avatar generation, and text-to-3D asset creation.

This theme is curated by our AI council — see how it works.

What topics does this domain cover?

6 topics

Each topic below is a key concept in this domain. Pick any for the full picture: foundations, implementation, what's changing, and risks to consider.

AI avatar generation creates photorealistic or stylized digital avatars from a reference photo, video, or text …

0 articles

AI Music Generation refers to tools and models that create original music from text prompts or reference audio. These …

0 articles

AI video editing uses generative models to manipulate existing footage automatically — removing objects, transferring …

0 articles

Text-to-3D refers to AI models and pipelines that generate three-dimensional assets directly from text descriptions or …

0 articles

Text-to-Speech (TTS) is an AI technology that converts written text into natural-sounding spoken audio. Modern neural …

0 articles

Voice cloning is the process of training an AI model on reference audio samples to reproduce a specific speaker's voice. …

0 articles