Generate videos with virtual avatars

Last updated on May 6, 2025

Learn how to generate a video featuring an avatar by providing scripted dialogue using the Text to avatar (beta) feature.

Try it in the app
Generate videos with virtual avatars using written scripts in a few simple steps.

With Text to avatar (beta), you can generate a video featuring an avatar by providing a script for the avatar to speak, whether it's a simple sentence or a detailed paragraph. You can choose from a variety of avatars, each with distinct appearances, voice styles, and accents. The feature allows you to edit the script, insert pauses, and customize the background, ensuring the final video is engaging and visually appealing.

  1. On the Adobe Firefly homepage, select Text to avatar (beta).

  2. On the Text to avatar page, select Create new.

  3. In the Scene section, select an avatar from the listed ones. Select Browse more to view the complete list and preview the voice. 

    An avatar is selected from the available ones and the Play icon is selected to preview the avatar with the voice.
    Preview the avatars with their respective voices, then select the one that best complements your video.

  4. Use the language dropdown menu to specify the language of the scripted dialogue for the avatar.

    The language dropdown menu is open and there are list of languages to choose from.
    Specify the language in which the avatar's scripted dialogue is written.

  5. In the Content field, add the scripted dialogue for the avatar.

    Under the Content section, English is the chosen language. A paragraph has been added as the scripted dialogue for the avatar.
    Add the script that the avatar should use in the specified language.

    Tip:
    •  If the text in your dialogue does not indicate certain emotions, you can try surrounding it with a character-acting quote. For example, He said with joy: "That is amazing!".
    •  Type out the words for the numbers if you want the avatar to speak it in a certain way.
    •  All capitalized words are spoken in a certain way. For example, POC will be spoken as Pee-Oh-See, but poc will be spoken as pock
    • A lowercase s after abbreviations, for example, PDFs, will be spoken as the s sound rather than es. Another example, POCi, will cause the word to be spoken as posi rather than Pee-Oh-Si. Add a dash between letters, for example, P-O-Ci, to get the intended speech.
  6. Use the preview option above the Content section to listen to the voice and pauses before generating the video.

    The preview displays the waveforms of the rendered voice according to the script.
    Use the preview feature to test and refine scripted dialogues in real time before finalizing.

  7. Under Background, select from the following available options:

    • Color: Select a color or no background.
    • Image: Select a background image from the provided options or upload your own in .png or .jpg format with the following specifications:
      • Aspect ratio: 16:9
      • Resolution: 1920x1080
    • Video: Upload a video in the .mp4 or .mov format with the following specifications:
      • Maximum length: 10 minutes
      • Codec: H264, H265
      • Aspect ratio: 16:9
      • Resolution: 1920x1080
    Under the Background section, there are three options to choose from - Color, Image, and Video.
    Add backgrounds to the video to highlight your brand, give context, or create a setting.

  8. Select Generate.

  9. Once the video is generated, use the on-screen controls to preview and scrub through the frames.

  10. Use the Download option to download the generated video in .mp4 file format.

The generated file is also added to the Your media section and will be available for download for seven days. After that period, it will be permanently deleted.