User

class User(xi_api_key: str)

Represents a user of the ElevenLabs API, including subscription information.

This is the class that can be used to query for the user’s available voices and create new ones.

__init__(xi_api_key: str)

Initializes a new instance of the User class.

Parameters:

xi_api_key (str) – The user’s API key.

Raises:

ValueError – If the API Key is invalid.

property headers: dict

Returns: dict: The headers used for API requests.

get_user_data() dict
Returns:

All the information returned by the /v1/user endpoint.

Return type:

dict

get_subscription_data() dict
Returns:

All the information returned by the /v1/user/subscription endpoint.

Return type:

dict

get_character_info()
Returns:

A tuple containing the number of characters used up, the maximum, and if the maximum can be increased.

Return type:

(int, int, bool)

get_voice_clone_available() bool
Returns:

True if the user can use instant voice cloning, False otherwise.

Return type:

bool

get_next_invoice() dict | None
Returns:

The next invoice’s data, or None if there is no next invoice.

Return type:

dict | None

get_models() list[Model]

This function returns all the available models for this account as Model.

Returns:

All the available models for this account, as Model instances.

Return type:

list[Model]

get_all_voices(show_legacy: bool = True) list[Voice | DesignedVoice | ClonedVoice | ProfessionalVoice]

Gets a list of all voices registered to this account.

Caution

Some of these may be unusable due to subscription tier changes. Use get_available_voices if you only need the currently useable ones.

Returns:

A list containing all the voices.

Return type:

list[Voice]

get_available_voices(show_legacy: bool = True) list[Voice | DesignedVoice | ClonedVoice | ProfessionalVoice]

Gets a list of voices this account can currently use for TTS.

Returns:

A list of currently usable voices.

Return type:

list[Voice]

get_voice_by_ID(voiceID: str) Voice | DesignedVoice | ClonedVoice | ProfessionalVoice

Gets a specific voice by ID.

Parameters:

voiceID (str) – The ID of the voice to get.

Returns:

The requested voice.

Return type:

Voice|DesignedVoice|ClonedVoice|ProfessionalVoice

get_voices_by_name_v2(voiceName: str, score_threshold: int = 75) list[Voice | EditableVoice | ClonedVoice | ProfessionalVoice]

Gets a list of voices with the given name.

Note

This is a list as multiple voices can have the same name.

Parameters:
  • voiceName (str) – The name of the voices to get.

  • score_threshold (int, Optional) – The % chance of a voice being a match required for it to be included in the returned list. Defaults to 75%.

Returns:

A list of matching voices.

Return type:

list[Voice|DesignedVoice|ClonedVoice]

get_history_items_paginated(maxNumberOfItems: int = 100, startAfterHistoryItem: str | HistoryItem = None) list[HistoryItem]

This function returns numberOfItems history items, starting from the newest (or the one specified with startAfterHistoryItem) and returning older ones.

Parameters:
  • maxNumberOfItems (int) – The maximum number of history items to get. A value of 0 or less means all of them.

  • startAfterHistoryItem (str|HistoryItem) – The history item (or its ID) from which to start returning items.

Returns:

A list containing the requested history items.

Return type:

list[HistoryItem]

get_history_item(historyItemID: str | GenerationInfo) HistoryItem
Parameters:

historyItemID – The HistoryItem ID.

Returns:

The corresponding HistoryItem

Return type:

HistoryItem

download_history_items_v2(historyItems: list[str | HistoryItem]) dict[HistoryItem, tuple[bytes, str]]

Download multiple history items and return a dictionary where the key is the HistoryItem and the value is a tuple consisting of the bytes of the audio and its filename.

Parameters:

historyItems (list[str|HistoryItem]) – List of history items (or their IDs) to download.

Returns:

Dictionary where the key is the historyItem and the value is a tuple of the bytes of the mp3 file and its filename.

Return type:

dict[HistoryItem, bytes]

design_voice(gender: str, accent: str, age: str, accent_strength: float, sampleText: str = 'First we thought the PC was a calculator. Then we found out how to turn numbers into letters and we thought it was a typewriter.')

Calls the API endpoint that randomly generates a voice based on the given parameters.

Caution

To actually save the generated voice to your account, you must then call save_designed_voice with the temporary voiceID.

Parameters:
  • gender (str) – The gender.

  • accent (str) – The accent.

  • age (str) – The age.

  • accent_strength (float) – How strong the accent should be, between 0.3 and 2.

  • sampleText (str) – The text that will be used to randomly generate the new voice. Must be at least 100 characters long.

Returns:

A tuple containing the new, temporary voiceID and the bytes of the generated audio.

Return type:

(str, bytes)

save_designed_voice(temporaryVoiceID: str | tuple[str, bytes], voiceName: str, voiceDescription: str = '') DesignedVoice

Saves a voice generated via design_voice to your account, with the given name.

Parameters:
  • temporaryVoiceID (str|tuple(str,bytes)) – The temporary voiceID of the generated voice. It also supports directly passing the tuple from design_voice.

  • voiceName (str) – The name you would like to give to the new voice.

  • voiceDescription (str) – The description you would like to give to the new voice.

Returns:

The newly created voice

Return type:

DesignedVoice

generate_voice(voice_description: str, text: str | None = None, auto_generate_text: bool = False, output_format: str = 'mp3_44100_192') list[tuple[str, bytes, float]]

Calls the updated API endpoint that generates voice previews based on the given description.

Parameters:
  • voice_description (str) – Description of the voice to generate. Must be between 20-1000 characters.

  • text (str, optional) – Text to generate audio for. Must be between 100-1000 characters. Required unless auto_generate_text is True.

  • auto_generate_text (bool, optional) – Whether to automatically generate suitable text. Defaults to False.

  • output_format (str, optional) – Output format for the generated audio. Defaults to “mp3_44100_192”.

Returns:

A list of tuples, each containing:
  • generated_voice_id (str): Temporary ID for the generated voice

  • audio_bytes (bytes): Audio data for the preview

  • duration_secs (float): Duration of the audio in seconds

Return type:

list[tuple[str, bytes, float]]

Raises:

ValueError – If voice_description is not between 20-1000 characters or if text is not between 100-1000 characters when required.

save_generated_voice(generated_voice_id: str, voice_name: str, voice_description: str, labels: dict[str, str] | None = None) EditableVoice

Saves a voice generated via generate_voice to your account.

Parameters:
  • generated_voice_id (str) – The temporary voice ID returned by generate_voice.

  • voice_name (str) – The name you would like to give to the new voice.

  • voice_description (str) – The description for the voice. Must be between 20-1000 characters.

  • labels (dict[str, str], optional) – Metadata to add to the created voice. Defaults to None.

Returns:

The newly created voice

Return type:

DesignedVoice

Raises:

ValueError – If voice_description is not between 20-1000 characters.

clone_voice(name: str, samples: list[str] | dict[str, bytes], description: str = '', remove_background_noise: bool = False, labels: dict[str, str] | None = None)

Create a new ClonedVoice object from the given samples.

Parameters:
  • name (str) – Name of the voice to be created.

  • samples (list[str]|dict[str, bytes]) – List of file paths OR dictionary of sample file names and bytes for the voice samples.

  • description (str, Optional) – The description of the voice.

  • remove_background_noise (bool, optional) – Whether to automatically remove background noise. Defaults to false, can worsen quality if noise is not present.

  • labels (dict[str, str], optional) – The labels to add to the voice.

Returns:

The new voice.

Return type:

ClonedVoice

search_voice_library(search_term: str | None = None, use_cases: list[str] | None = None, descriptives: list[str] | None = None, sort: ~elevenlabslib.helpers.LibSort | None = LibSort.TRENDING, advanced_filters: ~elevenlabslib.helpers.LibVoiceInfo = <elevenlabslib.helpers.LibVoiceInfo object>, starting_page=0, query_page_size=30) List[LibraryVoiceData]

Allows you to search the voice library with various filters. For parameters which are lists, all voices that match at least one of them will be returned.

Parameters:
  • search_term (str, Optional) – The search term to use, equivalent to typing it into the site.

  • use_cases (list, Optional) – A list of use cases.

  • descriptives (list, Optional) – A list of descriptives (Soft, Calm, etc).

  • sort (LibSort, Optional) – How to sort the voices.

  • advanced_filters (LibVoiceInfo, Optional) – Allows you to filter voices based on its characteristics (language, accent, etc)

  • query_page_size (int, Optional) – How many voices to return. Defaults to 30.

  • starting_page (int, Optional) – The page to start at.

add_shared_voice(voice: LibraryVoiceData, newName: str) Voice

Adds a voice from the library to your account.

Parameters:
  • voice (LibraryVoiceData) – A LibraryVoiceData object, from the voice library endpoint.

  • newName (str) – Name to give to the voice.

Returns:

The newly created voice.

Return type:

Voice

add_shared_voice_from_URL(shareURL: str, newName: str) Voice

Adds a voice from a share link to the account.

Parameters:
  • shareURL (str) – The sharing URL for the voice.

  • newName (str) – Name to give to the voice.

Returns:

The newly created voice.

Return type:

Voice

add_shared_voice_from_info(publicUserID: str, voiceID: str, newName: str) Voice

Adds a voice directly from the voiceID and the public userID.

Parameters:
  • publicUserID (str) – The public userID of the voice’s creator.

  • voiceID (str) – The voiceID of the voice.

  • newName (str) – Name to give to the voice.

Returns:

The newly created voice.

Return type:

Voice

add_project(name: str, default_title_voice: [str, Voice], default_paragraph_voice: [str, Voice], default_model: [str | Model], pronunciation_dictionaries=None, from_url: str | None = None, from_document: str | None = None, quality_preset: str = 'standard', title: str | None = None, author: str | None = None, isbn_number: str | None = None, volume_normalization: bool = False) Project

Creates a new project.

Parameters:
  • name (str) – Name of the project.

  • default_title_voice (str|Voice) – Default voice for titles.

  • default_paragraph_voice (str|Voice) – Default voice for paragraphs.

  • default_model (str) – Model for the project.

  • pronunciation_dictionaries (list[PronunciationDictionary]) – Pronunciation dictionary locators.

  • from_url (str, optional) – Optional URL to initialize project content.

  • from_document (str, optional) – The filepath to a file from which to initialize the project.

  • quality_preset (str, optional) – Quality preset for audio. Must be “standard”, “high” or “ultra”. Qualities higher than standard increase character cost. Defaults to standard.

  • title (str, optional) – Project title.

  • author (str, optional) – Author name.

  • isbn_number (str, optional) – ISBN number.

  • volume_normalization (bool, optional) – Whether to enable volume normalization. Defaults to False.

create_podcast(model_id: str, podcast_type: str, host_voice: str | Voice, guest_voice: str | Voice | None = None, source_text: str | None = None, source_url: str | None = None, quality_preset: str = 'standard', duration_scale: str = 'default', language: str | None = None, highlights: List[str] | None = None, callback_url: str | None = None) Project

Creates a new podcast project with simplified parameters.

Parameters:
  • model_id (str) – ID of the model to use.

  • podcast_type (str) – Either ‘conversation’ or ‘bulletin’.

  • host_voice (str|Voice) – Voice for the host.

  • guest_voice (str|Voice, optional) – Voice for the guest (required for ‘conversation’ mode).

  • source_text (str, optional) – Text content for the podcast. Either this or source_url must be provided.

  • source_url (str, optional) – URL to extract content from. Either this or source_text must be provided.

  • quality_preset (str, optional) – Audio quality. Options: ‘standard’, ‘high’, ‘highest’, ‘ultra’, ‘ultra_lossless’. Defaults to ‘standard’.

  • duration_scale (str, optional) – Duration of the podcast. Options: ‘short’, ‘default’, ‘long’. Defaults to ‘default’.

  • language (str, optional) – ISO 639-1 two-letter language code.

  • highlights (list[str], optional) – Brief summary points (10-70 characters each).

  • callback_url (str, optional) – URL to call when project is converted.

Returns:

The created podcast project.

Return type:

Project

create_transcript(audio: str | bytes | BinaryIO, model_id: str = 'scribe_v1', language_code: str | None = None, tag_audio_events: bool = True, num_speakers: int | None = None, timestamps_granularity: str = 'word', diarize: bool = False) dict

Transcribes speech from an audio file.

Parameters:
  • audio – Can be one of: - str: Path to the audio file - bytes: Raw audio data - BinaryIO: File-like object containing audio data

  • model_id (str) – The ID of the model to use for transcription. Currently only ‘scribe_v1’ is available.

  • language_code (str, optional) – ISO-639-1 or ISO-639-3 language code for the audio file.

  • tag_audio_events (bool, optional) – Whether to tag audio events like (laughter), (footsteps), etc. Defaults to True.

  • num_speakers (int, optional) – Maximum number of speakers (1-32). Defaults to model’s maximum.

  • timestamps_granularity (str, optional) – Granularity of timestamps: ‘none’, ‘word’, or ‘character’. Defaults to ‘word’.

  • diarize (bool, optional) – Whether to annotate which speaker is talking. Limits audio to 8 minutes. Defaults to False.

Returns:

The transcription results.

Return type:

dict

add_pronunciation_dictionary(name: str, description: str, dict_file: str | TextIO) PronunciationDictionary

Adds a pronunciation dictionary. :param name: The name for the dictionary. :type name: str :param description: The description. :type description: str :param dict_file: The dictionary file, either as a filepath or a TextIO object. :type dict_file: str|TextIO

Returns:

A PronunciationDictionary instance.

get_pronunciation_dictionary(dictionary_id: str) PronunciationDictionary
Parameters:

dictionary_id – The pronunciation dictionary ID.

Returns:

The corresponding PronunciationDictionary

Return type:

PronunciationDictionary

get_pronunciation_dictionaries(max_number_of_items: int = 30, start_after_dict: str | PronunciationDictionary | None = None) List[PronunciationDictionary]

This function returns max_number_of_items pronunciation dictionaries, starting from the newest (or the one specified with start_after_dict) and returning older ones.

Parameters:
  • max_number_of_items (int) – The maximum number of dictionaries to get. A value of 0 or less means all of them.

  • start_after_dict (str|PronunciationDictionary) – The pronunciation dict (or its ID) from which to start returning dicts.

Returns:

A list containing the requested pronunciation dictionaries.

Return type:

list[PronunciationDictionary]

generate_sfx(prompt: str, sfx_generation_options: SFXOptions = SFXOptions(duration_seconds=None, prompt_influence=None)) tuple[Future[bytes], Future[GenerationInfo]]

Generates a sound effect from a text prompt and returns the audio data as bytes.

Tip

If you would like to save the audio to disk or otherwise, you can use helpers.save_audio_bytes().

Parameters:
  • prompt (str) – The text prompt..

  • sfx_generation_options (SFXOptions) – Options for the SFX generation, such as duration, prompt adherence.

Returns:

  • A future that will contain the bytes of the audio file once the generation is complete.

  • An optional future that will contain information about the generation.

Return type:

tuple[Future[bytes], Optional[GenerationInfo]]

isolate_audio(audio: bytes | BinaryIO) tuple[Future[bytes], Future[GenerationInfo]]

Isolate the voice in the given audio.

Parameters:

audio (bytes|BinaryIO) – The audio to isolate voice from.

Returns:

  • A future that will contain the bytes of the audio file once the generation is complete.

  • An optional future that will contain the GenerationInfo object for the generation.

Return type:

tuple[Future[bytes], Optional[GenerationInfo]]

isolate_audio_stream(audio: bytes | ~typing.BinaryIO, playback_options: ~elevenlabslib.helpers.PlaybackOptions = PlaybackOptions(runInBackground=False, portaudioDeviceID=None, onPlaybackStart=<function PlaybackOptions.<lambda>>, onPlaybackEnd=<function PlaybackOptions.<lambda>>, audioPostProcessor=<function PlaybackOptions.<lambda>>), disable_playback: bool = False) tuple[Queue[ndarray], Future[OutputStream] | None, Future[GenerationInfo]]

Isolate the voice in the given audio and stream the result.

Parameters:
  • audio (bytes|BinaryIO) – The audio to isolate voice from.

  • playback_options (PlaybackOptions, optional) – Options for the audio playback such as the device to use and whether to run in the background.

  • disable_playback (bool, optional) – Allows you to disable playback altogether.

Returns:

  • A queue containing the numpy audio data as float32 arrays.

  • An optional future for controlling the playback, returned if playback is not disabled.

  • An future containing a GenerationInfo with metadata.

Return type:

tuple[queue.Queue[numpy.ndarray], Optional[Future[OutputStream]], Future[GenerationInfo]]

get_real_audio_format(generationOptions: GenerationOptions) GenerationOptions
Parameters:

generationOptions (GenerationOptions) – A GenerationOptions object.

Returns:

A GenerationOptions object with a real audio format (if the original was mp3_highest or pcm_highest, it’s modified accordingly, otherwise returned directly)

get_usage_stats(start_time: datetime | int, end_time: datetime | int | None = None, include_workspace_metrics: bool = False, breakdown_type: str | None = 'voice')

Returns the usage stats for the user. :param start_time: The start of the usage window in MILLIseconds. :type start_time: datetime.datetime|int :param end_time: The end of the usage window in MILLIseconds. Defaults to today’s date. :type end_time: datetime.datetime|int, Optional :param include_workspace_metrics: Whether to include workspace metrics. Defaults to false. :type include_workspace_metrics: bool, Optional :param breakdown_type: How to break down the results. Must be one of none, voice, user, api_keys, product_type. :type breakdown_type: str, Optional

Returns:

-The data formatted as a dict with datetime objects as keys -The data in its raw format

Return type:

A tuple containing

create_dub(name: str, target_lang: str, source_url: str = '', source_file_path: str | None = None, source_lang: str = 'auto', num_speakers: int = 0, watermark: bool = False, start_time: int | None = None, end_time: int | None = None, highest_resolution: bool = False, drop_background_audio: bool = False, use_profanity_filter: bool = False) Tuple[Dub, int]

Dubs a video or an audio file into the given language.

Parameters:
  • name (str) – Name of the dubbing project.

  • target_lang (str) – The target language to dub the content into.

  • source_url (str) – URL of the source video/audio file.

  • source_file_path (str, optional) – File path of the audio/video file to dub. If provided, it will be used instead of source_url.

  • source_lang (str, optional) – Source language. Defaults to “auto”.

  • num_speakers (int, optional) – Number of speakers to use for the dubbing. Set to 0 to automatically detect the number of speakers. Defaults to 0.

  • watermark (bool, optional) – Whether to apply a watermark to the output video. Defaults to False.

  • start_time (int, optional) – Start time of the source video/audio file.

  • end_time (int, optional) – End time of the source video/audio file.

  • highest_resolution (bool, optional) – Whether to use the highest resolution available. Defaults to False.

  • drop_background_audio (bool, optional) – An advanced setting. Whether to drop background audio from the final dub. Defaults to False.

  • use_profanity_filter (bool, optional) – [BETA] Whether transcripts should have profanities censored with the words ‘[censored]’. Defaults to False.

Returns:

A dictionary containing the dubbing_id and expected_duration_sec of the dubbing task.

Return type:

dict

get_dub_by_id(dubbing_id: str) Dub

Returns metadata about a dubbing project, including whether it’s still in progress or not.

Parameters:

dubbing_id (str) – ID of the dubbing project.

Returns:

A dictionary containing the metadata of the dubbing project.

Return type:

dict