User¶

class User(xi_api_key: str)¶

Represents a user of the ElevenLabs API, including subscription information.

This is the class that can be used to query for the user’s available voices and create new ones.

__init__(xi_api_key: str)¶

Initializes a new instance of the User class.

Parameters:: xi_api_key (str) – The user’s API key.
Raises:: ValueError – If the API Key is invalid.

property headers: dict¶: Returns: dict: The headers used for API requests.

get_user_data() → dict¶

Returns:: All the information returned by the /v1/user endpoint.
Return type:: dict

get_subscription_data() → dict¶

Returns:: All the information returned by the /v1/user/subscription endpoint.
Return type:: dict

get_character_info()¶

Returns:: A tuple containing the number of characters used up, the maximum, and if the maximum can be increased.
Return type:: (int, int, bool)

get_voice_clone_available() → bool¶

Returns:: True if the user can use instant voice cloning, False otherwise.
Return type:: bool

get_next_invoice() → dict | None¶

Returns:: The next invoice’s data, or None if there is no next invoice.
Return type:: dict | None

get_models() → list[Model]¶

This function returns all the available models for this account as Model.

Returns:: All the available models for this account, as Model instances.
Return type:: list[Model]

get_all_voices(show_legacy: bool = True) → list[Voice | DesignedVoice | ClonedVoice | ProfessionalVoice]¶

Gets a list of all voices registered to this account.

Caution

Some of these may be unusable due to subscription tier changes. Use get_available_voices if you only need the currently useable ones.

Returns:: A list containing all the voices.
Return type:: list[Voice]

get_available_voices(show_legacy: bool = True) → list[Voice | DesignedVoice | ClonedVoice | ProfessionalVoice]¶

Gets a list of voices this account can currently use for TTS.

Returns:: A list of currently usable voices.
Return type:: list[Voice]

get_voice_by_ID(voiceID: str) → Voice | DesignedVoice | ClonedVoice | ProfessionalVoice¶

Gets a specific voice by ID.

Parameters:: voiceID (str) – The ID of the voice to get.
Returns:: The requested voice.
Return type:: Voice|DesignedVoice|ClonedVoice|ProfessionalVoice

get_voices_by_name_v2(voiceName: str, score_threshold: int = 75) → list[Voice | EditableVoice | ClonedVoice | ProfessionalVoice]¶

Gets a list of voices with the given name.

Note

This is a list as multiple voices can have the same name.

Parameters:

voiceName (str) – The name of the voices to get.
score_threshold (int, Optional) – The % chance of a voice being a match required for it to be included in the returned list. Defaults to 75%.

Returns:

A list of matching voices.

Return type:

list[Voice|DesignedVoice|ClonedVoice]

get_history_items_paginated(maxNumberOfItems: int = 100, startAfterHistoryItem: str | HistoryItem = None) → list[HistoryItem]¶

This function returns numberOfItems history items, starting from the newest (or the one specified with startAfterHistoryItem) and returning older ones.

Parameters:

maxNumberOfItems (int) – The maximum number of history items to get. A value of 0 or less means all of them.
startAfterHistoryItem (str|HistoryItem) – The history item (or its ID) from which to start returning items.

Returns:

A list containing the requested history items.

Return type:

list[HistoryItem]

get_history_item(historyItemID: str | GenerationInfo) → HistoryItem¶

Parameters:: historyItemID – The HistoryItem ID.
Returns:: The corresponding HistoryItem
Return type:: HistoryItem

download_history_items_v2(historyItems: list[str | HistoryItem]) → dict[HistoryItem, tuple[bytes, str]]¶

Download multiple history items and return a dictionary where the key is the HistoryItem and the value is a tuple consisting of the bytes of the audio and its filename.

Parameters:: historyItems (list[str|HistoryItem]) – List of history items (or their IDs) to download.
Returns:: Dictionary where the key is the historyItem and the value is a tuple of the bytes of the mp3 file and its filename.
Return type:: dict[HistoryItem, bytes]

design_voice(gender: str, accent: str, age: str, accent_strength: float, sampleText: str = 'First we thought the PC was a calculator. Then we found out how to turn numbers into letters and we thought it was a typewriter.')¶

Calls the API endpoint that randomly generates a voice based on the given parameters.

Caution

To actually save the generated voice to your account, you must then call save_designed_voice with the temporary voiceID.

Parameters:

gender (str) – The gender.
accent (str) – The accent.
age (str) – The age.
accent_strength (float) – How strong the accent should be, between 0.3 and 2.
sampleText (str) – The text that will be used to randomly generate the new voice. Must be at least 100 characters long.

Returns:

A tuple containing the new, temporary voiceID and the bytes of the generated audio.

Return type:

(str, bytes)

save_designed_voice(temporaryVoiceID: str | tuple[str, bytes], voiceName: str, voiceDescription: str = '') → DesignedVoice¶

Saves a voice generated via design_voice to your account, with the given name.

Parameters:

temporaryVoiceID (str|tuple(str,bytes)) – The temporary voiceID of the generated voice. It also supports directly passing the tuple from design_voice.
voiceName (str) – The name you would like to give to the new voice.
voiceDescription (str) – The description you would like to give to the new voice.

Returns:

The newly created voice

Return type:

DesignedVoice

generate_voice(voice_description: str, text: str | None = None, auto_generate_text: bool = False, output_format: str = 'mp3_44100_192') → list[tuple[str, bytes, float]]¶

Calls the updated API endpoint that generates voice previews based on the given description.

Parameters:

voice_description (str) – Description of the voice to generate. Must be between 20-1000 characters.
text (str, optional) – Text to generate audio for. Must be between 100-1000 characters. Required unless auto_generate_text is True.
auto_generate_text (bool, optional) – Whether to automatically generate suitable text. Defaults to False.
output_format (str, optional) – Output format for the generated audio. Defaults to “mp3_44100_192”.

Returns:

A list of tuples, each containing:

generated_voice_id (str): Temporary ID for the generated voice
audio_bytes (bytes): Audio data for the preview
duration_secs (float): Duration of the audio in seconds

Return type:

list[tuple[str, bytes, float]]

Raises:

ValueError – If voice_description is not between 20-1000 characters or if text is not between 100-1000 characters when required.

save_generated_voice(generated_voice_id: str, voice_name: str, voice_description: str, labels: dict[str, str] | None = None) → EditableVoice¶

Saves a voice generated via generate_voice to your account.

Parameters:

generated_voice_id (str) – The temporary voice ID returned by generate_voice.
voice_name (str) – The name you would like to give to the new voice.
voice_description (str) – The description for the voice. Must be between 20-1000 characters.
labels (dict[str, str], optional) – Metadata to add to the created voice. Defaults to None.

Returns:

The newly created voice

Return type:

DesignedVoice

Raises:

ValueError – If voice_description is not between 20-1000 characters.

clone_voice(name: str, samples: list[str] | dict[str, bytes], description: str = '', remove_background_noise: bool = False, labels: dict[str, str] | None = None)¶

Create a new ClonedVoice object from the given samples.

Parameters:

name (str) – Name of the voice to be created.
samples (list[str]|dict[str, bytes]) – List of file paths OR dictionary of sample file names and bytes for the voice samples.
description (str, Optional) – The description of the voice.
remove_background_noise (bool, optional) – Whether to automatically remove background noise. Defaults to false, can worsen quality if noise is not present.
labels (dict[str, str], optional) – The labels to add to the voice.

Returns:

The new voice.

Return type:

ClonedVoice

search_voice_library(search_term: str | None = None, use_cases: list[str] | None = None, descriptives: list[str] | None = None, sort: ~elevenlabslib.helpers.LibSort | None = LibSort.TRENDING, advanced_filters: ~elevenlabslib.helpers.LibVoiceInfo = <elevenlabslib.helpers.LibVoiceInfo object>, starting_page=0, query_page_size=30) → List[LibraryVoiceData]¶

Allows you to search the voice library with various filters. For parameters which are lists, all voices that match at least one of them will be returned.

Parameters:

search_term (str, Optional) – The search term to use, equivalent to typing it into the site.
use_cases (list, Optional) – A list of use cases.
descriptives (list, Optional) – A list of descriptives (Soft, Calm, etc).
sort (LibSort, Optional) – How to sort the voices.
advanced_filters (LibVoiceInfo, Optional) – Allows you to filter voices based on its characteristics (language, accent, etc)
query_page_size (int, Optional) – How many voices to return. Defaults to 30.
starting_page (int, Optional) – The page to start at.

add_shared_voice(voice: LibraryVoiceData, newName: str) → Voice¶

Adds a voice from the library to your account.

Parameters:

voice (LibraryVoiceData) – A LibraryVoiceData object, from the voice library endpoint.
newName (str) – Name to give to the voice.

Returns:

The newly created voice.

Return type:

Voice

add_shared_voice_from_URL(shareURL: str, newName: str) → Voice¶

Adds a voice from a share link to the account.

Parameters:

shareURL (str) – The sharing URL for the voice.
newName (str) – Name to give to the voice.

Returns:

The newly created voice.

Return type:

Voice

add_shared_voice_from_info(publicUserID: str, voiceID: str, newName: str) → Voice¶

Adds a voice directly from the voiceID and the public userID.

Parameters:

publicUserID (str) – The public userID of the voice’s creator.
voiceID (str) – The voiceID of the voice.
newName (str) – Name to give to the voice.

Returns:

The newly created voice.

Return type:

Voice

add_project(name: str, default_title_voice: [str, Voice], default_paragraph_voice: [str, Voice], default_model: [str | Model], pronunciation_dictionaries=None, from_url: str | None = None, from_document: str | None = None, quality_preset: str = 'standard', title: str | None = None, author: str | None = None, isbn_number: str | None = None, volume_normalization: bool = False) → Project¶

Creates a new project.

Parameters:

name (str) – Name of the project.
default_title_voice (str|Voice) – Default voice for titles.
default_paragraph_voice (str|Voice) – Default voice for paragraphs.
default_model (str) – Model for the project.
pronunciation_dictionaries (list[PronunciationDictionary]) – Pronunciation dictionary locators.
from_url (str, optional) – Optional URL to initialize project content.
from_document (str, optional) – The filepath to a file from which to initialize the project.
quality_preset (str, optional) – Quality preset for audio. Must be “standard”, “high” or “ultra”. Qualities higher than standard increase character cost. Defaults to standard.
title (str, optional) – Project title.
author (str, optional) – Author name.
isbn_number (str, optional) – ISBN number.
volume_normalization (bool, optional) – Whether to enable volume normalization. Defaults to False.

Creates a new podcast project with simplified parameters.

Parameters:

model_id (str) – ID of the model to use.
podcast_type (str) – Either ‘conversation’ or ‘bulletin’.
host_voice (str|Voice) – Voice for the host.
guest_voice (str|Voice, optional) – Voice for the guest (required for ‘conversation’ mode).
source_text (str, optional) – Text content for the podcast. Either this or source_url must be provided.
source_url (str, optional) – URL to extract content from. Either this or source_text must be provided.
quality_preset (str, optional) – Audio quality. Options: ‘standard’, ‘high’, ‘highest’, ‘ultra’, ‘ultra_lossless’. Defaults to ‘standard’.
duration_scale (str, optional) – Duration of the podcast. Options: ‘short’, ‘default’, ‘long’. Defaults to ‘default’.
language (str, optional) – ISO 639-1 two-letter language code.
highlights (list[str], optional) – Brief summary points (10-70 characters each).
callback_url (str, optional) – URL to call when project is converted.

Returns:

The created podcast project.

Return type:

Project

create_transcript(audio: str | bytes | BinaryIO, model_id: str = 'scribe_v1', language_code: str | None = None, tag_audio_events: bool = True, num_speakers: int | None = None, timestamps_granularity: str = 'word', diarize: bool = False) → dict¶

Transcribes speech from an audio file.

Parameters:

audio – Can be one of: - str: Path to the audio file - bytes: Raw audio data - BinaryIO: File-like object containing audio data
model_id (str) – The ID of the model to use for transcription. Currently only ‘scribe_v1’ is available.
language_code (str, optional) – ISO-639-1 or ISO-639-3 language code for the audio file.
tag_audio_events (bool, optional) – Whether to tag audio events like (laughter), (footsteps), etc. Defaults to True.
num_speakers (int, optional) – Maximum number of speakers (1-32). Defaults to model’s maximum.
timestamps_granularity (str, optional) – Granularity of timestamps: ‘none’, ‘word’, or ‘character’. Defaults to ‘word’.
diarize (bool, optional) – Whether to annotate which speaker is talking. Limits audio to 8 minutes. Defaults to False.

Returns:

The transcription results.

Return type:

dict

add_pronunciation_dictionary(name: str, description: str, dict_file: str | TextIO) → PronunciationDictionary¶

Adds a pronunciation dictionary. :param name: The name for the dictionary. :type name: str :param description: The description. :type description: str :param dict_file: The dictionary file, either as a filepath or a TextIO object. :type dict_file: str|TextIO

Returns:: A PronunciationDictionary instance.

get_pronunciation_dictionary(dictionary_id: str) → PronunciationDictionary¶

Parameters:: dictionary_id – The pronunciation dictionary ID.
Returns:: The corresponding PronunciationDictionary
Return type:: PronunciationDictionary

get_pronunciation_dictionaries(max_number_of_items: int = 30, start_after_dict: str | PronunciationDictionary | None = None) → List[PronunciationDictionary]¶

This function returns max_number_of_items pronunciation dictionaries, starting from the newest (or the one specified with start_after_dict) and returning older ones.

Parameters:

max_number_of_items (int) – The maximum number of dictionaries to get. A value of 0 or less means all of them.
start_after_dict (str|PronunciationDictionary) – The pronunciation dict (or its ID) from which to start returning dicts.

Returns:

A list containing the requested pronunciation dictionaries.

Return type:

list[PronunciationDictionary]

generate_sfx(prompt: str, sfx_generation_options: SFXOptions = SFXOptions(duration_seconds=None, prompt_influence=None)) → tuple[Future[bytes], Future[GenerationInfo]]¶

Generates a sound effect from a text prompt and returns the audio data as bytes.

Tip

If you would like to save the audio to disk or otherwise, you can use helpers.save_audio_bytes().

Parameters:

prompt (str) – The text prompt..
sfx_generation_options (SFXOptions) – Options for the SFX generation, such as duration, prompt adherence.

Returns:

A future that will contain the bytes of the audio file once the generation is complete.
An optional future that will contain information about the generation.

Return type:

tuple[Future[bytes], Optional[GenerationInfo]]

isolate_audio(audio: bytes | BinaryIO) → tuple[Future[bytes], Future[GenerationInfo]]¶

Isolate the voice in the given audio.

Parameters:

audio (bytes|BinaryIO) – The audio to isolate voice from.

Returns:

A future that will contain the bytes of the audio file once the generation is complete.
An optional future that will contain the GenerationInfo object for the generation.

Return type:

tuple[Future[bytes], Optional[GenerationInfo]]

isolate_audio_stream(audio: bytes | ~typing.BinaryIO, playback_options: ~elevenlabslib.helpers.PlaybackOptions = PlaybackOptions(runInBackground=False, portaudioDeviceID=None, onPlaybackStart=<function PlaybackOptions.<lambda>>, onPlaybackEnd=<function PlaybackOptions.<lambda>>, audioPostProcessor=<function PlaybackOptions.<lambda>>), disable_playback: bool = False) → tuple[Queue[ndarray], Future[OutputStream] | None, Future[GenerationInfo]]¶

Isolate the voice in the given audio and stream the result.

Parameters:

audio (bytes|BinaryIO) – The audio to isolate voice from.
playback_options (PlaybackOptions, optional) – Options for the audio playback such as the device to use and whether to run in the background.
disable_playback (bool, optional) – Allows you to disable playback altogether.

Returns:

A queue containing the numpy audio data as float32 arrays.
An optional future for controlling the playback, returned if playback is not disabled.
An future containing a GenerationInfo with metadata.

Return type:

tuple[queue.Queue[numpy.ndarray], Optional[Future[OutputStream]], Future[GenerationInfo]]

get_real_audio_format(generationOptions: GenerationOptions) → GenerationOptions¶

Parameters:: generationOptions (GenerationOptions) – A GenerationOptions object.
Returns:: A GenerationOptions object with a real audio format (if the original was mp3_highest or pcm_highest, it’s modified accordingly, otherwise returned directly)

get_usage_stats(start_time: datetime | int, end_time: datetime | int | None = None, include_workspace_metrics: bool = False, breakdown_type: str | None = 'voice')¶

Returns the usage stats for the user. :param start_time: The start of the usage window in MILLIseconds. :type start_time: datetime.datetime|int :param end_time: The end of the usage window in MILLIseconds. Defaults to today’s date. :type end_time: datetime.datetime|int, Optional :param include_workspace_metrics: Whether to include workspace metrics. Defaults to false. :type include_workspace_metrics: bool, Optional :param breakdown_type: How to break down the results. Must be one of none, voice, user, api_keys, product_type. :type breakdown_type: str, Optional

Returns:: -The data formatted as a dict with datetime objects as keys -The data in its raw format
Return type:: A tuple containing

create_dub(name: str, target_lang: str, source_url: str = '', source_file_path: str | None = None, source_lang: str = 'auto', num_speakers: int = 0, watermark: bool = False, start_time: int | None = None, end_time: int | None = None, highest_resolution: bool = False, drop_background_audio: bool = False, use_profanity_filter: bool = False) → Tuple[Dub, int]¶

Dubs a video or an audio file into the given language.

Parameters:

name (str) – Name of the dubbing project.
target_lang (str) – The target language to dub the content into.
source_url (str) – URL of the source video/audio file.
source_file_path (str, optional) – File path of the audio/video file to dub. If provided, it will be used instead of source_url.
source_lang (str, optional) – Source language. Defaults to “auto”.
num_speakers (int, optional) – Number of speakers to use for the dubbing. Set to 0 to automatically detect the number of speakers. Defaults to 0.
watermark (bool, optional) – Whether to apply a watermark to the output video. Defaults to False.
start_time (int, optional) – Start time of the source video/audio file.
end_time (int, optional) – End time of the source video/audio file.
highest_resolution (bool, optional) – Whether to use the highest resolution available. Defaults to False.
drop_background_audio (bool, optional) – An advanced setting. Whether to drop background audio from the final dub. Defaults to False.
use_profanity_filter (bool, optional) – [BETA] Whether transcripts should have profanities censored with the words ‘[censored]’. Defaults to False.

Returns:

A dictionary containing the dubbing_id and expected_duration_sec of the dubbing task.

Return type:

dict

get_dub_by_id(dubbing_id: str) → Dub¶

Returns metadata about a dubbing project, including whether it’s still in progress or not.

Parameters:: dubbing_id (str) – ID of the dubbing project.
Returns:: A dictionary containing the metadata of the dubbing project.
Return type:: dict