User¶
- class User(xi_api_key: str)¶
Represents a user of the ElevenLabs API, including subscription information.
This is the class that can be used to query for the user’s available voices and create new ones.
- __init__(xi_api_key: str)¶
Initializes a new instance of the User class.
- Parameters:
xi_api_key (str) – The user’s API key.
- Raises:
ValueError – If the API Key is invalid.
- property headers: dict¶
Returns: dict: The headers used for API requests.
- get_user_data() dict¶
- Returns:
All the information returned by the /v1/user endpoint.
- Return type:
dict
- get_subscription_data() dict¶
- Returns:
All the information returned by the /v1/user/subscription endpoint.
- Return type:
dict
- get_character_info()¶
- Returns:
A tuple containing the number of characters used up, the maximum, and if the maximum can be increased.
- Return type:
(int, int, bool)
- get_voice_clone_available() bool¶
- Returns:
True if the user can use instant voice cloning, False otherwise.
- Return type:
bool
- get_next_invoice() dict | None¶
- Returns:
The next invoice’s data, or None if there is no next invoice.
- Return type:
dict | None
- get_models() list[Model]¶
This function returns all the available models for this account as Model.
- Returns:
All the available models for this account, as Model instances.
- Return type:
list[Model]
- get_all_voices(show_legacy: bool = True) list[Voice | DesignedVoice | ClonedVoice | ProfessionalVoice]¶
Gets a list of all voices registered to this account.
Caution
Some of these may be unusable due to subscription tier changes. Use get_available_voices if you only need the currently useable ones.
- Returns:
A list containing all the voices.
- Return type:
list[Voice]
- get_available_voices(show_legacy: bool = True) list[Voice | DesignedVoice | ClonedVoice | ProfessionalVoice]¶
Gets a list of voices this account can currently use for TTS.
- Returns:
A list of currently usable voices.
- Return type:
list[Voice]
- get_voice_by_ID(voiceID: str) Voice | DesignedVoice | ClonedVoice | ProfessionalVoice¶
Gets a specific voice by ID.
- Parameters:
voiceID (str) – The ID of the voice to get.
- Returns:
The requested voice.
- Return type:
- get_voices_by_name_v2(voiceName: str, score_threshold: int = 75) list[Voice | EditableVoice | ClonedVoice | ProfessionalVoice]¶
Gets a list of voices with the given name.
Note
This is a list as multiple voices can have the same name.
- Parameters:
voiceName (str) – The name of the voices to get.
score_threshold (int, Optional) – The % chance of a voice being a match required for it to be included in the returned list. Defaults to 75%.
- Returns:
A list of matching voices.
- Return type:
list[Voice|DesignedVoice|ClonedVoice]
- get_history_items_paginated(maxNumberOfItems: int = 100, startAfterHistoryItem: str | HistoryItem = None) list[HistoryItem]¶
This function returns numberOfItems history items, starting from the newest (or the one specified with startAfterHistoryItem) and returning older ones.
- Parameters:
maxNumberOfItems (int) – The maximum number of history items to get. A value of 0 or less means all of them.
startAfterHistoryItem (str|HistoryItem) – The history item (or its ID) from which to start returning items.
- Returns:
A list containing the requested history items.
- Return type:
list[HistoryItem]
- get_history_item(historyItemID: str | GenerationInfo) HistoryItem¶
- Parameters:
historyItemID – The HistoryItem ID.
- Returns:
The corresponding HistoryItem
- Return type:
- download_history_items_v2(historyItems: list[str | HistoryItem]) dict[HistoryItem, tuple[bytes, str]]¶
Download multiple history items and return a dictionary where the key is the HistoryItem and the value is a tuple consisting of the bytes of the audio and its filename.
- Parameters:
historyItems (list[str|HistoryItem]) – List of history items (or their IDs) to download.
- Returns:
Dictionary where the key is the historyItem and the value is a tuple of the bytes of the mp3 file and its filename.
- Return type:
dict[HistoryItem, bytes]
- design_voice(gender: str, accent: str, age: str, accent_strength: float, sampleText: str = 'First we thought the PC was a calculator. Then we found out how to turn numbers into letters and we thought it was a typewriter.')¶
Calls the API endpoint that randomly generates a voice based on the given parameters.
Caution
To actually save the generated voice to your account, you must then call save_designed_voice with the temporary voiceID.
- Parameters:
gender (str) – The gender.
accent (str) – The accent.
age (str) – The age.
accent_strength (float) – How strong the accent should be, between 0.3 and 2.
sampleText (str) – The text that will be used to randomly generate the new voice. Must be at least 100 characters long.
- Returns:
A tuple containing the new, temporary voiceID and the bytes of the generated audio.
- Return type:
(str, bytes)
- save_designed_voice(temporaryVoiceID: str | tuple[str, bytes], voiceName: str, voiceDescription: str = '') DesignedVoice¶
Saves a voice generated via design_voice to your account, with the given name.
- Parameters:
temporaryVoiceID (str|tuple(str,bytes)) – The temporary voiceID of the generated voice. It also supports directly passing the tuple from design_voice.
voiceName (str) – The name you would like to give to the new voice.
voiceDescription (str) – The description you would like to give to the new voice.
- Returns:
The newly created voice
- Return type:
- generate_voice(voice_description: str, text: str | None = None, auto_generate_text: bool = False, output_format: str = 'mp3_44100_192') list[tuple[str, bytes, float]]¶
Calls the updated API endpoint that generates voice previews based on the given description.
- Parameters:
voice_description (str) – Description of the voice to generate. Must be between 20-1000 characters.
text (str, optional) – Text to generate audio for. Must be between 100-1000 characters. Required unless auto_generate_text is True.
auto_generate_text (bool, optional) – Whether to automatically generate suitable text. Defaults to False.
output_format (str, optional) – Output format for the generated audio. Defaults to “mp3_44100_192”.
- Returns:
- A list of tuples, each containing:
generated_voice_id (str): Temporary ID for the generated voice
audio_bytes (bytes): Audio data for the preview
duration_secs (float): Duration of the audio in seconds
- Return type:
list[tuple[str, bytes, float]]
- Raises:
ValueError – If voice_description is not between 20-1000 characters or if text is not between 100-1000 characters when required.
- save_generated_voice(generated_voice_id: str, voice_name: str, voice_description: str, labels: dict[str, str] | None = None) EditableVoice¶
Saves a voice generated via generate_voice to your account.
- Parameters:
generated_voice_id (str) – The temporary voice ID returned by generate_voice.
voice_name (str) – The name you would like to give to the new voice.
voice_description (str) – The description for the voice. Must be between 20-1000 characters.
labels (dict[str, str], optional) – Metadata to add to the created voice. Defaults to None.
- Returns:
The newly created voice
- Return type:
- Raises:
ValueError – If voice_description is not between 20-1000 characters.
- clone_voice(name: str, samples: list[str] | dict[str, bytes], description: str = '', remove_background_noise: bool = False, labels: dict[str, str] | None = None)¶
Create a new ClonedVoice object from the given samples.
- Parameters:
name (str) – Name of the voice to be created.
samples (list[str]|dict[str, bytes]) – List of file paths OR dictionary of sample file names and bytes for the voice samples.
description (str, Optional) – The description of the voice.
remove_background_noise (bool, optional) – Whether to automatically remove background noise. Defaults to false, can worsen quality if noise is not present.
labels (dict[str, str], optional) – The labels to add to the voice.
- Returns:
The new voice.
- Return type:
- search_voice_library(search_term: str | None = None, use_cases: list[str] | None = None, descriptives: list[str] | None = None, sort: ~elevenlabslib.helpers.LibSort | None = LibSort.TRENDING, advanced_filters: ~elevenlabslib.helpers.LibVoiceInfo = <elevenlabslib.helpers.LibVoiceInfo object>, starting_page=0, query_page_size=30) List[LibraryVoiceData]¶
Allows you to search the voice library with various filters. For parameters which are lists, all voices that match at least one of them will be returned.
- Parameters:
search_term (str, Optional) – The search term to use, equivalent to typing it into the site.
use_cases (list, Optional) – A list of use cases.
descriptives (list, Optional) – A list of descriptives (Soft, Calm, etc).
sort (LibSort, Optional) – How to sort the voices.
advanced_filters (LibVoiceInfo, Optional) – Allows you to filter voices based on its characteristics (language, accent, etc)
query_page_size (int, Optional) – How many voices to return. Defaults to 30.
starting_page (int, Optional) – The page to start at.
Adds a voice from the library to your account.
- Parameters:
voice (LibraryVoiceData) – A LibraryVoiceData object, from the voice library endpoint.
newName (str) – Name to give to the voice.
- Returns:
The newly created voice.
- Return type:
Adds a voice from a share link to the account.
- Parameters:
shareURL (str) – The sharing URL for the voice.
newName (str) – Name to give to the voice.
- Returns:
The newly created voice.
- Return type:
Adds a voice directly from the voiceID and the public userID.
- Parameters:
publicUserID (str) – The public userID of the voice’s creator.
voiceID (str) – The voiceID of the voice.
newName (str) – Name to give to the voice.
- Returns:
The newly created voice.
- Return type:
- add_project(name: str, default_title_voice: [str, Voice], default_paragraph_voice: [str, Voice], default_model: [str | Model], pronunciation_dictionaries=None, from_url: str | None = None, from_document: str | None = None, quality_preset: str = 'standard', title: str | None = None, author: str | None = None, isbn_number: str | None = None, volume_normalization: bool = False) Project¶
Creates a new project.
- Parameters:
name (str) – Name of the project.
default_title_voice (str|Voice) – Default voice for titles.
default_paragraph_voice (str|Voice) – Default voice for paragraphs.
default_model (str) – Model for the project.
pronunciation_dictionaries (list[PronunciationDictionary]) – Pronunciation dictionary locators.
from_url (str, optional) – Optional URL to initialize project content.
from_document (str, optional) – The filepath to a file from which to initialize the project.
quality_preset (str, optional) – Quality preset for audio. Must be “standard”, “high” or “ultra”. Qualities higher than standard increase character cost. Defaults to standard.
title (str, optional) – Project title.
author (str, optional) – Author name.
isbn_number (str, optional) – ISBN number.
volume_normalization (bool, optional) – Whether to enable volume normalization. Defaults to False.
- create_podcast(model_id: str, podcast_type: str, host_voice: str | Voice, guest_voice: str | Voice | None = None, source_text: str | None = None, source_url: str | None = None, quality_preset: str = 'standard', duration_scale: str = 'default', language: str | None = None, highlights: List[str] | None = None, callback_url: str | None = None) Project¶
Creates a new podcast project with simplified parameters.
- Parameters:
model_id (str) – ID of the model to use.
podcast_type (str) – Either ‘conversation’ or ‘bulletin’.
host_voice (str|Voice) – Voice for the host.
guest_voice (str|Voice, optional) – Voice for the guest (required for ‘conversation’ mode).
source_text (str, optional) – Text content for the podcast. Either this or source_url must be provided.
source_url (str, optional) – URL to extract content from. Either this or source_text must be provided.
quality_preset (str, optional) – Audio quality. Options: ‘standard’, ‘high’, ‘highest’, ‘ultra’, ‘ultra_lossless’. Defaults to ‘standard’.
duration_scale (str, optional) – Duration of the podcast. Options: ‘short’, ‘default’, ‘long’. Defaults to ‘default’.
language (str, optional) – ISO 639-1 two-letter language code.
highlights (list[str], optional) – Brief summary points (10-70 characters each).
callback_url (str, optional) – URL to call when project is converted.
- Returns:
The created podcast project.
- Return type:
- create_transcript(audio: str | bytes | BinaryIO, model_id: str = 'scribe_v1', language_code: str | None = None, tag_audio_events: bool = True, num_speakers: int | None = None, timestamps_granularity: str = 'word', diarize: bool = False) dict¶
Transcribes speech from an audio file.
- Parameters:
audio – Can be one of: - str: Path to the audio file - bytes: Raw audio data - BinaryIO: File-like object containing audio data
model_id (str) – The ID of the model to use for transcription. Currently only ‘scribe_v1’ is available.
language_code (str, optional) – ISO-639-1 or ISO-639-3 language code for the audio file.
tag_audio_events (bool, optional) – Whether to tag audio events like (laughter), (footsteps), etc. Defaults to True.
num_speakers (int, optional) – Maximum number of speakers (1-32). Defaults to model’s maximum.
timestamps_granularity (str, optional) – Granularity of timestamps: ‘none’, ‘word’, or ‘character’. Defaults to ‘word’.
diarize (bool, optional) – Whether to annotate which speaker is talking. Limits audio to 8 minutes. Defaults to False.
- Returns:
The transcription results.
- Return type:
dict
- add_pronunciation_dictionary(name: str, description: str, dict_file: str | TextIO) PronunciationDictionary¶
Adds a pronunciation dictionary. :param name: The name for the dictionary. :type name: str :param description: The description. :type description: str :param dict_file: The dictionary file, either as a filepath or a TextIO object. :type dict_file: str|TextIO
- Returns:
A PronunciationDictionary instance.
- get_pronunciation_dictionary(dictionary_id: str) PronunciationDictionary¶
- Parameters:
dictionary_id – The pronunciation dictionary ID.
- Returns:
The corresponding PronunciationDictionary
- Return type:
- get_pronunciation_dictionaries(max_number_of_items: int = 30, start_after_dict: str | PronunciationDictionary | None = None) List[PronunciationDictionary]¶
This function returns max_number_of_items pronunciation dictionaries, starting from the newest (or the one specified with start_after_dict) and returning older ones.
- Parameters:
max_number_of_items (int) – The maximum number of dictionaries to get. A value of 0 or less means all of them.
start_after_dict (str|PronunciationDictionary) – The pronunciation dict (or its ID) from which to start returning dicts.
- Returns:
A list containing the requested pronunciation dictionaries.
- Return type:
list[PronunciationDictionary]
- generate_sfx(prompt: str, sfx_generation_options: SFXOptions = SFXOptions(duration_seconds=None, prompt_influence=None)) tuple[Future[bytes], Future[GenerationInfo]]¶
Generates a sound effect from a text prompt and returns the audio data as bytes.
Tip
If you would like to save the audio to disk or otherwise, you can use helpers.save_audio_bytes().
- Parameters:
prompt (str) – The text prompt..
sfx_generation_options (SFXOptions) – Options for the SFX generation, such as duration, prompt adherence.
- Returns:
A future that will contain the bytes of the audio file once the generation is complete.
An optional future that will contain information about the generation.
- Return type:
tuple[Future[bytes], Optional[GenerationInfo]]
- isolate_audio(audio: bytes | BinaryIO) tuple[Future[bytes], Future[GenerationInfo]]¶
Isolate the voice in the given audio.
- Parameters:
audio (bytes|BinaryIO) – The audio to isolate voice from.
- Returns:
A future that will contain the bytes of the audio file once the generation is complete.
An optional future that will contain the GenerationInfo object for the generation.
- Return type:
tuple[Future[bytes], Optional[GenerationInfo]]
- isolate_audio_stream(audio: bytes | ~typing.BinaryIO, playback_options: ~elevenlabslib.helpers.PlaybackOptions = PlaybackOptions(runInBackground=False, portaudioDeviceID=None, onPlaybackStart=<function PlaybackOptions.<lambda>>, onPlaybackEnd=<function PlaybackOptions.<lambda>>, audioPostProcessor=<function PlaybackOptions.<lambda>>), disable_playback: bool = False) tuple[Queue[ndarray], Future[OutputStream] | None, Future[GenerationInfo]]¶
Isolate the voice in the given audio and stream the result.
- Parameters:
audio (bytes|BinaryIO) – The audio to isolate voice from.
playback_options (PlaybackOptions, optional) – Options for the audio playback such as the device to use and whether to run in the background.
disable_playback (bool, optional) – Allows you to disable playback altogether.
- Returns:
A queue containing the numpy audio data as float32 arrays.
An optional future for controlling the playback, returned if playback is not disabled.
An future containing a GenerationInfo with metadata.
- Return type:
tuple[queue.Queue[numpy.ndarray], Optional[Future[OutputStream]], Future[GenerationInfo]]
- get_real_audio_format(generationOptions: GenerationOptions) GenerationOptions¶
- Parameters:
generationOptions (GenerationOptions) – A GenerationOptions object.
- Returns:
A GenerationOptions object with a real audio format (if the original was mp3_highest or pcm_highest, it’s modified accordingly, otherwise returned directly)
- get_usage_stats(start_time: datetime | int, end_time: datetime | int | None = None, include_workspace_metrics: bool = False, breakdown_type: str | None = 'voice')¶
Returns the usage stats for the user. :param start_time: The start of the usage window in MILLIseconds. :type start_time: datetime.datetime|int :param end_time: The end of the usage window in MILLIseconds. Defaults to today’s date. :type end_time: datetime.datetime|int, Optional :param include_workspace_metrics: Whether to include workspace metrics. Defaults to false. :type include_workspace_metrics: bool, Optional :param breakdown_type: How to break down the results. Must be one of none, voice, user, api_keys, product_type. :type breakdown_type: str, Optional
- Returns:
-The data formatted as a dict with datetime objects as keys -The data in its raw format
- Return type:
A tuple containing
- create_dub(name: str, target_lang: str, source_url: str = '', source_file_path: str | None = None, source_lang: str = 'auto', num_speakers: int = 0, watermark: bool = False, start_time: int | None = None, end_time: int | None = None, highest_resolution: bool = False, drop_background_audio: bool = False, use_profanity_filter: bool = False) Tuple[Dub, int]¶
Dubs a video or an audio file into the given language.
- Parameters:
name (str) – Name of the dubbing project.
target_lang (str) – The target language to dub the content into.
source_url (str) – URL of the source video/audio file.
source_file_path (str, optional) – File path of the audio/video file to dub. If provided, it will be used instead of source_url.
source_lang (str, optional) – Source language. Defaults to “auto”.
num_speakers (int, optional) – Number of speakers to use for the dubbing. Set to 0 to automatically detect the number of speakers. Defaults to 0.
watermark (bool, optional) – Whether to apply a watermark to the output video. Defaults to False.
start_time (int, optional) – Start time of the source video/audio file.
end_time (int, optional) – End time of the source video/audio file.
highest_resolution (bool, optional) – Whether to use the highest resolution available. Defaults to False.
drop_background_audio (bool, optional) – An advanced setting. Whether to drop background audio from the final dub. Defaults to False.
use_profanity_filter (bool, optional) – [BETA] Whether transcripts should have profanities censored with the words ‘[censored]’. Defaults to False.
- Returns:
A dictionary containing the dubbing_id and expected_duration_sec of the dubbing task.
- Return type:
dict
- get_dub_by_id(dubbing_id: str) Dub¶
Returns metadata about a dubbing project, including whether it’s still in progress or not.
- Parameters:
dubbing_id (str) – ID of the dubbing project.
- Returns:
A dictionary containing the metadata of the dubbing project.
- Return type:
dict