Giter VIP home page Giter VIP logo

elevenlabs-api's Introduction

๐Ÿ—ฃ๏ธ๐Ÿ”Š elevenlabs-api Build

An unofficial ElevenLabs AI Voice Generation Java API

Getting Started

So you wanna make custom voices, huh? Well you've come to the right place. This library should cover all the ElevenLabs API endpoints as of 11/15/23. Update It seems I jumped the gun (or happened to bump into an accidental push of the speech to speech API docs), but the original documentation that my implementation was based on has now been removed or hidden. It may reappear as is, or will require changes.

Installation

Maven

To add elevenlabs-api to your Maven project, use:

<dependency>
    <groupId>net.andrewcpu</groupId>
    <artifactId>elevenlabs-api</artifactId>
    <version>2.7.8</version>
</dependency>

JAR

Compiled JARs are available via the Releases tab

Setting up your API Key

To access your ElevenLabs API key, head to the official website, you can view your xi-api-key using the 'Profile' tab on the website. To set up your ElevenLabs API key, you must register it with the ElevenLabsAPI Java API like below:

ElevenLabs.setApiKey("YOUR_API_KEY_HERE");


ElevenLabs.setDefaultModel("eleven_monolingual_v1"); // Optional, defaults to: "eleven_monolingual_v1"

For any public repository security, you should store your API key in an environment variable, or external from your source code.


Links to ElevenLabs

ElevenLabs Website: https://elevenlabs.io

ElevenLabs API Documentation: https://api.elevenlabs.io/docs


Simplified Generation Handling with Builders

v2.7.8 now includes SpeechGenerationBuilder.java

Text to Speech

//File output
SpeechGenerationBuilder.textToSpeech()
        .file() // output type of file (or use .streamed() for an InputStream)
        .setText(String text)
        .setGeneratedAudioOutputFormat(GeneratedAudioOutputFormat.MP3_44100_128)
        .setVoiceId("voiceIdString")
        .setVoiceSettings(VoiceSettings)
        .setVoice(Voice) // or use a voice object, which will pull settings / ID out of the Voice
        .setModelId("modelIdString")
        .setModel(ElevenLabsVoiceModel.ELEVEN_ENGLISH_STS_V2)
        .setLatencyOptimization(StreamLatencyOptimization.NONE)
        .build();
//Streamed output
SpeechGenerationBuilder.textToSpeech()
        .streamed()
        .setText(String text)
        .setGeneratedAudioOutputFormat(GeneratedAudioOutputFormat.MP3_44100_128)
        .setVoiceId("voiceIdString")
        .setVoiceSettings(VoiceSettings)
        .setVoice(Voice) // or use a voice object, which will pull settings / ID out of the Voice
        .setModelId("modelIdString")
        .setModel(ElevenLabsVoiceModel.ELEVEN_ENGLISH_STS_V2)
        .setLatencyOptimization(StreamLatencyOptimization.NONE)
        .build();

Voices

Accessing your List of Available Voices

To retrieve your list of accessible Voices, you can statically utilize Voice#getVoices(). This will return both ElevenLab's pregenerated Voice models, as well as any personal Voices you have generated.

List<Voice> voices = Voice.getVoices();

Accessing the Default Voice Settings

ElevenLabs provides a default VoiceSettings configuration which can be accessed statically from VoiceSettings#getDefaultVoiceSettings() This is a network request.

VoiceSettings.getDefaultVoiceSettings();

Getting a Voice by ID

Retrieving voices via their voiceId can be done in a few different ways.

The first retrieves a Voice by voiceId and by default includes the settings.

Voice.get(String voiceId);

If you don't wish to retrieve the Voice model with its default settings included, you can use the Voice#get(String voiceId, boolean withSettings) function. By specifying false for withSettings, you will receive a voice object without its default VoiceSettings. (They can be loaded later with Voice#fetchSettings(), or not at all by providing a VoiceSettings object when generating TTS)

Voice.get(String voiceId, boolean withSettings);

Deleting a voice

To delete a Voice, you can utilize the Voice#delete() function. This will delete a voice from the ElevenLabs API.

Voice voice;
voice.delete();

Retrieving an Updated VoiceSettings for a Voice

There may be times when the default VoiceSettings parameters are changed externally from the API (Via the main website or another integrated system), to retrieve and apply the most up to date VoiceSettings object to a Voice, you can use the Voice#fetchSettings() function. (This is a network request, and it updates the object you're acting upon)

Voice voice;
voice.fetchSettings(); // requests updated settings from ElevenLabs

Updating the VoiceSettings for a Voice

A VoiceSettings object can be modified and updated in a Voice. The Voice#updateVoiceSettings(VoiceSettings settings) function updates the default voice parameters to use when generating speech. (This is a network request, and updates the object you're acting upon.)

Voice voice;
voice.updateVoiceSettings(VoiceSettings settings);

Editing a Voice

To edit an existing Voice model, you can load the Voice into a VoiceBuilder with the VoiceBuilder#fromVoice(Voice voice) function. The fromVoice(Voice voice) function will add all the current values stored in a Voice object directly into the VoiceBuilder. (Meaning you do not have to redefine existing settings)

You'll also see that the Creating a Voice section utilizes this same VoiceBuilder class, except it uses the VoiceBuilder#create() function, instead of the VoiceBuilder#edit() function.

When editing a voice, you must use VoiceBuilder.edit()

Voice voice;
VoiceBuilder builder = VoiceBuilder.fromVoice(voice); // load the existing voice into the builder
builder.withLabel("key", "value"); // add a new label
builder.removeLabel("oldKey");
builder.withFile(new File("someAudioFile.mp3")); // add a new audio sample
voice = builder.edit(); // edit voice & return updated voice object

Creating a Voice

To generate a new Voice model from the API, you can use the VoiceBuilder class to assemble the required parameters for a Voice model. Some things to remember:

  • All Voice objects must have a name. (There can only be ONE name for a Voice)
  • At least 1 file (Voice Sample) must be provided with VoiceBuilder#withFile()
  • Labels are not required, but are helpful. As many labels as you like may be applied to a Voice.
Voice voice;
VoiceBuilder builder = new VoiceBuilder();
builder.withName("Voice name");
builder.withFile(new File("sample1.mp3"));
builder.withFile(new File("sample2.mp3"));
builder.withLabel("accent", "American");
voice = builder.create();

Generating Audio (TTS + STS)

To generate an audio file with a given Voice, you can utilize the Voice#generate(...) functions. Depending on how you access your Voice, (with or without settings), will decide whether you can use the implicit voiceSettings or if you have to specify the VoiceSettings object to use. Unless explicitly requesting the Voice without settings, every Voice object SHOULD contain its default VoiceSettings.

Voice voice;

File file = voice.generate("Hello world!", "my_favorite_model");
...
//Available Functions:
public File generate(String text, String model);

public File generate(String text, String model, VoiceSettings settings);

public File generate(String text, String model, VoiceSettings settings, GeneratedAudioOutputFormat outputFormat, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, String model, VoiceSettings settings, GeneratedAudioOutputFormat outputFormat);

public File generate(String text, VoiceSettings settings, GeneratedAudioOutputFormat outputFormat);

public File generate(String text, VoiceSettings settings, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, VoiceSettings settings, GeneratedAudioOutputFormat outputFormat, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, GeneratedAudioOutputFormat outputFormat, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, VoiceSettings settings);

public File generate(String text);

public InputStream generateStream(String text, String model);

public InputStream generateStream(String text, String model, VoiceSettings settings);

public InputStream generateStream(String text, VoiceSettings settings);

public InputStream generateStream(String text);

public InputStream generateStream(String text, String model, GeneratedAudioOutputFormat generatedAudioOutputFormat, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, String model, GeneratedAudioOutputFormat generatedAudioOutputFormat, StreamLatencyOptimization streamLatencyOptimization, VoiceSettings settings);

public InputStream generateStream(String text, VoiceSettings settings, GeneratedAudioOutputFormat generatedAudioOutputFormat, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, GeneratedAudioOutputFormat generatedAudioOutputFormat, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, String model, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, String model, StreamLatencyOptimization streamLatencyOptimization, VoiceSettings settings);

public InputStream generateStream(String text, VoiceSettings settings, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, StreamLatencyOptimization streamLatencyOptimization);

Audio Native Projects

Creating an Audio Native Project

You can create audio native projects using the AudioNative API.

CreateAudioEnabledProjectRequest request = new CreateAudioEnabledProjectRequest()
        .setName("Project name")
        .setImage("https://...com/img.png")
        .setAuthor("Andrew")
        .setSmall(true)
        .setTextColor("red")
        .setBackgroundColor("black")
        .setSessionization(3)
        .setVoiceId("aso23809")
        .setModelId("my_favorite_model")
        .setFile(new File("input.dat"))
        .setAutoConvert(true);

CreateAudioEnabledProjectModelResponse response = ElevenLabs.getAudioNativeAPI()
                                                             .createAudioEnabledProject(request);

Projects

Create a project

You can create a new project using the AddProjectRequest builder.

AddProjectRequest request = new AddProjectRequest()
                                .setName("name")
        .setFromUrl("...")
        .setFromDocument(new File("file.dat"))
        .setDefaultTitleVoiceId("voiceA")
        .setDefaultParagraphVoiceId("voiceB")
        .setDefaultModelId("the_default_model_of_your_dreams")
        .setProjectOutputQuality(ProjectOutputQuality.STANDARD)
        .setTitle("Big Title")
        .setAuthor("Best Author")
        .setIsbnNumber("THE. ISBN.")
        .setAcxVolumeNormalization(true);

Project project = Project.addProject(request);

Get a Project

Get a project by it's specific project ID.

Project project = Project.getProjectById(projectId);

Get all Projects

Get all of the projects associated with your account.

List<Project> projects = Project.getProjects();

Interacting with Projects

Project project;
// Delete a project
String deleteResult = project.delete();

// Convert a project
String conversionResult = project.convertProject();

//Get the project's snapshots.
List<ProjectSnapshot> snapshots = project.getSnapshots();

// Access your chapters from memory
List<Chapter> chapters = project.getChapters();

// Refresh your local Project's chapters from the API
chapters = project.fetchUpdatedChapters(); // This will update the existing project object and return the list of new chapters 

// Get a chapter by ID
Chapter chapter = project.getChapterById(chapterId);

// Delete a chapter
String result = project.deleteChapter(chapter);

// Convert a chapter
String result = project.convertChapter(chapter);

// Get chapter snapshots
List<ChapterSnapshot> snapshots = project.getChapterSnapshots(chapter);

Interacting with Chapters

Project project;
Chapter chapter = project.getChapterById("chapter_id");

// Delete a chapter
chapter.deleteChapter(project.getProjectId());

// Convert a chapter
chapter.convertChapter(project.getProjectId());

// Get a chapter's snapshots
List<ChapterSnapshot> snapshots = chapter.getChapterSnapshots(project.getProjectId());

// 

Accessing Snapshot Audio

Accessing a ProjectSnapshot audio stream:

Project project;
List<ProjectSnapshot> projectSnapshots = project.getSnapshots();
ProjectSnapshot first = projectSnapshots.get(0);
InputStream audio = first.getAudioStream();

Accessing a ChapterSnapshot audio stream:

Project project;
Chapter chapter;
List<ChapterSnapshot> chapterSnapshots = project.getChapterSnapshots(chapter);
ChapterSnapshot first = chapterSnapshots.get(0);
InputStream audio = first.getAudioStream();

Both ProjectSnapshot and ChapterSnapshot are of type Snapshot.


Samples

A Sample is used as the training data for a given Voice model.

Accessing Voice Samples

To access the sample(s) for a given Voice, you can utilize Voice#getSamples().

Voice voice;
List<Sample> samples = voice.getSamples();

Downloading a Sample

You can download a Sample via the Sample#downloadAudio(File outputFile) function. The File parameter of downloadAudio() is the location of where you want to locally download the sample.

Voice voice;
File file = voice.getSamples().get(0).downloadAudio();

Deleting a Sample

Deleting a Sample is easy. This action makes a network request.

Sample sample;
sample.delete();

History

Your ElevenLabs History is a collection of all of your previous TTS generations.

Getting Generation History

To get your ElevenLabs generation History, you can utilize History#get(). (You can also retrieve your History from a User object, with User#getHistory())

History history = History.get();

Getting all History Items

The History endpoint accepts page size parameters and a start-after-history-id parameter. We can use this to fetch all of our HistoryItems.

    History history = History.get(); // the latest history object
    Optional<History> hist = Optional.of(history);
    List<HistoryItem> items = new ArrayList();
    do {
        items.addAll(hist.get().getHistoryItems());
        hist = hist.get().next();
    } while(hist.isPresent() && hist.hasMore());

Getting a History Item

To retrieve a HistoryItem from your History, you can use History#getHistoryItem(String itemId).

History history;
HistoryItem item = history.getHistoryItem("itemId");

Downloading History

The official API of ElevenLabs provides an endpoint for downloading multiple HistoryItem's as a ZIP file. To download such items, you can pass a String[] containing the HistoryItem IDs, OR you can provide a List<HistoryItem> parameter.

History history;
File download = history.downloadHistory(new String[]{"item-id1", "item-id2"});
File download = history.downloadHistory(List<HistoryItem> historyItems);

Deleting a HistoryItem

You can utilize the HistoryItem#delete() function to delete a HistoryItem from ElevenLabs.

HistoryItem item;
item.delete();

Requesting the Voice for a HistoryItem

By default, a HistoryItem contains the voiceId of the Voice used to generate it. The getVoice() function will send a request to the ElevenLabs API to retrieve the voice object. (See also Voice.get() and Voice.getVoices())

HistoryItem item;
Voice voice = item.getVoice();

Downloading a HistoryItem Audio

A HistoryItem is a previous TTS generation. You can download the generation as an MP3 file by executing the downloadAudio() function. The return value is the tmp File location of your download.

HistoryItem item;
File file = item.downloadAudio();

Projects

The Projects


User Management

Getting your Subscription

A Subscription contains all the relevant data to manage your API usage (character usage, next billing cycle, etc.)

Subscription subscription = Subscription.get();

Getting your User

This endpoint will return the User associated with a given API key.

User user = User.get();

Exceptions

ElevenLabsValidationException

This error indicates a malformed request to the ElevenLabs API. The exception should provide the location of any syntactically incorrect parameters within the request.


Misc

As specified on the official ElevenLabs API Documentation, their API is experimental and all endpoints are subject to change. Depending on how they modify their API, may break this library. Should you notice any API changes / library errors, feel free to submit an issue or a PR.

If you like what you see, give it a star! :)


Todo

I will probably rework the 2 new builders I added when I added projects support. Their usage should be more clear, though the documentation covers their use cases I believe.


Unit Testing

Unit tests have been created, these endpoints are destructive and not included in the testing:

  • deleteHistoryItem - WONT TEST
  • deleteSample - WONT TEST
  • deleteVoice - WONT TEST
  • editVoiceSettings - WONT TEST
  • createVoice - WONT TEST
  • editVoice - WONT TEST

To run the unit tests yourself, you have to clone this repo and update the ElevenLabsTest.java file with your API key and your voice to test on.

Thanks to ElevenLabs for making an awesome tool ๐Ÿฅ‚

elevenlabs-api's People

Contributors

andrewcpu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

elevenlabs-api's Issues

Websocket API

Hi there, I saw in #15 that you're working on the websocket part of the API

I'm eager to use it!

Do you have any rough ETA? :)

Generate and GenerateStream take about the same time

Shouldn't generateStream be a lot faster than generate?

voice.generate("This is a small test. Can this test be any faster? Can the speech be generated faster?");
-> 1700-1800ms

voice.generateStream("This is a small test. Can this test be any faster? Can the speech be generated faster?");
->1700ms to generateStream, 120ms to read all data from Stream = 1820ms

Is the streaming endpoint not used correctly or something? :)

Problem with requests in PCM_16000

In my Java project, I am using a speech generation constructor like this:

image

However, I am receiving the following error:

image

After some investigation, I realized that ElevenNetworkUtil.FILE_RESULT_TYPES does not have the type "audio/pcm" which is the value that comes in the response.getEntity().getContentType() to the ElevenLabs request.

Getting error code 400 in /v1/speech-to-speech API from android for new added voice

We get the new added voice on fetching the voices, but when we try to use that voice in speech to speech transformation we get the following error :

Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Cannot construct instance of java.io.InputStream (no Creators, like default constructor, exist): abstract types either need to be mapped to concrete types, have custom deserializer, or contain additional type information
at [Source: (String)"{"detail":{"status":"voice_does_not_exist","message":"A voice with voice_id DJfO3mfSaig9bYh34Kcb does not exist."}}"; line: 1, column: 1]

Server returned HTTP response code: 401 for URL

I get this error when i run the command Voice.generate() and when I click on the link that it goes to and outputs (https://api.elevenlabs.io/v1/text-to-speech/VR6AewLTigWG4xSOukaG) I get this "{"detail":"Method Not Allowed"}"

It seems like it's registering with my API because there is no connection to when I set the key in the ElevenLabsAPI object voiceAPI and when I try to generate a file with the Voice object LunaVoice. Any help is greatly appreciated

[REMOVED FOR PRIVACY]

Screenshot 2023-06-02 174005

IntelliJ CANNOT FIND Dependency

Hello. I would like to admit a few things. Firstly, i am new to using maven. Secondly, I am very lost. I am trying to create a new maven project in intellij, and add the dependency for this api. In my pom.xml file, i have the following:
<dependency> <groupId>net.andrewcpu.elevenlabs</groupId> <artifactId>elevenlabs-api</artifactId> <version>1.0-SNAPSHOT</version> </dependency>.
I have also tried:
<dependency> <groupId>net.andrewcpu</groupId> <artifactId>elevenlabs-api</artifactId> <version>1.0-SNAPSHOT</version> </dependency>.

Neither of these update the maven dependencies. I have even tried using the search bar in IntelliJ to find the dependency, and nothing works. Whenever I open the terminal in IntelliJ and type in "mvn install" it says that the command was not found. I really want to use this to create an app, but i am getting very frustrated. If anyone could help me, that would be awesome. Thank you.

Unsupported payload value type for speech to speech endpoints

Speech to speech endpoints will throw:
Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: Unsupported payload value type
at net.andrewcpu.elevenlabs.util.ElevenNetworkUtil.handleMultipartBodyRequest(ElevenNetworkUtil.java:103)
at net.andrewcpu.elevenlabs.util.ElevenNetworkUtil.handleBodyRequest(ElevenNetworkUtil.java:118)
at net.andrewcpu.elevenlabs.util.ElevenNetworkUtil.getRequest(ElevenNetworkUtil.java:134)
at net.andrewcpu.elevenlabs.util.ElevenNetworkUtil.sendRequest(ElevenNetworkUtil.java:145)
at net.andrewcpu.elevenlabs.api.ElevenLabsAPI.sendRequest(ElevenLabsAPI.java:11)
at net.andrewcpu.elevenlabs.api.impl.SpeechToSpeechAPI.generateSpeechToSpeech(SpeechToSpeechAPI.java:18)
at net.andrewcpu.elevenlabs.model.voice.Voice.speechToSpeech(Voice.java:245)

I don't have a Speech to Speech model available to my account, so once I fix this I will not be able to confirm whether or not the fix will work.

Multiple API Keys

Hey Andres, it's me again :)

I am now at a point, where I need multiple elevenlabs instances running. This is currently NOT supported, correct?

Is this planned? Maybe with a builder pattern? How complex do you think such a change would be?
Found 57 static variables (ElevenLabs, ElevenNetworkUtil, Choice, GeneratedAudioOutputFormat, ProjectOutputQuality, StreamLatencyOptimization, History, Project, Subscription, User).

Would you be willing (and also have the time) to implement such a thing or should I get started and add a PR some time in the future?

Kind regards and merry christmas
Dustin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.