in

Creating voice clones with Autogen and OpenVoice technology #cloning

Waldo Bear

The content is a guide on building a voice clone agent using autogen and openvoice. It explains how to create a SpeakingAgent class as a subclass of the AssistantAgent class in autogen, and a VoiceClone class to clone a voice using openvoice embedding. The SpeakingAgent is designed to speak the generated reply text before returning, while the VoiceClone class creates an embedding mapping between base tts setup and target clone voice. Users can initialize the SpeakingAgent with base llm model config, base voice clip, and target voice clip. When initiating a chat, the voice agent generates an llm text response, converts it to an audio response using the base voice, and then converts it to the target voice using the voice_clone_embedding.

The content also includes code snippets for the SpeakingAgent and VoiceChanger classes, as well as instructions on how to run them on separate servers. It provides links to the server code and additional utilities for interacting with the voice cloning server. The guide concludes with a simple usage example for creating a SpeakingAgent to roleplay as favorite characters or generate voices for various applications. The author invites readers to reach out for interesting applications or scaling the project for production use, and shares a link to the GitHub repository used throughout the guide.

Source link

Source link: https://medium.com/@waldobear002/voice-clone-agent-using-autogen-and-openvoice-83eb637f41a1?source=rss——ai-5

What do you think?

Leave a Reply

GIPHY App Key not set. Please check settings

From Alabama To Wyoming, Here's What AI Thinks Barbie Dolls From All 50 States Would Look Like - Yahoo News Australia

AI predicts appearance of Barbie dolls from all 50 states. #BarbieDolls

Build Local LLM Agentic Apps with LLMFlex

Developing Local LLM Agentic Apps with LLMFlex #appdevelopment