Hume Introduces Interpretability-Primarily based Voice Management Characteristic for AI Voice Customisation

Hume, a New York-based synthetic intelligence (AI) agency, unveiled a brand new device on Monday that can enable customers to customize AI voices. Dubbed Voice Management, the brand new function is aimed toward serving to builders combine these voices into their chatbots and different AI-based purposes. As an alternative of providing a wide range of voices, the corporate gives granular management over 10 completely different dimensions of voices. By deciding on the specified parameters in every of the size, customers can generate distinctive voices for his or her apps.

The corporate detailed the brand new AI device in a weblog publish. Hume acknowledged that it’s attempting to unravel the issue of enterprises discovering the suitable AI voice to match their model identification. With this function, customers can customise completely different elements of the notion of voice and permit builders to create a extra assertive, relaxed, or buoyant voice for AI-based purposes.

Hume’s Voice Management is at the moment out there in beta, however it may be accessed by anybody registered on the platform. Devices 360 workers members had been capable of entry the device and check the function. There are 10 completely different dimensions builders can alter together with gender, assertiveness, buoyancy, confidence, enthusiasm, nasality, relaxedness, smoothness, tepidity, and tightness.

As an alternative of including a prompt-based customisation, the corporate has added a slider that goes from -100 to +100 for every of the metrics. The corporate acknowledged that this method was taken to eradicate the vagueness related to the textual description of a voice and to supply granular management over the languages.

In our testing, we discovered altering any of the ten dimensions makes an audible distinction to the AI voice and the device was capable of disentangle the completely different dimensions appropriately. The AI agency claimed that this was achieved by creating a brand new “unsupervised method” which preserves most traits of every base voice when particular parameters are diversified. Notably, Hume didn’t element the supply of the procured knowledge.

Notably, after creating an AI voice, builders must deploy it to the appliance by configuring its Empathic Voice Interface (EVI) AI mannequin. Whereas the corporate didn’t specify, the EVI-2 mannequin was doubtless used for this experimental function.

Sooner or later, Hume plans to broaden the vary of base voices, introduce extra interpretable dimensions, improve the preservation of voice traits below excessive modifications, and develop superior instruments to analyse and visualise voice traits.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.