Cadastre-se agora para um orçamento mais personalizado!

OpenAI's Latest API Launches: DALL-E 3, Audio API, and Whisper large-v3

10 de novembro de 2023 Hi-network.com

OpenAI has recently announced the launch of several new APIs during its first-ever developer day event. The APIs include DALL-E 3, a text-to-image model, and the Audio API for text-to-speech conversion.

DALL-E 3, now accessible via an API, was first introduced on ChatGPT and Bing Chat. OpenAI has incorporated built-in moderation in the API to prevent misuse. The DALL-E 3 API offers different format and quality options for generating images, with pricing starting at$0.04 per generated image. However, it has certain limitations compared to its predecessor, DALL-E 2. For instance, the current version cannot be used to create edited versions or variations of existing images. OpenAI also automatically rewrites generation requests for safety reasons and to add more detail, which may result in less precise outcomes depending on the prompt.

OpenAI's Audio API provides six preset voices and two generative AI model variants for text-to-speech conversion. With pricing starting at$0.015 per input of 1,000 characters, the API aims to enhance user experiences and enables use cases like language learning and voice assistance. Notably, unlike some speech synthesis platforms, OpenAI's Audio API does not currently allow control over the emotional effect of the generated audio. OpenAI's internal tests have yielded mixed results, with factors like capitalisation or grammar in the text influencing the sound of the generated voices.

To ensure transparency, OpenAI requires developers to use their APIs to inform users that the generated audio or image is artificial. This step aims to make users aware of the involvement of AI in the content.

In addition to these APIs, OpenAI has also launched Whisper large-v3, the next version of its open-source automatic speech recognition model. The company claims that Whisper large-v3 offers improved performance across languages. The model is available on GitHub under a permissive licence.

tag-icon Tags quentes : Inteligência artificial Desenvolvimento

Copyright © 2014-2024 Hi-Network.com | HAILIAN TECHNOLOGY CO., LIMITED | All Rights Reserved.