OpenAI’s ChatGPT Now Empowered with Voice and Image Capabilities 

OpenAIs-ChatGPT-A-New-Era-with-Voice-and-Image-Capabilities

OpenAI has announced voice and image capabilities for its widely used AI-driven chatbot, ChatGPT. These newly integrated capabilities provide a more natural conversational experience with ChatGPT by speaking to it and showing images.

This allows several opportunities to use OpenAI’s ChatGPT in everyday activities. For instance, while traveling, you can send ChatGPT a picture of a landmark and have a real-time discussion. Similarly, at home, you can click photos of the contents of your refrigerator and explore meal ideas or request step-by-step recipes.

OpenAI will gradually introduce these features to Plus and Enterprise users in the coming weeks. The voice feature will be accessible through mobile apps, while the image feature will be available across all platforms.

OpenAI’s ChatGPT Gets Vocal: Engage in Two-way Conversation

The new voice functionality enables you to engage in spoken conversations with ChatGPT. It can respond audibly using one of five synthesized voices. You can opt-in by adjusting your iOS and Android mobile app settings to activate this feature.

OpenAI has revealed that this voice feature relies on an advanced text-to-speech model, which has been trained on recordings from voice actors. For speech recognition, it uses Whisper, OpenAI’s open-source speech system.

Getting started:

To start, follow these simple instructions:

  • Navigate to the “Settings” section.
  • Scroll down and select “New Features” within the mobile app.
  • Opt-in for voice conversations.
  • Tap the headphone icon present at the top-right corner of the home screen.
  • Select your desired voice from five different options.

Ask with Images: ChatGPT’s Latest Feature

You can now show one or multiple images to ChatGPT, which provides visual context and focuses the conversation.

For instance, sharing a picture of a malfunctioning appliance can help ChatGPT identify problems and propose solutions. On mobile, a drawing tool is available, allowing you to encircle or highlight specific areas within an image.

These image-related features leverage a multimodal version of the GPT-3.5 and GPT-4 models, specially refined to analyze visual inputs. OpenAI conducted rigorous safety assessments to evaluate potential risks before introducing these image capabilities.

Getting started:

To begin using image prompts, follow these simple instructions:

  • Press the photo button to either capture an image or select one from your library.
  • For iOS or Android users, tap the plus button before proceeding.
  • You can engage in discussions involving multiple images or use our drawing tool to guide your assistant visually.

Prioritizing Security: A Step-by-Step Rollout

OpenAI is focusing on taking a cautious approach to roll out its features.

The newly introduced voice technology brings the potential for innovative applications and raises concerns like public figure impersonation. The voice feature is currently restricted to conversational chat to reduce risks.

OpenAI has implemented limitations for images to prevent ChatGPT from directly analyzing people in photos. The system advises refraining from engaging in high-risk use cases without verification.

Conclusion

OpenAI’s recent announcement marks a significant milestone in the evolution of ChatGPT, introducing voice and image capabilities that promise a more natural and immersive conversational experience. These enhancements open doors to countless practical applications, from on-the-go travel help to creative meal planning at home. However, introducing new technology imposes high risks like safety and public figure impersonation.

Overall, this advancement showcases the potential of AI-driven chatbots like ChatGPT to provide more versatile and engaging interactions while prioritizing user safety and security. It’s a significant step forward in the ongoing development of conversational AI.

OpenAI’s ChatGPT introduces voice and image capabilities that promise a more natural and immersive conversational experience.

WRITTEN BY

Anjali Goyal

Anjali Goyal is a content writer at TechEela. She helps businesses increase their online presence with optimized and engaging content. Her service includes blog writing, technical writing, and digital marketing.
0

Leave a Reply

Your email address will not be published. Required fields are marked *