Perplexity's Reasoning Mode, Ideogram’s Canvas and Runway's Act-One

Hello AI Enthusiasts,

Welcome to another exciting edition of The AI Pro Max! We have some groundbreaking stories that you won't want to miss.

Max Your AI Insights Today:

  • 💻 Anthropic’s Claude Now Controls Computer Interfaces

  • 🧠 Perplexity Unveils Reasoning Mode for Multi-Layered Questions

  • 🎙️ OpenAI Rolls Out Advanced Voice Mode Globally

  • 🖌️ Ideogram’s Canvas: Infinite Creativity with Inpainting & Outpainting

  • 🎥 Runway Act-One: Turn Video & Voice into Stunning Animations

  • 💻 Byte-Sized Buzz

  • 📚 Must-Reads

  • 🛍️ Nerdy Necessities

  • 🧠 Tech Trivia

A Shoutout to Our Sponsor

Learn AI in 5 Minutes a Day

AI Tool Report is one of the fastest-growing and most respected newsletters in the world, with over 550,000 readers from companies like OpenAI, Nvidia, Meta, Microsoft, and more.

Our research team spends hundreds of hours a week summarizing the latest news, and finding you the best opportunities to save time and earn more using AI.

💻 Anthropic’s Claude Now Controls Computer Interfaces - LINK

  • The Claude 3.5 Sonnet has been upgraded with significant improvements, particularly in coding tasks, outperforming other models on benchmarks like SWE-bench Verified and TAU-bench. It maintains the same price and speed as its predecessor and has been praised by early users such as GitLab, Cognition, and The Browser Company for its enhanced reasoning and problem-solving capabilities.

  • A new model, Claude 3.5 Haiku, has been announced, which matches the performance of the larger Claude 3 Opus model at a similar cost and speed. Claude 3.5 Haiku excels in coding tasks, instruction following, and tool use, making it suitable for user-facing products and specialized sub-agent tasks. It will be available later this month on various platforms.

  • Anthropic has introduced a public beta for "computer use," a capability that allows Claude 3.5 Sonnet to interact with computers like humans, performing tasks such as navigating interfaces, clicking buttons, and typing text. Although still experimental and imperfect, this feature has the potential to automate repetitive processes and conduct complex tasks. Developers are encouraged to provide feedback to improve this capability.

🧠 Perplexity Unveils Reasoning Mode for Multi-Layered Questions - LINK

  • The Pro Search feature now includes a Reasoning Mode that uses multi-step reasoning to break down complex queries into manageable components, providing more comprehensive and accurate results. This mode automatically activates when additional computations or searches are needed.

  • The upgrade integrates several powerful components, including advanced code execution and mathematical problem-solving through Wolfram|Alpha integration, improved programming capabilities for debugging and data analysis, and enhanced complex problem-solving through multi-step reasoning.

  • The feature is available to users across different tiers: free users can access 5 Pro Searches every 4 hours, while Pro subscribers ($20/month) get up to 600 Pro Searches daily. Pro Search is particularly useful in various professional contexts such as academic research, legal research, marketing trend analysis, and code debugging and development support.

🎙️ OpenAI Rolls Out Advanced Voice Mode Globally - LINK

  • OpenAI has officially rolled out the Advanced Voice Mode to users in the European Union, Switzerland, Iceland, Norway, and Liechtenstein. This rollout follows a period where the feature was not available in these regions due to regulatory considerations, particularly related to the EU's General Data Protection Regulation (GDPR).

  • The Advanced Voice Mode allows users to interact with ChatGPT using voice input, enabling more natural and conversational interactions. This feature is available to Plus users, as well as some Team and Enterprise users, and requires a paid subscription. Free users in these regions do not yet have access to this feature.

  • The delay in rolling out Advanced Voice Mode to EU users was attributed to the need for additional external reviews to ensure the feature aligns with local regulatory requirements. With these hurdles now cleared, users in the EU can access the feature without needing to use VPNs.

🖌️ Ideogram’s Canvas: Infinite Creativity with Inpainting & Outpainting - LINK

  • Ideogram Canvas is an infinite creative board that allows users to organize, generate, edit, and combine images. It supports uploading personal images or generating new ones, and integrates with Magic Fill (inpainting) and Extend (outpainting) tools for advanced editing capabilities.

  • Magic fill enables users to edit specific regions of images to replace objects, add text, fix imperfections, or change backgrounds. It allows zooming into image portions for high-resolution detail generation and combining multiple images into a unified composition. Extend tool expands images beyond their original borders while maintaining the original style and composition. Users can adjust the image composition and aspect ratio to fit any screen size.

  • Ideogram Canvas is available with any paid plan on ideogram.ai, with additional editing capabilities for uploaded images available in the Plus or Pro plans. Developers can also integrate Magic Fill and Extend into their applications using the Ideogram API.

🎥 Runway Act-One: Turn Video & Voice into Stunning Animations - LINK

  • ACT-1 (Autoregressive Conditional Transformer 1) is a new large language model developed by Runway ML, designed to handle a wide range of natural language processing tasks. It is built on the transformer architecture and is trained on a diverse dataset to enhance its generalization capabilities.

  • Unlike many other language models, ACT-1 is designed with multimodal capabilities, allowing it to process and generate both text and images. This makes it versatile for tasks such as image captioning, text-to-image generation, and other multimodal applications.

  • Runway ML has released ACT-1 as an open-source model, encouraging community involvement and collaboration. The open-source nature allows developers to fine-tune the model for specific tasks, contribute to its improvement, and integrate it into various applications, fostering innovation and advancement in the field of AI.

  • 🚗 Ford CEO Reveals He’s Been Cruising in a Xiaomi EV for 6 Months - LINK

  • 🚨 Norway Raises Social Media Age to 15 - LINK

  • ⚠️ Chatbot that caused teen’s suicide is now more dangerous for kids, lawsuit says - LINK

  • 📱 Huawei Ditches Android for Good with Game-Changing HarmonyOS NEXT - LINK

  • 🔒 Google made a watermark for AI images that you can’t edit out - LINK

📖 Nexus: A Brief History of Information Networks from the Stone Age to AI

This book explores how information networks have shaped human history, from the Stone Age to the AI era. Harari examines key events like the canonization of the Bible, witch-hunts, and political systems to show the link between information, truth, and power. Relevance to today: It tackles pressing issues like misinformation, ecological collapse, and AI’s societal impact, urging careful management of AI’s future. Readers appreciate the engaging writing style, blending history, philosophy, and examples, making it a great read for those interested in the influence of information on society.

🛒 LEGO NES Console Set

Give your game room some vintage flair with this LEGO NES console set. With this 2,646-piece set you’ll be able to construct a classic Nintendo console with an accompanying controller and even a bulky old-school box television set.

In what year was the first version of GPT released?

Login or Subscribe to participate in polls.

Made it to the end? Awesome! Let’s keep in touch on Twitter.

See you in our next edition!😎

Farhan

We'd love to hear your thoughts on today's email!

Your feedback helps us improve our content

Login or Subscribe to participate in polls.