AI

Your New AI Co-Pilot: Anthropic’s Claude 3.5 Sonnet Learns to Navigate Computers

Anthropic has launched new versions of its Claude AI models with enhanced capabilities, including a groundbreaking Computer Use feature that allows the AI to control computers like humans
Pinterest LinkedIn Tumblr

TLDR:

  • Anthropic released upgraded AI models Claude 3.5 Sonnet and new Claude 3.5 Haiku with enhanced capabilities
  • Introduced “Computer Use” feature allowing AI to interact with computers like humans (in public beta)
  • Claude 3.5 Sonnet shows major improvements in coding, scoring 49% on SWE-bench Verified
  • Early access partners include Amazon, Asana, Canva, and Notion
  • Computer Use capability enables tasks like booking flights, scheduling, form filling, and research

Anthropic announced an update to its artificial intelligence platform on Tuesday, October 22, 2024, marking a major advancement in it’s capabilities.

The company introduced an upgraded version of Claude 3.5 Sonnet and a new model called Claude 3.5 Haiku, alongside a groundbreaking feature that allows its AI to control computers like human users.

The new Computer Use capability enables Claude to perform tasks such as interpreting screen content, selecting buttons, entering text, navigating websites, and executing complex operations through various software applications. This feature is currently available in public beta for developers, with plans to expand access to consumers and enterprise clients in early 2025.

Several major companies have already gained early access to the technology. Amazon served as an initial tester, while other early adopters include Asana, Canva, and Notion. These companies have been exploring the potential of the new feature since early 2024.

The upgraded Claude 3.5 Sonnet has demonstrated notable improvements in coding capabilities. The model achieved a 49% score on SWE-bench Verified, surpassing other publicly available models in the market. This represents a significant increase from its previous performance of 33.4%.

Jared Kaplan, Anthropic’s chief science officer, explained that the system can handle tasks requiring “tens or even hundreds of steps.”

The technology aims to assist users with practical applications such as booking flights, scheduling appointments, filling out forms, conducting online research, and filing expense reports.

The new Claude 3.5 Haiku model matches the performance of the previous Claude 3 Opus while maintaining similar speed and cost efficiency. It scored 40.6% on SWE-bench Verified, outperforming many competing models including earlier versions of Claude and GPT-4o.

Early customer feedback has been positive. GitLab reported up to 10% stronger reasoning across use cases with no added latency. The Browser Company noted that Claude 3.5 Sonnet outperformed all previously tested models in their evaluation process.

Safety measures have been implemented alongside these technological advances. The US AI Safety Institute and the UK Safety Institute conducted joint pre-deployment testing of the new Claude 3.5 Sonnet model. Anthropic has also developed new classifiers to identify computer use activity and potential harmful applications.

The Computer Use feature operates through an API that allows Claude to perceive and interact with computer interfaces. On OSWorld, which evaluates AI models’ ability to use computers like humans, Claude 3.5 Sonnet achieved a 14.9% score in the screenshot-only category, exceeding the next-best AI system’s score of 7.8%.

Current limitations of the system include challenges with common actions like scrolling, dragging, and zooming. Anthropic acknowledges these imperfections and advises developers to begin with low-risk tasks during the beta phase.

The release comes as part of a broader competition in the AI industry. Major tech companies including Google, Amazon, Microsoft, and Meta are all working to advance their AI capabilities in a market projected to exceed $1 trillion in revenue within the next decade.

Anthropic has seen significant growth since releasing the first version of Claude in March 2023. The company has expanded its offerings to include iOS and Android apps, a Team plan for businesses, and recently launched Claude Enterprise, designed for business integration.

The Computer Use capability will be available across multiple platforms, including the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Claude 3.5 Haiku will be released later this month, initially as a text-only model with image input capabilities to follow.

The upgraded Claude 3.5 Sonnet is now available to all users at the same price and speed as its predecessor. Developers can access the computer use beta through various cloud platforms, marking the beginning of a new phase in AI-human computer interaction.

Oliver Dale is Editor-in-Chief of Circlo and founder of Kooc Media Ltd, A UK-Based Online Publishing company. A Technology Entrepreneur with over 15 years of professional experience in Investing and UK Business.His writing has been quoted by Nasdaq, Dow Jones, Investopedia, The New Yorker, Forbes, Techcrunch & More. oliver@circlo.io