fbpx
News

Apple shares research stats on how its in-house AI model compares to competition

GPT-4-Turbo holds its own against Apple's LLM

Before WWDC 2024, leaks and rumours suggested that Cupertino-based Apple would employ OpenAI’s models to power AI-enabled features for the iPhones. And although Apple will integrate OpenAI’s ChatGPT, the model won’t power all of Apple’s AI features, and Siri will access it only when needed. “Siri can tap into ChatGPT’s expertise when helpful. Users are asked before any questions are sent to ChatGPT, along with any documents or photos, and Siri then presents the answer directly,” according to Apple.

AI (read: Apple Intelligence), will primarily be powered by Apple’s own LLMs, and will be available on iOS 18, iPadOS 18, and macOS 18, and only on devices with an A17 Pro chip and all M series chips. Compatible devices will process most requests on-device and only send the information to its data centres if it can’t be completed on-device.

Now, in a new blog post, via Android Authority, Apple has shed more details on its in-house generative models, and shared insight on how it compares to some other widely-used models. Apple Intelligence was compared with models like Mistral-7B, Microsoft’s Phi-3-mini, Google’s Gemma-7B and Gemma-2B, OpenAI’s GPT-4-Turbo, and more.

In terms of ‘Writing Benchmarks,’ Apple’s AI performed better than Mistral-7B, Gemma-7B, Phi-3-mini and Gemma-2B for on-device summarization and on-device text composition. When compared on servers, the model performed better than GPT-4-Turbo, Mixtral-8x22B, DBRX-Instruct and GPT-3.5-Turbo for text summarization, but fell short of GPT-4-Turbo in composition. See the image below for reference:

Image credit: Apple

Similarly, Apple Intelligence was the best of the bunch when it comes to instruction-following capabilities processed on-device. Again, it fell short of GPT-4-Turbo when compared on servers (by a small percentage).

When judged by humans for preferred responses, Apple’s model is tied to GPT-3.5-Turbo, offering a better response roughly 50 percent of the time. On the other hand, it fell short of GPT-4-Turbo, only offering a better response 28.5 percent of the time. You can find other comparison stats here.

It’s worth noting that there is no way for users to put these claims to the test. Although Apple revealed a slate of AI features at WWDC, none of them are available for users to try out at the moment. I am currently running the first iOS 18 Developer Beta, and none of the AI features are available on it. It is likely that these features will roll out with the iOS 18 stable build later this fall.

That being said, several non-AI features shown at WWDC are available to try out with the beta. Find out all the features that are available now with the first iOS 18 Developer Beta here.

Header image credit: Shutterstock

Source: Apple, Via: Android Authority

MobileSyrup may earn a commission from purchases made via our links, which helps fund the journalism we provide free on our website. These links do not influence our editorial content. Support us here.

Related Articles

Comments