Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

A Summary of the WWDC25 Group Lab - Machine Learning and AI Frameworks
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Machine Learning and AI Frameworks. What are you most excited about in the Foundation Models framework? The Foundation Models framework provides access to an on-device Large Language Model (LLM), enabling entirely on-device processing for intelligent features. This allows you to build features such as personalized search suggestions and dynamic NPC generation in games. The combination of guided generation and streaming capabilities is particularly exciting for creating delightful animations and features with reliable output. The seamless integration with SwiftUI and the new design material Liquid Glass is also a major advantage. When should I still bring my own LLM via CoreML? It's generally recommended to first explore Apple's built-in system models and APIs, including the Foundation Models framework, as they are highly optimized for Apple devices and cover a wide range of use cases. However, Core ML is still valuable if you need more control or choice over the specific model being deployed, such as customizing existing system models or augmenting prompts. Core ML provides the tools to get these models on-device, but you are responsible for model distribution and updates. Should I migrate PyTorch code to MLX? MLX is an open-source, general-purpose machine learning framework designed for Apple Silicon from the ground up. It offers a familiar API, similar to PyTorch, and supports C, C++, Python, and Swift. MLX emphasizes unified memory, a key feature of Apple Silicon hardware, which can improve performance. It's recommended to try MLX and see if its programming model and features better suit your application's needs. MLX shines when working with state-of-the-art, larger models. Can I test Foundation Models in Xcode simulator or device? Yes, you can use the Xcode simulator to test Foundation Models use cases. However, your Mac must be running macOS Tahoe. You can test on a physical iPhone running iOS 18 by connecting it to your Mac and running Playgrounds or live previews directly on the device. Which on-device models will be supported? any open source models? The Foundation Models framework currently supports Apple's first-party models only. This allows for platform-wide optimizations, improving battery life and reducing latency. While Core ML can be used to integrate open-source models, it's generally recommended to first explore the built-in system models and APIs provided by Apple, including those in the Vision, Natural Language, and Speech frameworks, as they are highly optimized for Apple devices. For frontier models, MLX can run very large models. How often will the Foundational Model be updated? How do we test for stability when the model is updated? The Foundation Model will be updated in sync with operating system updates. You can test your app against new model versions during the beta period by downloading the beta OS and running your app. It is highly recommended to create an "eval set" of golden prompts and responses to evaluate the performance of your features as the model changes or as you tweak your prompts. Report any unsatisfactory or satisfactory cases using Feedback Assistant. Which on-device model/API can I use to extract text data from images such as: nutrition labels, ingredient lists, cashier receipts, etc? Thank you. The Vision framework offers the RecognizeDocumentRequest which is specifically designed for these use cases. It not only recognizes text in images but also provides the structure of the document, such as rows in a receipt or the layout of a nutrition label. It can also identify data like phone numbers, addresses, and prices. What is the context window for the model? What are max tokens in and max tokens out? The context window for the Foundation Model is 4,096 tokens. The split between input and output tokens is flexible. For example, if you input 4,000 tokens, you'll have 96 tokens remaining for the output. The API takes in text, converting it to tokens under the hood. When estimating token count, a good rule of thumb is 3-4 characters per token for languages like English, and 1 character per token for languages like Japanese or Chinese. Handle potential errors gracefully by asking for shorter prompts or starting a new session if the token limit is exceeded. Is there a rate limit for Foundation Models API that is limited by power or temperature condition on the iPhone? Yes, there are rate limits, particularly when your app is in the background. A budget is allocated for background app usage, but exceeding it will result in rate-limiting errors. In the foreground, there is no rate limit unless the device is under heavy load (e.g., camera open, game mode). The system dynamically balances performance, battery life, and thermal conditions, which can affect the token throughput. Use appropriate quality of service settings for your tasks (e.g., background priority for background work) to help the system manage resources effectively. Do the foundation models support languages other than English? Yes, the on-device Foundation Model is multilingual and supports all languages supported by Apple Intelligence. To get the model to output in a specific language, prompt it with instructions indicating the user's preferred language using the locale API (e.g., "The user's preferred language is en-US"). Putting the instructions in English, but then putting the user prompt in the desired output language is a recommended practice. Are larger server-based models available through Foundation Models? No, the Foundation Models API currently only provides access to the on-device Large Language Model at the core of Apple Intelligence. It does not support server-side models. On-device models are preferred for privacy and for performance reasons. Is it possible to run Retrieval-Augmented Generation (RAG) using the Foundation Models framework? Yes, it is possible to run RAG on-device, but the Foundation Models framework does not include a built-in embedding model. You'll need to use a separate database to store vectors and implement nearest neighbor or cosine distance searches. The Natural Language framework offers simple word and sentence embeddings that can be used. Consider using a combination of Foundation Models and Core ML, using Core ML for your embedding model.
1
0
424
2w
Foundation Models performance reality check - anyone else finding it slow?
Testing Foundation Models framework with a health-focused recipe generation app. The on-device approach is appealing but performance is rough. Taking 20+ seconds just to get recipe name and description. Same content from Claude API: 4 seconds. I know it's beta and on-device has different tradeoffs, but this is approaching unusable territory for real-time user experience. The streaming helps psychologically but doesn't mask the underlying latency.The privacy/cost benefits are compelling but not if users abandon the feature before it completes. Anyone else seeing similar performance? Is this expected for beta, or are there optimization techniques I'm missing?
4
0
183
5h
"FoundationModels GenerationError error 2" on iOS 26 beta 3
Hi all, I'm working on an app that utilizes the FoundationModels found in iOS 26. I updated my phone to iOS 26 beta 3 and am now receiving the following error when trying to run code that worked in beta 2: Al Error: The operation couldn't be completed. (FoundationModels.LanguageModelSession.Genera- tionError error 2.) I admit I'm a bit of a new developer, but any idea if this is an issue with beta 3 or work that I'll need to do to adapt my code to some changes in the AI API? Thank you!
20
6
932
11h
Rate limit exceeded when using Foundation Model framework
When I use the FoundationModel framework to generate long text, it will always hit an error. "Passing along Client rate limit exceeded, try again later in response to ExecuteRequest" And stop generating. eg. for the prompt "Write a long story", it will almost certainly hit that error after 17 seconds of generation. do{ let session = LanguageModelSession() let prompt: String = "Write a long story" let response = try await session.respond(to: prompt) }catch{} If possible, I want to know how to prevent that error or at least how to handle it.
0
0
169
21h
The answer of "apple" goes to guardrailViolation?
I have been using "apple" to test foundation models. I thought this is local, but today the answer changed - half way through explanation, suddenly guardrailViolation error was activated! And yesterday, all reference to "Apple II", "Apple III" now refers me to consult apple.com! Does foundation models connect to Internet for answer? Using beta 3.
2
0
106
23h
Using Past Versions of Foundation Models As They Progress
Has Apple made any commitment to versioning the Foundation Models on device? What if you build a feature that works great on 26.0 but they change the model or guardrails in 26.1 and it breaks your feature, is your only recourse filing Feedback or pulling the feature from the app? Will there be a way to specify a model version like in all of the server based LLM provider APIs? If not, sounds risky to build on.
6
0
195
1d
Can I give additional context to Foundation Models?
I'm interested in using Foundation Models to act as an AI support agent for our extensive in-app documentation. We have many pages of in-app documents, which the user can currently search, but it would be great to use Foundation Models to let the user get answers to arbitrary questions. Is this possible with the current version of Foundation Models? It seems like the way to add new context to the model is with the instructions parameter on LanguageModelSession. As I understand it, the combined instructions and prompt need to consume less than 4096 tokens. That definitely wouldn't be enough for the amount of documentation I want the agent to be able to refer to. Is there another way of doing this, maybe as a series of recursive queries? If there is a solution based on multiple queries, should I expect this to be fast enough for interactive use?
4
0
196
1d
Unavailable error is wrong?
This is my code: witch SystemLanguageModel.default.availability { case .available: ContentView() .popover(isPresented: $showSettings) { SettingsView().presentationCompactAdaptation(.popover) } case .unavailable(.modelNotReady): ContentUnavailableView("Apple Intelligence is unavailable", systemImage: "apple.intelligence.badge.xmark", description: Text("Please come back later.")) case .unavailable(.appleIntelligenceNotEnabled): ContentUnavailableView("Apple Intelligence is unavailable", systemImage: "apple.intelligence.badge.xmark", description: Text("Please turn on Apple Intelligence.")) case .unavailable(.deviceNotEligible): ContentUnavailableView("Apple Intelligence is unavailable", systemImage: "apple.intelligence.badge.xmark", description: Text("This device is not eligible for Apple Intelligence.")) case .unavailable: ContentUnavailableView("Apple Intelligence is unavailable", systemImage: "apple.intelligence.badge.xmark") } When I switch off Apple Intelligence, I expected "Please turn on Apple Intelligence.", but instead I get "Please come back later." This seems to be wrong error?
1
0
199
1d
Selecting an output language with Foundation Models
When using Foundation Models, is it possible to ask the model to produce output in a specific language, apart from giving an instruction like "Provide answers in ." ? (I tried that and it kind of worked, but it seems fragile.) I haven't noticed an API to do so and have a use-case where the output should be in a user-selectable language that is not the current system language.
1
0
310
1d
AttributedString in App Intents
In this WWDC25 session, it is explictely mentioned that apps should support AttributedString for text parameters to their App Intents. However, I have not gotten this to work. Whenever I pass rich text (either generated by the new "Use Model" intent or generated manually for example using "Make Rich Text from Markdown"), my Intent gets an AttributedString with the correct characters, but with all attributes stripped (so in effect just plain text). struct TestIntent: AppIntent { static var title = LocalizedStringResource(stringLiteral: "Test Intent") static var description = IntentDescription("Tests Attributed Strings in Intent Parameters.") @Parameter var text: AttributedString func perform() async throws -> some IntentResult & ReturnsValue<AttributedString> { return .result(value: text) } } Is there anything else I am missing?
0
0
138
2d
Model Guardrails Too Restrictive?
I'm experimenting with using the Foundation Models framework to do news summarization in an RSS app but I'm finding that a lot of articles are getting kicked back with a vague message about guardrails. This seems really common with political news but we're talking mainstream stuff, i.e. Politico, etc. If the models are this restrictive, this will be tough to use. Is this intended? FB17904424
5
3
207
2d
Dynamically Create Tool Argument Type
According to the Tool documentation, the arguments to the tool are specified as a static struct type T, which is given to tool.call(argument: T) However, if the arguments are not known until runtime, is it possible to still create a Tool object with the proper parameters? Let's say a JSON-style dictionary is passed into the Tool init function to specify T, is this achievable?
1
0
295
3d
ModelManager received unentitled request. Expected entitlement com.apple.modelmanager.inference
Just tried to write a very simple test of using foundation models, but it gave me the error like this "ModelManager received unentitled request. Expected entitlement com.apple.modelmanager.inference establishment of session failed with Missing entitlement: com.apple.modelmanager.inference" The simple code is listed below: let session: LanguageModelSession = LanguageModelSession() let response = try? await session.respond(to: "What is the capital of France?") print("Response: (response)") So what's the problem of this one?
2
0
189
3d
FoundationModels guardrailViolation on Beta 3
Hello everybody! I’m encountering an unexpected guardrailViolation error when using Foundation Models on macOS Beta 3 (Tahoe) with an Apple M2 Pro chip. This issue didn’t occur on Beta 1 or Beta 2 using the same codebase. Reproduction Context I’m developing an app that leverages Foundation Models for structured generation, paired with a local database tool. After upgrading to macOS Beta 3, I started receiving this error consistently, despite no changes in the generation logic. To isolate the issue, I opened the official WWDC sample project from the Adding intelligent app features with generative models and the same guardrailViolation error appeared without any modifications. Simplified Working Example I attempted to narrow down the issue by starting with a minimal prompt structure. This basic case works fine: import Foundation import Playgrounds import FoundationModels @Generable struct GeneableLandmark { @Guide(description: "Name of the landmark to visit") var name: String } final class LandmarkSuggestionGenerator { var landmarkSuggestion: GeneableLandmark.PartiallyGenerated? private var session: LanguageModelSession init(){ self.session = LanguageModelSession( instructions: Instructions { """ generate a list of landmarks to visit """ } ) } func createLandmarkSuggestion(location: String) async throws { let stream = session.streamResponse( generating: GeneableLandmark.self, options: GenerationOptions(sampling: .greedy), includeSchemaInPrompt: false ) { """ Generate a list of landmarks to viist in \(location) """ } for try await partialResponse in stream { landmarkSuggestion = partialResponse } } } #Playground { let generator = LandmarkSuggestionGenerator() Task { do { try await generator.createLandmarkSuggestion(location: "New york") if let suggestion = generator.landmarkSuggestion { print("Suggested landmark: \(suggestion)") } else { print("No suggestion generated.") } } catch { print("Error generating landmark suggestion: \(error)") } } } But as soon as I use the Sample ItineraryPlanner: #Playground { // Example landmark for demonstration let exampleLandmark = Landmark( id: 1, name: "San Francisco", continent: "North America", description: "A vibrant city by the bay known for the Golden Gate Bridge.", shortDescription: "Iconic Californian city.", latitude: 37.7749, longitude: -122.4194, span: 0.2, placeID: nil ) let planner = ItineraryPlanner(landmark: exampleLandmark) Task { do { try await planner.suggestItinerary(dayCount: 3) if let itinerary = planner.itinerary { print("Suggested itinerary: \(itinerary)") } else { print("No itinerary generated.") } } catch { print("Error generating itinerary: \(error)") } } } The error pops up: Multiline Error generating itinerary: guardrailViolation(FoundationModels.LanguageModelSession. >GenerationError.Context(debug Description: "May contain sensitive or unsafe content", >underlyingErrors: [FoundationModels. LanguageModelSession. Gene >rationError.guardrailViolation(FoundationMo dels. >LanguageModelSession.GenerationError.C ontext (debugDescription: >"May contain unsafe content", underlyingErrors: []))])) Based on my tests: The error may not be tied to structure complexity (since more nested structures work) The issue may stem from the tools or prompt content used inside the ItineraryPlanner The guardrail sensitivity may have increased or changed in Beta 3, affecting models that worked in earlier betas Thank you in advance for your help. Let me know if more details or reproducible code samples are needed - I’m happy to provide them. Best, Sasha Morozov
2
1
267
4d
WWDC25 combining metal and ML
WWDC25: Combine Metal 4 machine learning and graphics Demonstrated a way to combine neural network in the graphics pipeline directly through the shaders, using an example of Texture Compression. However there is no mention of using which ML technique texture is compressed. Can anyone point me to some well known model/s for this particular use case shown in WWDC25.
2
0
291
4d
Is it allowed for an iOS app to download machine learning model files (e.g., .mlmodel, .onnx) from a separate cloud server?
Hello, I am developing an iOS app that uses machine learning models. To improve accuracy and user experience, I would like to download .mlmodel files (compiled and compressed as zip files) from our own server after the app is installed, and use them for inference within the app. No executable code, scripts, or dynamic libraries will be downloaded—only model data files are used. According to App Store Review Guideline 2.5.2, I understand that apps may not download or execute code which introduces or changes features or functionality. In this case, are compiled and zip-compressed .mlmodel files considered "data" rather than "code", and is it allowed to download and use them in the app? If there are any restrictions or best practices related to this, please let me know. Thank you.
1
0
261
4d