Core AI
Core AI (iOS 27) is the on-device inference framework that powers Apple Intelligence — now open to your apps. It is the modern path for running your own advanced models (large language models, vision transformers like SAM 3) locally across CPU, GPU, and Neural Engine, with no server and no per-token cost. It spans a Python conversion/optimization toolchain, a .aimodel format, a memory-safe Swift runtime, and developer tools (ahead-of-time compilation, Core AI Instruments, the Core AI Debugger).
This skill is part of the axiom-ai suite. For Apple's built-in LLM (no model to ship), see Foundation Models. For classic Core ML models, see iOS ML.
When to Use
Use this skill when you're:
- Bringing a PyTorch LLM, segmentation model, or custom transformer on-device via Core AI (the 27-cycle path)
- Loading and running a
.aimodelfrom Swift (AIModel,InferenceFunction,NDArray) - Fixing a transformer decode loop that slows down over time (KV-cache via Core AI states)
- Diagnosing first-launch stalls caused by model specialization, or planning model download and caching
- Backing a Foundation Models
LanguageModelSessionwith your own custom model
Core AI vs Core ML vs Foundation Models
| What you're doing | Skill |
|---|---|
Run Apple's built-in LLM (@Generable, no model to ship) | Foundation Models |
| Bring an LLM-scale / transformer model on-device (27-cycle) | This skill (Core AI) |
Convert/compress a classic Core ML model (.mlpackage) | iOS ML |
Back a LanguageModelSession with your own model | This skill (the Foundation Models bridge) |
Rule of thumb — Core ML is the established path for classic models; Core AI is the 27-cycle path built for modern/LLM-scale workloads and deep customization (custom Metal kernels, multi-function assets, ahead-of-time compilation). Both convert from PyTorch.
Example Prompts
Questions you can ask Claude that will draw from this skill:
- "How do I convert my PyTorch LLM to Core AI and run it from Swift?"
- "My Core AI model freezes the app on first launch — how do I handle specialization?"
- "How do I add a KV-cache to my Core AI transformer so it stops slowing down?"
- "How do I share a specialized model cache between my app and its extension?"
- "How do I back a
LanguageModelSessionwith my own model?" - "Should I bundle my 1 GB model in the app or download it?"
- "What's the difference between Core AI and Core ML?"
What This Skill Provides
- The deployment lifecycle – convert (
coreai-torch), optimize (coreai-opt), debug (Core AI Debugger), integrate (SwiftCoreAIframework), deploy (specialization, caching, ahead-of-time compilation) - The Swift runtime API –
AIModel,InferenceFunction,NDArrayand its scalar types, KV-cache states,AIModelAssetinspection — all SDK-verified and compile-checked against Xcode 27 - Specialization & caching discipline – why first-load is slow, why to keep it out of interactive flows,
AIModelCache(including app-group sharing), and ahead-of-time compilation withcoreai-build - The Foundation Models bridge – using
CoreAILanguageModel(from the open-sourcecoreai-modelsSwift package) to reuserespond/@Generable/ streaming with your own model - The developer tools – Core AI Instruments, the Core AI debug gauge, and the standalone Core AI Debugger
Related
- iOS ML – classic Core ML conversion, compression, and deployment; the boundary with Core AI
- foundation-models-ref – the
LanguageModelprotocol and Ecosystem section that the Core AI bridge plugs into - Metal Migration – writing the custom Metal kernels (
TorchMetalKernel,MTLTensor) that Core AI embeds in a model - Background Assets – delivering large models on demand instead of bundling them
Resources
WWDC: 2026-324, 2026-325, 2026-326, 2026-330
Docs: /CoreAI, /CoreAI/compiling-core-ai-models-ahead-of-time, /CoreAI/managing-model-specialization-and-caching