apfel

Project Url: Arthur-Ficial/apfel
Introduction: Apple Intelligence from the command line. On-device LLM via FoundationModels framework. No API keys, no cloud, no dependencies.
More: Author   ReportBugs   OfficialWebsite   
Tags:

Version 0.9.2 Swift 6.3+ macOS 26+ No Xcode Required License: MIT 100% On-Device Website

Use the FREE local Apple Intelligence LLM on your Mac - your model, your machine, your way.

No API keys. No cloud. No subscriptions. No per-token billing. The AI is already on your computer - apfel lets you use it.

What is this

Every Mac with Apple Silicon has a built-in LLM - Apple's on-device foundation model, shipped as part of Apple Intelligence. Apple provides the FoundationModels framework (macOS 26+) to access it, but only exposes it through Siri and system features. apfel wraps it in a CLI and an HTTP server - so you can actually use it. All inference runs on-device, no network calls.

  • UNIX tool - echo "summarize this" | apfel - pipe-friendly, file attachments, JSON output, exit codes
  • OpenAI-compatible server - apfel --serve - drop-in replacement at localhost:11434, works with any OpenAI SDK
  • Tool calling - function calling with schema conversion, full round-trip support
  • Zero cost - no API keys, no cloud, no subscriptions, 4096-token context window

apfel CLI

Requirements & Install

  • Apple Silicon Mac, macOS 26 Tahoe or newer, Apple Intelligence enabled
  • Building from source requires Command Line Tools with macOS 26.4 SDK (ships Swift 6.3). No Xcode required.

Homebrew (recommended):

brew tap Arthur-Ficial/tap
brew install apfel
brew upgrade apfel

Update:

brew upgrade apfel

Build from source:

git clone https://github.com/Arthur-Ficial/apfel.git
cd apfel
make install

Troubleshooting: docs/install.md

Quick Start

UNIX tool

Shell note: if your prompt contains !, prefer single quotes in zsh/bash so history expansion does not break copy-paste. Example: apfel 'Hello, Mac!'

# Single prompt
apfel "What is the capital of Austria?"

# Permissive mode -- reduces guardrail false positives for creative/long prompts
apfel --permissive "Write a dramatic opening for a thriller novel"

# Stream output
apfel --stream "Write a haiku about code"

# Pipe input
echo "Summarize: $(cat README.md)" | apfel

# Attach file content to prompt
apfel -f README.md "Summarize this project"

# Attach multiple files
apfel -f old.swift -f new.swift "What changed between these two files?"

# Combine files with piped input
git diff HEAD~1 | apfel -f CONVENTIONS.md "Review this diff against our conventions"

# JSON output for scripting
apfel -o json "Translate to German: hello" | jq .content

# System prompt
apfel -s "You are a pirate" "What is recursion?"

# System prompt from file
apfel --system-file persona.txt "Explain TCP/IP"

# Quiet mode for shell scripts
result=$(apfel -q "Capital of France? One word.")

OpenAI-compatible server

Start the server:

apfel --serve

Then in another terminal:

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"apple-foundationmodel","messages":[{"role":"user","content":"Hello"}]}'

Works with the official Python client:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="unused")
resp = client.chat.completions.create(
    model="apple-foundationmodel",
    messages=[{"role": "user", "content": "What is 1+1?"}],
)
print(resp.choices[0].message.content)

Interactive chat

apfel --chat
apfel --chat -s "You are a helpful coding assistant"
apfel --chat --mcp ./mcp/calculator/server.py      # chat with MCP tools
apfel --chat --debug                                # debug output to stderr

Ctrl-C exits cleanly. Context window is managed automatically with configurable strategies:

apfel --chat --context-strategy newest-first     # default: keep recent turns
apfel --chat --context-strategy oldest-first     # keep earliest turns
apfel --chat --context-strategy sliding-window --context-max-turns 6
apfel --chat --context-strategy summarize        # compress old turns via on-device model
apfel --chat --context-strategy strict           # error on overflow, no trimming
apfel --chat --context-output-reserve 256        # custom output token reserve

Updating

apfel --update

Or directly via Homebrew: brew upgrade apfel

Benchmarking

Measure internal overhead on the installed release binary:

apfel --benchmark -o json
make benchmark

File attachments (-f/--file)

Attach files to any prompt with -f (repeatable). Contents are prepended to your prompt.

apfel -f main.swift "Explain what this code does"
apfel -f before.txt -f after.txt "What are the differences?"
git diff HEAD~1 | apfel -f style-guide.md "Any style violations in this diff?"
apfel -f data.csv -o json "Extract the top 5 rows" | jq .content

Files, stdin, and prompt arguments all compose:

apfel -f poem.txt                                    # file only
apfel -f poem.txt "Translate this to German"          # file + prompt
echo "some text" | apfel "Summarize this"             # stdin + prompt
echo "ctx" | apfel -f code.swift "Explain with context" # all three

Demos

See demo/ for real-world shell scripts powered by apfel.

cmd - natural language to shell command:

demo/cmd "find all .log files modified today"
# $ find . -name "*.log" -type f -mtime -1

demo/cmd -x "show disk usage sorted by size"   # -x = execute after confirm
demo/cmd -c "list open ports"                   # -c = copy to clipboard

Shell function version - add to your .zshrc and use cmd from anywhere:

# cmd - natural language to shell command (apfel). Add to .zshrc:
cmd(){ local x c r a; while [[ $1 == -* ]]; do case $1 in -x)x=1;shift;; -c)c=1;shift;; *)break;; esac; done; r=$(apfel -q -s 'Output only a shell command.' "$*" | sed '/^```/d;/^#/d;s/\x1b\[[0-9;]*[a-zA-Z]//g;s/^[[:space:]]*//;/^$/d' | head -1); [[ $r ]] || { echo "no command generated"; return 1; }; printf '\e[32m$\e[0m %s\n' "$r"; [[ $c ]] && printf %s "$r" | pbcopy && echo "(copied)"; [[ $x ]] && { printf 'Run? [y/N] '; read -r a; [[ $a == y ]] && eval "$r"; }; return 0; }
cmd find all swift files larger than 1MB     # shows: $ find . -name "*.swift" -size +1M
cmd -c show disk usage sorted by size        # shows command + copies to clipboard
cmd -x what process is using port 3000       # shows command + asks to run it
cmd list all git branches merged into main
cmd count lines of code by language

oneliner - complex pipe chains from plain English:

demo/oneliner "sum the third column of a CSV"
# $ awk -F',' '{sum += $3} END {print sum}' file.csv

demo/oneliner "count unique IPs in access.log"
# $ awk '{print $1}' access.log | sort | uniq -c | sort -rn

mac-narrator - your Mac's inner monologue:

demo/mac-narrator              # one-shot: what's happening right now?
demo/mac-narrator --watch      # continuous narration every 60s

Also in demo/:

  • wtd - "what's this directory?" - instant project orientation
  • explain - explain a command, error, or code snippet
  • naming - naming suggestions for functions, variables, files
  • port - what's using this port?
  • gitsum - summarize recent git activity

Debug GUI

brew install Arthur-Ficial/tap/apfel-gui

apfel GUI

Native SwiftUI debug inspector with request timeline, MCP protocol viewer, chat, TTS/STT - all on-device. apfel-gui repo ->

MCP Tool Support

Attach MCP tool servers with --mcp. apfel discovers tools, executes them automatically, and returns the final answer. No glue code needed.

apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
mcp: ./mcp/calculator/server.py - add, subtract, multiply, divide, sqrt, power    ← stderr
tool: multiply({"a": 15, "b": 27}) = 405                                          ← stderr
15 times 27 is 405. ← stdout

Tool info goes to stderr; only the answer goes to stdout. Use -q to suppress tool info.

apfel --mcp ./server_a.py --mcp ./server_b.py "Use both tools"  # multiple servers
apfel --serve --mcp ./mcp/calculator/server.py                   # server mode
apfel --chat --mcp ./mcp/calculator/server.py                    # chat mode

Ships with a calculator MCP server at mcp/calculator/. See MCP docs for details.

OpenAI API Compatibility

Base URL: http://localhost:11434/v1

Feature Status Notes
POST /v1/chat/completions Supported Streaming + non-streaming
GET /v1/models Supported Returns apple-foundationmodel
GET /health Supported Model availability, context window, languages
GET /v1/logs, /v1/logs/stats Debug only Requires --debug
Tool calling Supported Native ToolDefinition + JSON detection. See Tool Calling Guide
response_format: json_object Supported Via system prompt injection
temperature, max_tokens, seed Supported Mapped to GenerationOptions
stream: true Supported SSE with usage stats in final chunk
finish_reason Supported stop, tool_calls, length
Context strategies Supported x_context_strategy, x_context_max_turns, x_context_output_reserve extension fields
CORS Supported Enable with --cors
POST /v1/completions 501 Legacy text completions not supported
POST /v1/embeddings 501 Embeddings not available on-device
logprobs=true, n>1, stop, presence_penalty, frequency_penalty 400 Rejected explicitly. n=1 and logprobs=false are accepted as no-ops
Multi-modal (images) 400 Rejected with clear error
Authorization header Supported Required when --token is set. See Server Security

Full API spec: openai/openai-openapi

Limitations

Constraint Detail
Context window 4096 tokens (input + output combined)
Platform macOS 26+, Apple Silicon only
Model One model (apple-foundationmodel), not configurable
Guardrails Apple's safety system may block benign prompts (use --permissive to reduce false positives). See comparison
Speed On-device, not cloud-scale - a few seconds per response
No embeddings / vision Not available on-device

CLI Reference

MODES
  apfel <prompt>                          Single prompt (default)
  apfel --stream <prompt>                 Stream response tokens
  apfel --chat                            Interactive conversation
  apfel --serve                           Start OpenAI-compatible server
  apfel --benchmark                       Run internal performance benchmarks

INPUT
  apfel -f, --file <path> <prompt>        Attach file content (repeatable)
  apfel -s, --system <text> <prompt>      Set system prompt
  apfel --system-file <path> <prompt>     Read system prompt from file
  apfel --mcp <server.py> <prompt>        Attach MCP tool server (repeatable)

OUTPUT
  -o, --output <fmt>                      Output format: plain, json
  -q, --quiet                             Suppress non-essential output
  --no-color                              Disable ANSI colors

MODEL
  --temperature <n>                       Sampling temperature (e.g., 0.7)
  --seed <n>                              Random seed for reproducibility
  --max-tokens <n>                        Maximum response tokens
  --permissive                            Relaxed guardrails (reduces false positives)
  --retry [n]                             Retry transient errors with backoff (default: 3)
  --debug                                 Enable debug logging to stderr (all modes)

CONTEXT (--chat)
  --context-strategy <s>                  newest-first, oldest-first, sliding-window, summarize, strict
  --context-max-turns <n>                 Max history turns (sliding-window only)
  --context-output-reserve <n>            Tokens reserved for output (default: 512)

SERVER (--serve)
  --port <n>                              Server port (default: 11434)
  --host <addr>                           Bind address (default: 127.0.0.1)
  --cors                                  Enable CORS headers
  --allowed-origins <origins>             Comma-separated allowed origins
  --no-origin-check                       Disable origin checking
  --token <secret>                        Require Bearer token auth
  --token-auto                            Generate random Bearer token
  --public-health                         Keep /health unauthenticated
  --footgun                               Disable all protections
  --max-concurrent <n>                    Max concurrent requests (default: 5)

META
  -v, --version                           Print version
  -h, --help                              Show help
  --release                               Detailed build info
  --model-info                            Print model capabilities
  --update                                Check for updates via Homebrew

Examples by flag:

# -f, --file — attach file content to prompt (repeatable)
apfel -f main.swift "Explain this code"
apfel -f before.txt -f after.txt "What changed?"

# -s, --system — set a system prompt
apfel -s "You are a pirate" "What is recursion?"
apfel -s "Reply in JSON only" "List 3 colors"

# --system-file — read system prompt from a file
apfel --system-file persona.txt "Introduce yourself"

# --mcp — attach MCP tool servers (repeatable)
apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
apfel --mcp ./calc.py --mcp ./weather.py "Use both tools"

# -f, --file
apfel -f main.swift "Explain this code"
apfel -f before.txt -f after.txt "What changed?"

# -s, --system
apfel -s "You are a pirate" "What is recursion?"

# --system-file
apfel --system-file persona.txt "Introduce yourself"

# --mcp
apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
apfel --mcp ./calc.py --mcp ./weather.py "Use both tools"

# -o, --output
apfel -o json "Translate to German: hello" | jq .content

# -q, --quiet
apfel -q "Give me a UUID"

# --no-color
NO_COLOR=1 apfel "Hello"

# --temperature
apfel --temperature 0.0 "What is 2+2?"
apfel --temperature 1.5 "Write a wild poem"

# --seed
apfel --seed 42 "Tell me a joke"

# --max-tokens
apfel --max-tokens 50 "Explain quantum computing"

# --permissive — relaxed guardrails (see docs/PERMISSIVE.md for comparison)
apfel --permissive "Write a villain monologue"
apfel --permissive -f long-document.md "Summarize this"

# --retry
apfel --retry "What is 2+2?"

# --debug
apfel --debug "Hello world"

# --stream
apfel --stream "Write a haiku about code"

# --chat
apfel --chat
apfel --chat -s "You are a helpful coding assistant"

# --context-strategy
apfel --chat --context-strategy newest-first      # default
apfel --chat --context-strategy sliding-window --context-max-turns 6
apfel --chat --context-strategy summarize          # compress old turns

# --serve
apfel --serve
apfel --serve --port 3000 --host 0.0.0.0

# --cors, --token, --footgun
apfel --serve --cors
apfel --serve --token "my-secret-token"
apfel --serve --footgun   # only for local development!

# --token-auto, --public-health
apfel --serve --token-auto --host 0.0.0.0 --public-health

# --allowed-origins, --no-origin-check
apfel --serve --allowed-origins "https://myapp.com,https://staging.myapp.com"
apfel --serve --no-origin-check

# --max-concurrent
apfel --serve --max-concurrent 2

# --debug (server: also enables /v1/logs)
apfel --serve --debug

# --context-output-reserve
apfel --chat --context-output-reserve 256

# --benchmark, --model-info, --update, --release, --version, --help
apfel --benchmark -o json | jq '.benchmarks[] | {name, speedup_ratio}'
apfel --model-info
apfel --update
apfel --release
apfel --version
apfel --help

See Server Security for detailed documentation on security options.

Exit Codes

Code Meaning
0 Success
1 Runtime error
2 Usage error (bad flags)
3 Guardrail blocked
4 Context overflow
5 Model unavailable
6 Rate limited

Environment Variables

Variable Description
APFEL_SYSTEM_PROMPT Default system prompt
APFEL_HOST Server bind address
APFEL_PORT Server port
APFEL_TOKEN Bearer token for server authentication
APFEL_TEMPERATURE Default temperature
APFEL_MAX_TOKENS Default max tokens
APFEL_CONTEXT_STRATEGY Default context strategy
APFEL_CONTEXT_MAX_TURNS Max turns for sliding-window
APFEL_CONTEXT_OUTPUT_RESERVE Tokens reserved for output
NO_COLOR Disable colors (no-color.org)

Architecture

CLI (single/stream/chat) ──┐
                           ├─→ FoundationModels.SystemLanguageModel
HTTP Server (/v1/*) ───────┘   (100% on-device, zero network)
                                ContextManager → Transcript API
                                SchemaConverter → native ToolDefinitions
                                TokenCounter → real token counts (SDK 26.4)

Swift 6.3 strict concurrency. Three targets: ApfelCore (pure logic, unit-testable), apfel (CLI + server), apfel-tests (pure Swift runner, no XCTest). No Xcode required.

Build & Test

make install                             # build release + install to /usr/local/bin
make build                               # build release only
make version                             # print current version
make release-minor                       # bump minor: 0.6.x -> 0.7.0
swift build                              # quick debug build (no version bump)
swift run apfel-tests                    # unit tests
python3 -m pytest Tests/integration/ -v  # integration tests (auto-starts servers)
apfel --benchmark -o json                # performance report

Every make build/make install auto-bumps the patch version, updates the README badge, and generates build metadata (.version is the single source of truth).

Related Projects

  • apfel-clip - AI clipboard actions from the menu bar
  • apfel-gui - Native macOS debug GUI (inspector, MCP viewer, TTS/STT)

Examples

See docs/EXAMPLES.md for 50+ real prompts with unedited model output.

License

MIT

Apps
About Me
GitHub: Trinea
Facebook: Dev Tools