apfel

Project Url: Arthur-Ficial/apfel
Introduction: Apple Intelligence from the command line. On-device LLM via FoundationModels framework. No API keys, no cloud, no dependencies.
More: Author   ReportBugs   OfficialWebsite   
Tags:

The free AI already on your Mac.

Version 1.3.3 Swift 6.3+ macOS 26 Tahoe+ No Xcode Required License: MIT 100% On-Device Website #agentswelcome

Apple Silicon Macs ship a built-in LLM via Apple FoundationModels. apfel exposes it as a UNIX tool and a local OpenAI-compatible server. 100% on-device. No API keys, no cloud.

Mode Command What you get
UNIX tool apfel "prompt" / echo "text" | apfel Pipe-friendly answers, file attachments, JSON output, exit codes
OpenAI-compatible server apfel --serve Drop-in local http://localhost:11434/v1 backend for OpenAI SDKs

apfel --chat - interactive REPL.

Tool calling works in all contexts. 4096-token context.

apfel CLI

Requirements & Install

macOS 26 Tahoe+, Apple Silicon (M1+), Apple Intelligence enabled.

brew install apfel

Update:

brew upgrade apfel

Build from source (Command Line Tools with macOS 26.4 SDK / Swift 6.3, no Xcode):

git clone https://github.com/Arthur-Ficial/apfel.git && cd apfel && make install

Same-day tap, Mint, mise, troubleshooting: docs/install.md.

Quick Start

UNIX tool

Quote prompts with ! in single quotes (zsh/bash history expansion): apfel 'Hello, Mac!'.

# Single prompt
apfel "What is the capital of Austria?"

# Permissive mode - reduces guardrail false positives for creative/long prompts
apfel --permissive "Write a dramatic opening for a thriller novel"

# Stream output
apfel --stream "Write a haiku about code"

# Pipe input
echo "Summarize: $(cat README.md)" | apfel

# Attach file content to prompt
apfel -f README.md "Summarize this project"

# Attach multiple files
apfel -f old.swift -f new.swift "What changed between these two files?"

# Combine files with piped input
git diff HEAD~1 | apfel -f CONVENTIONS.md "Review this diff against our conventions"

# JSON output for scripting
apfel -o json "Translate to German: hello" | jq .content

# System prompt
apfel -s "You are a pirate" "What is recursion?"

# System prompt from file
apfel --system-file persona.txt "Explain TCP/IP"

# Quiet mode for shell scripts
result=$(apfel -q "Capital of France? One word.")

OpenAI-compatible server

apfel --serve                              # foreground
brew services start apfel                  # background (like Ollama)
brew services stop apfel
APFEL_TOKEN=$(uuidgen) APFEL_MCP=/path/to/tools.py brew services start apfel
curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"apple-foundationmodel","messages":[{"role":"user","content":"Hello"}]}'
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11434/v1", api_key="unused")
resp = client.chat.completions.create(
    model="apple-foundationmodel",
    messages=[{"role": "user", "content": "What is 1+1?"}],
)
print(resp.choices[0].message.content)

Background service details: docs/background-service.md.

Quick testing chat

apfel --chat is a small REPL for testing prompts or MCP servers. For a GUI chat app, see apfel-chat.

apfel --chat
apfel --chat -s "You are a helpful coding assistant"
apfel --chat --mcp ./mcp/calculator/server.py      # chat with MCP tools
apfel --chat --debug                                # debug output to stderr

Ctrl-C exits. Context is trimmed automatically (docs/context-strategies.md).

Demos

Shell scripts in demo/:

cmd - natural language to shell command:

demo/cmd "find all .log files modified today"
# $ find . -name "*.log" -type f -mtime -1

demo/cmd -x "show disk usage sorted by size"   # -x = execute after confirm
demo/cmd -c "list open ports"                   # -c = copy to clipboard

Shell function version - add to your .zshrc and use cmd from anywhere:

# cmd - natural language to shell command (apfel). Add to .zshrc:
cmd(){ local x c r a; while [[ $1 == -* ]]; do case $1 in -x)x=1;shift;; -c)c=1;shift;; *)break;; esac; done; r=$(apfel -q -s 'Output only a shell command.' "$*" | sed '/^```/d;/^#/d;s/\x1b\[[0-9;]*[a-zA-Z]//g;s/^[[:space:]]*//;/^$/d' | head -1); [[ $r ]] || { echo "no command generated"; return 1; }; printf '\e[32m$\e[0m %s\n' "$r"; [[ $c ]] && printf %s "$r" | pbcopy && echo "(copied)"; [[ $x ]] && { printf 'Run? [y/N] '; read -r a; [[ $a == y ]] && eval "$r"; }; return 0; }
cmd find all swift files larger than 1MB     # shows: $ find . -name "*.swift" -size +1M
cmd -c show disk usage sorted by size        # shows command + copies to clipboard
cmd -x what process is using port 3000       # shows command + asks to run it
cmd list all git branches merged into main
cmd count lines of code by language

oneliner - complex pipe chains from plain English:

demo/oneliner "sum the third column of a CSV"
# $ awk -F',' '{sum += $3} END {print sum}' file.csv

demo/oneliner "count unique IPs in access.log"
# $ awk '{print $1}' access.log | sort | uniq -c | sort -rn

mac-narrator - your Mac's inner monologue:

demo/mac-narrator              # one-shot: what's happening right now?
demo/mac-narrator --watch      # continuous narration every 60s

Also in demo/:

  • wtd - "what's this directory?" instant project orientation
  • explain - explain a command, error, or code snippet
  • naming - naming suggestions for functions, variables, files
  • port - what's using this port?
  • gitsum - summarize recent git activity

Longer walkthroughs: docs/demos.md.

MCP Tool Support

Attach Model Context Protocol servers with --mcp. apfel discovers, invokes, and returns.

apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
mcp: ./mcp/calculator/server.py - add, subtract, multiply, divide, sqrt, power    ← stderr
tool: multiply({"a": 15, "b": 27}) = 405                                          ← stderr
15 times 27 is 405. ← stdout

Use -q to suppress tool info.

apfel --mcp ./server_a.py --mcp ./server_b.py "Use both tools"
apfel --serve --mcp ./mcp/calculator/server.py
apfel --chat --mcp ./mcp/calculator/server.py

Ships with a calculator at mcp/calculator/ (docs/mcp-calculator.md).

Remote MCP servers (Streamable HTTP, MCP spec 2025-03-26):

apfel --mcp https://mcp.example.com/v1 "what tools do you have?"

# bearer token - prefer env var (flag is visible in ps aux)
APFEL_MCP_TOKEN=mytoken apfel --mcp https://mcp.example.com/v1 "..."

# mixed local + remote
apfel --mcp /path/to/local.py --mcp https://remote.example.com/v1 "..."

Security: prefer APFEL_MCP_TOKEN over --mcp-token (ps aux). apfel refuses bearer tokens over plaintext http://.

apfel-run: optional config layer

apfel itself has no config file - flags + env vars, like any UNIX tool. If you want a TOML config (many MCPs, profiles, team configs in git), apfel-run is an MIT wrapper that adds one via execve drop-in.

brew install Arthur-Ficial/tap/apfel-run
apfel-run config init                 # starter ~/.config/apfel/config.toml
alias apfel=apfel-run                 # optional, every apfel flag still works

OpenAI API Compatibility

Base URL: http://localhost:11434/v1

Feature Status Notes
POST /v1/chat/completions Supported Streaming + non-streaming
GET /v1/models Supported Returns apple-foundationmodel
GET /health Supported Model availability, context window, languages
GET /v1/logs, /v1/logs/stats Debug only Requires --debug
Tool calling Supported Native ToolDefinition + JSON detection. See docs/tool-calling-guide.md
response_format: json_object Supported System-prompt injection; markdown fences stripped from output
temperature, max_tokens, seed Supported Mapped to GenerationOptions. Omitting max_tokens uses the remaining context window (drop-in OpenAI semantics) - see Default response cap
stream: true Supported SSE; final usage chunk only when stream_options: {"include_usage": true} (per OpenAI spec)
finish_reason Supported stop, tool_calls, length
Context strategies Supported x_context_strategy, x_context_max_turns, x_context_output_reserve extension fields
CORS Supported Enable with --cors
POST /v1/completions 501 Legacy text completions not supported
POST /v1/embeddings 501 Embeddings not available on-device
logprobs=true, n>1, stop, presence_penalty, frequency_penalty 400 Rejected explicitly. n=1 and logprobs=false are accepted as no-ops
Multi-modal (images) 400 Rejected with clear error
Authorization header Supported Required when --token is set. See docs/server-security.md

Full API spec: openai/openai-openapi.

Default response cap (max_tokens)

When max_tokens is omitted, CLI and OpenAI-compatible server behave identically: the value flows through as nil and the model uses whatever room is left in the 4096-token context window. This is drop-in OpenAI semantics - no arbitrary fallback constant.

The on-device model has a 4096-token context window that holds input and output combined. If generation runs into the ceiling, the response ends cleanly with finish_reason: "length" and the partial content is returned (server: HTTP 200; CLI: exit 0 with a stderr warning). Pass max_tokens explicitly when you want a tighter latency budget or a known cap for your client.

Examples

# Omitted: uses remaining window, finish_reason: "stop" or "length"
curl -sS http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"apple-foundationmodel",
       "messages":[{"role":"user","content":"Reply SKIP, MOVE, or RENAME."}]}'

# Explicit cap (recommended for tight latency budgets)
curl -sS http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"apple-foundationmodel","max_tokens":128,
       "messages":[{"role":"user","content":"Summarise: ..."}]}'

Picking a value

Use case max_tokens
Single-word / classification reply 16 - 32
One-line instruction 64 - 128
Short paragraph 256 - 512
Long paragraph / structured JSON 1024 - 2048
As long as the context window allows omit it

Keep input_tokens + max_tokens comfortably below 4096. If the prompt itself exceeds the window, generation cannot start and the request fails with [context overflow] (HTTP 400 / CLI exit 4). The validator rejects non-positive values (max_tokens <= 0).

CLI parity

CLI and server share one rule: omitted = use remaining window. No constant to drift. Override with --max-tokens N or APFEL_MAX_TOKENS=N.

apfel "Reply SKIP."                    # uses remaining window
apfel --max-tokens 64 "Reply SKIP."    # explicit cap
APFEL_MAX_TOKENS=2048 apfel "..."      # via env var

Permissive guardrails for the server

apfel --serve --permissive makes the server use Apple's .permissiveContentTransformations guardrails for every request the process handles. Same flag, same semantics as the CLI's --permissive (docs/PERMISSIVE.md). There is no per-request override - the server operator decides for the whole process.

apfel --serve --permissive             # every request uses permissive guardrails

Limitations

Constraint Detail
Context window 4096 tokens (input + output combined)
Platform macOS 26+, Apple Silicon only
Model One model (apple-foundationmodel), not configurable
Guardrails Apple's safety system may block benign prompts. --permissive reduces false positives (docs/PERMISSIVE.md)
Speed On-device, not cloud-scale - a few seconds per response
No embeddings / vision Not available on-device

Reference Docs

Guides to use apfel from Python, Node.js, Ruby, PHP, Bash/curl, Zsh, AppleScript, Swift, Perl, AWK - see docs/guides/index.md. Empirically tested; runnable proof at apfel-guides-lab.

Architecture

CLI (single/stream/chat) ──┐
                           ├─→ FoundationModels.SystemLanguageModel
HTTP Server (/v1/*) ───────┘   (100% on-device, zero network)
                                ContextManager → Transcript API
                                SchemaConverter → native ToolDefinitions
                                TokenCounter → real token counts (SDK 26.4)

Swift 6.3 strict concurrency. Three targets: ApfelCore (pure logic, unit-testable, also available as a Swift Package product - see docs/swift-library.md), apfel (CLI + server), and apfel-tests (pure Swift runner, no XCTest).

Build & Test

make test                                # release build + all unit/integration tests
make preflight                           # full release qualification
make install                             # build release + install to /usr/local/bin
make build                               # build release only
make version                             # print current version
make release                             # patch release
make release TYPE=minor                  # minor release
make release TYPE=major                  # major release
swift build                              # quick debug build (no version bump)
swift run apfel-tests                    # unit tests
python3 -m pytest Tests/integration/ -v  # integration tests
apfel --benchmark -o json                # performance report

.version is the single source of truth. Only make release bumps versions. Local builds do not change the version.

The apfel tree

Projects built on apfel. Each ships as its own repo + Homebrew formula.

Project What it does Install
apfel The root. On-device FoundationModels CLI + OpenAI-compatible server. brew install apfel
apfel-chat macOS chat client: streaming markdown, speech I/O, Apple Vision image analysis. brew install Arthur-Ficial/tap/apfel-chat
apfel-clip Menu-bar AI actions on the clipboard: summarize, translate, rewrite. brew install Arthur-Ficial/tap/apfel-clip
apfel-quick Instant AI overlay: press a key, ask, answer, dismiss. brew install Arthur-Ficial/tap/apfel-quick
apfelpad Formula notepad - on-device AI as an inline cell function. brew install Arthur-Ficial/tap/apfelpad
apfel-mcp Token-budget-optimized MCPs for the 4096 window: url-fetch, ddg-search, search-and-fetch. brew install Arthur-Ficial/tap/apfel-mcp
apfel-gui SwiftUI debug inspector: request timeline, MCP protocol viewer, TTS/STT. brew install Arthur-Ficial/tap/apfel-gui
apfel-run UNIX wrapper adding a persistent MCP registry + TOML config on top of apfel. brew install Arthur-Ficial/tap/apfel-run
apfel-server-kit Swift package for ecosystem tools: discover, spawn, and stream from a local apfel --serve. Swift Package

Community Projects

Built something on top of apfel? Open an issue and it can be added here.

Project What it does Links
apfelclaw by @julianYaman Local AI agent that reads files, calendar, mail, and Mac status via read-only tools github - site
fruit-chat by @bhaskarvilles Browser-based chat UI that talks to apfel --serve over the OpenAI-compatible API github
local-claude by @lucaspwo Claude Code wrapper that swaps in apfel as a local backend via a small Anthropic-OpenAI proxy github
apfeller by @hasit App manager for local shell apps built around apfel github - site - catalog

Contributing

Issues and PRs welcome on any Arthur-Ficial/apfel* repo.

#agentswelcome - AI agent PRs are fine. Read the repo's CLAUDE.md, run the tests, credit the tool in a Co-Authored-By trailer. Same bar as humans: clean code, passing tests, honest limits. Most agent-friendly entry point: apfel-mcp (contribution rules).

License

MIT

Apps
About Me
GitHub: Trinea
Facebook: Dev Tools