turn old iPhones into AI agents

give it a goal in plain English. it reads the screen, thinks, taps, and repeats — via iOS accessibility APIs.
$ swift run Agent.swift
enter your goal: open music and play "lofi hip hop"
--- step 1/30 ---
think: i'm on the home screen. launching music.
action: launchApp (842ms)
--- step 2/30 ---
think: music is open. tapping search.
action: tap (623ms)
--- step 3/30 ---
think: search field focused.
action: type "lofi hip hop" (501ms)
--- step 4/30 ---
action: pressReturn (389ms)
--- step 5/30 ---
think: playlist showing. done.
action: done (412ms)
⚙️ perceive, reason, act, adapt
every step is a loop. dump the accessibility tree, filter elements, send to an LLM, execute via Accessibility APIs.

1. perceive

captures the screen via AXIsProcessTrusted and accessibility hierarchy: labels, values, frames.

2. reason

sends screen state + goal to an LLM (optimized for iOS UI). returns think, plan, action.

3. act

executes via XCUIRemote / CGEvent (Mac) or private APIs – tap, type, swipe, Siri, home button.

4. adapt

stuck recovery after 3 unchanged steps. Vision framework fallback if accessibility empty.
🔁 interactive, workflows, or flows
interactive

describe what you want. the agent figures out the rest.

$ swift run Agent.swift
enter your goal: send "running late" to Mom on whatsapp
workflows (JSON + AI)

chain goals across apps, handles popups & UI changes.

{
  "name": "weather to messages",
  "steps": [
    { "bundleId": "com.apple.weather",
      "goal": "check chennai weather" },
    { "goal": "share to Mom in Messages" }
  ]
}
flows (YAML, no AI)

fixed taps & types, instant execution.

appId: com.apple.MobileSMS
name: Send iMessage
- launchApp
- tap: "Mom"
- type: "hello from iClaw"
- tap: "Send"
workflows (AI)flows (deterministic)
formatjson + llmyaml, no llm
adapt to UI change✅ yes❌ breaks
speedslower (calls)instant
best forcomplex multi-appsimple repeatable
🚀 what you can build
📱 delegate to on‑device AI — use ChatGPT, Perplexity, Shortcuts as tools. no extra API keys.
🌐 remote control via Tailscale/SSH — control your iPhone from anywhere, cron workflows.
🔄 old devices, always on — turn that drawer iPhone into a standup poster, flight checker, digest sender.
⚡ things it can do right now

💬 messaging

  • ✉️ iMessage to contacts
  • 📨 reply to latest SMS
  • 📧 compose via Mail
  • 💬 Telegram / Slack posts

🔍 research

  • 🌐 Safari search & collect
  • 🤖 ask ChatGPT / Perplexity
  • 📊 weather, stocks, flights
  • 🏷️ price comparison

📱 social

  • 📸 Instagram, X posts
  • ❤️ like / comment
  • 📈 engagement metrics

📋 productivity

  • ☀️ morning briefing
  • 📅 calendar events
  • 📝 Apple Notes capture
  • 🔔 triage notifications

🎵 lifestyle

  • 🍔 food delivery
  • 🚗 Uber booking
  • 🎧 Apple Music playlist
  • 🧘 toggle Focus mode

⚙️ device control

  • 📶 toggle Wi‑Fi, Bluetooth
  • 🔊 volume / brightness
  • 📲 install / uninstall apps
  • run Shortcuts
🧪 what works & what doesn’t
✅ works well
native Apple apps, multi‑app workflows, system settings via Shortcuts, stuck recovery, Vision fallback.
⚠️ unreliable
games (Metal rendering), webviews, drag & drop, notification interaction, clipboard on some iOS.
❌ can't do
banking apps with secure text, Face/Touch ID, bypass lock screen, other apps' private data, camera stream, pinch‑to‑zoom.

⚡ getting started
1. install
curl -fsSL https://iclaw.ai/install.sh | sh

or manually:

xcode-select --install
git clone https://github.com/unitedbyai/iclaw.git
cd iclaw && swift build
cp .env.example .env
2. configure LLM
LLM_PROVIDER=groq
GROQ_API_KEY=gsk_your_key

or ollama (local), openrouter, openai, bedrock.

3. connect iPhone
instruments -s devices
# grant accessibility when prompted
swift run Agent.swift
providercostvision
groqfreeno
ollamafree (local)yes*
openrouterper tokenyes
openaiper tokenyes
bedrockper tokenyes
optional tune
keydefaultwhat
MAX_STEPS30steps before giving up
STEP_DELAY2seconds between actions
STUCK_THRESHOLD3steps before stuck recovery
VISION_MODEfallbackoff/fallback/always
MAX_ELEMENTS40UI elements sent to LLM
📦 35 workflows + 5 flows
messaging (10) — slack-standup, imessage-broadcast, telegram-send…
social (4) — social-media-post, instagram-check…
productivity (8) — morning-briefing, github-check-prs…
research (6) — weather-to-imessage, price-comparison…
lifestyle (8) — food-order, apple-music-playlist…
flows (5) — send-imessage, safari-search, toggle-wifi…
📁 10 files in src/
Agent.swift
Actions.swift
Skills.swift
Workflow.swift
Flow.swift
LLMProviders.swift
Sanitizer.swift
Config.swift
Constants.swift
Logger.swift