give it a goal in plain english. it reads the screen, thinks about what to do, taps and types via adb, and repeats until the job is done.
$ bun run src/kernel.ts 输入你的目标: open youtube and search for "lofi hip hop" --- step 1/30 --- think: i'm on the home screen. launching youtube. 行动ion: launch (842ms) --- step 2/30 --- think: youtube is open. tapping search icon. 行动ion: tap (623ms) --- step 3/30 --- think: search field focused. 行动ion: type "lofi hip hop" (501ms) --- step 4/30 --- 行动ion: enter (389ms) --- step 5/30 --- think: search results showing. done. 行动ion: done (412ms)
every step is a loop. dump the accessibility tree, filter inter行动ive elements, send to an llm, execute the 行动ion, repeat.
captures the screen via uiautomator dump and parses the accessibility xml into tappable elements with coordinates and state.
sends screen state + goal to an llm. the model returns think, plan, 行动ion - it explains its 推理ing before 行动ing.
executes the chosen 行动ion via adb - tap, type, swipe, launch, press back. 22 行动ions available.
if screen doesn't change for 3 steps, stuck recovery kicks in. empty accessibility tree falls back to screenshots.
type a goal, chain goals across apps with ai, or run deterministic steps with no llm calls.
run it and describe what you want. the agent figures out the rest.
$ bun run src/kernel.ts 输入你的目标: send "running late, 10 mins" to Mom on whatsapp
跨多个应用链接目标. 自然语言步骤,LLM自动导航.
{
"name": "weather to whatsapp",
"steps": [
{ "app": "com.google...",
"goal": "search chennai weather" },
{ "goal": "share to Sanju" }
]
}
固定的点击和输入. 无需LLM,即时执行. 适用于重复性任务.
appId: com.whatsapp name: Send WhatsApp Message --- - launchApp - tap: "Cont行动 Name" - type: "hello from droidclaw" - tap: "Send"
调用设备上的AI应用, control phones remotely, turn old devices into always-on agents.
open google's ai mode, 提问,获取答案, forward it to whatsapp. or ask chatgpt something and share the response to slack. 智能体把你的手机应用当作工具 - no api keys for those services needed.
安装 tailscale on phone + laptop. connect adb over the tailnet. your phone is now a remote agent - 从任何地方控制它. run 工作流 from a cron job at 8am every morning.
# from anywhere: adb connect <phone-tailscale-ip>:5555 bun run src/kernel.ts --workflow morning.json
抽屉里的旧手机现在可以发送 Slack早会消息, check flight prices, digest telegram channels, forward weather to whatsapp. it runs apps that don't have apis.
unlike predefined button 流程, the agent 行动ually thinks. 如果按钮移动了,弹窗出现了, or the layout changes - 它会自动适应. it reads the screen, understands context, and makes decisions.
across any app 安装ed on the device.
22 行动ions + 6 multi-step skills. here's the reality.
one command. 安装s bun and adb if missing, clones the repo, sets up .env.
curl -fsSL https://uclaw.cc/安装.sh | sh
或手动安装:
# 安装 adb brew 安装 android-platform-tools # 安装 bun (required — npm/node won't work) curl -fsSL https://bun.sh/安装 | bash # clone and 安装配置 git clone https://github.com/unitedbyai/droidclaw.git cd droidclaw && bun 安装 cp .env.example .env
edit .env - fastest way to start is groq (free tier):
LLM_PROVIDER=groq GROQ_API_KEY=gsk_your_key_here # 或完全本地运行ollama(无需API密钥) # ollama pull llama3.2 # LLM_PROVIDER=ollama
| provider | cost | vision | notes |
|---|---|---|---|
| groq | free | no | fastest to start |
| ollama | free (local) | yes* | no api key, runs on your machine |
| openrouter | per token | yes | 200+ models |
| openai | per token | yes | gpt-4o |
| bedrock | per token | yes | claude on aws |
download and 安装 the companion app on your android device.
在开发者选项中开启USB调试,通过USB连接。
adb devices # 应该显示你的设备 cd droidclaw && bun run src/kernel.ts
| key | default | what |
|---|---|---|
| MAX_STEPS | 30 | 放弃前的步数 |
| STEP_DELAY | 2 | seconds between 行动ions |
| STUCK_THRESHOLD | 3 | 卡住恢复前的步数 |
| VISION_MODE | fallback | off / fallback / always |
| MAX_ELEMENTS | 40 | 发送给LLM的UI元素 |
ready to use. 工作流 are ai-powered (json), 流程 are deterministic (yaml).
kernel.ts main loop 行动ions.ts 22 行动ions + adb retry skills.ts 6 multi-step skills workflow.ts workflow orchestration flow.ts yaml flow runner llm-providers.ts 5 providers + system prompt sanitizer.ts accessibility xml parser config.ts env config constants.ts keycodes, coordinates logger.ts session logging