now live — sign up & start controlling your device

turn old phones into
ai agents

give it a goal in plain english. it reads the screen, thinks about what to do, taps and types via adb, and repeats until the job is done.

droidclaw
$ bun run src/kernel.ts
输入你的目标: open youtube and search for "lofi hip hop"

--- step 1/30 ---
think: i'm on the home screen. launching youtube.
行动ion: launch (842ms)

--- step 2/30 ---
think: youtube is open. tapping search icon.
行动ion: tap (623ms)

--- step 3/30 ---
think: search field focused.
行动ion: type "lofi hip hop" (501ms)

--- step 4/30 ---
行动ion: enter (389ms)

--- step 5/30 ---
think: search results showing. done.
行动ion: done (412ms)

感知, 推理, 行动, adapt

every step is a loop. dump the accessibility tree, filter inter行动ive elements, send to an llm, execute the 行动ion, repeat.

1. 感知

captures the screen via uiautomator dump and parses the accessibility xml into tappable elements with coordinates and state.

2. 推理

sends screen state + goal to an llm. the model returns think, plan, 行动ion - it explains its 推理ing before 行动ing.

3. 行动

executes the chosen 行动ion via adb - tap, type, swipe, launch, press back. 22 行动ions available.

4. adapt

if screen doesn't change for 3 steps, stuck recovery kicks in. empty accessibility tree falls back to screenshots.

inter行动ive, 工作流, or 流程

type a goal, chain goals across apps with ai, or run deterministic steps with no llm calls.

inter行动ive

just type

run it and describe what you want. the agent figures out the rest.

$ bun run src/kernel.ts
输入你的目标: send "running
late, 10 mins" to Mom on whatsapp

工作流

ai-powered · json

跨多个应用链接目标. 自然语言步骤,LLM自动导航.

{
  "name": "weather to whatsapp",
  "steps": [
    { "app": "com.google...",
      "goal": "search chennai weather" },
    { "goal": "share to Sanju" }
  ]
}

流程

instant · yaml

固定的点击和输入. 无需LLM,即时执行. 适用于重复性任务.

appId: com.whatsapp
name: Send WhatsApp Message
---
- launchApp
- tap: "Cont行动 Name"
- type: "hello from droidclaw"
- tap: "Send"

工作流

  • json format, uses ai
  • 处理UI变化 and popups
  • slower (llm calls each step)
  • best for complex multi-app tasks

流程

  • yaml format, no ai needed
  • UI变化会中断
  • 即时执行
  • best for simple repeatable tasks

用它可以做什么

调用设备上的AI应用, control phones remotely, turn old devices into always-on agents.

调用手机上的AI应用

open google's ai mode, 提问,获取答案, forward it to whatsapp. or ask chatgpt something and share the response to slack. 智能体把你的手机应用当作工具 - no api keys for those services needed.

通过Tailscale远程控制

安装 tailscale on phone + laptop. connect adb over the tailnet. your phone is now a remote agent - 从任何地方控制它. run 工作流 from a cron job at 8am every morning.

# from anywhere:
adb connect <phone-tailscale-ip>:5555
bun run src/kernel.ts --workflow morning.json

旧手机,常年在线

抽屉里的旧手机现在可以发送 Slack早会消息, check flight prices, digest telegram channels, forward weather to whatsapp. it runs apps that don't have apis.

AI驱动的自动化

unlike predefined button 流程, the agent 行动ually thinks. 如果按钮移动了,弹窗出现了, or the layout changes - 它会自动适应. it reads the screen, understands context, and makes decisions.

现在就能做的事

across any app 安装ed on the device.

消息通讯

  • send whatsapp to saved or unsaved numbers
  • reply to latest sms
  • compose emails via gmail
  • telegram messages to groups
  • post standups to slack
  • broadcast to multiple cont行动s

研究搜索

  • search google, collect results
  • ask chatgpt / gemini, grab answer
  • check weather, stocks, flights
  • compare prices across apps
  • translate via google translate
  • compile multi-源代码 digests

社交媒体

  • post to instagram, twitter/x
  • like and comment on posts
  • check engagement metrics
  • save youtube to watch later
  • 关注/取消关注账号
  • check linkedin notifications

效率工具

  • morning briefing across apps
  • create calendar events
  • capture notes in google keep
  • check github pull requests
  • set alarms and reminders
  • triage notifications

生活服务

  • order food from delivery apps
  • book an uber ride
  • play songs on spotify
  • check commute on maps
  • log workouts, track expenses
  • toggle do not disturb

设备控制

  • toggle wifi, bluetooth, airplane
  • adjust brightness, volume
  • force stop or clear cache
  • grant/revoke permissions
  • 安装/un安装 apps
  • run any adb shell command

什么能做,什么不能

22 行动ions + 6 multi-step skills. here's the reality.

效果好

  • native android apps with standard ui
  • multi-app 工作流 that chain goals
  • device settings via shell commands
  • text input, navigation, taps
  • stuck detection + recovery
  • vision fallback for empty trees

不稳定

  • flutter, re行动 native, games
  • webviews (incomplete tree)
  • drag & drop, multi-finger
  • notification inter行动ion
  • clipboard on android 12+
  • captchas and bot detection

无法实现

  • banking apps (FLAG_SECURE)
  • biometrics (fingerprint, face)
  • bypass encrypted lock screen
  • access other apps' private data
  • audio or camera streams
  • pinch-to-zoom gestures

快速开始

1

安装

one command. 安装s bun and adb if missing, clones the repo, sets up .env.

curl -fsSL https://uclaw.cc/安装.sh | sh

或手动安装:

# 安装 adb
brew 安装 android-platform-tools

# 安装 bun (required — npm/node won't work)
curl -fsSL https://bun.sh/安装 | bash

# clone and 安装配置
git clone https://github.com/unitedbyai/droidclaw.git
cd droidclaw && bun 安装
cp .env.example .env
2

配置LLM供应商

edit .env - fastest way to start is groq (free tier):

LLM_PROVIDER=groq
GROQ_API_KEY=gsk_your_key_here

# 或完全本地运行ollama(无需API密钥)
# ollama pull llama3.2
# LLM_PROVIDER=ollama
providercostvisionnotes
groqfreenofastest to start
ollamafree (local)yes*no api key, runs on your machine
openrouterper tokenyes200+ models
openaiper tokenyesgpt-4o
bedrockper tokenyesclaude on aws
3

安装 the android app

download and 安装 the companion app on your android device.

download apk (v0.4.0)
4

连接手机

在开发者选项中开启USB调试,通过USB连接。

adb devices   # 应该显示你的设备
cd droidclaw && bun run src/kernel.ts
5

调整配置(可选)

keydefaultwhat
MAX_STEPS30放弃前的步数
STEP_DELAY2seconds between 行动ions
STUCK_THRESHOLD3卡住恢复前的步数
VISION_MODEfallbackoff / fallback / always
MAX_ELEMENTS40发送给LLM的UI元素

35 工作流 + 5 流程

ready to use. 工作流 are ai-powered (json), 流程 are deterministic (yaml).

消息通讯 10 工作流
社交媒体 4 工作流
效率工具 8 工作流
研究搜索 6 工作流
生活服务 8 工作流
流程 5 deterministic

10 files in src/

kernel.ts          main loop
行动ions.ts         22 行动ions + adb retry
skills.ts          6 multi-step skills
workflow.ts        workflow orchestration
flow.ts            yaml flow runner
llm-providers.ts   5 providers + system prompt
sanitizer.ts       accessibility xml parser
config.ts          env config
constants.ts       keycodes, coordinates
logger.ts          session logging