operate-android-devices-with-bochi
Bochi - Android Device Control for AI Agents
Bochi is a command line tool for AI agents to control Android devices via ADB. Use this skill when you need to interact with Android UI elements programmatically, such as tapping buttons, waiting for elements to appear, or automating Android device interactions. Supports CSS-like element selectors with attribute assertions, AND/OR logic, descendant matching, and negation.
Features
- Uses
adb shell uiautomator dumpto obtain UI layout information - Supports CSS-like element selectors with attribute assertions, AND/OR logic, descendant matching, and negation
- Commands:
waitFor,tap,inputText,longTap,doubleTap,scrollUp,scrollDown - Configurable timeout
Installation
Install from crates.io
cargo install bochi
Build from source
git clone https://github.com/linmx0130/bochi.git && cd bochi
cargo build --release
The binary will be available at target/release/bochi.
Usage
bochi [OPTIONS] --selector <SELECTOR> --command <COMMAND>
Options:
-h, --help Print help
Common Parameters:
-s, --serial <SERIAL>
-e, --selector <SELECTOR> Element selector. Supports CSS-like syntax: - [attr=value] - attribute assertion - [attr1=v1][attr2=v2] - AND of clauses - sel1,sel2 - OR of selectors - :has(cond) - has descendant matching cond
-c, --command <COMMAND>
-t, --timeout <TIMEOUT> [default: 30]
Command-Specific Parameters:
--text <TEXT> Text content for inputText command
--print-descendants Print the XML of matched elements including their descendants (for waitFor command)
--scroll-target <SELECTOR> Target element selector for scrollUp/scrollDown commands
Commands
All commands are executed against the elements matched by the selector. If the element is not found within the specified timeout, an error will be returned. If there are multiple elements matched, the command will be executed against the first element.
waitFor: Wait for an element to appeartap: Tap an elementinputText: Input text into an elementlongTap: Long tap (1000ms) an elementdoubleTap: Double tap an elementscrollUp: Scroll up until the target element is visible (requires--scroll-target)scrollDown: Scroll down until the target element is visible (requires--scroll-target)
Selector Syntax
The selector syntax is inspired by CSS selectors:
Basic Attribute Assertion
Use square brackets to match elements by attribute:
# Match element with text="Submit"
bochi -e '[text="Submit"]' -c tap
# Match element with class="Button"
bochi -e '[class=Button]' -c tap
Attribute Operators
In addition to exact match (=), you can use:
^=- starts with:[attr^=value]matches if attribute starts withvalue$=- ends with:[attr$=value]matches if attribute ends withvalue*=- contains:[attr*=value]matches if attribute containsvalue
# Match text starting with "Submit"
bochi -e '[text^=Submit]' -c tap
# Match text ending with "Button"
bochi -e '[text$=Button]' -c tap
# Match text containing "Search"
bochi -e '[text*=Search]' -c tap
# Combine operators
bochi -e '[class^=android.widget][text*=Save]' -c tap
AND Logic (Multiple Clauses)
Multiple square bracket clauses connected together means AND:
# Match element with class="android.widget.Button" AND text="OK"
bochi -e '[class=android.widget.Button][text="OK"]' -c tap
# Match element with package="com.example" AND clickable="true"
bochi -e '[package=com.example][clickable=true]' -c tap
OR Logic (Comma-separated)
Use comma , to represent OR of multiple conditions:
# Match element with text="Cancel" OR text="Back"
bochi -e '[text=Cancel],[text=Back]' -c tap
# Match element with text="OK" OR text="Confirm"
bochi -e '[text=OK],[text=Confirm]' -c tap
Descendant Matching (:has())
Use :has(cond) to select nodes which have a descendant matching the condition:
# Match a ScrollView element that contains an item with text="Item 1"
bochi -e '[class=android.widget.ScrollView]:has([text="Item 1"])' -c tap
# Match any element that has a descendant with text="Submit"
bochi -e ':has([text=Submit])' -c tap
Negation (:not())
Use :not(cond) to select nodes that do NOT match the condition:
# Match elements that are not clickable=false
bochi -e ':not([clickable=false])' -c tap
# Match elements with text containing "Confirm" but not clickable=false
bochi -e '[text*=Confirm]:not([clickable=false])' -c tap
# Match elements that do not have a descendant with text="Loading"
bochi -e ':not(:has([text=Loading]))' -c waitFor
Child Combinator (>)
Use > to select direct children:
# Match clickable elements that are direct children of a ScrollView
bochi -e '[class=android.widget.ScrollView]>[clickable=true]' -c tap
# Chain child combinators
bochi -e '[class=android.widget.ScrollView]>[class*=Item]>[text=Settings]' -c tap
Note: > only matches direct children, unlike :has() which matches any descendant.
Descendant Combinator (space)
Use a space to select any descendant (direct or indirect):
# Match buttons that are descendants of a ScrollView (any depth)
bochi -e '[class=android.widget.ScrollView] [clickable=true]' -c tap
# Match text anywhere within a specific container
bochi -e '[resource-id=com.example:id/container] [text="Submit"]' -c tap
# Chain descendant combinators
bochi -e '[class=android.widget.ScrollView] [class=android.widget.LinearLayout] [text="Item 1"]' -c tap
Note: Unlike >, the space combinator matches elements at any depth, not just direct children.
Complex Selectors
Combine all features for powerful selection:
# Match Button with text "OK" OR "Confirm"
bochi -e '[class=android.widget.Button][text=OK],[class=android.widget.Button][text=Confirm]' -c tap
Supported Attributes
text- The text content of the elementcontentDescription(orcontent-desc,content_desc) - The content descriptionresourceId(orresource-id,resource_id) - The resource IDclass- The class name of the elementpackage- The package namecheckable,checked,clickable,enabled,focusable,focusedlong-clickable(orlong_clickable),password,scrollable,selectedbounds- The bounding rectangle
Quoting Values
Values can be quoted or unquoted:
[text=Submit]- unquoted[text="Submit Button"]- double quotes (required for values with spaces)[text='Submit']- single quotes
Using Opposite Quote Types
You can include one type of quote inside the other without escaping:
[text="It's done"]- single quote inside double quotes[text='Say "Hello"']- double quotes inside single quotes
Escape Sequences
To include the same type of quote within quoted values, use backslash escaping:
[text="Say \"Hello\""]- escaped double quotes[text='It\'s done']- escaped single quote[text="C:\\Windows"]- escaped backslash
Supported escape sequences:
\"- double quote\'- single quote\\- backslash- Unknown sequences (e.g.,
\n) are preserved as-is
Examples
Wait for an element to appear
bochi -e '[text=Submit]' -c waitFor
Wait for an element and print its descendants
bochi -e '[class=android.widget.ScrollView]' -c waitFor --print-descendants
Tap an element
bochi -e '[contentDescription="Open Menu"]' -c tap
If there are multiple elements matches the selector, the first element will be tapped. In order to make accurate selection, use contentDescription or resource-id in the code to set accurate description.
Input text into an element
bochi -e '[resource-id=com.example:id/username]' -c inputText --text "myusername"
If there are multiple element matches the selector, the first element will receive the input. In order to make accurate selection, use contentDescription or resource-id in the code to set accurate description.
Tap element with OR condition
bochi -e '[text=OK],[text=Confirm]' -c tap
Tap a list item within a specific container
bochi -e '[class$=RecyclerView]:has([text="Settings"])' -c tap
Match text starting with a prefix
bochi -e '[text^=Loading]' -c waitFor
Match resource-id ending with a suffix
bochi -e '[resource-id$=submit_button]' -c tap
Match text containing a substring
bochi -e '[text*=Save Changes]' -c tap
Select direct children
# Select clickable buttons directly under a toolbar
bochi -e '[class$=Toolbar]>[clickable=true]' -c tap
# Chain: Select Settings item in a RecycleView
bochi -e '[class$=RecyclerView]>[class$=LinearLayout]>[text=Settings]' -c tap
Use with specific device
bochi -s emulator-5554 -e '[resource-id=com.example:id/button]' -c tap
Set custom timeout
bochi -e '[text=Loading]' -c waitFor -t 60
Scroll to an element
For scrollable containers like RecyclerView or ScrollView, use scrollUp or scrollDown to find an element:
# Scroll down in a RecyclerView to find an item
bochi -e '[class$=RecyclerView]' -c scrollDown --scroll-target '[text="Item 50"]'
# Scroll up to find an element at the top
bochi -e '[scrollable=true]' -c scrollUp --scroll-target '[text="Header"]'
The -e selector specifies the scrollable container, and --scroll-target specifies the element to scroll into view. The command will perform gradual swipes until the target element becomes visible or the timeout is reached.
Selecting a Button Within a Specific Container
When you need to interact with a button that appears multiple times on the screen (e.g., "Reset" buttons for different layout configurations), you can combine the :has() pseudo-class with the child combinator (>) to precisely target the button within a specific container.
# Click the "Reset" button within the Portrait Layout section
bochi -e ':has([text*=Portrait]) > [clickable=true]:has([text="Reset"])' -c tap
How it works:
:has([text*=Portrait])- Selects a container element that contains a descendant with text matching "Portrait" (e.g., the "Portrait Layout" card)>- The child combinator restricts the search to direct children of the container[clickable=true]:has([text="Reset"])- Matches a clickable element that contains the text "Reset"
This pattern is useful when:
• Multiple similar buttons exist on the same screen (e.g., "Edit" or "Reset" buttons for different settings categories) • You need to distinguish between buttons based on their container context • Elements don't have unique resource IDs but their parent containers have distinguishing text
Exit Codes
0- Success1- Error (element not found, timeout, ADB error, etc.)
Requirements
- Android Debug Bridge (ADB) installed and in PATH
- Android device connected and authorized for debugging
Tips for using bochi during development
- In order to make accurate selection,
resource-idshould be the best attribute to query if it is available. - In Jetpack Compose,
testTagcan be exposed asresource-idby applyingModifier.semantics { testTagsAsResourceId = true }on the containers. - Select elements by adding accurate content description is also a good idea. Since content description will be used for accessibility tools, filling unique, concise and accurate content description to elements will benefit both automatic tools like
bochiand more human users.