
mobile-mcp
Model Context Protocol Server for Mobile Automation and Scraping
3 years
Works with Finder
4
Github Watches
28
Github Forks
189
Github Stars
Mobile Next - MCP server for Mobile Automation
This is a Model Context Protocol (MCP) server that enables scalable mobile automation through a platform-agnostic interface, eliminating the need for distinct iOS or Android knowledge. This server allows Agents and LLMs to interact with native iOS/Android applications and devices through structured accessibility snapshots or coordinate-based taps based on screenshots.
https://github.com/user-attachments/assets/c4e89c4f-cc71-4424-8184-bdbc8c638fa1
🚀 Mobile MCP Roadmap: Building the Future of Mobile
Join us on our journey as we continuously enhance Mobile MCP! Check out our detailed roadmap to see upcoming features, improvements, and milestones. Your feedback is invaluable in shaping the future of mobile automation.
Main use cases
How we help to scale mobile automation:
- 📲 Native app automation (iOS and Android) for testing or data-entry scenarios.
- 📝 Scripted flows and form interactions without manually controlling simulators/emulators or physical devices (iPhone, Samsung, Google Pixel etc)
- 🧭 Automating multi-step user journeys driven by an LLM
- 👆 General-purpose mobile application interaction for agent-based frameworks
- 🤖 Enables agent-to-agent communication for mobile automation usecases, data extraction
Main Features
- 🚀 Fast and lightweight: Uses native accessibility trees for most interactions, or screenshot based coordinates where a11y labels are not available.
- 🤖 LLM-friendly: No computer vision model required in Accessibility (Snapshot).
- 🧿 Visual Sense: Evaluates and analyses what’s actually rendered on screen to decide the next action. If accessibility data or view-hierarchy coordinates are unavailable, it falls back to screenshot-based analysis.
- 📊 Deterministic tool application: Reduces ambiguity found in purely screenshot-based approaches by relying on structured data whenever possible.
- 📺 Extract structured data: Enables you to extract structred data from anything visible on screen.
Mobile MCP Architecture
Installation and configuration
Detailed guide for Claude Desktop
{
"mcpServers": {
"mobile-mcp": {
"command": "npx",
"args": ["-y", "@mobilenext/mobile-mcp@latest"]
}
}
}
claude mcp add mobile -- npx -y @mobilenext/mobile-mcp@latest
Prerequisites
What you will need to connect MCP with your agent and mobile devices:
- Xcode command line tools
- Android Platform Tools
- node.js
- MCP supported foundational models or agents, like Claude MCP, OpenAI Agent SDK, Copilot Studio
Simulators, Emulators, and Physical Devices
When launched, Mobile MCP can connect to:
- iOS Simulators on macOS/Linux
- Android Emulators on Linux/Windows/macOS
- Physical iOS or Android devices (requires proper platform tools and drivers)
Make sure you have your mobile platform SDKs (Xcode, Android SDK) installed and configured properly before running Mobile Next Mobile MCP.
Running in "headless" mode on Simulators/Emulators
When you do not have a physical phone connected to your machine, you can run Mobile MCP with an emulator or simulator in the background.
For example, on Android:
- Start an emulator (avdmanager / emulator command).
- Run Mobile MCP with the desired flags
On iOS, you'll need Xcode and to run the Simulator before using Mobile MCP with that simulator instance.
-
xcrun simctl list
-
xcrun simctl boot "iPhone 16"
Mobile Commands and interaction tools
The commands and tools support both accessibility-based locators (preferred) and coordinate-based inputs, giving you flexibility when accessibility/automation IDs are missing for reliable and seemless automation.
mobile_list_apps
- Description: List all the installed apps on the device
-
Parameters:
-
bundleId
(string): The application's unique bundle/package identifier like: com.google.android.keep or com.apple.mobilenotes )
-
mobile_launch_app
- Description: Launches the specified app on the device/emulator
-
Parameters:
-
bundleId
(string): The application's unique bundle/package identifier like: com.google.android.keep or com.apple.mobilenotes )
-
mobile_terminate_app
- Description: Terminates a running application
-
Parameters:
-
packageName
(string): Based on the application's bundle/package identifier calls am force stop or kills the app based on pid.
-
mobile_get_screen_size
- Description: Get the screen size of the mobile device in pixels
- Parameters: None
mobile_click_on_screen_at_coordinates
- Description: Taps on specified screen coordinates based on coordinates.
-
Parameters:
-
x
(number): X-coordinate -
y
(number): Y-coordinate
-
mobile_list_elements_on_screen
- Description: List elements on screen and their coordinates, with display text or accessibility label.
- Parameters: None
mobile_element_tap
- Description: Taps on a UI element identified by accessibility locator
-
Parameters:
-
element
(string): Human-readable element description (e.g., "Login button") -
ref
(string): Accessibility/automation ID or reference from a snapshot
-
mobile_tap
- Description: Taps on specified screen coordinates
-
Parameters:
-
x
(number): X-coordinate -
y
(number): Y-coordinate
-
mobile_press_button
- Description: Press a button on device (home, back, volume, enter, power button.)
- Parameters: None
mobile_open_url
- Description: Open a URL in browser on device
-
Parameters:
-
url
(string): The URL to be opened (e.g., "https://example.com").
-
mobile_type_text
- Description: Types text into a focused UI element (e.g., TextField, SearchField)
-
Parameters:
-
text
(string): Text to type -
submit
(boolean): Whether to press Enter/Return after typing
-
mobile_element_swipe
- Description: Performs a swipe gesture from one UI element to another
-
Parameters:
-
startElement
(string): Human-readable description of the start element -
startRef
(string): Accessibility/automation ID of the start element -
endElement
(string): Human-readable description of the end element -
endRef
(string): Accessibility/automation ID of the end element
-
mobile_swipe
- Description: Performs a swipe gesture between two sets of screen coordinates
-
Parameters:
-
startX
(number): Start X-coordinate -
startY
(number): Start Y-coordinate -
endX
(number): End X-coordinate -
endY
(number): End Y-coordinate
-
mobile_press_key
- Description: Presses hardware keys or triggers special events (e.g., back button on Android)
-
Parameters:
-
key
(string): Key identifier (e.g., HOME, BACK, VOLUME_UP, etc.)
-
mobile_take_screenshot
- Description: Captures a screenshot of the current device screen
- Parameters: None
mobile_get_source
- Description: Fetches the current device UI structure (accessibility snapshot) (xml format)
- Parameters: None
Thanks to all contributors ❤️
We appreciate everyone who has helped improve this project.
相关推荐
Embark on a thrilling diplomatic quest across a galaxy on the brink of war. Navigate complex politics and alien cultures to forge peace and avert catastrophe in this immersive interstellar adventure.
Confidential guide on numerology and astrology, based of GG33 Public information
Converts Figma frames into front-end code for various mobile frameworks.
Advanced software engineer GPT that excels through nailing the basics.
Delivers concise Python code and interprets non-English comments
💬 MaxKB is a ready-to-use AI chatbot that integrates Retrieval-Augmented Generation (RAG) pipelines, supports robust workflows, and provides advanced MCP tool-use capabilities.
MCP server to provide Figma layout information to AI coding agents like Cursor
Python code to use the MCP3008 analog to digital converter with a Raspberry Pi or BeagleBone black.
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
Put an end to hallucinations! GitMCP is a free, open-source, remote MCP server for any GitHub project
Reviews

user_nGMfYDkm
I've been using the mobile-mcp application for a while now, and it has significantly streamlined my mobile development workflow. The comprehensive tools and features provided by mobile-next are impressive and cater to both beginners and advanced users. The intuitive interface and clear documentation made it easy to integrate and get started quickly. Highly recommend checking it out at https://github.com/mobile-next/mobile-mcp!