Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
c1a3398
fix(cli): allow duplicate YAML files in config.yaml (#1327)
Copilot Oct 17, 2025
46ffc1d
refactor(core): remove non-OpenAI SDK support and upgrade to OpenAI 6…
quanru Oct 17, 2025
80a2c97
feat(core,shared): enforce VL mode requirement for Planning (#1332)
quanru Oct 17, 2025
2a98471
chore(core): remove warning msg for gpt-4 (#1331)
yuyutaotao Oct 20, 2025
23c49d3
feat(core): update recorder (#1330)
yuyutaotao Oct 20, 2025
68952aa
refactor: rename API methods for improved clarity (#1335)
quanru Oct 20, 2025
c9b385b
chore(core): update tasks impementation (#1338)
yuyutaotao Oct 20, 2025
d91e8f7
chore(release): upgrade all packages to v1.0.0 (#1340)
quanru Oct 20, 2025
469b843
refactor(core): remove unused getXpathsById method (#1342)
quanru Oct 20, 2025
89b4067
feat(core): support custom OpenAI client instances for observability …
quanru Oct 21, 2025
95eb850
chore(ci): enable workflows for PRs targeting 1.0 branch (#1345)
quanru Oct 21, 2025
581d5fb
Merge main to 1.0 (#1348)
quanru Oct 21, 2025
dc60bc3
refine(core): use 'subTask' flag to reuse context (#1350)
yuyutaotao Oct 21, 2025
641d326
chore(lint): fix linting and formatting issues (#1351)
quanru Oct 21, 2025
767127e
feat(chrome-extension): enable hot reload for development (#1353)
quanru Oct 21, 2025
b150f2f
Merge main (#1371)
quanru Oct 23, 2025
5c5b7bd
feat(bridge-mode): add remote access support for cross-machine commun…
quanru Oct 24, 2025
b10ad89
fix(build): switch from nano-staged to lint-staged for proper auto-fi…
quanru Oct 24, 2025
fd58ead
feat(yaml): Support all device options in YAML configuration (#1367)
quanru Oct 24, 2025
87026c4
feat(report): update task display naming conventions (#1379)
quanru Oct 27, 2025
edc4064
fix(cli): use clonedYamlScript consistently for agent configuration (…
quanru Oct 27, 2025
5300926
refactor(env): modernize model configuration environment variables (#…
quanru Oct 27, 2025
57cd24a
refactor(core): remove tree in context (#1376)
yuyutaotao Oct 27, 2025
9894171
feat(core): update signature of warp-openai (#1383)
yuyutaotao Oct 27, 2025
4ee95a8
feat(env): add backward compatibility for MIDSCENE_OPENAI_* environme…
quanru Oct 27, 2025
52db2aa
feat(android): add screenshot polling fallback for remote devices (#1…
quanru Oct 27, 2025
b709f7f
refactor(report): consolidate PlaygroundSDK creation for report compo…
quanru Oct 28, 2025
2820964
refactor(core): rename Insight class to Service (#1386)
quanru Oct 28, 2025
bdf0999
docs(site): clarify file execution order in YAML scripts docs (#1397)
quanru Oct 29, 2025
11046a8
fix(core): improve Assert task error handling (#1399)
quanru Oct 29, 2025
75f2ff1
fix(web-integration): prevent temp file leakage in Playwright tests (…
quanru Oct 29, 2025
41944c9
feat(core): reuse context for screenshot (#1401)
yuyutaotao Oct 29, 2025
ef94425
chore(core): remove @midscene/recorder from core.deps (#1394)
EAGzzyCSL Oct 29, 2025
ff7ccb2
feat(core): optimize AI prompts and implement order-sensitive judgmen…
quanru Oct 29, 2025
77aa9dc
fix(ios): use WebDriver Clear API for dynamic input fields (#1403)
quanru Oct 30, 2025
cdeae16
feat(shared): add XPath match count helper and warning for ambiguous …
quanru Oct 30, 2025
c7fe766
fix(visualizer): prevent video export hang caused by animation race c…
quanru Oct 30, 2025
5baeddd
docs(site): add WebDriverAgent version requirement to iOS guide (#1411)
quanru Nov 1, 2025
9bffc83
fix(core): action context as param (#1415)
yuyutaotao Nov 3, 2025
6e4b196
feat(core): add runAdbShell support to YAML automation scripts (#1391)
Copilot Nov 3, 2025
44589ae
feat(core): show intent in report (#1407)
yuyutaotao Nov 4, 2025
7a454d5
refactor(core): change Locate task from Insight to Planning type (#1406)
quanru Nov 4, 2025
706f70c
feat(core): update timeout strategy of aiWaitFor (#1419)
yuyutaotao Nov 4, 2025
ee2ff72
chore(core): refine error processing of agent (#1417)
yuyutaotao Nov 4, 2025
94f4f74
feat(android,ios): expose mobile system navigation actions (#1420)
yuyutaotao Nov 4, 2025
1166529
fix(android): correct orientation handling for displayId screenshots …
Copilot Nov 6, 2025
b386a5e
docs(site): remove unreleased model env names (#1427)
yuyutaotao Nov 7, 2025
0017b20
feat(ios): add WebDriverAgent 5.x-7.x compatibility (#1426)
quanru Nov 7, 2025
03485d6
fix(visualizer): cursor not move in player (#1429)
quanru Nov 7, 2025
8378dae
docs(core): docs for 1.0 (#1423)
yuyutaotao Nov 7, 2025
8d19a22
fix(core): make paramSchema optional for actions without parameters (…
quanru Nov 7, 2025
341bdb8
fix: workflow of planning (#1431)
yuyutaotao Nov 10, 2025
ee40b90
Enhanced report UI with improved element rendering and highlight effe…
quanru Nov 10, 2025
6769784
feat(shared): unify VQA and grounding models into insight model (#1432)
quanru Nov 10, 2025
d8cf126
feat(report): Add comprehensive dark mode support (#1434)
quanru Nov 10, 2025
aa30ba8
feat(report): replace sidebar grid layout with antd Table (#1436)
quanru Nov 10, 2025
1228a7d
feat(core): update sidebar ui (#1437)
yuyutaotao Nov 11, 2025
d1104a0
fix(core): replay scripts (#1440)
yuyutaotao Nov 11, 2025
4a0680b
feat(report): Improve dark mode UI styling (#1438)
quanru Nov 11, 2025
2792369
feat(core): show markup in screenshot panel (#1444)
yuyutaotao Nov 11, 2025
6fcc788
feat(core): redefine scroll param (#1441)
yuyutaotao Nov 11, 2025
225224a
feat(mcp): implement auto-destroy agent after each tool call (#1443)
quanru Nov 11, 2025
43e3316
feat(core): redefine the ai shortcut (#1445)
yuyutaotao Nov 11, 2025
12345da
fix(core): report scripts
yuyutaotao Nov 11, 2025
b88a183
Merge branch '1.0' of https://github.com/web-infra-dev/midscene into 1.0
yuyutaotao Nov 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ on:
pull_request:
branches:
- main
- "1.0"

permissions:
contents: read
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ on:
pull_request:
branches:
- main
- "1.0"

permissions:
contents: read
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ midscene_run/dump
*.ignore.png
extension_output
.cursor
apps/chrome-extension/web-ext-profile
packages/android-playground/static/
packages/ios-playground/static/
packages/ios/static/
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ To change the AI-related code of this repository, you need to create a '.env 'fi

```
OPENAI_API_KEY="your_token"
MIDSCENE_MODEL_NAME="gpt-4o-2024-08-06"
MIDSCENE_MODEL_NAME="qwen3-vl-plus"
```


Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ Read more about [Choose a model](https://midscenejs.com/choose-a-model)
Midscene will automatically plan the steps and execute them. It may be slower and heavily rely on the quality of the AI model.

```javascript
await aiAction('click all the records one by one. If one record contains the text "completed", skip it');
await aiAct('click all the records one by one. If one record contains the text "completed", skip it');
```

### Workflow Style
Expand Down Expand Up @@ -131,6 +131,7 @@ Community projects that extend Midscene.js capabilities:

* [midscene-ios](https://github.com/lhuanyu/midscene-ios) - iOS automation support for Midscene
* [Midscene-Python](https://github.com/Python51888/Midscene-Python) - Python SDK for Midscene automation
* [midscene-java](https://github.com/Master-Frank/midscene-java) - Java SDK that brings Midscene automation features to JVM projects


## 📝 Credits
Expand Down
3 changes: 2 additions & 1 deletion README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ Midscene.js 支持视觉语言模型,例如 `Qwen3-VL`、`Doubao-1.6-vision`
Midscene 会自动规划步骤并执行。它可能较慢,并且深度依赖于 AI 模型的质量。

```javascript
await aiAction('click all the records one by one. If one record contains the text "completed", skip it');
await aiAct('click all the records one by one. If one record contains the text "completed", skip it');
```

### 工作流风格
Expand Down Expand Up @@ -133,6 +133,7 @@ for (const record of recordList) {

* [midscene-ios](https://github.com/lhuanyu/midscene-ios) - iOS 设备自动化工具
* [Midscene-Python](https://github.com/Python51888/Midscene-Python) - Python 版本的 Midscene SDK
* [midscene-java](https://github.com/Master-Frank/midscene-java) - Java 版本的 Midscene SDK,便于在 JVM 项目中使用自动化能力

## 📝 致谢

Expand Down
12 changes: 6 additions & 6 deletions apps/android-playground/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,12 @@
"socket.io-client": "4.8.1"
},
"devDependencies": {
"@rsbuild/core": "^1.3.22",
"@rsbuild/plugin-less": "^1.2.4",
"@rsbuild/plugin-node-polyfill": "1.3.0",
"@rsbuild/plugin-react": "^1.3.1",
"@rsbuild/plugin-svgr": "^1.1.1",
"@rsbuild/plugin-type-check": "1.2.3",
"@rsbuild/core": "^1.5.17",
"@rsbuild/plugin-less": "^1.5.0",
"@rsbuild/plugin-node-polyfill": "1.4.2",
"@rsbuild/plugin-react": "^1.4.1",
"@rsbuild/plugin-svgr": "^1.2.2",
"@rsbuild/plugin-type-check": "1.2.4",
"@types/react": "^18.3.1",
"@types/react-dom": "^18.3.1",
"archiver": "^6.0.0",
Expand Down
46 changes: 40 additions & 6 deletions apps/android-playground/src/App.tsx
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
import './App.less';
import { SCRCPY_SERVER_PORT } from '@midscene/shared/constants';
import {
PLAYGROUND_SERVER_PORT,
SCRCPY_SERVER_PORT,
} from '@midscene/shared/constants';
import {
ScreenshotViewer,
globalThemeConfig,
safeOverrideAIConfig,
useEnvConfig,
Expand All @@ -17,6 +21,13 @@ import ScrcpyPlayer, {

const { Content } = Layout;

// Helper function to detect if device is remote (IP:Port format)
const isRemoteDevice = (deviceId: string | null): boolean => {
if (!deviceId) return false;
// Remote device format: IP:Port (e.g., 192.168.1.10:5555)
return /^\d+\.\d+\.\d+\.\d+:\d+$/.test(deviceId);
};

export default function App() {
// Device and connection state - now simplified since device is pre-selected
const [selectedDeviceId, setSelectedDeviceId] = useState<string | null>(null);
Expand All @@ -26,6 +37,7 @@ export default function App() {
`http://localhost:${SCRCPY_SERVER_PORT}`,
);
const [isNarrowScreen, setIsNarrowScreen] = useState(false);
const [usePollingMode, setUsePollingMode] = useState(false);

// Configuration state
const { config } = useEnvConfig();
Expand Down Expand Up @@ -125,6 +137,11 @@ export default function App() {
return () => window.removeEventListener('resize', handleResize);
}, []);

// Detect if device is remote and switch to polling mode
useEffect(() => {
setUsePollingMode(isRemoteDevice(selectedDeviceId));
}, [selectedDeviceId]);

return (
<ConfigProvider theme={globalThemeConfig()}>
{contextHolder}
Expand Down Expand Up @@ -157,11 +174,28 @@ export default function App() {
selectedDeviceId={selectedDeviceId}
scrcpyPlayerRef={scrcpyPlayerRef}
/>
<ScrcpyPlayer
ref={scrcpyPlayerRef}
serverUrl={serverUrl}
autoConnect={connectToDevice}
/>
{!usePollingMode ? (
<ScrcpyPlayer
ref={scrcpyPlayerRef}
serverUrl={serverUrl}
autoConnect={connectToDevice}
/>
) : (
<ScreenshotViewer
getScreenshot={() =>
fetch(
`http://localhost:${PLAYGROUND_SERVER_PORT}/screenshot`,
).then((r) => r.json())
}
getInterfaceInfo={() =>
fetch(
`http://localhost:${PLAYGROUND_SERVER_PORT}/interface-info`,
).then((r) => r.json())
}
serverOnline={true}
isUserOperating={false}
/>
)}
</div>
</Panel>
</PanelGroup>
Expand Down
20 changes: 11 additions & 9 deletions apps/chrome-extension/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
"type": "module",
"scripts": {
"build": "rsbuild build && npm run pack-extension",
"dev": "rsbuild dev --open",
"dev": "concurrently -k -n build,run \"rsbuild dev\" \"node scripts/wait-for-build.js && web-ext run --config web-ext-config.cjs --source-dir dist --target chromium\"",
"dev:simple": "rsbuild dev --open",
"preview": "rsbuild preview",
"pack-extension": "node scripts/pack-extension.js"
},
Expand All @@ -18,7 +19,6 @@
"@midscene/shared": "workspace:*",
"@midscene/visualizer": "workspace:*",
"@midscene/web": "workspace:*",

"@types/file-saver": "2.0.7",
"antd": "^5.21.6",
"canvas-confetti": "1.9.3",
Expand All @@ -32,20 +32,22 @@
"zustand": "4.5.2"
},
"devDependencies": {
"@rsbuild/core": "^1.3.22",
"@rsbuild/plugin-less": "^1.2.4",
"@rsbuild/plugin-node-polyfill": "1.3.0",
"@rsbuild/plugin-react": "^1.3.1",
"@rsbuild/plugin-svgr": "^1.1.1",
"@rsbuild/plugin-type-check": "1.2.3",
"@rsbuild/core": "^1.5.17",
"@rsbuild/plugin-less": "^1.5.0",
"@rsbuild/plugin-node-polyfill": "1.4.2",
"@rsbuild/plugin-react": "^1.4.1",
"@rsbuild/plugin-svgr": "^1.2.2",
"@rsbuild/plugin-type-check": "1.2.4",
"@tailwindcss/postcss": "4.1.11",
"@types/chrome": "0.0.279",
"@types/react": "^18.3.1",
"@types/react-dom": "^18.3.1",
"archiver": "^6.0.0",
"concurrently": "^8.2.0",
"less": "^4.2.0",
"openai": "6.3.0",
"tailwindcss": "4.1.11",
"typescript": "^5.8.3",
"openai": "4.81.0"
"web-ext": "9.0.0"
}
}
8 changes: 7 additions & 1 deletion apps/chrome-extension/rsbuild.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,13 @@ export default defineConfig({
tools: {
rspack: {
watchOptions: {
ignored: /\.git/,
ignored: [
'**/.git/**',
'**/web-ext-profile/**',
'**/extension_output/**',
'dist/**', // Only ignore THIS app's dist folder, not workspace packages
'**/node_modules/**',
],
},
},
},
Expand Down
59 changes: 59 additions & 0 deletions apps/chrome-extension/scripts/wait-for-build.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env node

import fs from 'node:fs';
import path from 'node:path';
import { fileURLToPath } from 'node:url';

const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);

const manifestPath = path.resolve(__dirname, '../dist/manifest.json');
const indexHtmlPath = path.resolve(__dirname, '../dist/index.html');
const indexJsPath = path.resolve(__dirname, '../dist/static/js/index.js');
const popupJsPath = path.resolve(__dirname, '../dist/static/js/popup.js');
const maxWaitTime = 60000; // 60 seconds
const checkInterval = 500; // 500ms
const stabilityWait = 1000; // Wait 1 second after detection to ensure files are stable

let elapsed = 0;

console.log('Waiting for initial build to complete...');

const checkBuildComplete = () => {
const manifestExists = fs.existsSync(manifestPath);
const indexExists = fs.existsSync(indexHtmlPath);
const indexJsExists = fs.existsSync(indexJsPath);
const popupJsExists = fs.existsSync(popupJsPath);

if (manifestExists && indexExists && indexJsExists && popupJsExists) {
// Wait a bit more to ensure all files are written
console.log('Build files detected, waiting for stability...');
setTimeout(() => {
// Double check the files still exist
if (
fs.existsSync(manifestPath) &&
fs.existsSync(indexHtmlPath) &&
fs.existsSync(indexJsPath) &&
fs.existsSync(popupJsPath)
) {
console.log('Build complete! Starting web-ext...');
process.exit(0);
} else {
console.log('Build files disappeared, continuing to wait...');
setTimeout(checkBuildComplete, checkInterval);
}
}, stabilityWait);
return;
}

elapsed += checkInterval;

if (elapsed >= maxWaitTime) {
console.error('Timeout waiting for build to complete');
process.exit(1);
}

setTimeout(checkBuildComplete, checkInterval);
};

checkBuildComplete();
68 changes: 59 additions & 9 deletions apps/chrome-extension/src/extension/bridge/index.less
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,58 @@
line-height: 25px;
}

.server-config-section {
margin-top: 12px;
margin-bottom: 20px;

.server-config-header {
display: flex;
align-items: center;
gap: 6px;
cursor: pointer;
user-select: none;
padding: 6px 0;
transition: opacity 0.2s ease;

&:hover {
opacity: 0.7;
}
}

.server-config-arrow {
font-size: 10px;
color: rgba(0, 0, 0, 0.45);
transition: transform 0.2s ease;
transform: rotate(-90deg);

&.expanded {
transform: rotate(0deg);
}
}

.server-config-title {
font-size: 13px;
color: rgba(0, 0, 0, 0.65);
}

.server-config-content {
margin-top: 12px;
padding-left: 16px;
}

.server-config-input {
width: 100%;
}

.server-config-hint {
display: block;
margin-top: 6px;
font-size: 12px;
color: rgba(0, 0, 0, 0.45);
line-height: 1.5;
}
}

.middle-dialog-area {
height: calc(100vh - 100px);
overflow: hidden;
Expand Down Expand Up @@ -130,12 +182,6 @@
font-style: italic;
}

.bridge-log-container {
}

.bridge-log-item {
}

.bottom-button-container {
position: absolute;
bottom: 16px;
Expand Down Expand Up @@ -251,15 +297,20 @@

.bottom-status-btn {
display: flex;
flex-direction: column;
flex-direction: row;
align-items: center;
justify-content: center;
position: relative;
margin-left: 12px;

.stop-button {
border: none;
box-shadow: none;
background: transparent;
display: flex;
align-items: center;
justify-content: center;
gap: 8px;
padding: 4px 8px;

&::before {
content: '';
Expand All @@ -268,7 +319,6 @@
height: 10px;
background-color: #000;
border-radius: 2px;
vertical-align: middle;
}

&:hover,
Expand Down
Loading