Skip to content

Conversation

@aprowe
Copy link
Contributor

@aprowe aprowe commented Oct 23, 2025

TD_RESOLUTION was changing the size of screenshots incorrectly Now uses the actual size of the image

TD_RESOLUTION was changing the size of screenshots incorrectly
Now uses the actual size of the image
@aprowe aprowe force-pushed the alexrowe/resolution-fix branch from 2ee78db to 76e88e6 Compare October 23, 2025 21:47
@chottuthejimmy
Copy link
Contributor

chottuthejimmy commented Oct 28, 2025

@aprowe was trying this out by just checking out the branch and running it, I had set TD_RESOLUTION in the .env to be the 1920x1080 but the AI's ss seems to be just gray
image

btw, the current version works fine for the same prompt

Here's is the full capacity before it gave up

> click on the google logo

    It appears that the screenshot is completely gray and does
    not display any visible UI elements, windows, or the
    Google logo. Therefore, I cannot identify or interact with
    the Google logo based on the current desktop state.
    
    Reasoning:     It appears that the screenshot is completely gray and does
    not display any visible UI elements, windows, or the
    Google logo. Therefore, I cannot identify or interact with
    the Google logo based on the current desktop state.
    
    Reasoning:
    There is nothing visible on the desktop—no browser, no
    Google logo, no text, or images. The screen is just a
    solid gray color.
    
    Here is the YAML reflecting that no action can be taken:
    
      commands:
        - command: wait
          timeout: 2000
    
    Next step:     It appears that the screenshot is completely gray and does
    not display any visible UI elements, windows, or the
    Google logo. Therefore, I cannot identify or interact with
    the Google logo based on the current desktop state.
    
    Reasoning:
    There is nothing visible on the desktop—no browser, no
    Google logo, no text, or images. The screen is just a
    solid gray color.
    
    Here is the YAML reflecting that no action can be taken:
    
      commands:
        - command: wait
          timeout: 2000
    
    Next step:
    Please open the browser to the Google homepage or provide
    a screenshot where the Google logo is visible, so I can
    proceed to click on it.

command='wait' timeout=2000
    Confidence the task is complete: 0%
    
    Reasoning:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots show a completely gray screen
    with no visible UI elements, browser windows, or the
    Google logo. No actions have been taken to open a browser
    or navigate to the Google homepage. Therefore, the task to
    "click on the google logo" has not been started or
    completed.
    
    Next step:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots show a completely gray screen
    with no visible UI elements, browser windows, or the
    Google logo. No actions have been taken to open a browser
    or navigate to the Google homepage. Therefore, the task to
    "click on the google logo" has not been started or
    completed.
    
    Next step:
    Since there is no active window and nothing is visible,
    the first step is to open the browser (e.g., Google
    Chrome).
    
      commands:
        - command: focus-application
          name: "Google Chrome"
    
    Once the browser is open, I will proceed to navigate to
    the Google homepage and look for the Google logo.

command='focus-application' name='Google Chrome'
    Confidence the task is complete: 0%
    
    Reasoning:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots show a completely gray screen
    with no visible UI elements, browser windows, or the
    Google logo. There is no indication that the Google logo
    is present or has been clicked. No progress has been made
    toward the task.
    
    Next step:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots show a completely gray screen
    with no visible UI elements, browser windows, or the
    Google logo. There is no indication that the Google logo
    is present or has been clicked. No progress has been made
    toward the task.
    
    Next step:
    Since nothing is visible and there is no active window,
    the next step is to attempt to open the browser (Google
    Chrome).
    
      commands:
        - command: focus-application
          name: "Google Chrome"

command='focus-application' name='Google Chrome'
>     Confidence the task is complete: 0%
    
    Reasoning:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots still show a completely gray
    screen with no visible UI elements, browser windows, or
    the Google logo. No actions have resulted in the Google
    logo being visible or clickable. The task to "click on the
    google logo" has not been started or completed.
    
    Next step:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots still show a completely gray
    screen with no visible UI elements, browser windows, or
    the Google logo. No actions have resulted in the Google
    logo being visible or clickable. The task to "click on the
    google logo" has not been started or completed.
    
    Next step:
    Since there is still no visible browser or Google logo, I
    will attempt to open the application launcher to launch a
    browser.
    
      commands:
        - command: press-keys
          keys:
            - "command"
            - "space"

command='press-keys' keys=command,space
>     Confidence the task is complete: 0%
    
    Reasoning:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots still show a completely gray
    screen with no visible UI elements, browser windows, or
    the Google logo. There is no indication that any progress
    has been made toward clicking the Google logo. The desktop
    remains unchanged, and the Google logo is not present.
    
    Next step:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots still show a completely gray
    screen with no visible UI elements, browser windows, or
    the Google logo. There is no indication that any progress
    has been made toward clicking the Google logo. The desktop
    remains unchanged, and the Google logo is not present.
    
    Next step:
    Since there is still no visible browser or Google logo, I
    will attempt to open the application launcher again to try
    to launch a browser.
    
      commands:
        - command: press-keys
          keys:
            - "command"
            - "space"

command='press-keys' keys=command,space
    Confidence the task is complete: 0%
    
    Reasoning:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots still show a completely gray
    screen with no visible UI elements, browser windows, or
    the Google logo. There is no indication that the Google
    logo is present or has been clicked. No progress has been
    made toward the task.
    
    Next step:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots still show a completely gray
    screen with no visible UI elements, browser windows, or
    the Google logo. There is no indication that the Google
    logo is present or has been clicked. No progress has been
    made toward the task.
    
    Next step:
    Since nothing is visible and there is no active window, I
    will attempt to open the application launcher to launch a
    browser.
    
      commands:
        - command: press-keys
          keys:
            - "command"
            - "space"

command='press-keys' keys=command,space
    Confidence the task is complete: 0%
    
    Reasoning:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots still show a completely gray
    screen with no visible UI elements, browser windows, or
    the Google logo. There is no indication that the Google
    logo is present or has been clicked. No progress has been
    made toward the task.
    
    Next step:     Confidence the task is complete: 0%
    
    Reasoning:
    The attached screenshots still show a completely gray
    screen with no visible UI elements, browser windows, or
    the Google logo. There is no indication that the Google
    logo is present or has been clicked. No progress has been
    made toward the task.
    
    Next step:
    Since there is still nothing visible and no active window,
    I will attempt to open the application launcher again to
    try to launch a browser.
    
      commands:
        - command: press-keys
          keys:
            - "command"
            - "space"

command='press-keys' keys=command,space
    The test never actually opened or focused a visible
    browser window containing the Google homepage, so the
    Google logo was never present to click. Every
    attempt—focusing “Google Chrome” and repeatedly triggering
    the application launcher via “command + space”—failed to
    bring up any UI; the screen remained a uniform gray
    without any recognizable elements. Because no progress
    could be made toward displaying the Google logo, the loop
    detector eventually halted the sequence and marked the
    test as failed.
    ```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants