-
Notifications
You must be signed in to change notification settings - Fork 37
Migrate to Blacksmith official Docker actions with native multi-platform builds #990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…orm builds - Replace custom build script with useblacksmith/build-push-action@v2 - Implement true multi-platform builds using matrix strategy - Build each architecture (amd64/arm64) on native hardware (no QEMU) - Add manifest merge job to create multi-arch Docker manifests - Enable automatic Docker layer caching via Blacksmith - Update workflow to build 6 jobs in parallel (3 variants × 2 archs) - Add comprehensive migration documentation Expected benefits: - 2-40x faster builds with layer caching - 5-10x faster ARM builds (native vs QEMU emulation) - Automatic cache management (no manual cache-from/cache-to) - Better observability via Blacksmith dashboard Co-authored-by: openhands <[email protected]>
- Filter out uv warnings that pollute BUILD_CTX output - Use tail -n 1 to ensure clean path extraction - Fixes 'Invalid format' error in GitHub Actions output Co-authored-by: openhands <[email protected]>
- Filter out DeprecationWarning from litellm (asyncio.get_event_loop)
- Filter out JSON log lines starting with '{'
- Only capture lines starting with '/tmp/' (actual path)
- Add validation to ensure BUILD_CTX is non-empty and valid directory
- Fixes: 'path "loop = asyncio.get_event_loop()" not found' error
Co-authored-by: openhands <[email protected]>
- Set dockerfile output to BUILD_CTX/Dockerfile - Fixes: 'failed to read dockerfile: open Dockerfile: no such file or directory' - The build-push-action needs explicit file parameter Co-authored-by: openhands <[email protected]>
Changes to build.py: - Add json import for structured output - In build-ctx-only mode, create BuildOptions to generate tags - Output JSON with build_context, dockerfile, tags, versioned_tag, base_image_slug - Tags are computed using the same logic as regular builds Changes to server.yml: - Pass CUSTOM_TAGS, BASE_IMAGE, TARGET, IMAGE env vars to build.py - Parse JSON output from build.py using jq - Extract build_context, dockerfile, and base tags from JSON - Append architecture suffix to each tag for per-arch builds - Simplifies workflow by reusing build.py's tag generation logic Benefits: - Single source of truth for tag generation (build.py) - Consistent tagging between direct builds and CI builds - Eliminates duplicate tag computation logic in workflow Co-authored-by: openhands <[email protected]>
Changes: - build.py now writes directly to $GITHUB_OUTPUT when in CI - Outputs: build_context, dockerfile, tags_csv, versioned_tag, base_image_slug - Still prints build context path to stdout for backwards compatibility - server.yml no longer needs jq to parse JSON output - Reads tags_csv from GITHUB_OUTPUT file using grep - More reliable than JSON parsing with filtering Benefits: - Eliminates jq dependency and parsing failures - Direct file writes are more reliable in CI environments - Simpler workflow logic Co-authored-by: openhands <[email protected]>
Changes: - Capture full output from build.py including stderr - Check build.py exit code explicitly before proceeding - Add validation that BASE_TAGS is not empty - Add validation that ARCH_TAGS is not empty - Show GITHUB_OUTPUT contents for debugging - Use set -o pipefail for proper error propagation This will help diagnose why the prep step is failing. Co-authored-by: openhands <[email protected]>
…nifest Changes: - Replace 'docker manifest create/annotate/push' with 'docker buildx imagetools create' - Apply to both SHA-based and latest tag manifest creation - Add 'docker buildx imagetools inspect' for verification Why: - Blacksmith builds create manifest lists even for single-platform builds - 'docker manifest create' fails with 'is a manifest list' error - 'docker buildx imagetools create' properly handles manifest lists - Modern Docker buildx approach is more robust for multi-arch workflows This fixes the error: Error: ghcr.io/openhands/agent-server:0a1bb14-java-arm64 is a manifest list Co-authored-by: openhands <[email protected]>
|
@OpenHands /codereview-roasted Also, help me understand can we get rid of the All tags pushed for this build About Multi-Architecture Support
|
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
SummaryI've completed a comprehensive code review of PR #990 and answered your question about multi-architecture Docker image tags. ✅ Answer to Your QuestionYou CAN already pull images without the The
These automatically select the correct architecture for your machine. Docker handles this transparently - when you pull The Issue: Your PR description is showing broken tags like 🔥 Code Review Highlights (Linus-Style)Taste Rating: 🟡 AcceptableThe migration works but has design issues that violate simplicity principles. Critical Issues Found:
Key Recommendations:
📝 No Code Changes MadePer your instructions for If you'd like me to implement fixes for any of the issues identified (especially the PR description bug), let me know! |
- Simplify matrix configuration to explicit 6 job entries (3 variants × 2 architectures) - Each job now clearly shows what it builds and where it runs - Removes complex matrix include merging that was hard to understand - Add --arch flag to build.py for architecture-specific tagging - Move bash string manipulation logic into Python BuildOptions - Tags are now generated with architecture suffix directly by build.py - Reduces workflow complexity from ~70 lines to ~20 lines in prep step - Fix multi-arch manifest tag tracking - merge-manifests job now creates artifacts with manifest tag info - consolidate-build-info processes manifest artifacts - PR descriptions will now correctly show multi-arch tags like :sha-python Co-authored-by: openhands <[email protected]>
Co-authored-by: openhands <[email protected]>
- Add --arch flag to build.py invocation in prep step
- This ensures tags include the correct architecture suffix
- Fix bash syntax error in consolidate-build-info
- Can't use 2>/dev/null in glob pattern, wrap in if statement instead
- Fix SHORT_SHA mismatch between build and merge-manifests jobs
- merge-manifests was using ${{github.sha}} which doesn't match build.py's git calculation
- Now downloads build-info artifacts and extracts the actual SHORT_SHA used during build
- This ensures manifest creation uses the correct image tags
Co-authored-by: openhands <[email protected]>
|
@OpenHands please fix the failing actions on PR #990 at branch |
|
I'm on it! xingyaoww can track my progress at all-hands.dev |
The merge-manifests job was failing because it was trying to use 'docker buildx imagetools create' without having Docker Buildx properly set up. This adds the useblacksmith/setup-docker-builder@v1 action to configure Docker Buildx before attempting to create multi-arch manifests. Co-authored-by: openhands <[email protected]>
SummaryI've successfully fixed the failing "Agent Server" workflow on PR #990 ( Problem IdentifiedAll three "Merge Multi-Arch Manifests" jobs (python, java, golang) were failing because they attempted to use Fix AppliedAdded the Docker Buildx setup step to the - name: Set up Docker Buildx with Blacksmith
uses: useblacksmith/setup-docker-builder@v1This step was already present in the Changes Made
VerificationThe GitHub Actions workflow has been triggered and is currently running at: The fix is minimal, focused, and addresses the root cause of the failure without introducing any extraneous changes. |
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
The build.py script expects CUSTOM_TAGS environment variable to generate
proper image tags with variant names (python, java, golang). Without this,
the tags were being generated without the variant name, causing the
merge-manifests job to fail because it couldn't find the expected
{SHORT_SHA}-{VARIANT}-{ARCH} formatted tags.
Co-authored-by: openhands <[email protected]>
…caching Following the pattern from OpenHands/software-agent-sdk#990 and Blacksmith's official documentation (https://docs.blacksmith.sh/blacksmith-caching/docker-builds), this change replaces the standard docker/setup-buildx-action with useblacksmith/setup-docker-builder@v1. Key improvements: - Replaces docker/setup-buildx-action@v3 with useblacksmith/setup-docker-builder@v1 - Removes manual cache configuration (useblacksmith/cache@v6) - Blacksmith's Docker builder automatically manages Docker layer caching via NVMe-backed sticky disks - Provides 2x to 40x improvements in build times according to Blacksmith's customers - Since we only build amd64 images, we don't need the complex multi-platform matrix strategy This approach is recommended for workflows that use Docker commands directly (as opposed to using docker/build-push-action). Co-authored-by: openhands <[email protected]>
Follow the example here (https://docs.blacksmith.sh/blacksmith-caching/docker-builds) to use blacksmith's official Docker action for more efficient Docker builds
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:121be05-pythonRun
All tags pushed for this build
About Multi-Architecture Support
121be05-python) is a multi-arch manifest supporting both amd64 and arm64121be05-python-amd64) are also available if needed