Skip to content

Conversation

Sjors
Copy link
Member

@Sjors Sjors commented Jul 22, 2025

The simulator build jobs as well as the simulator test jobs are split in one group per device brand. This means that the failure of one simulator to build will only cancels jobs for that simulator and not the other ones. Previously this was done in #743 using always(), but that's a bit overkill.

Scherm­afbeelding 2025-08-29 om 19 09 56

Issues left for a followup:

@Sjors Sjors closed this Jul 22, 2025
@Sjors Sjors reopened this Jul 22, 2025
@Sjors
Copy link
Member Author

Sjors commented Jul 22, 2025

It seems that CI doesn't re-run if I modify a commit and force push.

@Sjors
Copy link
Member Author

Sjors commented Jul 22, 2025

What's happening it seems is that when a force push occurs, not all previous runners are cancelled. Eventually they finish or fail, and then the jobs start for the new push. Only when that happens do the checks show up in the PR itself. In other words, just be patient...

You can see what it's actually working on here: https://github.com/bitcoin-core/HWI/actions/workflows/ci.yml

@Sjors Sjors force-pushed the 2025/07/fix-ci branch 2 times, most recently from 18a3a5b to 10fcca8 Compare July 22, 2025 10:46
@Sjors
Copy link
Member Author

Sjors commented Jul 22, 2025

I suspect #743 is the culprit, so I'm reverting that for now.

Also included the Jade qemu changes from #779.

@Sjors
Copy link
Member Author

Sjors commented Jul 22, 2025

Another problem is that we're building the bitcoind binaries with ubuntu-latest which is 24.04. This produces that binaries that require at least glibc 2.38. We then download those on debian bookworm (the default for https://hub.docker.com/_/python) which comes with 2.36. So the binary fails to run: https://github.com/bitcoin-core/HWI/actions/runs/16440824447/job/46463441345

Possible solutions:

  1. do a guix build instead, which uses glibc 2.31
  2. build using the Ubuntu 22 image, which I believe uses glibc 2.35
  3. use a modern version of Ubuntu with PyEnv to control the Python version

I'll try (2) though (3) is probably a more robust solution. (1) sounds like a massive pain.

@Sjors Sjors mentioned this pull request Jul 22, 2025
@Sjors
Copy link
Member Author

Sjors commented Jul 22, 2025

Closing this and continuing in #796 because https://github.com/bitcoin-core/HWI/actions/runs/16440824447 has been stuck for hours.

@Sjors Sjors closed this Jul 22, 2025
@Sjors
Copy link
Member Author

Sjors commented Jul 22, 2025

Never mind, opening a fresh PR doesn't help. It looks like the available workers are just tied up on that old job.

Also, although the timeout of each individual test is set to 45 minutes, only a handful run in parallel, so it really does take forever if all of them are stuck for that time. And some (?) jobs just ignore that timeout setting and stall for hours, e.g. https://github.com/bitcoin-core/HWI/actions/runs/16440824447/job/46463441359.

@Sjors Sjors reopened this Jul 22, 2025
@Sjors Sjors mentioned this pull request Jul 22, 2025
@Sjors
Copy link
Member Author

Sjors commented Jul 23, 2025

I was able to "fix" both the Ledger Speculos and Keepkey builders by downgrading to Ubuntu 22.04. Since I don't want to needlessly downgrade the other builders, it now makes sense to pull #797 in here.

@Sjors
Copy link
Member Author

Sjors commented Jul 23, 2025

I'm not 100% sure if the Coldcard multisig patch update in e782954 is necessary, because the previous CI failure may have been due to me messing up whitespace. Still it seems a good idea to refresh it given how much the underlying file changed in recent years.

Longer term though we should modify our test rather than sabotage the firmware.

@Sjors
Copy link
Member Author

Sjors commented Jul 23, 2025

Ok, the Coldcard patches apply again, but the build fails. Trying the downgrade approach...

Update: that worked... 😕

Though the test jobs themselves fail:

  File "/github/home/.cache/pypoetry/virtualenvs/hwi-crEDFiR--py3.9/lib/python3.9/site-packages/sdl2/dll.py", line 362, in <module>
    dll = DLL("SDL2", ["SDL2", "SDL2-2.0", "SDL2-2.0.0"], os.getenv("PYSDL2_DLL_PATH"))
  File "/github/home/.cache/pypoetry/virtualenvs/hwi-crEDFiR--py3.9/lib/python3.9/site-packages/sdl2/dll.py", line 253, in __init__
    raise RuntimeError("could not find any library for %s (%s)" %
RuntimeError: could not find any library for SDL2 (PYSDL2_DLL_PATH: unset)

e.g. https://github.com/bitcoin-core/HWI/actions/runs/16466809973/job/46548656297?pr=795

So I'll try downgrading those too. Although some jobs did pass.

The ColdCard test also kept lingering for 45 minutes until it hit the timeout: https://github.com/bitcoin-core/HWI/actions/runs/16466809973/job/46548656344

But thanks to 4a375d4 and 99ddf39 once that timeout was hit, it shut down it's sibling jobs. Although not all, so we should probably check if the simulator is running. I'll do that in the next push for all simulators.


Trezor T jobs are fine, but all Trezor 1 jobs fail with:

./trezor.elf: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.38' not found (required by ./trezor.elf)

https://github.com/bitcoin-core/HWI/actions/runs/16466809973/job/46548656415

So I'll downgrade those too.

The ledger runners seem stuck, pushing again with the above change that aborts the test in that scenario. Hopefully that reveals an actual error message, because I already tried downgrading the sim builder.

@Sjors
Copy link
Member Author

Sjors commented Jul 23, 2025

Ledger was missing flask-cors.

@Sjors Sjors force-pushed the 2025/07/fix-ci branch 2 times, most recently from abac277 to 69495bf Compare July 23, 2025 12:33
@Sjors
Copy link
Member Author

Sjors commented Jul 23, 2025

Oh Speculos now also needs ledgered, whatever that is... LedgerHQ/speculos#593

Sjors and others added 18 commits August 30, 2025 08:34
Manually re-applied the patch after the original code seems to have
moved around a bit.
```
ERROR: coldcard: test_signtx (test_device.TestSignTx.test_signtx) (addrtypes=['legacy'], multisig_types=['legacy'], external=True, op_return=False)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/HWI/HWI/test/test_device.py", line 588, in test_signtx
    self._test_signtx(addrtypes, multisig_types, external, op_return)
  File "/__w/HWI/HWI/test/test_device.py", line 576, in _test_signtx
    self._generate_and_finalize(True, psbt)
  File "/__w/HWI/HWI/test/test_device.py", line 403, in _generate_and_finalize
    self.assertTrue(first_sign_res["signed"])
                    ~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'signed'
```
This reverts commit edab2af.

The always() option is too powerfull. The next commit implements
an alternative solution to the original issue.
This ensures the failure to build a simulator for one device
doesn't abort running jobs for the others. They're still grouped
by manufacturer.

Alternative to bitcoin-core#743.
Build failure on ubuntu-latest:

```
../py/stackctrl.c: In function ‘mp_stack_ctrl_init’:
../py/stackctrl.c:32:32: error: storing the address of local variable ‘stack_dummy’ in ‘mp_state_ctx.thread.stack_top’ [-Werror=dangling-pointer=]
   32 |     MP_STATE_THREAD(stack_top) = (char *)&stack_dummy;
../py/stackctrl.c:31:18: note: ‘stack_dummy’ declared here
   31 |     volatile int stack_dummy;
      |                  ^~~~~~~~~~~
In file included from ../py/runtime.h:29,
                 from ../py/stackctrl.c:27:
../py/mpstate.h:282:23: note: ‘mp_state_ctx’ declared here
  282 | extern mp_state_ctx_t mp_state_ctx;
      |                       ^~~~~~~~~~~~
cc1: all warnings being treated as errors
make: *** [../py/mkrules.mk:77: build/py/stackctrl.o] Error 1
```

Test failure (after downgrading build sim):

```
 File "/github/home/.cache/pypoetry/virtualenvs/hwi-crEDFiR--py3.9/lib/python3.9/site-packages/sdl2/dll.py", line 362, in <module>
    dll = DLL("SDL2", ["SDL2", "SDL2-2.0", "SDL2-2.0.0"], os.getenv("PYSDL2_DLL_PATH"))
  File "/github/home/.cache/pypoetry/virtualenvs/hwi-crEDFiR--py3.9/lib/python3.9/site-packages/sdl2/dll.py", line 253, in __init__
    raise RuntimeError("could not find any library for %s (%s)" %
RuntimeError: could not find any library for SDL2 (PYSDL2_DLL_PATH: unset)
```

https://github.com/bitcoin-core/HWI/actions/runs/16466809973/job/46548656293?pr=795
The build on ubuntu-latest succeeds, but the resulting binary uses
a too recent version of glibc for the test runners to handle.

This only seems to impact Trezor 1, but just downgrade for Trezor T
as well.
NanoS support has been dropped: LedgerHQ/app-bitcoin-new#262

NanoX also makes it possible to test MuSig2 in the future.

Keep NanoS for legacy.
This adds support for BitBox02 Nova. It has the same API has BitBox02.
Sjors added a commit to Sjors/HWI that referenced this pull request Aug 30, 2025
@benma
Copy link
Contributor

benma commented Sep 2, 2025

I also need to add libcmocka0, any other new deps?

@Sjors bb02 master is now fixed and its simulator does not depend on libcmocka0 anymore.

Sjors and others added 3 commits September 2, 2025 11:37
Also fixes indentation.
Co-authored-by: Claude <[email protected]>
Co-authored-by: GPT-5 (Preview) <[email protected]>
@Sjors
Copy link
Member Author

Sjors commented Sep 2, 2025

@benma thanks. I dropped d42e9b5 (and the apt-packages option in 570bfdf that was only used for this).

@achow101
Copy link
Member

achow101 commented Sep 3, 2025

ACK 70e14aa

Seems fine I guess

@achow101 achow101 merged commit 4e342db into bitcoin-core:master Sep 3, 2025
233 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants