-
Notifications
You must be signed in to change notification settings - Fork 71
Description
Checks
- I've already read https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/troubleshooting-actions-runner-controller-errors and I'm sure my issue is not covered in the troubleshooting guide.
- I am using charts that are officially provided
Controller Version
0.12.1
Deployment Method
Helm
Checks
- This isn't a question or user support case (For Q&A and community support, go to Discussions).
- I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
To Reproduce
1. Deploy ARC and a RSS to an IPv6-enabled cluster. Runner is 'kubernetes' mode.
2. Create a simple dispatch workflow that `runs-on: myrunner`
3. Dispatch.
4. Runner crashes during 'Initialize Containers'.
Describe the bug
When trying to migrate our runners to an IPv6 EKS cluster, we find that the runners consistently crash in the 'Initialize Containers' step.
Error [ERR_TLS_CERT_ALTNAME_INVALID]: Hostname/IP does not match certificate's altnames: Host: fd26. is not in the cert's altnames: DNS:c699ee59bb9e133834eae210f228abc6.yl4.eks-cluster.us-east-1.api.aws, DNS:ip-172-16-172-216.ec2.internal, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP Address:FD26:11D8:2382:0:0:0:0:1, IP Address:2600:1F18:427C:8111:0:0:0:F22, IP Address:172.16.172.216
This is a suspicious message, especially Host: fd26. is not
. fd26
is the first part of my cluster's "Service IPv6 Range": fd26:11d8:2382::/108
. In fact, KUBERNETES_SERVICE_HOST=fd26:11d8:2382::1
. And the DNS result for kubernetes.default
is
$ getent hosts kubernetes.default
fd26:11d8:2382::1 kubernetes.default.svc.cluster.local
This Host: fd26.
is an outcome I would theorize coming from improper handling of host addresses; attempting to split an assumed address+port string on :
, then taking the first part as the address.
Describe the expected behavior
I expect that the runner should work on IPv6 without modification or override.
When migrating workloads to IPv6, I often encounter improper address handling in every language from Ruby, Javascript, Python and Go. Implementers often assume they can string-build a URI from component strings, which falls apart at the edge cases. All of these migrations were solved by correcting implementation to use proper URI-handling standard library functions, or by formatting the address using the bracketed notation: [feeb:beef::1]:8080
, which overcomes some issues in third-party libraries and tools.
I attempted to find the root cause of the issue in this repository, @actions/runner-container-hooks and @actions/runner but was unable to pinpoint it.
This effectively blocks me (and apparently everyone) from deploying ARC to IPv6-only clusters.
Controller Logs
https://gist.github.com/carl-reverb/5666aa77be92f57e16320d89c0f2c2db
Runner Pod Logs
https://gist.github.com/carl-reverb/40ac08236942c9bb2aa8330e78fb5c7f