-
Couldn't load subscription status.
- Fork 2.8k
feat(txt-registry): add apex record delegation check #5864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: u-kai <[email protected]>
Signed-off-by: u-kai <[email protected]>
|
Hi @u-kai. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
After opening this PR, I realized I still need to think through a few more things. |
…and documentation Signed-off-by: u-kai <[email protected]>
…prehensive tests Signed-off-by: u-kai <[email protected]>
…cords Signed-off-by: u-kai <[email protected]>
|
Earlier I mentioned that I still needed to think through a few more things. |
|
To be honest, I'm not sure what this change is intend to improve. Do you have any kubernetes manifests to try out before/with-the-fix? |
|
@ivankatliarchuk Why: What this PR changes:
Backward-compatibility: existing apex records that already lack a TXT remain untouched. Manifests to try (before/with the fix) ExternalDNS Deployment (set an intentionally incompatible txt-prefix to simulate the problem): apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
namespace: dns
spec:
replicas: 1
selector:
matchLabels: { app: external-dns }
template:
metadata:
labels: { app: external-dns }
spec:
serviceAccountName: external-dns
containers:
- name: external-dns
image: ghcr.io/kubernetes-sigs/external-dns:latest
args:
- --source=service
- --provider=aws
- --registry=txt
- --txt-prefix=hogeApex record via a Service: apiVersion: v1
kind: Service
metadata:
name: website
namespace: default
annotations:
external-dns.alpha.kubernetes.io/hostname: example.com
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 80
selector:
app: websiteExpected behavior: Before the fix: A record for example.com is created; TXT ownership for apex is skipped with only a debug log (no clear signal). With the fix: A record creation is blocked when the corresponding apex TXT would be invalid, and a warn-level log explains that txt-prefix is not suitable for apex. Related issue: |
|
/ok-to-test |
|
Potentially resolves or helps to resolve or simply relates |
Pull Request Test Coverage Report for Build 18129040433Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Is this referring to cases where another registry (such as dynamodb, aws-sd, or noop) is used instead? |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mloiseleur The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
I only performed high-level smoke testing, and the PR behaves as expected. I’m not entirely confident about the apex record check—it doesn’t seem very reliable, and I’m not sure what a truly reliable check for apex records would look like. The change itself make sense. |
|
@ivankatliarchuk Could you share what would make you feel more confident about it? |
|
There is a long-standing achitectural gap: accurate differentiation of apex roots independent of provider or zone nesting I see a risk in the proposed APEX check. The rootApexDetector implementation operates entirely heuristically, relying on NS record metadata observed from endpoint listings, not on authoritative DNS resolution. We can’t get the apex definitively without querying SOA, at most we could make a reasonable heuristic using net.LookupNS(). Simply parsing records I'm unsure where it will work or not, and when the edge cases are. |
|
@ivankatliarchuk Our Please let me know if anything looks incorrect or if you see a case I might have missed. 🙇 |
|
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
The PR reduces risk of creating invalid apex TXT records, but it can’t guarantee correctness across all DNS topologies and providers. Golang standart libraries do not provide any ways to identify that either. I tried to summarize it
That’s key for: the code only sees what your provider’s managed zones expose via their API. I'm not too sure where all the providers will expose this information same way. Hence this needs validating for every single one of them. Unfortunately, I'm only aware of flow, where we could validate for example public suffixes https://pkg.go.dev/golang.org/x/net/publicsuffix for public domain. Without querying authoritative servers for SOA (which isn’t done here), this method relies solely on names observed from the registry — a local view of reality. But this approach adds latency and brittleness, and might fail in locked-down VPC networks. |
|
From my perspective apex detection requires policy, not just data. There’s no single, universally correct way to define a “root apex domain”, as a result, there is no support in standart libraries for the same. Examples:
This means determining the apex isn’t a pure DNS query problem, but a contextual policy problem - you need to know which DNS view, resolver, and authority you’re talking to. |
|
Thanks a lot for the detailed write-up — I read the code and thought this through. Before replying to each “unreliable” scenario you listed, I’d like to align on scope:
With that scope in mind, here is how I see the scenarios (please correct me if I’m off):
Out of scope. If the provider API does not expose those zones, ExternalDNS should not manage them, so the detector doesn’t consider them.
From the provider implementations I checked, when a provider supports both public and private zones, an ExternalDNS instance chooses one (e.g.,
While looking into this case, I found something more fundamental related to this PR. I explain this below. New finding: some providers do not return NS records via While investigating this, I found that CoreDNS and Pi-hole providers don’t return NS records at all through their Summary In summary, this change should work well for most providers — That’s a trade-off between benefit and consistency. As an alternative idea, we could deprecate configurations whose |
What does it do?
Adds apex record delegation checks to prevent creating apex records when the TXT registry configuration would otherwise generate ownership TXT records outside of the managed zone.
Specifically, when
--txt-prefixincludes a record type template with a trailing dot, ExternalDNS could attempt to create TXT records that do not belong to the managed zone. This change introduces a guard (apexChecker) to block such cases.Motivation
When using certain
--txt-prefixconfigurations with apex records (records at the zone root), ExternalDNS may try to create ownership TXT records outside of the intended zone.For example, with
--txt-prefix=txt-and an apex A record for example.com, the ownership TXT record would be created as txt-example.com, which is outside the example.com zone and therefore unmanaged.This PR ensures apex records are only created when the resulting TXT record can be safely managed within the zone.
Related: documentation clarification proposed in #5863 .
More