Skip to content

Conversation

mcoulombe
Copy link
Contributor

@mcoulombe mcoulombe commented Oct 3, 2025

What this PR does / why we need it:

Some resources like SCIM groups can be created and added as a reference to the policy file in the same run. The eager validation causes otherwise valid runs to fail at the plan step because the references do not exist yet.

We are relaxing the plan validation logic to avoid such false positives. The full validation results are still available in debug logs in case this is useful. Note that the validation is done in full during an apply before the policy file is changed so this does not bypass the full breath of validations taking place before the policy is modified.

Which issue this PR fixes (use fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged):

Fixes #546

Example of a validation error for a failed test in the debug logs:

2025-10-03T14:07:54.620-0400 [DEBUG] provider.terraform-provider-tailscale: ACL validation unsuccessful due to advisory error: tf_resource_type=tailscale_acl tf_rpc=PlanResourceChange @module=provider tf_provider_addr=provider error="ACL validation failed: test(s) failed; [{[email protected] [[acl test error]: address \"1.2.3.4:80\" (protocol \"tcp\"): want: Drop, got: Accept]}]" tf_req_id=4a4b79ad-9fb5-fe55-1ab8-961f4e20b4af @caller=/Users/maxc/Documents/projects/terraform-provider-tailscale/tailscale/resource_acl.go:54 timestamp=2025-10-03T14:07:54.620-0400

Resulting apply if the user ignores or does not check the validation logs before an apply:

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

tailscale_acl.as_hujson: Modifying... [id=acl]
╷
│ Error: Failed to set ACL
│ 
│   with tailscale_acl.as_hujson,
│   on main.tf line 15, in resource "tailscale_acl" "as_hujson":
│   15: resource "tailscale_acl" "as_hujson" {
│ 
│ test(s) failed (400)
╵
╷
│ Error: user: [email protected]
│ error: [acl test error]: address "1.2.3.4:80" (protocol "tcp"): want: Drop, got: Accept
│ 
│   with tailscale_acl.as_hujson,
│   on main.tf line 15, in resource "tailscale_acl" "as_hujson":
│   15: resource "tailscale_acl" "as_hujson" {
│ 
╵

ports = ["*:*"],
"src": ["*"],
"dst": ["*"],
"ip": ["*"],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by, the HuJSON example was not valid. Switching from the ACL to the Grant syntax at the same time.

// non-syntax errors are logged but do not fail the plan operation
tflog.Debug(ctx, "ACL validation unsuccessful due to advisory error", map[string]interface{}{"error": err})
return nil
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From our conversations, warnings might be counterproductive and confuse people, but we should not make the information completely unavailable.

Some resources like SCIM groups can be created and added as a reference to the policy file in the same run. The eager validation causes otherwise valid runs to fail at the plan step because the references do not exist yet.

We are relaxing the plan validation logic to avoid such false positives. The full validation results are still available in debug logs in case this is useful. Note that the validation is done in full during an apply before the policy file is changed so this does not permit a tailnet policy to be in an invalid state.

Fixes #546

Signed-off-by: mcoulombe <[email protected]>
@mcoulombe mcoulombe force-pushed the max/relax-acl-plan-validations branch from 5e104df to c70357a Compare October 3, 2025 18:15
Copy link
Collaborator

@oxtoacart oxtoacart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, just a couple of suggestions.

return err
} else if err != nil {
// non-syntax errors are logged but do not fail the plan operation
tflog.Debug(ctx, "ACL validation unsuccessful due to advisory error", map[string]interface{}{"error": err})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tflog.Debug(ctx, "ACL validation unsuccessful due to advisory error", map[string]interface{}{"error": err})
tflog.Debug(ctx, "ACL validation unsuccessful due to advisory warning, apply may fail if dependencies are unmet", map[string]interface{}{"error": err})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can soften the wording and call them warnings, but the validation failures are not necessarily undefined dependencies. The provider should not characterize what caused the error if it just forwards what the BE returned, which can change in the future.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are not necessarily

Hence the use of "may"

// TODO: use an exported variable when https://github.com/hashicorp/terraform-plugin-sdk/issues/803 has been addressed.
const UnknownVariableValue = "74D93920-ED26-11E3-AC10-0800200C9A66"

var syntaxErrorMessage = regexp.MustCompile(`^ACL validation failed: line \d+, column \d+:`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include a comment about what assumptions this makes about the control server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by assumptions about the control server?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assumption that all syntax errors present as an error in this form, and that if this change, the logic breaks.

@oxtoacart
Copy link
Collaborator

@mcoulombe Should we just close this one?

@mcoulombe
Copy link
Contributor Author

@mcoulombe Should we just close this one?

Once I'm done with the WIF work my plan is to add error codes on the validate endpoint and use that code here instead of the error message format, it seems reasonable to me to update this PR once the BE is ready instead of opening a new one.

@mcoulombe mcoulombe marked this pull request as draft October 14, 2025 18:23
@mcoulombe
Copy link
Contributor Author

mcoulombe commented Oct 14, 2025

Converting to draft since it requires downstream BE changes and has been a bit deprioritized to get some internal projects over the finish line. I'll come back to it as soon when possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow bypassing ACL validation for groups

2 participants