feat(aws-connection): Enables code for the feature #831

d0weinberger · 2025-10-15T10:19:46Z

Why this PR?

This PR enables the creation of AWS connections via Terraform which are, for example, used in Workflows.

What has changed?

The code was already there, but disabled. I enabled it again, did some manual testing with it, and fixed/changed parts of it.

In role_arn/service.go:71, I let Update return nil. Only Create does anything, which is updating the role_arn property of the corresponding connection.
In aws/service.go, the commented logic for the Update did not work after enabling it, because even if an ARN is set in the Terraform resource, it would be overwritten by the ARN set on the server (which is empty). So the created ARN resource did not lead to an update of the connection settings on the server. I removed all of that and just called me.service.Update, but need to talk to @Reinhard-Pilz-Dynatrace whether this might be problematic, or to understand the intention of the code that was there before.
The call for adding the role ARN needed some retry logic. The reason for this is that when creating the IAM role on AWS, it is not immediately in effect. It can take a few seconds before this is the case. If the AWS connection settings object is updated immediately after IAM role creation, the call will fail, as HAS will fail to assume the role. I implemented a retry loop using retry.RetryContext. We retry on server-errors, non-HTTP-errors and errors of code 400 and 404 to handle the case with eventual consistency. The default timeout is 20 minutes. The timeout for the retry loop can be set with a timeouts block, as seen in the example below, where it is set to 30 minutes. One minute (if available) is deducted from the deadline for the retry context to have enough time for finalizing tasks. Example:

resource "dynatrace_aws_connection_role_arn" "test-aws-connection-rolebased-arn" {
  aws_connection_id = dynatrace_aws_connection.test-aws-connection-rolebased.id
  role_arn          = aws_iam_role.example_role_rolebased.arn

  timeouts {
    create = "30m"
  }
}

How does it do it?

The first step is to create a dynatrace_aws_connection resource. This also creates the settings object on the server.
Then, after having created the IAM role in AWS, the dynatrace_aws_connection_role_arn resource is used. This resource updates the role_arn of the AWS connection settings object on the server.

Since the AWS connection settings object is practically immutable after creation, all attributes and sub-resources of the resource dynatrace_aws_connection are flagged as forceNew. The Update should only ever be called via role_arn/service to set the role_arn.

So for dynatrace_aws_connection, if a terraform resource is

created --> Create is called, new settings object is created (without role ARN)
changed --> because of forceNew, the old resource is deleted and a new one is created
removed --> Delete is called and the connection is deleted.

For dynatrace_aws_connection_role_arn, if a terraform resource is

created --> Create is called, which calls Update for dynatrace_aws_connection, so the AWS connection settings object is updated - this should only set the ARN. Other changes, or resetting the ARN, would lead to API errors
changed --> returns nil.
removed --> Delete is called but nothing happens - nil is returned. Since the role ARN is also immutable once it is set, we cannot update the AWS connection with an empty ARN.

How is it tested?

Just manual tests right now. No E2E test setup.
An example using web identity is added.
An acceptance test is added, but skipped for now.

How does it affect users?

Users can now use the TF resources dynatrace_aws_connection and dynatrace_aws_connection_role_arn to configure AWS connections, which in turn can be used in, e.g. workflows.

Issue:
CA-16697

.../builtin/hyperscalerauthentication/connections/aws/testdata/terraform/example_webidentity.tf

dynatrace/api/builtin/hyperscalerauthentication/connections/aws/service.go

dynatrace/api/builtin/hyperscalerauthentication/connections/aws/role_arn/service.go

Kirdock · 2025-10-17T09:00:48Z

dynatrace/api/builtin/hyperscalerauthentication/connections/aws/role_arn/service.go

+	var err error
+	var retry = 10
+	for i := 0; i < retry; i++ {
+		if err = me.connService.Update(ctx, v.AWSConnectionID, &connValue); err == nil {


[nit] Should we limit it to only retry on 400 status codes?

We could, but I don't see any immediaty harm in retrying on every error. Do you?
What I'd like to discuss, though, is: number of retries, duration of timeout and whether these 2 values should be customizable, i.e. should we pass this from the outside via env vars instead of magic numbers? What do you think?

dynatrace/api/builtin/hyperscalerauthentication/connections/aws/role_arn/service.go

Kirdock · 2025-10-27T12:28:37Z

dynatrace/api/builtin/hyperscalerauthentication/connections/aws/role_arn/service.go

+			retryTimeout = remaining - timeoutDeadlineBuffer
+		} else {
+			retryTimeout = remaining
+		}


I'm not quite sure what that does.
timeoutDeadlineBuffer is like an expected amount of time of API call (e.g., min time)?
If the remaining time is less than timeoutDeadlineBuffer I would expect that the call is not executed. With this else here, I don't quite get the point of the timeoutDeadlineBuffer

If the timeout for the whole create-process triggered via Terraform is, say, 10 minutes, then I don't want to have the whole 10 minutes allocated for the (internal) update call to the settings API. I want to give it maybe 9 minutes, so there is still enough time for all the other operations after the update call to finish, before the whole create times out.

Whether to do or not to do the update call if the remaining time is less than what we would like to have for buffer ... yes, that's a fair point. To play it save, we could just not do the call at all instead of using the whole remaining time for the update call.

@arthurpitman could you also have a look?
I would prefer either returning in the else branch with DeadlineExceeded or not considering a timeoutDeadlineBuffer at all

- Code that was previously commented in `enums.go`, `provider.go`, `resource_descriptor`, `aws/service.go`, and `role_arn/service.go` is now enabled. - A (minimal) `example.tf` using web identity is added. The AWS provider part is missing to have a complete example. - An acceptance test is added, but skipped for now.

This is necessary because when the IAM role is created on AWS, it takes some time until it comes into effect. If immediately after creating the IAM role the role ARN is added to the AWS connection, the HAS will fail when trying to assume the role. Therefore, we need to retry adding the role ARN

…able With the changes in this commit, users of the Terraform provider can configure the timeout for the `create` operation for the `role_arn` resource by adding the following block to the resource: ``` timeouts { create = "<x>m" } `` where <x> is the max amount of minutes the user wants to wait.

Kirdock

The current implementation solves the issue. The other comment open is minor and doesn't have a big impact on the implementation. So lgtm

sonarqubecloud · 2025-10-28T10:18:23Z

Quality Gate passed

Issues
3 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

d0weinberger force-pushed the feat/aws-connection branch from cfc1a29 to 613f1b2 Compare October 15, 2025 11:45

Kirdock reviewed Oct 17, 2025

View reviewed changes

d0weinberger force-pushed the feat/aws-connection branch 3 times, most recently from 182c43d to b57e9e7 Compare October 22, 2025 07:35

Kirdock reviewed Oct 27, 2025

View reviewed changes

d0weinberger added 5 commits October 27, 2025 14:47

feat(aws-connection): Add dependency for settings permissions

76ab0b6

feat(aws-connection): Adds webidentity example

5bf325b

d0weinberger force-pushed the feat/aws-connection branch from b57e9e7 to 6281dfb Compare October 27, 2025 13:47

Kirdock previously approved these changes Oct 27, 2025

View reviewed changes

d0weinberger dismissed Kirdock’s stale review via 41c6d06 October 27, 2025 16:11

Kirdock previously approved these changes Oct 27, 2025

View reviewed changes

feat(aws-connection): Simplifies retry logic

f53e62a

d0weinberger dismissed Kirdock’s stale review via f53e62a October 28, 2025 10:17

d0weinberger force-pushed the feat/aws-connection branch from 41c6d06 to f53e62a Compare October 28, 2025 10:17

Kirdock approved these changes Oct 30, 2025

View reviewed changes

d0weinberger merged commit 2489714 into main Oct 30, 2025
6 of 8 checks passed

d0weinberger deleted the feat/aws-connection branch October 30, 2025 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(aws-connection): Enables code for the feature #831

feat(aws-connection): Enables code for the feature #831

Uh oh!

d0weinberger commented Oct 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Kirdock Oct 17, 2025

Uh oh!

d0weinberger Oct 17, 2025

Uh oh!

Uh oh!

Kirdock Oct 27, 2025

Uh oh!

d0weinberger Oct 27, 2025

Uh oh!

Kirdock Oct 27, 2025

Uh oh!

Kirdock left a comment

Uh oh!

sonarqubecloud bot commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(aws-connection): Enables code for the feature #831

feat(aws-connection): Enables code for the feature #831

Uh oh!

Conversation

d0weinberger commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this PR?

What has changed?

How does it do it?

How is it tested?

How does it affect users?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Kirdock Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

d0weinberger Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Kirdock Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

d0weinberger Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Kirdock Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Kirdock left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Oct 28, 2025

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

d0weinberger commented Oct 15, 2025 •

edited

Loading