Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
7854f9a
add new client protocol module
razvan Sep 2, 2025
32dfb3a
update crd fields to match decision
razvan Sep 3, 2025
e732da4
add spooling config to STS and some unit tests
razvan Sep 3, 2025
cc24f64
add kuttl test
razvan Sep 3, 2025
318c600
fix spool secret length
razvan Sep 4, 2025
202e25f
integration test successful
razvan Sep 4, 2025
9e2d738
rename python script
razvan Sep 4, 2025
d220ea4
Merge remote-tracking branch 'origin/main' into issues/673
razvan Sep 4, 2025
9086f55
increase the number of rows to fetch from Trino
razvan Sep 4, 2025
0a0c093
update changelog
razvan Sep 4, 2025
bea8115
update docs
razvan Sep 4, 2025
c5dc376
remove unused enum
razvan Sep 5, 2025
8f58e0e
remove Optional from config_overrides
razvan Sep 5, 2025
78f2e64
remove the "enabled" property
razvan Sep 5, 2025
e916710
refactor crd to use spec.clusterConfig.clientProtocol.spooling
razvan Sep 5, 2025
fffd646
remove clientProtocol.configOverrides field
razvan Sep 8, 2025
9ceafd6
handle config overrides for spooling-manager.properties
razvan Sep 8, 2025
206b5df
update docs
razvan Sep 8, 2025
83f8879
use config-utils to resolve S3 credentials
razvan Sep 5, 2025
a020f50
add comment to function
razvan Sep 5, 2025
26b5097
revert the unsafe function
razvan Sep 5, 2025
b63c9de
fix merge and update test
razvan Sep 8, 2025
efe1003
remove test timeout
razvan Sep 9, 2025
9e55ce1
not all Trino versions support spooling
razvan Sep 9, 2025
da557fd
Apply suggestions from code review
razvan Sep 11, 2025
76c88da
remove leftovers
razvan Sep 11, 2025
d76c4a1
remove `clusterConfig.faultTolerantExecution.configOverrides` property
razvan Sep 12, 2025
0ba2fed
amend FTE changelog
razvan Sep 15, 2025
4d32b6f
client_protocol: refactor s3 config into crd::s3 and config::s3 to m…
razvan Sep 15, 2025
d6943fc
fte: refactor to reuse crd::s3 anc move resolved struct out of the cr…
razvan Sep 15, 2025
c030c38
Apply suggestions from code review
razvan Sep 16, 2025
6757b5e
remove unused imports after suggestion
razvan Sep 16, 2025
b5c1941
remove crd::s3 which included iam fields
razvan Sep 16, 2025
af0806b
fte tests: replace inline structs with indoc
razvan Sep 16, 2025
dc31a89
controller tests: apply PR review patch
razvan Sep 16, 2025
68809f8
client-spooling: update schema for docs and tests
razvan Sep 16, 2025
f7f1e95
Apply suggestions from code review
razvan Sep 17, 2025
2bd9cd4
Merge branch 'main' into issues/673
razvan Sep 17, 2025
67945ba
client spooling: make s3 filesystem consistent with FTE backend
razvan Sep 17, 2025
35e7f2c
ensure error message are lowercased
razvan Sep 17, 2025
ae1dc26
client spooling: raise error for trino 451
razvan Sep 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,12 @@ All notable changes to this project will be documented in this file.

### Added

- Support for fault-tolerant execution ([#779]).
- Support for fault-tolerant execution ([#779], [#793]).
- Support for the client spooling protocol ([#793]).
- Helm: Allow Pod `priorityClassName` to be configured ([#798]).

[#779]: https://github.com/stackabletech/trino-operator/pull/779
[#793]: https://github.com/stackabletech/trino-operator/pull/793
[#798]: https://github.com/stackabletech/trino-operator/pull/798

## [25.7.0] - 2025-07-23
Expand Down
176 changes: 148 additions & 28 deletions deploy/helm/trino-operator/crds/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,154 @@ spec:
description: matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is "key", the operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
type: object
clientProtocol:
description: Client spooling protocol configuration.
nullable: true
oneOf:
- required:
- spooling
properties:
spooling:
properties:
filesystem:
oneOf:
- required:
- s3
properties:
s3:
properties:
connection:
oneOf:
- required:
- inline
- required:
- reference
properties:
inline:
description: S3 connection definition as a resource. Learn more on the [S3 concept documentation](https://docs.stackable.tech/home/nightly/concepts/s3).
properties:
accessStyle:
default: VirtualHosted
description: Which access style to use. Defaults to virtual hosted-style as most of the data products out there. Have a look at the [AWS documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html).
enum:
- Path
- VirtualHosted
type: string
credentials:
description: If the S3 uses authentication you have to specify you S3 credentials. In the most cases a [SecretClass](https://docs.stackable.tech/home/nightly/secret-operator/secretclass) providing `accessKey` and `secretKey` is sufficient.
nullable: true
properties:
scope:
description: '[Scope](https://docs.stackable.tech/home/nightly/secret-operator/scope) of the [SecretClass](https://docs.stackable.tech/home/nightly/secret-operator/secretclass).'
nullable: true
properties:
listenerVolumes:
default: []
description: The listener volume scope allows Node and Service scopes to be inferred from the applicable listeners. This must correspond to Volume names in the Pod that mount Listeners.
items:
type: string
type: array
node:
default: false
description: The node scope is resolved to the name of the Kubernetes Node object that the Pod is running on. This will typically be the DNS name of the node.
type: boolean
pod:
default: false
description: The pod scope is resolved to the name of the Kubernetes Pod. This allows the secret to differentiate between StatefulSet replicas.
type: boolean
services:
default: []
description: The service scope allows Pod objects to specify custom scopes. This should typically correspond to Service objects that the Pod participates in.
items:
type: string
type: array
type: object
secretClass:
description: '[SecretClass](https://docs.stackable.tech/home/nightly/secret-operator/secretclass) containing the LDAP bind credentials.'
type: string
required:
- secretClass
type: object
host:
description: 'Host of the S3 server without any protocol or port. For example: `west1.my-cloud.com`.'
type: string
port:
description: Port the S3 server listens on. If not specified the product will determine the port to use.
format: uint16
minimum: 0.0
nullable: true
type: integer
region:
default:
name: us-east-1
description: |-
Bucket region used for signing headers (sigv4).

This defaults to `us-east-1` which is compatible with other implementations such as Minio.

WARNING: Some products use the Hadoop S3 implementation which falls back to us-east-2.
properties:
name:
default: us-east-1
type: string
type: object
tls:
description: Use a TLS connection. If not specified no TLS will be used.
nullable: true
properties:
verification:
description: The verification method used to verify the certificates of the server and/or the client.
oneOf:
- required:
- none
- required:
- server
properties:
none:
description: Use TLS but don't verify certificates.
type: object
server:
description: Use TLS and a CA certificate to verify the server.
properties:
caCert:
description: CA cert to verify the server.
oneOf:
- required:
- webPki
- required:
- secretClass
properties:
secretClass:
description: Name of the [SecretClass](https://docs.stackable.tech/home/nightly/secret-operator/secretclass) which will provide the CA certificate. Note that a SecretClass does not need to have a key but can also work with just a CA certificate, so if you got provided with a CA cert but don't have access to the key you can still use this method.
type: string
webPki:
description: Use TLS and the CA certificates trusted by the common web browsers to verify the server. This can be useful when you e.g. use public AWS S3 or other public available services.
type: object
type: object
required:
- caCert
type: object
type: object
required:
- verification
type: object
required:
- host
type: object
reference:
type: string
type: object
required:
- connection
type: object
type: object
location:
type: string
required:
- filesystem
- location
type: object
type: object
faultTolerantExecution:
description: Fault tolerant execution configuration. When enabled, Trino can automatically retry queries or tasks in case of failures.
nullable: true
Expand Down Expand Up @@ -132,12 +280,6 @@ spec:
- required:
- local
properties:
configOverrides:
additionalProperties:
type: string
default: {}
description: The `configOverrides` allow overriding arbitrary exchange manager properties.
type: object
encryptionEnabled:
description: Whether to enable encryption of spooling data.
nullable: true
Expand Down Expand Up @@ -312,14 +454,6 @@ spec:
reference:
type: string
type: object
externalId:
description: External ID for the IAM role trust policy.
nullable: true
type: string
iamRole:
description: IAM role to assume for S3 access.
nullable: true
type: string
maxErrorRetries:
description: Maximum number of times the S3 client should retry a request.
format: uint32
Expand Down Expand Up @@ -394,12 +528,6 @@ spec:
- required:
- local
properties:
configOverrides:
additionalProperties:
type: string
default: {}
description: The `configOverrides` allow overriding arbitrary exchange manager properties.
type: object
encryptionEnabled:
description: Whether to enable encryption of spooling data.
nullable: true
Expand Down Expand Up @@ -574,14 +702,6 @@ spec:
reference:
type: string
type: object
externalId:
description: External ID for the IAM role trust policy.
nullable: true
type: string
iamRole:
description: IAM role to assume for S3 access.
nullable: true
type: string
maxErrorRetries:
description: Maximum number of times the S3 client should retry a request.
format: uint32
Expand Down
43 changes: 43 additions & 0 deletions docs/modules/trino/pages/usage-guide/client-spooling-protocol.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
= Client Spooling Protocol
:description: Enable and configure the Client Spooling Protocol in Trino for efficient handling of large result sets.
:keywords: client spooling protocol, Trino, large result sets, memory management
:trino-docs-spooling-url: https://trino.io/docs/476/client/client-protocol.html

The Client Spooling Protocol in Trino is designed to efficiently handle large result sets. When enabled, this protocol allows the Trino server to spool results to external storage systems, reducing memory consumption and improving performance for queries that return large datasets.

For more details, refer to the link:{trino-docs-spooling-url}[Trino documentation on Client Spooling Protocol {external-link-icon}^].

[IMPORTANT]
====
The client spooling protocol was introduced in Trino 466 but it only works reliably starting with Trino 476.
====

== Configuration

The client spooling protocol is disabled by default.
To enable it, you need to set the `spec.clusterConfig.clientSpoolingProtocol` configuration property as shown below.

[source,yaml]
----
spec:
clusterConfig:
clientProtocol:
spooling:
location: "s3://spooling-bucket/trino/" # <1>
filesystem:
s3: # <2>
connection:
reference: "minio"
----
<1> Specifies the location where spooled data will be stored. This example uses an S3 bucket.
<2> Configures the filesystem type for spooling. Only S3 is supported currently via the custom resource definition.

The operator automatically fills in additional settings required by Trino, such as the `protocol.spooling.shared-secret-key`.
To add or replace properties in the generated `spooling-manager.properties` file, use the `configOverrides` property as describe here : xref:usage-guide/configuration.adoc[].

[IMPORTANT]
====
Even if enabled, Trino may decide to not use the client spooling protocol for certain queries. Clients cannot force Trino to use it.
====

The clients need to have access to the same storage location configured for spooling.
1 change: 1 addition & 0 deletions docs/modules/trino/pages/usage-guide/configuration.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ For a role or role group, at the same level of `config`, you can specify `config
* `password-authenticator.properties`
* `security.properties`
* `exchange-manager.properties`
* `spooling-manager.properties`

For a list of possible configuration properties consult the https://trino.io/docs/current/admin/properties.html[Trino Properties Reference].

Expand Down
1 change: 1 addition & 0 deletions docs/modules/trino/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
** xref:trino:usage-guide/listenerclass.adoc[]
** xref:trino:usage-guide/configuration.adoc[]
** xref:trino:usage-guide/fault-tolerant-execution.adoc[]
** xref:trino:usage-guide/client-spooling-protocol.adoc[]
** xref:trino:usage-guide/s3.adoc[]
** xref:trino:usage-guide/security.adoc[]
** xref:trino:usage-guide/monitoring.adoc[]
Expand Down
10 changes: 5 additions & 5 deletions rust/operator-binary/src/authentication/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,26 +43,26 @@ const HTTP_SERVER_AUTHENTICATION_TYPE: &str = "http-server.authentication.type";
#[derive(Snafu, Debug)]
pub enum Error {
#[snafu(display(
"The Trino Operator does not support the AuthenticationClass provider [{authentication_class_provider}] from AuthenticationClass [{authentication_class}]."
"the Trino Operator does not support the AuthenticationClass provider [{authentication_class_provider}] from AuthenticationClass [{authentication_class}]."
))]
AuthenticationClassProviderNotSupported {
authentication_class_provider: String,
authentication_class: ObjectRef<core::v1alpha1::AuthenticationClass>,
},

#[snafu(display("Failed to format trino authentication java properties"))]
#[snafu(display("failed to format trino authentication java properties"))]
FailedToWriteJavaProperties {
source: product_config::writer::PropertiesWriterError,
},

#[snafu(display("Failed to configure trino password authentication"))]
#[snafu(display("failed to configure trino password authentication"))]
InvalidPasswordAuthenticationConfig { source: password::Error },

#[snafu(display("Failed to configure trino OAuth2 authentication"))]
#[snafu(display("failed to configure trino OAuth2 authentication"))]
InvalidOauth2AuthenticationConfig { source: oidc::Error },

#[snafu(display(
"OIDC authentication details not specified. The AuthenticationClass {auth_class_name:?} uses an OIDC provider, you need to specify OIDC authentication details (such as client credentials) as well"
"oidc authentication details not specified. The AuthenticationClass {auth_class_name:?} uses an OIDC provider, you need to specify OIDC authentication details (such as client credentials) as well"
))]
OidcAuthenticationDetailsNotSpecified { auth_class_name: String },

Expand Down
Loading