-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Views REST API #137818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ESQL: Views REST API #137818
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/es-data-management (Team:Data Management) |
|
Hi @craigtaverner, I've created a changelog YAML for you. |
259d88e to
2b93fa5
Compare
|
Since these don't yet do anything, and will be part of multiple PRs before the feature is fully-fleshed out, I think it would be helpful for these to be behind a feature flag, so that they are not inadvertently released-and-then-changed. |
We've made this SNAPSHOT only, which is essentially the same as a feature flag, and is the approach normally taken in the ES|QL team for cases like this where we do not yet want the feature to be released. However, I will add more tests that explicitly assert that the feature is missing outside of SNAPSHOT builds. |
As I read this, that is for the implementation itself, but does not prevent the REST APIs from being registered and available, even without the node feature. Unless I am misunderstanding it? |
|
(Another drive-by comment, but this is not a full review) Why do we have both a "get" and "list" action/API for these? All of our other APIs that follow this model register a single get REST handler that takes an optional |
bb1ce21 to
89f766e
Compare
|
Hi @craigtaverner, I've created a changelog YAML for you. |
This is extracted from the Views Prototype at elastic#134995 and is originally based on the "ESQL: View" prototype by Nik but with the following changes: * No ES|QL language integration, only REST API for adding/removing and managing Views in cluster state * ViewMetadata now inside ProjectMetadata * NodeFeature to ensure metadata only exists in clusters fully upgraded * Wide CRUD support (add, update, delete, list) * View name validation * View count, length and depth restrictions Co-authored-by: Nik Everett <[email protected]>
f21d063 to
73da7cd
Compare
nielsbauman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, @craigtaverner! I scanned most parts of the code briefly and left a few comments, some of which are not a big deal.
...vileges-tests/src/javaRestTest/java/org/elasticsearch/xpack/security/operator/Constants.java
Outdated
Show resolved
Hide resolved
| PARSER.declareObject(ConstructingObjectParser.constructorArg(), (p, c) -> { | ||
| Map<String, View> patterns = new HashMap<>(); | ||
| String fieldName = null; | ||
| for (XContentParser.Token token = p.nextToken(); token != XContentParser.Token.END_OBJECT; token = p.nextToken()) { | ||
| if (token == XContentParser.Token.FIELD_NAME) { | ||
| fieldName = p.currentName(); | ||
| } else if (token == XContentParser.Token.START_OBJECT) { | ||
| patterns.put(fieldName, View.fromXContent(p)); | ||
| } else { | ||
| throw new ElasticsearchParseException("unexpected token [" + token + "]"); | ||
| } | ||
| } | ||
| return patterns; | ||
| }, VIEWS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use AbstractObjectParser#declareNamedObjects instead here? That avoids using the old parser syntax, which is a lot harder to read (and debug). I think that would look something like this:
| PARSER.declareObject(ConstructingObjectParser.constructorArg(), (p, c) -> { | |
| Map<String, View> patterns = new HashMap<>(); | |
| String fieldName = null; | |
| for (XContentParser.Token token = p.nextToken(); token != XContentParser.Token.END_OBJECT; token = p.nextToken()) { | |
| if (token == XContentParser.Token.FIELD_NAME) { | |
| fieldName = p.currentName(); | |
| } else if (token == XContentParser.Token.START_OBJECT) { | |
| patterns.put(fieldName, View.fromXContent(p)); | |
| } else { | |
| throw new ElasticsearchParseException("unexpected token [" + token + "]"); | |
| } | |
| } | |
| return patterns; | |
| }, VIEWS); | |
| PARSER.declareObject(ConstructingObjectParser.constructorArg(), (p, c) -> View.fromXContent(p), VIEWS); |
but I didn't test that, so there's a chance I'm saying something that doesn't make sense :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied verbatim from EnrichMetadata, which has a very similar structure with the constructor taking a Map<String, EnrichPolicy>, and here we take Map<String, View>. The difference is handled on line 56 which decodes the View using View.fromXContent. You suggest that I skip the map, which I'm not sure makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That code in EnrichMetadata hasn't been modified since 2019, so not every aspect of that class serves as a great example for us here. I indeed now realize that declareObject returns a list and not a map, that's unfortunate... I've asked around if we don't have any utility for this, but we can leave it for now.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/view/View.java
Outdated
Show resolved
Hide resolved
| @SuppressForbidden(reason = "legacy usage of unbatched task") // TODO add support for batching here | ||
| private void submitUnbatchedTask(@SuppressWarnings("SameParameterValue") String source, ClusterStateUpdateTask task) { | ||
| clusterService.submitUnbatchedStateUpdateTask(source, task); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you were still planning on doing this, but it would be good to use a proper task queue here. I would recommend using a SimpleBatchedAckListenerTaskExecutor, as that avoids you having to implement a boilerplate executor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comes from the original prototype, and it would be good to fix this already now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, the enrich policy api also uses submitUnbatchedStateUpdateTask...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that code is definitely not recent either, hence the TODO in the code :)
| * Simple implementation of {@link ClusterViewService} that keeps the views in memory. | ||
| * This is useful for testing. | ||
| */ | ||
| public class InMemoryViewService extends ViewService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably missing something, but why do we need this in-memory version exactly?
And, follow-up question: wouldn't it make more sense to put that in a test package somewhere, rather than having it in production code, if it's only useful for testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is only moderately useful here, but very useful in the actual ES|QL views were we have semi-unit tests that run the full ES|QL stack, but outside of Elasticsearch, and then this becomes the main mechanism for enabling that. I could indeed move it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this PR it serves primarily as unit tests for the ViewService, without touching cluster state (so only tests the ViewService API internally). That still has value, so I'm happy to keep it here (although I moved to the test package, as you suggested).
| submitUnbatchedTask("update-esql-view-metadata", new ClusterStateUpdateTask() { | ||
| @Override | ||
| public ClusterState execute(ClusterState currentState) { | ||
| var project = getProjectMetadata(currentState); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getProjectMetadata uses the ProjectResolver, but the ProjectResolver is not meant to be used inside cluster state update tasks, as the thread context isn't copied over when running cluster state update tasks (which is what the project resolver relies on in multi-project mode). It's totally understandable that you would implement it like this; when the multi-project work gets picked up, we should put effort into making it clearer that the project resolver isn't supposed to work like this.
That means that you'll have to explicitly pass a ProjectId to this method. I would advise resolving the project ID in the transport actions, because:
I think having this getMetadata() method that relies on the project resolver can cause confusion and possible bugs, because it's not apparent at all that this method only works if the project ID is present in the thread context, which isn't always the case, such as here. I would advise doing something along the lines of:
protected ViewMetadata getMetadata(ProjectId projectId) {
return getMetadata(clusterService.state().metadata().getProject(projectId));
}
protected ViewMetadata getMetadata(ProjectMetadata projectMetadata) {
return projectMetadata.custom(ViewMetadata.TYPE, ViewMetadata.EMPTY);
}Let me know what you think and if you have any questions/concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally got this approach from the EnrichPolicyResolver, which uses ProjectResolver, but then also noticed that the EnrichPolicy CRUD used ProjectID instead, so it seems Enrich uses both approaches (but in different places, so probably correct). I can try switch to using ProjectID for CRUD as you suggest.
This is extracted from the Views Prototype at #134995 and is originally based on the "ESQL: View" prototype by Nik but with the following changes:
The REST API looks like:
Checklist:
listinterface and generalise GET command to cover list use cases (more in-line with other APIs)SequentialAckingBatchedTaskExecutorProjectResolverand instead use theProjectIDas is done in Enrich REST API