-
Couldn't load subscription status.
- Fork 24
Dashboard Multi Dataset Support #244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| return {} | ||
|
|
||
| def get_union_view(self, dataset_names): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Queries across multiple datasets should probably be strictly limited to "full dataset" plots.
This implementation combines the current ctx.view with the entire content of the other datasets. Which is a weird behavior. I think "multiple dataset" queries should do one of two things:
- Strictly be limited to "full dataset" queries. IE use
ctx.dataset.add_stage()instead ofctx.view.add_stage() - Apply the current view's filters to all datasets. For some views, this would be as simple as injecting the
Mongo()stage as the first stage inctx.view. However, for views that involve things likelimit/skip/take, then we'd need a version of theconcat()stage that allowed combinations of multiple datasets instead.
Querying across multiple datasets is a bit dubious because there is no guarantee that other datasets will have the correct field names/types to be queried by whatever filter you've built based on the current dataset's schema.
TDLR: should we add guardrails here to ensure the user doesn't define an invalid plot? 🤔
Option 2 is tirc
| return [ctx.dataset] | ||
|
|
||
| dataset_names = item.selected_datasets | ||
| if "all" in item.selected_datasets: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ALL datasets??? 🤯🤯🤯
These plots will surely take a loooong time to generate when the user has many datasets. Are we sure we can recommend this as a usable option?
Another consideration: how likely is it that any given aggregation would actually be valid across all datasets? Even if you are plotting a default field like metadata, image vs video datasets have different attributes (EG metadata.width for image datasets and metadata.frame_width for video datasets).
| inputs = types.Object() | ||
|
|
||
| # Dataset selection tabs | ||
| dataset_mode_choices = types.TabsView() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need to see it IRL to confirm, but I do think using tabs is a reasonable UX here 👍
No description provided.