Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
a90ba5d
Add list aggregation examples in lists.py
cr7pt0gr4ph7 Nov 11, 2025
0ac185b
docs: Explain list aggregation & sorting
cr7pt0gr4ph7 Nov 11, 2025
ba292a8
fix: Fix syntax error in docs
cr7pt0gr4ph7 Nov 11, 2025
3cb6d84
fix: Make the code formatter happy
cr7pt0gr4ph7 Nov 11, 2025
e541911
Fix formatting in lists-and-arrays.md
cr7pt0gr4ph7 Nov 11, 2025
71fd867
Update lists-and-arrays.md
cr7pt0gr4ph7 Nov 11, 2025
a72c0f8
Update aggregation examples to use struct fields
cr7pt0gr4ph7 Nov 12, 2025
413335b
fix: Make the code formatter happy
cr7pt0gr4ph7 Nov 12, 2025
88a48c8
Reference related API functions
cr7pt0gr4ph7 Nov 12, 2025
a9c9254
Fix formatting in lists-and-arrays.md
cr7pt0gr4ph7 Nov 12, 2025
0e5f410
Rename section markers in lists.py
cr7pt0gr4ph7 Nov 12, 2025
fcc0c18
Rename section markers in lists.py
cr7pt0gr4ph7 Nov 12, 2025
4524d28
Update lists-and-arrays.md
cr7pt0gr4ph7 Nov 12, 2025
cc15c1c
fix: Use correct code snippet section names
cr7pt0gr4ph7 Nov 12, 2025
fc49453
Update comments for struct list example section
cr7pt0gr4ph7 Nov 12, 2025
3157f86
fix: Add stub sections to the Rust example files
cr7pt0gr4ph7 Nov 12, 2025
d739879
fix: Remove trailing spaces
cr7pt0gr4ph7 Nov 12, 2025
b7c114b
Add markers for list-sorting section in lists.rs
cr7pt0gr4ph7 Nov 14, 2025
a999ed9
Add list-sorting section
cr7pt0gr4ph7 Nov 14, 2025
65190fd
Update lists.py
cr7pt0gr4ph7 Nov 14, 2025
4fdbc01
Clarify usage of eval and agg functions
cr7pt0gr4ph7 Nov 14, 2025
729803e
fix: Make the code formatter happy
cr7pt0gr4ph7 Nov 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions docs/source/src/python/user-guide/expressions/lists.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,79 @@
print(result.equals(result2))
# --8<-- [end:element-wise-regex]

# --8<-- [start:children]
df = pl.DataFrame(
{
"children": [
[
{"name": "Anne", "age": 5},
{"name": "Averill", "age": 7},
],
[
{"name": "Brandon", "age": 12},
{"name": "Brooke", "age": 9},
{"name": "Branson", "age": 11},
],
[{"name": "Camila", "age": 19}],
[
{"name": "Dennis", "age": 8},
{"name": "Doyle", "age": 11},
{"name": "Dina", "age": 18},
],
],
}
)

print(df)
# --8<-- [end:children]

# --8<-- [start:list-sorting]
result = df.select(
pl.col("children")
.list.eval(
pl.element()
.sort_by(pl.element().struct.field("age"), descending=True)
.struct.field("name")
)
.alias("names_by_age"),
pl.col("children")
.list.eval(pl.element().struct.field("age").min())
.alias("min_age"),
pl.col("children")
.list.eval(pl.element().struct.field("age").max())
.alias("max_age"),
)
print(result)
# --8<-- [end:list-sorting]

# --8<-- [start:list-aggregation]
result = df.select(
pl.col("children")
.list.eval(
pl.element()
.sort_by(pl.element().struct.field("age"), descending=True)
.struct.field("name")
)
.alias("names_by_age"),
pl.col("children")
.list.agg(pl.element().struct.field("age").min())
.alias("min_age"),
pl.col("children")
.list.agg(pl.element().struct.field("age").max())
.alias("max_age"),
)
print(result)
# --8<-- [end:list-aggregation]

# --8<-- [start:list-entropy]
result = df.with_columns(
pl.col("children")
.list.agg(pl.element().struct.field("age").entropy())
.alias("age_entropy"),
)
print(result)
# --8<-- [end:list-entropy]

# --8<-- [start:weather_by_day]
weather_by_day = pl.DataFrame(
{
Expand Down
16 changes: 16 additions & 0 deletions docs/source/src/rust/user-guide/expressions/lists.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,22 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
// Contribute the Rust translation of the Python example by opening a PR.
// --8<-- [end:element-wise-regex]

// --8<-- [start:children]
// Contribute the Rust translation of the Python example by opening a PR.
// --8<-- [end:children]

// --8<-- [start:list-sorting]
// Contribute the Rust translation of the Python example by opening a PR.
// --8<-- [end:list-sorting]

// --8<-- [start:list-aggregation]
// Contribute the Rust translation of the Python example by opening a PR.
// --8<-- [end:list-aggregation]

// --8<-- [start:list-entropy]
// Contribute the Rust translation of the Python example by opening a PR.
// --8<-- [end:list-entropy]

// --8<-- [start:weather_by_day]
// Contribute the Rust translation of the Python example by opening a PR.
// --8<-- [end:weather_by_day]
Expand Down
48 changes: 46 additions & 2 deletions docs/source/user-guide/expressions/lists-and-arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,10 +151,54 @@ If you are unfamiliar with the namespace `str` or the notation `(?i)` in the reg
time to
[look at how to work with strings and regular expressions in Polars](strings.md#check-for-the-existence-of-a-pattern).

### Aggregation & sorting

Like `select` on data frames, the two related functions `eval` and `agg` can also be
used to aggregate over or sort the list elements.

We'll reuse a slightly modified version of the example data from the very beginning:

{{code_block('user-guide/expressions/lists', 'children', ['List'])}}

```python exec="on" result="text" session="expressions/lists"
--8<-- "python/user-guide/expressions/lists.py:children"
```

Using `eval`, we can sort the list elements or compute some aggregations:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, this should reference list.agg as that makes more sense for aggregations.

Copy link
Contributor Author

@cr7pt0gr4ph7 cr7pt0gr4ph7 Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I didn't even know about list.agg. OTOH, what's even the point of having both list.agg and list.eval if the latter can do everything the former can? Dataframes also only have DataFrame.select, which is used for both use cases.

I'm mainly asking so I can explain the differences (if any) here.


Update: I've created a separate issue for this discussion:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coastalwhite Thanks for the suggestion (and the clarification in #25336). I've now modified the explanation by first showing the result of using eval for aggregations, and then showing why it makes sense to use agg instead in those cases.

I've incorporated a slightly modified version of your statement from #25336, and hope that's okay for you:

The list.agg and list.eval expressions are exactly the same, except one difference. If the evaluation expression is statically determined to return only one value, it will automatically explode the list into the inner values. This matches what .group_by(...).agg(...) does, hence the name.


{{code_block('user-guide/expressions/lists', 'list-sorting', ['list.eval', 'Expr.sort_by'])}}

```python exec="on" result="text" session="expressions/lists"
--8<-- "python/user-guide/expressions/lists.py:list-sorting"
```

`eval` will always return a list. Use `agg` to get `min_age` and `max_age` as scalar values instead
of single-element lists:

{{code_block('user-guide/expressions/lists', 'list-aggregation', ['list.agg'])}}

```python exec="on" result="text" session="expressions/lists"
--8<-- "python/user-guide/expressions/lists.py:list-aggregation"
```

If the evaluated expression is statically determined to return only one value, `agg` will
automatically explode the resulting list into the inner values. This matches what
`df.group_by(...).agg(...)` does, hence the name. This is in contrast with `eval`, which will not
perform such unwrapping.

While some aggregation functions like `.list.sum()` are directly available in the `list` namespace,
you can access more exotic aggregations like `entropy` via `agg`/`eval` only:

{{code_block('user-guide/expressions/lists', 'list-entropy', ['list.agg', 'Expr.entropy'])}}

```python exec="on" result="text" session="expressions/lists"
--8<-- "python/user-guide/expressions/lists.py:list-entropy"
```

### Row-wise computations

The function `eval` gives us access to the list elements and `pl.element` refers to each individual
element, but we can also use `pl.all()` to refer to all of the elements of the list.
`pl.all()` can be combined with `pl.concat_list(...)` to perform row-wise aggregations over a subset
of columns.

To show this in action, we will start by creating another dataframe with some more weather data:

Expand Down
Loading