Release tidyr 0.4.0 · tidyverse/tidyr

Nested data frames

nest() and unnest() have been overhauled to support a useful way of structuring data frames: the nested data frame. In a grouped data frame, you have one row per observation, and additional metadata define the groups. In a nested data frame, you have one row per group, and the individual observations are stored in a column that is a list of data frames. This is a useful structure when you have lists of other objects (like models) with one element per group.

nest() now produces a single list of data frames called "data" rather
than a list column for each variable. Nesting variables are not included
in nested data frames. It also works with grouped data frames made
by dplyr::group_by(). You can override the default column name with .key.
unnest() gains a .drop argument which controls what happens to
other list columns. By default, they're kept if the output doesn't require
row duplication; otherwise they're dropped.
unnest() now has mutate() semantics for ... - this allows you to
unnest transformed columns more easily. (Previously it used select semantics).

Expanding

expand() once again allows you to evaluate arbitrary expressions like
full_seq(year). If you were previously using c() to created nested
combinations, you'll now need to use nesting() (#85, #121).
nesting() and crossing() allow you to create nested and crossed data
frames from individual vectors. crossing() is similar to
base::expand.grid()
full_seq(x, period) creates the full sequence of values from min(x) to
max(x) every period values.

Minor bug fixes and improvements

fill() fills in NULLs in list-columns.
fill() gains a direction argument so that it can fill either upwards or
downwards (#114).
gather() now stores the key column as character, by default. To revert to
the previous behaviour of using a factor (which allows you to preserve the
ordering of the columns), use key_factor = TRUE (#96).
All tidyr verbs do the right thing for grouped data frames created by
group_by() (#122, #129, #81).
seq_range() has been removed. It was never used or announced.
spread() once again creates columns of mixed type when convert = TRUE
(#118, @jennybc). spread() with drop = FALSE handles zero-length
factors (#56). spread()ing a data frame with only key and value columns
creates a one row output (#41).
unite() now removes old columns before adding new (#89, @krlmlr).
separate() now warns if defunct ... argument is used (#151, @krlmlr).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tidyr 0.4.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Nested data frames

Expanding

Minor bug fixes and improvements

Uh oh!