Skip to content

a new method of the flatten function in DataFrames  #2890

@sprmnt21

Description

@sprmnt21

what do you think of the utility of having a method of the flatten function in DataFrames that expands a dataframe type df2 in the form dfexp?

df = DataFrame(a=1:3)
df1=DataFrame(b=11:15)

CSV.write("tmp1.csv", df)
CSV.write("tmp2.csv", df)
CSV.write("tmp3.csv", df1)

julia> df2
3×3 DataFrame
 Row │ files     other_cols  subdf                 
     │ String    Int64       DataFrame             
─────┼─────────────────────────────────────        
   1 │ tmp1.csv           1  3×1 DataFrame         
   2 │ tmp2.csv           2  3×1 DataFrame         
   3 │ tmp3.csv           3  5×1 DataFrame         


julia> dfexp=flatten(df2,:subdf)
11×4 DataFrame
 Row │ files     other_cols  a        b            
     │ String    Int64       Int64?   Int64?       
─────┼────────────────────────────────────────     
   1 │ tmp1.csv           1        1  missing      
   2 │ tmp1.csv           1        2  missing      
   3 │ tmp1.csv           1        3  missing      
   4 │ tmp2.csv           2        1  missing      
   5 │ tmp2.csv           2        2  missing      
   6 │ tmp2.csv           2        3  missing      
   7 │ tmp3.csv           3  missing       11      
   8 │ tmp3.csv           3  missing       12      
   9 │ tmp3.csv           3  missing       13      
  10 │ tmp3.csv           3  missing       14      
  11 │ tmp3.csv           3  missing       15    

this is a function that does what is indicated, just to give an idea of what is required not to suggest how it should be done.

mapreduce(r->crossjoin(DataFrame(r[Not(:subdf)]),r.subdf), (x, y) -> vcat(x, y; cols = :union), eachrow(df2))

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions