-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
Description
I am on my last step of my R session, where I got a huge data.table object to be passed on to my 2 clients in CSV format. The 2 clients only need several columns of my data.table object. But, I am running tight on RAM and couldn't afford to create another R object, I also can't delete the unnecessary variable as the other client need it.
Currently, my implementation is:
set.seed(1)
dt <- data.table(
a = rnorm(1e8),
b = runif(1e8),
c = sample(letters,1e8,T)
)
fwrite(dt)
fwrite(dt[,.(a,c)]) # data requested by the first client
fwrite(dt[,.(b,c)]) # data requested by the second client
From my inspection and understanding, the second and the third fwrite, will create a throwaway object that requires additional RAM. Is it possible to have select argument for fwrite that doesn't require such temporary object? Something like, fwrite(dt,select=c("a","c")
?
#
Output of sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)
Matrix products: default
locale:
[1] LC_COLLATE=English_Singapore.1252 LC_CTYPE=English_Singapore.1252
[3] LC_MONETARY=English_Singapore.1252 LC_NUMERIC=C
[5] LC_TIME=English_Singapore.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.9
loaded via a namespace (and not attached):
[1] compiler_3.6.1 tools_3.6.1 curl_4.2
raneameya