@@ -54,7 +54,7 @@ Predictors:
5454 b(unknown) & c(unknown)
5555
5656julia> df = DataFrame(y = rand(9), a = 1:9, b = rand(9), c = repeat(["d","e","f"], 3))
57- 9×4 DataFrame
57+ 9×4 DataFrames. DataFrame
5858│ Row │ y │ a │ b │ c │
5959│ │ Float64 │ Int64 │ Float64 │ String │
6060├─────┼────────────┼───────┼───────────┼────────┤
@@ -108,6 +108,11 @@ The left-hand side has one term `y` which means that the response variable is
108108the column from the data named ` :y ` . The response can be accessed with the
109109analogous ` response(f, df) ` function.
110110
111+ !!! note
112+
113+ To make a "one-sided" formula (with no response), put a ` 0 ` on the left-hand
114+ side, like ` @formula(0 ~ 1 + a + b) ` .
115+
111116The right hand side is made up of a number of different ** terms** , separated by
112117` + ` : ` 1 + a + b + c + b&c ` . Each term corresponds to one or more columns in the
113118generated model matrix:
@@ -214,34 +219,34 @@ For instance, to fit a linear regression to a log-transformed response:
214219julia> using GLM
215220
216221julia> lm(@formula(log(y) ~ 1 + a + b), df)
217- StatsModels.TableRegressionModel{LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
222+ StatsModels.TableRegressionModel{LinearModel{GLM. LmResp{Array{Float64,1}},GLM. DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
218223
219224:(log(y)) ~ 1 + a + b
220225
221226Coefficients:
222- ──────────────────────────────────────────────────────
223- Estimate Std.Error t value Pr(>|t|)
224- ──────────────────────────────────────────────────────
225- (Intercept) -4.16168 2.98788 -1.39285 0.2131
226- a 0.357482 0.342126 1.04489 0.3363
227- b 2.32528 3.13735 0.741159 0.4866
228- ──────────────────────────────────────────────────────
227+ ──────────────────────────────────────────────────────────────────────────────
228+ Estimate Std. Error t value Pr(>|t|) Lower 95% Upper 95%
229+ ──────────────────────────────────────────────────────────────────────────────
230+ (Intercept) -4.16168 2.98788 -1.39285 0.2131 -11.4727 3.14939
231+ a 0.357482 0.342126 1.04489 0.3363 -0.479669 1.19463
232+ b 2.32528 3.13735 0.741159 0.4866 -5.35154 10.0021
233+ ──────────────────────────────────────────────────────────────────────────────
229234
230- julia> df[: log_y] = log.(df[:y] );
235+ julia> df. log_y = log.(df.y );
231236
232237julia> lm(@formula(log_y ~ 1 + a + b), df) # equivalent
233- StatsModels.TableRegressionModel{LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
238+ StatsModels.TableRegressionModel{LinearModel{GLM. LmResp{Array{Float64,1}},GLM. DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
234239
235240log_y ~ 1 + a + b
236241
237242Coefficients:
238- ──────────────────────────────────────────────────────
239- Estimate Std.Error t value Pr(>|t|)
240- ──────────────────────────────────────────────────────
241- (Intercept) -4.16168 2.98788 -1.39285 0.2131
242- a 0.357482 0.342126 1.04489 0.3363
243- b 2.32528 3.13735 0.741159 0.4866
244- ──────────────────────────────────────────────────────
243+ ──────────────────────────────────────────────────────────────────────────────
244+ Estimate Std. Error t value Pr(>|t|) Lower 95% Upper 95%
245+ ──────────────────────────────────────────────────────────────────────────────
246+ (Intercept) -4.16168 2.98788 -1.39285 0.2131 -11.4727 3.14939
247+ a 0.357482 0.342126 1.04489 0.3363 -0.479669 1.19463
248+ b 2.32528 3.13735 0.741159 0.4866 -5.35154 10.0021
249+ ──────────────────────────────────────────────────────────────────────────────
245250
246251```
247252
@@ -262,9 +267,9 @@ julia> modelmatrix(@formula(y ~ 1 + b + identity(1+b)), df)
262267 1.0 0.0203749 1.02037
263268```
264269
265- ## Constructing a formula programatically
270+ ## Constructing a formula programmatically
266271
267- A formula can be constructed at run-time by creating ` Term ` s and combining them
272+ A formula can be constructed at runtime by creating ` Term ` s and combining them
268273with the formula operators ` + ` , ` & ` , and ` ~ ` :
269274
270275``` jldoctest 1
@@ -279,6 +284,20 @@ Predictors:
279284 a(unknown) & b(unknown)
280285```
281286
287+ !!! warning
288+
289+ Even though the `@formula` macro supports arbitrary julia functions,
290+ runtime (programmatic) formula construction does not. This is because to
291+ resolve a symbol giving a function's _name_ into the actual _function_
292+ itself, it's necessary to `eval`. In practice this is not often an issue,
293+ _except_ in cases where a package provides special syntax by overloading a
294+ function (like `|` for
295+ [MixedModels.jl](https://github.com/dmbates/MixedModels.jl), or `absorb`
296+ for [Econometrics.jl](https://github.com/Nosferican/Econometrics.jl)). In
297+ these cases, you should use the corresponding constructors for the actual
298+ terms themselves (e.g., `RanefTerm` and `FixedEffectsTerm` respectively), as
299+ long as the packages have [implemented support for them](@ref extend-runtime).
300+
282301The [ ` term ` ] ( @ref ) function constructs a term of the appropriate type from
283302symbols and numbers, which makes it easy to work with collections of mixed type:
284303
@@ -338,26 +357,26 @@ julia> β_true = 1:8;
338357
339358julia> ϵ = randn(100)*0.1;
340359
341- julia> data[:y] = X*β_true .+ ϵ;
360+ julia> data.y = X*β_true .+ ϵ;
342361
343362julia> mod = fit(LinearModel, @formula(y ~ 1 + a*b), data)
344- StatsModels.TableRegressionModel{LinearModel{LmResp{Array{Float64,1}},DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
363+ StatsModels.TableRegressionModel{LinearModel{GLM. LmResp{Array{Float64,1}},GLM. DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
345364
346365y ~ 1 + a + b + a & b
347366
348367Coefficients:
349- ───────────────────────────────────────────────────
350- Estimate Std.Error t value Pr(>|t|)
351- ───────────────────────────────────────────────────
352- (Intercept) 0.98878 0.0384341 25.7266 <1e-43
353- a 2.00843 0.0779388 25.7694 <1e-43
354- b: e 3.03726 0.0616371 49.2764 <1e-67
355- b: f 4.03909 0.0572857 70.5078 <1e-81
356- b: g 5.02948 0.0587224 85.6484 <1e-88
357- a & b: e 5.9385 0.10753 55.2264 <1e-71
358- a & b: f 6.9073 0.112483 61.4075 <1e-75
359- a & b: g 7.93918 0.111285 71.3407 <1e-81
360- ───────────────────────────────────────────────────
368+ ──────────────────────────────────────────────────────────────────────────
369+ Estimate Std. Error t value Pr(>|t|) Lower 95% Upper 95%
370+ ──────────────────────────────────────────────────────────────────────────
371+ (Intercept) 0.98878 0.0384341 25.7266 <1e-43 0.912447 1.06511
372+ a 2.00843 0.0779388 25.7694 <1e-43 1.85364 2.16323
373+ b: e 3.03726 0.0616371 49.2764 <1e-67 2.91484 3.15967
374+ b: f 4.03909 0.0572857 70.5078 <1e-81 3.92531 4.15286
375+ b: g 5.02948 0.0587224 85.6484 <1e-88 4.91285 5.14611
376+ a & b: e 5.9385 0.10753 55.2264 <1e-71 5.72494 6.15207
377+ a & b: f 6.9073 0.112483 61.4075 <1e-75 6.6839 7.1307
378+ a & b: g 7.93918 0.111285 71.3407 <1e-81 7.71816 8.16021
379+ ──────────────────────────────────────────────────────────────────────────
361380
362381```
363382
0 commit comments