@@ -49,7 +49,7 @@ than one place.
49
49
julia> loss = L2DistLoss()
50
50
L2DistLoss()
51
51
52
- julia> value( loss, 3, 2)
52
+ julia> loss( 3, 2)
53
53
1
54
54
```
55
55
@@ -66,9 +66,9 @@ yourself in the code below. As such they are zero-cost
66
66
abstractions.
67
67
68
68
``` julia-repl
69
- julia> v1(loss,y,t) = value( loss, y,t)
69
+ julia> v1(loss,y,t) = loss( y,t)
70
70
71
- julia> v2(y,t) = value( L2DistLoss(), y,t)
71
+ julia> v2(y,t) = L2DistLoss()( y,t)
72
72
73
73
julia> @code_llvm v1(loss, 3, 2)
74
74
define i64 @julia_v1_70944(i64, i64) #0 {
@@ -115,46 +115,17 @@ performance overhead, and zero memory allocations on the heap.
115
115
116
116
The first thing we may want to do is compute the loss for some
117
117
observation (singular). In fact, all losses are implemented on
118
- single observations under the hood. The core function to compute
119
- the value of a loss is ` value ` . We will see throughout the
120
- documentation that this function allows for a lot of different
121
- method signatures to accomplish a variety of tasks.
122
-
123
- ``` @docs
124
- value
125
- ```
126
-
127
- It may be interesting to note, that this function also supports
128
- broadcasting and all the syntax benefits that come with it. Thus,
129
- it is quite simple to make use of preallocated memory for storing
130
- the element-wise results.
118
+ single observations under the hood, and are functors.
131
119
132
120
``` jldoctest bcast1
133
- julia> value.(L1DistLoss(), [2,5,-2], [1,2,3])
121
+ julia> loss = L1DistLoss()
122
+ L1DistLoss()
123
+
124
+ julia> loss.([2,5,-2], [1,2,3])
134
125
3-element Vector{Int64}:
135
126
1
136
127
3
137
128
5
138
-
139
- julia> buffer = zeros(3); # preallocate a buffer
140
-
141
- julia> buffer .= value.(L1DistLoss(), [2,5,-2], [1.,2,3])
142
- 3-element Vector{Float64}:
143
- 1.0
144
- 3.0
145
- 5.0
146
- ```
147
-
148
- Furthermore, with the loop fusion changes that were introduced in
149
- Julia 0.6, one can also easily weight the influence of each
150
- observation without allocating a temporary array.
151
-
152
- ``` jldoctest bcast1
153
- julia> buffer .= value.(L1DistLoss(), [2,5,-2], [1.,2,3]) .* [2,1,0.5]
154
- 3-element Vector{Float64}:
155
- 2.0
156
- 3.0
157
- 2.5
158
129
```
159
130
160
131
## Computing the 1st Derivatives
@@ -166,8 +137,7 @@ derivatives of the loss in one way or the other during the
166
137
training process.
167
138
168
139
To compute the derivative of some loss we expose the function
169
- [ ` deriv ` ] ( @ref ) . It supports the same exact method signatures as
170
- [ ` value ` ] ( @ref ) . It may be interesting to note explicitly, that
140
+ [ ` deriv ` ] ( @ref ) . It may be interesting to note explicitly, that
171
141
we always compute the derivative in respect to the predicted
172
142
` output ` , since we are interested in deducing in which direction
173
143
the output should change.
@@ -176,39 +146,6 @@ the output should change.
176
146
deriv
177
147
```
178
148
179
- Similar to [ ` value ` ] ( @ref ) , this function also supports
180
- broadcasting and all the syntax benefits that come with it. Thus,
181
- one can make use of preallocated memory for storing the
182
- element-wise derivatives.
183
-
184
- ``` jldoctest bcast2
185
- julia> deriv.(L2DistLoss(), [2,5,-2], [1,2,3])
186
- 3-element Vector{Int64}:
187
- 2
188
- 6
189
- -10
190
-
191
- julia> buffer = zeros(3); # preallocate a buffer
192
-
193
- julia> buffer .= deriv.(L2DistLoss(), [2,5,-2], [1.,2,3])
194
- 3-element Vector{Float64}:
195
- 2.0
196
- 6.0
197
- -10.0
198
- ```
199
-
200
- Furthermore, with the loop fusion changes that were introduced in
201
- Julia 0.6, one can also easily weight the influence of each
202
- observation without allocating a temporary array.
203
-
204
- ``` jldoctest bcast2
205
- julia> buffer .= deriv.(L2DistLoss(), [2,5,-2], [1.,2,3]) .* [2,1,0.5]
206
- 3-element Vector{Float64}:
207
- 4.0
208
- 6.0
209
- -5.0
210
- ```
211
-
212
149
## Computing the 2nd Derivatives
213
150
214
151
Additionally to the first derivative, we also provide the
@@ -220,30 +157,6 @@ derivative in respect to the predicted `output`.
220
157
deriv2
221
158
```
222
159
223
- Just like [ ` deriv ` ] ( @ref ) and [ ` value ` ] ( @ref ) , this function also
224
- supports broadcasting and all the syntax benefits that come with
225
- it. Thus, one can make use of preallocated memory for storing the
226
- element-wise derivatives.
227
-
228
- ``` jldoctest
229
- julia> deriv2.(LogitDistLoss(), [0.3, 2.3, -2], [-0.5, 1.2, 3])
230
- 3-element Vector{Float64}:
231
- 0.42781939304058886
232
- 0.3747397590950413
233
- 0.013296113341580313
234
-
235
- julia> buffer = zeros(3); # preallocate a buffer
236
-
237
- julia> buffer .= deriv2.(LogitDistLoss(), [0.3, 2.3, -2], [-0.5, 1.2, 3])
238
- 3-element Vector{Float64}:
239
- 0.42781939304058886
240
- 0.3747397590950413
241
- 0.013296113341580313
242
- ```
243
-
244
- Furthermore [ ` deriv2 ` ] ( @ref ) supports all the same method
245
- signatures as [ ` deriv ` ] ( @ref ) does.
246
-
247
160
## Properties of a Loss
248
161
249
162
In some situations it can be quite useful to assert certain
0 commit comments