Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Draft state.
Closes Gradient Clipping #596.
Adds Storage and Gradient view/mutating methods.
dfdx::nn_traits::WithGradstrait anddfdx_derives::WithGradsproc macro, basead onZeroGrads.ZeroGradstrait could be merged into theWithGradsby mostly just merging their methods.dfdx_core::tensor::WithStoragetrait.Changed some methods from
Gradients:get_mutaspub.get_refaspub, and lower the requirements from&mut selfto&self.Added gradient clamping and cliping methods.
Example using clip_norm:
Note that
clip_normdoesn't change the grads "direction" because all grad values are scaled by the same value, whileclip_valuedoes changes the direction (because some values are changed while others are left intact). So for gradient descent, where the grads direction is supposed to be somewhat followed, my guess is thatclip_normis better.