Skip to content

Conversation

@barankarakus
Copy link
Contributor

@barankarakus barankarakus commented Sep 6, 2020

Fixes #51.

Two changes:

  1. Change to computeλ to ensure λmax = 0 leads to an output of [0] and
    not [NaN, ..., NaN].
  2. Change to fit! to ensure the case where autoλ = true and λmax = 0 is
    handled correctly (rather than throwing an error).

Two changes:
1) Change to computeλ to ensure λmax = 0 leads to an output of [0] and 
not [NaN, ..., NaN].
2) Change to fit! to ensure the case where autoλ = true and λmax = 0 is 
handled correctly (rather than throwing an error).
@barankarakus barankarakus changed the title Correctly handling the case λmax = 0. Correctly handling the case λmax = 0; fixes #51 Sep 6, 2020
@barankarakus barankarakus changed the title Correctly handling the case λmax = 0; fixes #51 Correctly handling the case λmax = 0. Sep 6, 2020
@coveralls
Copy link

coveralls commented Sep 6, 2020

Pull Request Test Coverage Report for Build 202

  • 19 of 19 (100.0%) changed or added relevant lines in 2 files are covered.
  • 7 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+6.6%) to 91.049%

Files with Coverage Reduction New Missed Lines %
src/segselect.jl 1 89.23%
src/coordinate_descent.jl 3 93.63%
src/Lasso.jl 3 88.05%
Totals Coverage Status
Change from base Build 198: 6.6%
Covered Lines: 885
Relevant Lines: 972

💛 - Coveralls

Copy link
Collaborator

@AsafManela AsafManela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change.
It seems like the test case is basically one where there is no variation in y.
Do you think you could add a test for this case?

@barankarakus
Copy link
Contributor Author

Added tests (and some more minor changes). Let me know if anything else needs done!

# Compute automatic λ values based on λmax and λminratio
function computeλ(λmax, λminratio, α, nλ)
λmax /= α
if isapprox(λmax, 0; atol=1e-10) # then assuming λmax = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky because I think lambda is not unitless, so if it is small or not depends on the data given.
How does glmnet in R (or GLMNet.jl) handle this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I've changed the equality check to an isapprox() check is due to floating point arithmetic leading to a lambdamax that should actually be zero being very close to zero but non-zero instead. Simple example when this happens is a design matrix X with entries sampled from U[0, 1] and y a non-zero vector with identical entries.

I agree, the data could be such that lambdamax is genuinely very small but non-zero.

That said, I think it would be very rare to encounter such data in practice... especially since lambdamax (for the linear model) scales linearly with X and y, and we tend to standardise these.

I see two approaches going forward:

  1. Keep this check as is - the case where it would fail to produce correct output basically never occurs anyway.

  2. Revert back to the equality check. The real case in which the package failed was the case where lambdamax was exactly zero, anyway. Moreover, even if lambdamax should be zero but instead is a very small number, there is no major problem: the solver works very fast and it is clear from the output that every value of lambda yields zero active coefficients.

I'll leave it to you to decide 😃.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally: I'm not sure how glmnet in R or Julia handles this.

return true
end

@test zero_variation_test() == true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use @test_log instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, any idea why the tests stopped working in julia v1.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use @test_log instead?

I agree. Will implement tomorrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, any idea why the tests stopped working in julia v1.0?

Unfortunately nope!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Package fails when response is zero vector.

3 participants