Fix numerical instability with high values of the degrees of freedom of the t-distribution by FelixNoessler · Pull Request #166 · JuliaStats/StatsFuns.jl

FelixNoessler · 2024-04-22T16:45:46Z

try to fix #165

…reedom of the t-distribution

devmotion

Thank you for the PR!

Can you check whether this fixes the problems, e.g. similar to the visual analysis in #165? I think it would be good to also add a test that checks inputs for which these functions are broken on the master branch.

devmotion · 2024-04-23T22:02:08Z

src/distrs/fdist.jl

+        # the evaluation of the CDF with the t-distribution combined with the
+        # beta distribution is unstable for high ν2,
+        # therefore, we switch from the beta to the chi-squared distribution at ν > 1.0e7
+        if ν2 > 1.0e7


I wonder if this cutoff has to be type-dependent, ie whether a different value is needed for Float32 or Float64. Can you check the current output for these number types as well?

Here is comparison between using the threshold and not using it. Apparently, there is no big difference between calculating with Float32 and Float64. We could set the threshold a bit higher but smaller than 10^8.

I am a bit wondering why the quantile function is not smooth.

ν_f64 = 10.0 .^ LinRange(6, 11, 1000) ν_f32 = Float32.(ν_f64) d_f64 = TDist.(ν_f64) d_f32 = TDist.(ν_f32) qs = [0.75, 0.8, 0.9, 0.95, 0.99] before_f64 = Vector{Float64}[] before_f32 = Vector{Float32}[] for q in qs push!(before_f64, quantile.(d_f64, q)) push!(before_f32, quantile.(d_f32, q)) end ################# "fix" the quantile function after_f64 = Vector{Float64}[] after_f32 = Vector{Float32}[] for q in qs push!(after_f64, quantile.(d_f64, q)) push!(after_f32, quantile.(d_f32, q)) end begin fig = Figure(size = (500, 1300)) for i in eachindex(qs) ax = Axis(fig[i,1]; ylabel = "$(qs[i]) quantile", xlabel = "degrees of freedom ν", xscale = log10, xlabelvisible = i == length(qs) ? true : false) lines!(ν_f64, before_f64[i]; label = "before f64") lines!(ν_f32, before_f32[i]; label = "before f32") lines!(ν_f64, after_f64[i]; label = "after f64") lines!(ν_f32, after_f32[i]; label = "after f32") if i == length(qs) axislegend(ax) end end display(fig) end

devmotion · 2024-04-23T22:02:48Z

src/distrs/tdist.jl

+    # the logpdf equation of the t-distribution is numerically unstable for high ν,
+    # therefore we switch at ν > 1.0e7 to the normal distribution
+    ν > 1.0e7 && return normlogpdf(x)


Similarly, what's the current output (without the change) for e.g. Float32 and Float64?

Here is the same analysis for the log pdf. It seems that there is no best value for the threshold because the numerical instability can start from 10 ^ 7.2 for the log pdf of 0.5, but for higher x values it is better to have a higher threshold value.

ν_f64 = 10.0 .^ LinRange(6.5, 8.5, 1000) ν_f32 = Float32.(ν_f64) d_f64 = TDist.(ν_f64) d_f32 = TDist.(ν_f32) xs = [0.1, 0.5, 10.0, 20.0] before_f64 = Vector{Float64}[] before_f32 = Vector{Float32}[] for x in xs push!(before_f64, logpdf.(d_f64, x)) push!(before_f32, logpdf.(d_f32, x)) end ################# "fix" the quantile function after_f64 = Vector{Float64}[] after_f32 = Vector{Float32}[] for x in xs push!(after_f64, logpdf.(d_f64, x)) push!(after_f32, logpdf.(d_f32, x)) end begin fig = Figure(size = (500, 800)) for i in eachindex(xs) ax = Axis(fig[i,1]; ylabel = "logpdf of $(xs[i])", xlabel = "degrees of freedom ν", xscale = log10, xlabelvisible = i == length(xs) ? true : false) lines!(ν_f64, before_f64[i]; label = "before f64") lines!(ν_f32, before_f32[i]; label = "before f32") lines!(ν_f64, after_f64[i]; label = "after f64") lines!(ν_f32, after_f32[i]; label = "after f32") if i == length(xs) axislegend(ax) end end display(fig) end

try to fix numerical instability with high values of the degrees of f…

0424b04

…reedom of the t-distribution

andreasnoack requested a review from devmotion April 22, 2024 17:51

devmotion reviewed Apr 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix numerical instability with high values of the degrees of freedom of the t-distribution#166

Fix numerical instability with high values of the degrees of freedom of the t-distribution#166
FelixNoessler wants to merge 1 commit intoJuliaStats:masterfrom
FelixNoessler:t_dist_numerical_instability

FelixNoessler commented Apr 22, 2024

Uh oh!

devmotion left a comment

Uh oh!

devmotion Apr 23, 2024

Uh oh!

FelixNoessler Apr 24, 2024

Uh oh!

devmotion Apr 23, 2024

Uh oh!

FelixNoessler Apr 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FelixNoessler commented Apr 22, 2024

Uh oh!

devmotion left a comment

Choose a reason for hiding this comment

Uh oh!

devmotion Apr 23, 2024

Choose a reason for hiding this comment

Uh oh!

FelixNoessler Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

devmotion Apr 23, 2024

Choose a reason for hiding this comment

Uh oh!

FelixNoessler Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants