Remove idxfloor and shrink age, and maxprobe #48380

oscardssmith · 2023-01-22T23:58:10Z

I believe this frees up 16 bytes for approximately free. idxfloor seems like a good idea, but it adds very little performance because with uniformly distributed hashes, on average, the idxfloor will be very low. The other change is to shrink age and maxprobe by 32 bits because there never should be 4 billion collisions next to each other, and age doesn't need to be 64 bits either.

petvana · 2023-01-23T12:21:51Z

The only major benefit of idxfloor is that we do not iterate over empty dict (#44411). We can add a fast path for count == 0 without idxfloor to get the same performance.

oscardssmith · 2023-01-23T17:08:51Z

that makes sense to me.

…s full idxfloor will usually be <4 so the compute and memory time of tracking this number aren't worth much

…ance of missing a concurrent write due to overflow in age seems acceptable to me (these changes together shrink by 8 bytes)

base/dict.jl

Co-authored-by: Kristoffer Carlsson <[email protected]>

simeonschaub · 2023-01-30T14:27:36Z

I believe we have some basic Dict benchmarks in BaseBenchmarks, so would be good to verify with those before merging

gbaraldi · 2023-01-30T15:03:20Z

@nanosoldier runbenchmarks("!scalar", vs=":master")

nanosoldier · 2023-01-30T15:11:47Z

Your job failed.

vtjnash · 2023-01-30T19:05:10Z

maxprobe could be arbitrarily large, if the hash function is poorly chosen (it just degrades the dict into an array)

oscardssmith · 2023-01-30T19:14:23Z

if the hash function is poorly chosen everything else about our Dict's performance breaks anyway. I don't see why iteration should be special.

petvana · 2023-01-30T23:47:51Z

maxprobe could be arbitrarily large, if the hash function is poorly chosen (it just degrades the dict into an array)

For maxprobe to reach 32 bits, you first need to execute ~ 2^62 comparisons just to insert the elements that seems quite intractable for any non-quantum computer. Short example

julia> struct X; x::Int64; end

julia> Base.hash(x::X) = UInt64(1)

julia> for n = 5000:5000:20000; print("n=",n); 
         @time (d = Dict(); for i=1:n; d[X(i)] = i; end)
       end
n=5000  0.458749 seconds (12.52 M allocations: 213.679 MiB, 30.10% gc time, 0.72% compilation time)
n=10000  1.442519 seconds (50.03 M allocations: 785.996 MiB, 7.78% gc time)
n=15000  3.760717 seconds (112.53 M allocations: 1.699 GiB, 10.08% gc time)
n=20000  9.156206 seconds (200.07 M allocations: 3.070 GiB, 7.67% gc time)

base/dict.jl

petvana · 2023-03-26T12:33:14Z

base/dict.jl

        new(copy(d.slots), copy(d.keys), copy(d.vals), d.ndel, d.count, d.age,
-            d.idxfloor, d.maxprobe)
+            d.maxprobe)


I guess it can be moved to the previous line.

Co-authored-by: Petr Vana <[email protected]>

petvana · 2023-03-30T12:39:51Z

Now, the error from docs is that skip_deleted_floor! is missing in base/set.jl and base/weakkeydict.jl

oscardssmith · 2025-10-07T05:39:08Z

closing in favor of #59769

oscardssmith added the performance Must go faster label Jan 22, 2023

oscarddssmith added 4 commits January 28, 2023 20:52

since our dicts are almost always going to be between 1/4th and 2/3rd…

6d7b7d3

…s full idxfloor will usually be <4 so the compute and memory time of tracking this number aren't worth much

maxprobe will realistically never be >1000 or so and a 1/4 billion ch…

efc047c

…ance of missing a concurrent write due to overflow in age seems acceptable to me (these changes together shrink by 8 bytes)

change maxallowedprobe to better reflect the literature

dc1cd8d

improvements

8293543

oscardssmith force-pushed the remove-idxfloor branch from a5ee881 to 8293543 Compare January 30, 2023 13:30

KristofferC reviewed Jan 30, 2023

View reviewed changes

base/dict.jl Outdated Show resolved Hide resolved

Update base/dict.jl

bfe5c5e

Co-authored-by: Kristoffer Carlsson <[email protected]>

simeonschaub added the needs nanosoldier run This PR should have benchmarks run on it label Jan 30, 2023

typos

b410717

typo

9fd92d1

petvana added the collections Data structures holding multiple items, e.g. sets label Mar 18, 2023

petvana reviewed Mar 26, 2023

View reviewed changes

base/dict.jl Outdated Show resolved Hide resolved

petvana reviewed Mar 26, 2023

View reviewed changes

oscardssmith and others added 2 commits March 30, 2023 07:46

Update base/dict.jl

c1ee823

Co-authored-by: Petr Vana <[email protected]>

Merge branch 'master' into remove-idxfloor

142102e

oscardssmith mentioned this pull request Oct 7, 2025

Simplify Dict representation #59769

Open

oscardssmith closed this Oct 7, 2025

Uh oh!

Remove idxfloor and shrink age, and maxprobe #48380

Remove idxfloor and shrink age, and maxprobe #48380

Uh oh!

Conversation

oscardssmith commented Jan 22, 2023

Uh oh!

petvana commented Jan 23, 2023

Uh oh!

oscardssmith commented Jan 23, 2023

Uh oh!

Uh oh!

simeonschaub commented Jan 30, 2023

Uh oh!

gbaraldi commented Jan 30, 2023

Uh oh!

nanosoldier commented Jan 30, 2023

Uh oh!

vtjnash commented Jan 30, 2023

Uh oh!

oscardssmith commented Jan 30, 2023

Uh oh!

petvana commented Jan 30, 2023

Uh oh!

Uh oh!

petvana Mar 26, 2023

Choose a reason for hiding this comment

Uh oh!

petvana commented Mar 30, 2023

Uh oh!

oscardssmith commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants