Skip to content

Conversation

@ChicChip
Copy link

According to Java's compare function , I code a new function CompareValueONeil for go version ,and it only support number which is >= 0 , it is faster than CompareValue. When bsi include 200 million columnIds, using same range query,CompareValueONeil is cost about 1s when CompareValue cost about 12s(https://github.com/RoaringBitmap/RoaringBitmap/blob/cca90c986d5c0096bbeabb5f968833bf12c28c0e/bsi/src/main/java/org/roaringbitmap/bsi/longlong/Roaring64BitmapSliceIndex.java#L456)

…oNeilCompare), only for number which is >= 0
@ChicChip
Copy link
Author

ChicChip commented Nov 26, 2024

Test code read my file "total.bin"
file, _ := os.OpenFile(path+"/total.bin", os.O_RDWR, 0755)
defer file.Close()
total := roaring64.NewDefaultBSI()
total.ReadFrom(file)
t := time.Now().UnixNano()
res := total.CompareValue(0, roaring64.RANGE, 25, 30, nil)
fmt.Printf("totalCount[%v], count[%v], CompareValue cost %v [ms]\n", total.GetCardinality(), res.GetCardinality(), (time.Now().UnixNano()-t)/1000000)
t1 := time.Now().UnixNano()
res = total.CompareValueONeil(roaring64.RANGE, 25, 30, nil)
fmt.Printf("totalCount[%v], count[%v], CompareValueONeil cost %v [ms]\n", total.GetCardinality(), res.GetCardinality(), (time.Now().UnixNano()-t1)/1000000)

result
totalCount[207701480], count[234], CompareValue cost 14051 [ms]
totalCount[207701480], count[234], CompareValueONeil cost 882 [ms]

@guymolinari
Copy link
Contributor

I will run some tests in my environment. In the meantime, could we come up with a better more descriptive name for this function?

@guymolinari
Copy link
Contributor

Also (and better yet), when constructing the BSI we could potentially set min/max value providing some context about the nature of the data being stored in the BSI. Then we could choose to apply the optimization without creating an entirely new API.

@ChicChip
Copy link
Author

I will run some tests in my environment. In the meantime, could we come up with a better more descriptive name for this function?

yes, more descriptive name is better

@ChicChip
Copy link
Author

Also (and better yet), when constructing the BSI we could potentially set min/max value providing some context about the nature of the data being stored in the BSI. Then we could choose to apply the optimization without creating an entirely new API.

Yes, util now, BSI's min/max doesn't be refreshed when setValues or deserialization. If min/max exist in BSI, new API can cancel.

@lemire
Copy link
Member

lemire commented Jul 10, 2025

Is there some news on this PR?

@ofpiyush
Copy link
Contributor

ofpiyush commented Aug 7, 2025

I'm also interested in the progress of this.

Are you still working on this @ChicChip?

@ChicChip
Copy link
Author

ChicChip commented Aug 7, 2025

I'm also interested in the progress of this.

Are you still working on this @ChicChip?

bsi32 and bsi64 are supported in commit, but this request(#461 (comment)) about set min/max i haven't done yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants