Skip to content

Commit 6aa90e8

Browse files
authored
md5: minor optimization in software backend (#755)
Replaces the logical OR in the G function with addition. It seemingly results in a better ALU utilization and improves performance by several percents. From 699 MB/s to 753 MB/s on my x86 PC and from 910 MB/s to 960 MB/s on Mac M4.
1 parent 8af25ee commit 6aa90e8

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

md5/src/compress/soft.rs

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,11 @@ fn op_f(w: u32, x: u32, y: u32, z: u32, m: u32, c: u32, s: u32) -> u32 {
1212
}
1313
#[inline(always)]
1414
fn op_g(w: u32, x: u32, y: u32, z: u32, m: u32, c: u32, s: u32) -> u32 {
15-
((x & z) | (y & !z))
15+
// We replace the logical OR in `(x & z) | (y & !z)` with addition.
16+
// Since masked bits do not overlap, the expressions are equivalent;
17+
// however, addition results in better performance on high-end CPUs,
18+
// likely due to improved ALU utilization.
19+
((x & z).wrapping_add(y & !z))
1620
.wrapping_add(w)
1721
.wrapping_add(m)
1822
.wrapping_add(c)

0 commit comments

Comments
 (0)