-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
diff-delta: bound hash list length to avoid O(m*n) behavior
The diff-delta code can exhibit O(m*n) behavior with some patological data set where most hash entries end up in the same hash bucket. The latest code rework reduced the block size making it particularly vulnerable to this issue, but the issue was always there and can be triggered regardless of the block size. This patch does two things: 1) the hashing has been reworked to offer a better distribution to atenuate the problem a bit, and 2) a limit is imposed to the number of entries that can exist in the same hash bucket. Because of the above the code is a bit more expensive on average, but the problematic samples used to diagnoze the issue are now orders of magnitude less expensive to process with only a slight loss in compression. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
- Loading branch information
Nicolas Pitre
authored and
Junio C Hamano
committed
Mar 2, 2006
1 parent
cc5c59a
commit 5bb86b8
Showing
1 changed file
with
56 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters