Skip to content

Commit

Permalink
grep: load file data after checking binary-ness
Browse files Browse the repository at this point in the history
Usually we load each file to grep into memory, check whether
it's binary, and then either grep it (the default) or not
(if "-I" was given).

In the "-I" case, we can skip loading the file entirely if
it is marked as binary via gitattributes. On my giant
3-gigabyte media repository, doing "git grep -I foo" went
from:

  real    0m0.712s
  user    0m0.044s
  sys     0m4.780s

to:

  real    0m0.026s
  user    0m0.016s
  sys     0m0.020s

Obviously this is an extreme example. The repo is almost
entirely binary files, and you can see that we spent all of
our time asking the kernel to read() the data. However, with
a cold disk cache, even avoiding a few binary files can have
an impact.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
  • Loading branch information
Jeff King authored and Junio C Hamano committed Feb 2, 2012
1 parent 41b59bf commit 0826579
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions grep.c
Original file line number Diff line number Diff line change
Expand Up @@ -1019,9 +1019,6 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
}
opt->last_shown = 0;

if (grep_source_load(gs) < 0)
return 0;

switch (opt->binary) {
case GREP_BINARY_DEFAULT:
if (grep_source_is_binary(gs))
Expand All @@ -1042,6 +1039,9 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle

try_lookahead = should_lookahead(opt);

if (grep_source_load(gs) < 0)
return 0;

bol = gs->buf;
left = gs->size;
while (left) {
Expand Down

0 comments on commit 0826579

Please sign in to comment.