Skip to content

Commit

Permalink
builtin-grep: use external grep when we can take advantage of it
Browse files Browse the repository at this point in the history
It's not perfect, but it gets the "git grep some-random-string" down to
the good old half-a-second range for the kernel.

It should convert more of the argument flags for "grep", that should be
trivial to expand (I did a few just as an example). It should also bother
to try to return the right "hit" value (which it doesn't, right now - the
code is kind of there, but I didn't actually bother to do it _right_).

Also, right now it _just_ limits by number of arguments, but it should
also strictly speaking limit by total argument size (ie add up the length
of the filenames, and do the "exec_grep()" flush call if it's bigger than
some random value like 32kB).

But I think that it's _conceptually_ doing all the right things, and it
seems to work. So maybe somebody else can do some of the final polish.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
  • Loading branch information
Linus Torvalds authored and Junio C Hamano committed May 15, 2006
1 parent 07ea91d commit 1e2398d
Showing 1 changed file with 79 additions and 0 deletions.
79 changes: 79 additions & 0 deletions builtin-grep.c
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include "builtin.h"
#include <regex.h>
#include <fnmatch.h>
#include <sys/wait.h>

/*
* git grep pathspecs are somewhat different from diff-tree pathspecs;
Expand Down Expand Up @@ -409,12 +410,90 @@ static int grep_file(struct grep_opt *opt, const char *filename)
return i;
}

static int exec_grep(int argc, const char **argv)
{
pid_t pid;
int status;

argv[argc] = NULL;
pid = fork();
if (pid < 0)
return pid;
if (!pid) {
execvp("grep", (char **) argv);
exit(255);
}
while (waitpid(pid, &status, 0) < 0) {
if (errno == EINTR)
continue;
return -1;
}
if (WIFEXITED(status)) {
if (!WEXITSTATUS(status))
return 1;
return 0;
}
return -1;
}

#define MAXARGS 1000

static int external_grep(struct grep_opt *opt, const char **paths, int cached)
{
int i, nr, argc, hit;
const char *argv[MAXARGS+1];
struct grep_pat *p;

nr = 0;
argv[nr++] = "grep";
if (opt->word_regexp)
argv[nr++] = "-w";
if (opt->name_only)
argv[nr++] = "-l";
for (p = opt->pattern_list; p; p = p->next) {
argv[nr++] = "-e";
argv[nr++] = p->pattern;
}
argv[nr++] = "--";

hit = 0;
argc = nr;
for (i = 0; i < active_nr; i++) {
struct cache_entry *ce = active_cache[i];
if (ce_stage(ce) || !S_ISREG(ntohl(ce->ce_mode)))
continue;
if (!pathspec_matches(paths, ce->name))
continue;
argv[argc++] = ce->name;
if (argc < MAXARGS)
continue;
hit += exec_grep(argc, argv);
argc = nr;
}
if (argc > nr)
hit += exec_grep(argc, argv);
return 0;
}

static int grep_cache(struct grep_opt *opt, const char **paths, int cached)
{
int hit = 0;
int nr;
read_cache();

#ifdef __unix__
/*
* Use the external "grep" command for the case where
* we grep through the checked-out files. It tends to
* be a lot more optimized
*/
if (!cached) {
hit = external_grep(opt, paths, cached);
if (hit >= 0)
return hit;
}
#endif

for (nr = 0; nr < active_nr; nr++) {
struct cache_entry *ce = active_cache[nr];
if (ce_stage(ce) || !S_ISREG(ntohl(ce->ce_mode)))
Expand Down

0 comments on commit 1e2398d

Please sign in to comment.