Skip to content

Commit

Permalink
Optimize match_pathspec() to avoid fnmatch()
Browse files Browse the repository at this point in the history
"git add *" is actually fundamentally different from "git add .", and
yeah, you should generally use the latter.

The reason? The argument list is actually something different from what
you think it is. For git, it's a "pathspec", so what actualy happens is
that in *both* cases, it will really traverse the whole tree, and then
match every file it finds against the pathspec.

So think of the arguments not as a file list, but as a random bunch of
patterns to match against the files you have!

Which is why the cost is actually approximately O(n*m), where "n" is the
size of the working tree, and "m" is the number of pathspecs.

So the reason "git add ." is fast is actually that "m" in that case is
just 1 (just one trivial pattern), and then "git add *" is slow because
"m" is large (lots of complicated patterns). In both cases, 'n' is the
same (== the whole set of files in your working tree).

Anyway, here's a trivial patch that doesn't change this fundamental fact,
but that avoids doing anything *expensive* until we've done some cheap
initial tests. It may or may not help your test-case, but it's pretty
simple and it matches the other git optimizations in this area (ie
"conceptually handle the general case, but optimize the simple cases where
we can exit early")

Notice how this patch doesn' actually change the fundamental O(n^2)
behaviour, but it makes it much cheaper by generally avoiding the
expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not
needed.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
  • Loading branch information
Linus Torvalds authored and Junio C Hamano committed Apr 27, 2008
1 parent 36c79d2 commit 88ea811
Showing 1 changed file with 20 additions and 2 deletions.
22 changes: 20 additions & 2 deletions dir.c
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,11 @@ int common_prefix(const char **pathspec)
return prefix;
}

static inline int special_char(unsigned char c1)
{
return !c1 || c1 == '*' || c1 == '[' || c1 == '?';
}

/*
* Does 'match' matches the given name?
* A match is found if
Expand All @@ -69,14 +74,27 @@ static int match_one(const char *match, const char *name, int namelen)
int matchlen;

/* If the match was just the prefix, we matched */
matchlen = strlen(match);
if (!matchlen)
if (!*match)
return MATCHED_RECURSIVELY;

for (;;) {
unsigned char c1 = *match;
unsigned char c2 = *name;
if (special_char(c1))
break;
if (c1 != c2)
return 0;
match++;
name++;
namelen--;
}


/*
* If we don't match the matchstring exactly,
* we need to match by fnmatch
*/
matchlen = strlen(match);
if (strncmp(match, name, matchlen))
return !fnmatch(match, name, 0) ? MATCHED_FNMATCH : 0;

Expand Down

0 comments on commit 88ea811

Please sign in to comment.