Skip to content

Commit

Permalink
Merge branch 'nd/untracked-cache'
Browse files Browse the repository at this point in the history
Teach the index to optionally remember already seen untracked files
to speed up "git status" in a working tree with tons of cruft.

* nd/untracked-cache: (24 commits)
  git-status.txt: advertisement for untracked cache
  untracked cache: guard and disable on system changes
  mingw32: add uname()
  t7063: tests for untracked cache
  update-index: test the system before enabling untracked cache
  update-index: manually enable or disable untracked cache
  status: enable untracked cache
  untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE
  untracked cache: mark index dirty if untracked cache is updated
  untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS
  untracked cache: avoid racy timestamps
  read-cache.c: split racy stat test to a separate function
  untracked cache: invalidate at index addition or removal
  untracked cache: load from UNTR index extension
  untracked cache: save to an index extension
  ewah: add convenient wrapper ewah_serialize_strbuf()
  untracked cache: don't open non-existent .gitignore
  untracked cache: mark what dirs should be recursed/saved
  untracked cache: record/validate dir mtime and reuse cached output
  untracked cache: make a wrapper around {open,read,close}dir()
  ...
  • Loading branch information
Junio C Hamano committed May 26, 2015
2 parents a26d48a + aeb6f8b commit 38ccaf9
Show file tree
Hide file tree
Showing 21 changed files with 1,822 additions and 58 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,7 @@
/test-delta
/test-dump-cache-tree
/test-dump-split-index
/test-dump-untracked-cache
/test-scrap-cache-tree
/test-genrandom
/test-hashmap
Expand Down
5 changes: 4 additions & 1 deletion Documentation/git-status.txt
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,10 @@ When `-u` option is not used, untracked files and directories are
shown (i.e. the same as specifying `normal`), to help you avoid
forgetting to add newly created files. Because it takes extra work
to find untracked files in the filesystem, this mode may take some
time in a large working tree. You can use `no` to have `git status`
time in a large working tree.
Consider enabling untracked cache and split index if supported (see
`git update-index --untracked-cache` and `git update-index
--split-index`), Otherwise you can use `no` to have `git status`
return more quickly without showing untracked files.
+
The default can be changed using the status.showUntrackedFiles
Expand Down
14 changes: 14 additions & 0 deletions Documentation/git-update-index.txt
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,20 @@ may not support it yet.
the shared index file. This mode is designed for very large
indexes that take a significant amount of time to read or write.

--untracked-cache::
--no-untracked-cache::
Enable or disable untracked cache extension. This could speed
up for commands that involve determining untracked files such
as `git status`. The underlying operating system and file
system must change `st_mtime` field of a directory if files
are added or deleted in that directory.

--force-untracked-cache::
For safety, `--untracked-cache` performs tests on the working
directory to make sure untracked cache can be used. These
tests can take a few seconds. `--force-untracked-cache` can be
used to skip the tests.

\--::
Do not interpret any more arguments as options.

Expand Down
62 changes: 62 additions & 0 deletions Documentation/technical/index-format.txt
Original file line number Diff line number Diff line change
Expand Up @@ -233,3 +233,65 @@ Git index format
The remaining index entries after replaced ones will be added to the
final index. These added entries are also sorted by entry name then
stage.

== Untracked cache

Untracked cache saves the untracked file list and necessary data to
verify the cache. The signature for this extension is { 'U', 'N',
'T', 'R' }.

The extension starts with

- A sequence of NUL-terminated strings, preceded by the size of the
sequence in variable width encoding. Each string describes the
environment where the cache can be used.

- Stat data of $GIT_DIR/info/exclude. See "Index entry" section from
ctime field until "file size".

- Stat data of core.excludesfile

- 32-bit dir_flags (see struct dir_struct)

- 160-bit SHA-1 of $GIT_DIR/info/exclude. Null SHA-1 means the file
does not exist.

- 160-bit SHA-1 of core.excludesfile. Null SHA-1 means the file does
not exist.

- NUL-terminated string of per-dir exclude file name. This usually
is ".gitignore".

- The number of following directory blocks, variable width
encoding. If this number is zero, the extension ends here with a
following NUL.

- A number of directory blocks in depth-first-search order, each
consists of

- The number of untracked entries, variable width encoding.

- The number of sub-directory blocks, variable width encoding.

- The directory name terminated by NUL.

- A number of untrached file/dir names terminated by NUL.

The remaining data of each directory block is grouped by type:

- An ewah bitmap, the n-th bit marks whether the n-th directory has
valid untracked cache entries.

- An ewah bitmap, the n-th bit records "check-only" bit of
read_directory_recursive() for the n-th directory.

- An ewah bitmap, the n-th bit indicates whether SHA-1 and stat data
is valid for the n-th directory and exists in the next data.

- An array of stat data. The n-th data corresponds with the n-th
"one" bit in the previous ewah bitmap.

- An array of SHA-1. The n-th SHA-1 corresponds with the n-th "one" bit
in the previous ewah bitmap.

- One NUL.
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -574,6 +574,7 @@ TEST_PROGRAMS_NEED_X += test-date
TEST_PROGRAMS_NEED_X += test-delta
TEST_PROGRAMS_NEED_X += test-dump-cache-tree
TEST_PROGRAMS_NEED_X += test-dump-split-index
TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
TEST_PROGRAMS_NEED_X += test-genrandom
TEST_PROGRAMS_NEED_X += test-hashmap
TEST_PROGRAMS_NEED_X += test-index-version
Expand Down
5 changes: 3 additions & 2 deletions builtin/commit.c
Original file line number Diff line number Diff line change
Expand Up @@ -1366,13 +1366,14 @@ int cmd_status(int argc, const char **argv, const char *prefix)
refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, &s.pathspec, NULL, NULL);

fd = hold_locked_index(&index_lock, 0);
if (0 <= fd)
update_index_if_able(&the_index, &index_lock);

s.is_initial = get_sha1(s.reference, sha1) ? 1 : 0;
s.ignore_submodule_arg = ignore_submodule_arg;
wt_status_collect(&s);

if (0 <= fd)
update_index_if_able(&the_index, &index_lock);

if (s.relative_paths)
s.prefix = prefix;

Expand Down
188 changes: 188 additions & 0 deletions builtin/update-index.c
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ static int mark_valid_only;
static int mark_skip_worktree_only;
#define MARK_FLAG 1
#define UNMARK_FLAG 2
static struct strbuf mtime_dir = STRBUF_INIT;

__attribute__((format (printf, 1, 2)))
static void report(const char *fmt, ...)
Expand All @@ -48,6 +49,166 @@ static void report(const char *fmt, ...)
va_end(vp);
}

static void remove_test_directory(void)
{
if (mtime_dir.len)
remove_dir_recursively(&mtime_dir, 0);
}

static const char *get_mtime_path(const char *path)
{
static struct strbuf sb = STRBUF_INIT;
strbuf_reset(&sb);
strbuf_addf(&sb, "%s/%s", mtime_dir.buf, path);
return sb.buf;
}

static void xmkdir(const char *path)
{
path = get_mtime_path(path);
if (mkdir(path, 0700))
die_errno(_("failed to create directory %s"), path);
}

static int xstat_mtime_dir(struct stat *st)
{
if (stat(mtime_dir.buf, st))
die_errno(_("failed to stat %s"), mtime_dir.buf);
return 0;
}

static int create_file(const char *path)
{
int fd;
path = get_mtime_path(path);
fd = open(path, O_CREAT | O_RDWR, 0644);
if (fd < 0)
die_errno(_("failed to create file %s"), path);
return fd;
}

static void xunlink(const char *path)
{
path = get_mtime_path(path);
if (unlink(path))
die_errno(_("failed to delete file %s"), path);
}

static void xrmdir(const char *path)
{
path = get_mtime_path(path);
if (rmdir(path))
die_errno(_("failed to delete directory %s"), path);
}

static void avoid_racy(void)
{
/*
* not use if we could usleep(10) if USE_NSEC is defined. The
* field nsec could be there, but the OS could choose to
* ignore it?
*/
sleep(1);
}

static int test_if_untracked_cache_is_supported(void)
{
struct stat st;
struct stat_data base;
int fd, ret = 0;

strbuf_addstr(&mtime_dir, "mtime-test-XXXXXX");
if (!mkdtemp(mtime_dir.buf))
die_errno("Could not make temporary directory");

fprintf(stderr, _("Testing "));
atexit(remove_test_directory);
xstat_mtime_dir(&st);
fill_stat_data(&base, &st);
fputc('.', stderr);

avoid_racy();
fd = create_file("newfile");
xstat_mtime_dir(&st);
if (!match_stat_data(&base, &st)) {
close(fd);
fputc('\n', stderr);
fprintf_ln(stderr,_("directory stat info does not "
"change after adding a new file"));
goto done;
}
fill_stat_data(&base, &st);
fputc('.', stderr);

avoid_racy();
xmkdir("new-dir");
xstat_mtime_dir(&st);
if (!match_stat_data(&base, &st)) {
close(fd);
fputc('\n', stderr);
fprintf_ln(stderr, _("directory stat info does not change "
"after adding a new directory"));
goto done;
}
fill_stat_data(&base, &st);
fputc('.', stderr);

avoid_racy();
write_or_die(fd, "data", 4);
close(fd);
xstat_mtime_dir(&st);
if (match_stat_data(&base, &st)) {
fputc('\n', stderr);
fprintf_ln(stderr, _("directory stat info changes "
"after updating a file"));
goto done;
}
fputc('.', stderr);

avoid_racy();
close(create_file("new-dir/new"));
xstat_mtime_dir(&st);
if (match_stat_data(&base, &st)) {
fputc('\n', stderr);
fprintf_ln(stderr, _("directory stat info changes after "
"adding a file inside subdirectory"));
goto done;
}
fputc('.', stderr);

avoid_racy();
xunlink("newfile");
xstat_mtime_dir(&st);
if (!match_stat_data(&base, &st)) {
fputc('\n', stderr);
fprintf_ln(stderr, _("directory stat info does not "
"change after deleting a file"));
goto done;
}
fill_stat_data(&base, &st);
fputc('.', stderr);

avoid_racy();
xunlink("new-dir/new");
xrmdir("new-dir");
xstat_mtime_dir(&st);
if (!match_stat_data(&base, &st)) {
fputc('\n', stderr);
fprintf_ln(stderr, _("directory stat info does not "
"change after deleting a directory"));
goto done;
}

if (rmdir(mtime_dir.buf))
die_errno(_("failed to delete directory %s"), mtime_dir.buf);
fprintf_ln(stderr, _(" OK"));
ret = 1;

done:
strbuf_release(&mtime_dir);
return ret;
}

static int mark_ce_flags(const char *path, int flag, int mark)
{
int namelen = strlen(path);
Expand Down Expand Up @@ -741,6 +902,7 @@ static int reupdate_callback(struct parse_opt_ctx_t *ctx,
int cmd_update_index(int argc, const char **argv, const char *prefix)
{
int newfd, entries, has_errors = 0, line_termination = '\n';
int untracked_cache = -1;
int read_from_stdin = 0;
int prefix_length = prefix ? strlen(prefix) : 0;
int preferred_index_format = 0;
Expand Down Expand Up @@ -832,6 +994,10 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
N_("write index in this format")),
OPT_BOOL(0, "split-index", &split_index,
N_("enable or disable split index")),
OPT_BOOL(0, "untracked-cache", &untracked_cache,
N_("enable/disable untracked cache")),
OPT_SET_INT(0, "force-untracked-cache", &untracked_cache,
N_("enable untracked cache without testing the filesystem"), 2),
OPT_END()
};

Expand Down Expand Up @@ -938,6 +1104,28 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
the_index.split_index = NULL;
the_index.cache_changed |= SOMETHING_CHANGED;
}
if (untracked_cache > 0) {
struct untracked_cache *uc;

if (untracked_cache < 2) {
setup_work_tree();
if (!test_if_untracked_cache_is_supported())
return 1;
}
if (!the_index.untracked) {
uc = xcalloc(1, sizeof(*uc));
strbuf_init(&uc->ident, 100);
uc->exclude_per_dir = ".gitignore";
/* should be the same flags used by git-status */
uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
the_index.untracked = uc;
}
add_untracked_ident(the_index.untracked);
the_index.cache_changed |= UNTRACKED_CHANGED;
} else if (!untracked_cache && the_index.untracked) {
the_index.untracked = NULL;
the_index.cache_changed |= UNTRACKED_CHANGED;
}

if (active_cache_changed) {
if (newfd < 0) {
Expand Down
6 changes: 6 additions & 0 deletions cache.h
Original file line number Diff line number Diff line change
Expand Up @@ -297,8 +297,11 @@ static inline unsigned int canon_mode(unsigned int mode)
#define RESOLVE_UNDO_CHANGED (1 << 4)
#define CACHE_TREE_CHANGED (1 << 5)
#define SPLIT_INDEX_ORDERED (1 << 6)
#define UNTRACKED_CHANGED (1 << 7)

struct split_index;
struct untracked_cache;

struct index_state {
struct cache_entry **cache;
unsigned int version;
Expand All @@ -312,6 +315,7 @@ struct index_state {
struct hashmap name_hash;
struct hashmap dir_hash;
unsigned char sha1[20];
struct untracked_cache *untracked;
};

extern struct index_state the_index;
Expand Down Expand Up @@ -563,6 +567,8 @@ extern void fill_stat_data(struct stat_data *sd, struct stat *st);
* INODE_CHANGED, and DATA_CHANGED.
*/
extern int match_stat_data(const struct stat_data *sd, struct stat *st);
extern int match_stat_data_racy(const struct index_state *istate,
const struct stat_data *sd, struct stat *st);

extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st);

Expand Down
11 changes: 11 additions & 0 deletions compat/mingw.c
Original file line number Diff line number Diff line change
Expand Up @@ -2128,3 +2128,14 @@ void mingw_startup()
/* initialize Unicode console */
winansi_init();
}

int uname(struct utsname *buf)
{
DWORD v = GetVersion();
memset(buf, 0, sizeof(*buf));
strcpy(buf->sysname, "Windows");
sprintf(buf->release, "%u.%u", v & 0xff, (v >> 8) & 0xff);
/* assuming NT variants only.. */
sprintf(buf->version, "%u", (v >> 16) & 0x7fff);
return 0;
}
Loading

0 comments on commit 38ccaf9

Please sign in to comment.