Skip to content

Commit

Permalink
cat-file: split --batch input lines on whitespace
Browse files Browse the repository at this point in the history
If we get an input line to --batch or --batch-check that
looks like "HEAD foo bar", we will currently feed the whole
thing to get_sha1(). This means that to use --batch-check
with `rev-list --objects`, one must pre-process the input,
like:

  git rev-list --objects HEAD |
  cut -d' ' -f1 |
  git cat-file --batch-check

Besides being more typing and slightly less efficient to
invoke `cut`, the result loses information: we no longer
know which path each object was found at.

This patch teaches cat-file to split input lines at the
first whitespace. Everything to the left of the whitespace
is considered an object name, and everything to the right is
made available as the %(reset) atom. So you can now do:

  git rev-list --objects HEAD |
  git cat-file --batch-check='%(objectsize) %(rest)'

to collect object sizes at particular paths.

Even if %(rest) is not used, we always do the whitespace
split (which means you can simply eliminate the `cut`
command from the first example above).

This whitespace split is backwards compatible for any
reasonable input. Object names cannot contain spaces, so any
input with spaces would have resulted in a "missing" line.
The only input hurt is if somebody really expected input of
the form "HEAD is a fine-looking ref!" to fail; it will now
parse HEAD, and make "is a fine-looking ref!" available as
%(rest).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
  • Loading branch information
Jeff King authored and Junio C Hamano committed Jul 12, 2013
1 parent a4ac106 commit c334b87
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 3 deletions.
10 changes: 8 additions & 2 deletions Documentation/git-cat-file.txt
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,10 @@ BATCH OUTPUT
If `--batch` or `--batch-check` is given, `cat-file` will read objects
from stdin, one per line, and print information about them.

Each line is considered as a whole object name, and is parsed as if
given to linkgit:git-rev-parse[1].
Each line is split at the first whitespace boundary. All characters
before that whitespace are considered as a whole object name, and are
parsed as if given to linkgit:git-rev-parse[1]. Characters after that
whitespace can be accessed using the `%(rest)` atom (see below).

You can specify the information shown for each object by using a custom
`<format>`. The `<format>` is copied literally to stdout for each
Expand All @@ -110,6 +112,10 @@ newline. The available atoms are:
The size, in bytes, that the object takes up on disk. See the
note about on-disk sizes in the `CAVEATS` section below.

`rest`::
The text (if any) found after the first run of whitespace on the
input line (i.e., the "rest" of the line).

If no format is specified, the default format is `%(objectname)
%(objecttype) %(objectsize)`.

Expand Down
20 changes: 19 additions & 1 deletion builtin/cat-file.c
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ struct expand_data {
enum object_type type;
unsigned long size;
unsigned long disk_size;
const char *rest;

/*
* If mark_query is true, we do not expand anything, but rather
Expand Down Expand Up @@ -161,6 +162,9 @@ static void expand_atom(struct strbuf *sb, const char *atom, int len,
data->info.disk_sizep = &data->disk_size;
else
strbuf_addf(sb, "%lu", data->disk_size);
} else if (is_atom("rest", atom, len)) {
if (!data->mark_query && data->rest)
strbuf_addstr(sb, data->rest);
} else
die("unknown format element: %.*s", len, atom);
}
Expand Down Expand Up @@ -263,7 +267,21 @@ static int batch_objects(struct batch_options *opt)
data.mark_query = 0;

while (strbuf_getline(&buf, stdin, '\n') != EOF) {
int error = batch_one_object(buf.buf, opt, &data);
char *p;
int error;

/*
* Split at first whitespace, tying off the beginning of the
* string and saving the remainder (or NULL) in data.rest.
*/
p = strpbrk(buf.buf, " \t");
if (p) {
while (*p && strchr(" \t", *p))
*p++ = '\0';
}
data.rest = p;

error = batch_one_object(buf.buf, opt, &data);
if (error)
return error;
}
Expand Down
7 changes: 7 additions & 0 deletions t/t1006-cat-file.sh
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,13 @@ $content"
echo $sha1 | git cat-file --batch-check="%(objecttype) %(objectname)" >actual &&
test_cmp expect actual
'

test_expect_success '--batch-check with %(rest)' '
echo "$type this is some extra content" >expect &&
echo "$sha1 this is some extra content" |
git cat-file --batch-check="%(objecttype) %(rest)" >actual &&
test_cmp expect actual
'
}

hello_content="Hello World"
Expand Down

0 comments on commit c334b87

Please sign in to comment.