Skip to content

Commit

Permalink
drop support for "experimental" loose objects
Browse files Browse the repository at this point in the history
In git v1.4.3, we introduced a new loose object format that
encoded some object information outside of the zlib stream.
Ultimately the format was dropped in v1.5.3, but we kept the
reading side around to help people migrate objects. Each
time we open a loose object, we use a heuristic to check
whether it is in the normal loose format, or the
experimental one.

This heuristic is robust in the face of valid data, but it
tends to treat corrupted or garbage data as an experimental
object. With the regular format, we would notice quickly
that zlib's crc does not check out and complain. With the
experimental object, we are likely to extract a nonsensical
object size and try to allocate a huge buffer, resulting in
xmalloc calling "die".

This latter behavior is much worse, for two reasons. One,
git reports an allocation error when the real error is
corruption. And two, the program dies unconditionally, so
you cannot even run fsck (which would otherwise ignore the
broken object and keep going).

We could try to improve the heuristic to err on the side of
normal objects in the face of corruption, but there is
really little point. The experimental format is long-dead,
and was never enabled by default to begin with. We can
instead simply remove it. The only affected repository would
be one that explicitly set core.legacyheaders in 2007, and
then never repacked in the intervening 6 years.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
  • Loading branch information
Jeff King authored and Junio C Hamano committed Nov 21, 2013
1 parent becb433 commit b039718
Show file tree
Hide file tree
Showing 19 changed files with 0 additions and 143 deletions.
74 changes: 0 additions & 74 deletions sha1_file.c
Original file line number Diff line number Diff line change
Expand Up @@ -1372,51 +1372,6 @@ void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
return map;
}

/*
* There used to be a second loose object header format which
* was meant to mimic the in-pack format, allowing for direct
* copy of the object data. This format turned up not to be
* really worth it and we no longer write loose objects in that
* format.
*/
static int experimental_loose_object(unsigned char *map)
{
unsigned int word;

/*
* We must determine if the buffer contains the standard
* zlib-deflated stream or the experimental format based
* on the in-pack object format. Compare the header byte
* for each format:
*
* RFC1950 zlib w/ deflate : 0www1000 : 0 <= www <= 7
* Experimental pack-based : Stttssss : ttt = 1,2,3,4
*
* If bit 7 is clear and bits 0-3 equal 8, the buffer MUST be
* in standard loose-object format, UNLESS it is a Git-pack
* format object *exactly* 8 bytes in size when inflated.
*
* However, RFC1950 also specifies that the 1st 16-bit word
* must be divisible by 31 - this checksum tells us our buffer
* is in the standard format, giving a false positive only if
* the 1st word of the Git-pack format object happens to be
* divisible by 31, ie:
* ((byte0 * 256) + byte1) % 31 = 0
* => 0ttt10000www1000 % 31 = 0
*
* As it happens, this case can only arise for www=3 & ttt=1
* - ie, a Commit object, which would have to be 8 bytes in
* size. As no Commit can be that small, we find that the
* combination of these two criteria (bitmask & checksum)
* can always correctly determine the buffer format.
*/
word = (map[0] << 8) + map[1];
if ((map[0] & 0x8F) == 0x08 && !(word % 31))
return 0;
else
return 1;
}

unsigned long unpack_object_header_buffer(const unsigned char *buf,
unsigned long len, enum object_type *type, unsigned long *sizep)
{
Expand Down Expand Up @@ -1444,42 +1399,13 @@ unsigned long unpack_object_header_buffer(const unsigned char *buf,

int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz)
{
unsigned long size, used;
static const char valid_loose_object_type[8] = {
0, /* OBJ_EXT */
1, 1, 1, 1, /* "commit", "tree", "blob", "tag" */
0, /* "delta" and others are invalid in a loose object */
};
enum object_type type;

/* Get the data stream */
memset(stream, 0, sizeof(*stream));
stream->next_in = map;
stream->avail_in = mapsize;
stream->next_out = buffer;
stream->avail_out = bufsiz;

if (experimental_loose_object(map)) {
/*
* The old experimental format we no longer produce;
* we can still read it.
*/
used = unpack_object_header_buffer(map, mapsize, &type, &size);
if (!used || !valid_loose_object_type[type])
return -1;
map += used;
mapsize -= used;

/* Set up the stream for the rest.. */
stream->next_in = map;
stream->avail_in = mapsize;
git_inflate_init(stream);

/* And generate the fake traditional header */
stream->total_out = 1 + snprintf(buffer, bufsiz, "%s %lu",
typename(type), size);
return 0;
}
git_inflate_init(stream);
return git_inflate(stream, 0);
}
Expand Down
66 changes: 0 additions & 66 deletions t/t1013-loose-object-format.sh

This file was deleted.

Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
2 changes: 0 additions & 2 deletions t/t1013/objects/76/e7fa9941f4d5f97f64fea65a2cba436bc79cbb

This file was deleted.

Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
1 change: 0 additions & 1 deletion t/t1013/objects/f8/16d5255855ac160652ee5253b06cd8ee14165a

This file was deleted.

0 comments on commit b039718

Please sign in to comment.