Skip to content

Commit

Permalink
filter-branch: Big syntax change; support rewriting multiple refs
Browse files Browse the repository at this point in the history
We used to take the first non-option argument as the name for the new
branch.  This syntax is not extensible to support rewriting more than just
HEAD.

Instead, we now have the following syntax:

	git filter-branch [<filter options>...] [<rev-list options>]

All positive refs given in <rev-list options> are rewritten.  Yes,
in-place.  If a ref was changed, the original head is stored in
refs/original/$ref now, for your inspecting pleasure, in addition to the
reflogs (since it is easier to inspect "git show-ref | grep original" than
to inspect all the reflogs).

This commit also adds the --force option to remove .git-rewrite/ and all
refs from refs/original/ before filtering.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
  • Loading branch information
Johannes Schindelin authored and Junio C Hamano committed Jul 24, 2007
1 parent 3b38ec1 commit dfd05e3
Show file tree
Hide file tree
Showing 3 changed files with 182 additions and 60 deletions.
51 changes: 28 additions & 23 deletions Documentation/git-filter-branch.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ SYNOPSIS
[--index-filter <command>] [--parent-filter <command>]
[--msg-filter <command>] [--commit-filter <command>]
[--tag-name-filter <command>] [--subdirectory-filter <directory>]
[-d <directory>] <new-branch-name> [<rev-list options>...]
[-d <directory>] [-f | --force] [<rev-list options>...]

DESCRIPTION
-----------
Expand All @@ -26,10 +26,9 @@ information) will be preserved.
The command takes the new branch name as a mandatory argument and
the filters as optional arguments. If you specify no filters, the
commits will be recommitted without any changes, which would normally
have no effect and result in the new branch pointing to the same
branch as your current branch. Nevertheless, this may be useful in
the future for compensating for some git bugs or such, therefore
such a usage is permitted.
have no effect. Nevertheless, this may be useful in the future for
compensating for some git bugs or such, therefore such a usage is
permitted.

*WARNING*! The rewritten history will have different object names for all
the objects and will not converge with the original branch. You will not
Expand All @@ -38,8 +37,9 @@ original branch. Please do not use this command if you do not know the
full implications, and avoid using it anyway, if a simple single commit
would suffice to fix your problem.

Always verify that the rewritten version is correct before disposing
the original branch.
Always verify that the rewritten version is correct: The original refs,
if different from the rewritten ones, will be stored in the namespace
'refs/original/'.

Note that since this operation is extensively I/O expensive, it might
be a good idea to redirect the temporary directory off-disk, e.g. on
Expand Down Expand Up @@ -142,6 +142,11 @@ definition impossible to preserve signatures at any rate.)
does this in the '.git-rewrite/' directory but you can override
that choice by this parameter.

-f\|--force::
`git filter-branch` refuses to start with an existing temporary
directory or when there are already refs starting with
'refs/original/', unless forced.

<rev-list-options>::
When options are given after the new branch name, they will
be passed to gitlink:git-rev-list[1]. Only commits in the resulting
Expand All @@ -156,14 +161,14 @@ Suppose you want to remove a file (containing confidential information
or copyright violation) from all commits:

-------------------------------------------------------
git filter-branch --tree-filter 'rm filename' newbranch
git filter-branch --tree-filter 'rm filename' HEAD
-------------------------------------------------------

A significantly faster version:

-------------------------------------------------------------------------------
git filter-branch --index-filter 'git update-index --remove filename' newbranch
-------------------------------------------------------------------------------
--------------------------------------------------------------------------
git filter-branch --index-filter 'git update-index --remove filename' HEAD
--------------------------------------------------------------------------

Now, you will get the rewritten history saved in the branch 'newbranch'
(your current branch is left untouched).
Expand All @@ -172,25 +177,25 @@ To set a commit (which typically is at the tip of another
history) to be the parent of the current initial commit, in
order to paste the other history behind the current history:

------------------------------------------------------------------------
git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' newbranch
------------------------------------------------------------------------
-------------------------------------------------------------------
git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' HEAD
-------------------------------------------------------------------

(if the parent string is empty - therefore we are dealing with the
initial commit - add graftcommit as a parent). Note that this assumes
history with a single root (that is, no merge without common ancestors
happened). If this is not the case, use:

-------------------------------------------------------------------------------
--------------------------------------------------------------------------
git filter-branch --parent-filter \
'cat; test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>"' newbranch
-------------------------------------------------------------------------------
'cat; test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>"' HEAD
--------------------------------------------------------------------------

or even simpler:

-----------------------------------------------
echo "$commit-id $graft-id" >> .git/info/grafts
git filter-branch newbranch $graft-id..
git filter-branch $graft-id..HEAD
-----------------------------------------------

To remove commits authored by "Darl McBribe" from the history:
Expand All @@ -208,7 +213,7 @@ git filter-branch --commit-filter '
done;
else
git commit-tree "$@";
fi' newbranch
fi' HEAD
------------------------------------------------------------------------------

The shift magic first throws away the tree id and then the -p
Expand Down Expand Up @@ -238,14 +243,14 @@ A--B-----C
To rewrite only commits D,E,F,G,H, but leave A, B and C alone, use:

--------------------------------
git filter-branch ... new-H C..H
git filter-branch ... C..H
--------------------------------

To rewrite commits E,F,G,H, use one of these:

----------------------------------------
git filter-branch ... new-H C..H --not D
git filter-branch ... new-H D..H --not C
git filter-branch ... C..H --not D
git filter-branch ... D..H --not C
----------------------------------------

To move the whole tree into a subdirectory, or remove it from there:
Expand All @@ -255,7 +260,7 @@ git filter-branch --index-filter \
'git ls-files -s | sed "s-\t-&newsubdir/-" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info &&
mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' directorymoved
mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD
---------------------------------------------------------------


Expand Down
150 changes: 128 additions & 22 deletions git-filter-branch.sh
Original file line number Diff line number Diff line change
Expand Up @@ -78,13 +78,20 @@ filter_msg=cat
filter_commit='git commit-tree "$@"'
filter_tag_name=
filter_subdir=
orig_namespace=refs/original/
force=
while case "$#" in 0) usage;; esac
do
case "$1" in
--)
shift
break
;;
--force|-f)
shift
force=t
continue
;;
-*)
;;
*)
Expand Down Expand Up @@ -126,24 +133,43 @@ do
--subdirectory-filter)
filter_subdir="$OPTARG"
;;
--original)
orig_namespace="$OPTARG"
;;
*)
usage
;;
esac
done

dstbranch="$1"
shift
test -n "$dstbranch" || die "missing branch name"
git show-ref "refs/heads/$dstbranch" 2> /dev/null &&
die "branch $dstbranch already exists"

test ! -e "$tempdir" || die "$tempdir already exists, please remove it"
case "$force" in
t)
rm -rf "$tempdir"
;;
'')
test -d "$tempdir" &&
die "$tempdir already exists, please remove it"
esac
mkdir -p "$tempdir/t" &&
tempdir="$(cd "$tempdir"; pwd)" &&
cd "$tempdir/t" &&
workdir="$(pwd)" ||
die ""

# Make sure refs/original is empty
git for-each-ref > "$tempdir"/backup-refs
while read sha1 type name
do
case "$force,$name" in
,$orig_namespace*)
die "Namespace $orig_namespace not empty"
;;
t,$orig_namespace*)
git update-ref -d "$name" $sha1
;;
esac
done < "$tempdir"/backup-refs

case "$GIT_DIR" in
/*)
;;
Expand All @@ -153,6 +179,29 @@ case "$GIT_DIR" in
esac
export GIT_DIR GIT_WORK_TREE=.

# These refs should be updated if their heads were rewritten

git rev-parse --revs-only --symbolic "$@" |
while read ref
do
# normalize ref
case "$ref" in
HEAD)
ref="$(git symbolic-ref "$ref")"
;;
refs/*)
;;
*)
ref="$(git for-each-ref --format='%(refname)' |
grep /"$ref")"
esac

git check-ref-format "$ref" && echo "$ref"
done > "$tempdir"/heads

test -s "$tempdir"/heads ||
die "Which ref do you want to rewrite?"

export GIT_INDEX_FILE="$(pwd)/../index"
git read-tree || die "Could not seed the index"

Expand All @@ -174,6 +223,8 @@ commits=$(wc -l <../revs | tr -d " ")

test $commits -eq 0 && die "Found nothing to rewrite"

# Rewrite the commits

i=0
while read commit parents; do
i=$(($i+1))
Expand Down Expand Up @@ -234,22 +285,75 @@ while read commit parents; do
$(git write-tree) $parentstr < ../message > ../map/$commit
done <../revs

src_head=$(tail -n 1 ../revs | sed -e 's/ .*//')
target_head=$(head -n 1 ../map/$src_head)
case "$target_head" in
'')
echo Nothing rewritten
# In case of a subdirectory filter, it is possible that a specified head
# is not in the set of rewritten commits, because it was pruned by the
# revision walker. Fix it by mapping these heads to the next rewritten
# ancestor(s), i.e. the boundaries in the set of rewritten commits.

# NEEDSWORK: we should sort the unmapped refs topologically first
while read ref
do
sha1=$(git rev-parse "$ref"^0)
test -f "$workdir"/../map/$sha1 && continue
# Assign the boundarie(s) in the set of rewritten commits
# as the replacement commit(s).
# (This would look a bit nicer if --not --stdin worked.)
for p in $((cd "$workdir"/../map; ls | sed "s/^/^/") |
git rev-list $ref --boundary --stdin |
sed -n "s/^-//p")
do
map $p >> "$workdir"/../map/$sha1
done
done < "$tempdir"/heads
# Finally update the refs
_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
count=0
echo
while read ref
do
# avoid rewriting a ref twice
test -f "$orig_namespace$ref" && continue
sha1=$(git rev-parse "$ref"^0)
rewritten=$(map $sha1)
test $sha1 = "$rewritten" &&
warn "WARNING: Ref '$ref' is unchanged" &&
continue
case "$rewritten" in
'')
echo "Ref '$ref' was deleted"
git update-ref -m "filter-branch: delete" -d "$ref" $sha1 ||
die "Could not delete $ref"
;;
*)
git update-ref refs/heads/"$dstbranch" $target_head ||
die "Could not update $dstbranch with $target_head"
if [ $(wc -l <../map/$src_head) -gt 1 ]; then
echo "WARNING: Your commit filter caused the head commit to expand to several rewritten commits. Only the first such commit was recorded as the current $dstbranch head but you will need to resolve the situation now (probably by manually merging the other commits). These are all the commits:" >&2
sed 's/^/ /' ../map/$src_head >&2
ret=1
fi
$_x40)
echo "Ref '$ref' was rewritten"
git update-ref -m "filter-branch: rewrite" \
"$ref" $rewritten $sha1 ||
die "Could not rewrite $ref"
;;
esac
*)
# NEEDSWORK: possibly add -Werror, making this an error
warn "WARNING: '$ref' was rewritten into multiple commits:"
warn "$rewritten"
warn "WARNING: Ref '$ref' points to the first one now."
rewritten=$(echo "$rewritten" | head -n 1)
git update-ref -m "filter-branch: rewrite to first" \
"$ref" $rewritten $sha1 ||
die "Could not rewrite $ref"
;;
esac
git update-ref -m "filter-branch: backup" "$orig_namespace$ref" $sha1
count=$(($count+1))
done < "$tempdir"/heads

# TODO: This should possibly go, with the semantics that all positive given
# refs are updated, and their original heads stored in refs/original/
# Filter tags

if [ "$filter_tag_name" ]; then
git for-each-ref --format='%(objectname) %(objecttype) %(refname)' refs/tags |
Expand Down Expand Up @@ -286,6 +390,8 @@ fi

cd ../..
rm -rf "$tempdir"
printf "\nRewritten history saved to the $dstbranch branch\n"
echo
test $count -gt 0 && echo "These refs were rewritten:"
git show-ref | grep ^"$orig_namespace"

exit $ret
Loading

0 comments on commit dfd05e3

Please sign in to comment.