There are three commands with similar names: git reset, git restore and git revert.
-
git-revert
(1) is about making a new commit that reverts the changes made by other commits. -
git-restore
(1) is about restoring files in the working tree from either the index or another commit. This command does not update your branch. The command can also be used to restore files in the index from another commit. git-reset
(1) is about updating your branch, moving the tip in order to add or remove commits from the branch. This operation changes the commit history.git reset
can also be used to restore the index, overlapping with git restore.
redo the last commit
One of the common undos takes place when you commit too early and possibly forget to add some files, or you mess up your commit message. If you want to redo that commit, make the additional changes you forgot, stage them, and commit again using the --amend
option:
1
git commit --amend
Unstaging a Staged File
1
git reset HEAD xxx
Unmodifying a Modified File
1
git checkout -- xxx
git reset
Git version 2.23.0 introduced a new command: git restore
. It’s basically an alternative to git reset
which we just covered. From Git version 2.23.0 onwards, Git will use git restore
instead of git reset
for many undo operations.
Unstaging a Staged File
1
git restore --staged xxx
Unmodifying a Modified File
1
git restore xxx
Data Recovery
Remember, anything that is committed in Git can almost always be recovered. Even commits that were on branches that were deleted or commits that were overwritten with an --amend
commit can be recovered (see Data Recovery for data recovery). However, anything you lose that was never committed is likely never to be seen again.
garbage collection
Occasionally, Git automatically runs a command called “auto gc”. Most of the time, this command does nothing. However, if there are too many loose objects (objects not in a packfile) or too many packfiles, Git launches a full-fledged git gc
command. The “gc” stands for garbage collect, and the command does a number of things: it gathers up all the loose objects and places them in packfiles, it consolidates packfiles into one big packfile, and it removes objects that aren’t reachable from any commit and are a few months old.
1
2
# Again, this generally does nothing. You must have around 7,000 loose objects or more than 50 packfiles for Git to fire up a real gc command. You can modify these limits with the gc.auto and gc.autopacklimit config settings, respectively.
git gc --auto
git reflog
1
2
git reflog
git log -g
delete the log
1
rm -rf .git/logs
git fsck
Because the reflog data is kept in the .git/logs/
directory, you effectively have no reflog. How can you recover that commit at this point? One way is to use the git fsck
utility, which checks your database for integrity. If you run it with the --full
option, it shows you all objects that aren’t pointed to by another object:
1
git fsck --full
Deleting Objects
There are a lot of great things about Git, but one feature that can cause issues is the fact that a git clone
downloads the entire history of the project, including every version of every file. This is fine if the whole thing is source code, because Git is highly optimized to compress that data efficiently. However, if someone at any point in the history of your project added a single huge file, every clone for all time will be forced to download that large file, even if it was removed from the project in the very next commit. Because it’s reachable from the history, it will always be there.
This can be a huge problem when you’re converting Subversion or Perforce repositories into Git. Because you don’t download the whole history in those systems, this type of addition carries few consequences. If you did an import from another system or otherwise find that your repository is much larger than it should be, here is how you can find and remove large objects.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# check how much space you are using
git count-objects -v
# First you have to find it. In this case, you already know what file it is. But suppose you didn’t; how would you identify what file or files were taking up so much space? If you run git gc, all the objects are in a packfile; you can identify the big objects by running another plumbing command called git verify-pack and sorting on the third field in the output, which is file size. You can also pipe it through the tail command because you’re only interested in the last few largest files:
git verify-pack -v .git/objects/pack/pack-29…69.idx \
| sort -k 3 -n \
| tail -3
# To find out what file it is, you’ll use the rev-list command, which you used briefly in Enforcing a Specific Commit-Message Format. If you pass --objects to rev-list, it lists all the commit SHA-1s and also the blob SHA-1s with the file paths associated with them.
git rev-list --objects --all | grep 82c99a3
# see what commits modified this file
git log --oneline --branches -- git.tgz
# You must rewrite all the commits downstream from 7b30847 to fully remove this file from your Git history.
git filter-branch --index-filter \
'git rm --ignore-unmatch --cached git.tgz' -- 7b30847^..
# Your history no longer contains a reference to that file. However, your reflog and a new set of refs that Git added when you did the filter-branch under .git/refs/original still do, so you have to remove them and then repack the database. You need to get rid of anything that has a pointer to those old commits before you repack:
rm -Rf .git/refs/original
rm -Rf .git/logs/
git gc
# The packed repository size is down to 8K, which is much better than 5MB. You can see from the size value that the big object is still in your loose objects, so it’s not gone; but it won’t be transferred on a push or subsequent clone, which is what is important. If you really wanted to, you could remove the object completely by running git prune with the --expire option:
git prune --expire now
References
https://git-scm.com/book/en/v2/Git-Basics-Undoing-Things
https://git-scm.com/book/en/v2/Git-Internals-Maintenance-and-Data-Recovery#_data_recovery