Have you ever encountered the dreaded "file exceeds size limit" error when trying to push to GitLab, GitHub, or other Git hosting services? You're not alone! This is a common issue that developers face, especially when accidentally committing large build artifacts, dependencies, or media files to their repositories.
The Problem: Git Push Rejected Due to Large Files
Recently, while working on a Java project, I encountered this exact error:
remote: GitLab: You are attempting to check in one or more blobs which exceed the 100.0MiB limit:
remote:
remote: - 7fd3bc8c77bf9608054e674f2e69a02a7d73191c (106 MiB)
remote:
remote: To resolve this error, you must either reduce the size of the above blobs, or utilize LFS.
The issue was a GWT plugin ZIP file (GWT plugins/gwt-2.5.1.zip
) that was over 100MB - something that should never have been committed to the repository in the first place.
Step 1: Identify the Problematic File
When Git gives you a blob ID, you can find the exact file using:
git ls-tree -r HEAD | grep 7fd3bc8c77bf9608054e674f2e69a02a7d73191c
Alternative: Find All Large Files
If you want to audit your entire repository for large files:
# Find all files larger than 50MB
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | awk '/^blob/ {if($3 > 52428800) print $3/1024/1024 " MB " $4}' | sort -n
# Or check current directory
find . -type f -size +50M -exec ls -lh {} \;
Step 2: Remove the File from Git History
Once you've identified the large file, you have several options to remove it completely from your Git history.
Method 1: git filter-repo (Modern Approach)
First, install git-filter-repo:
pip install git-filter-repo
Then remove the file:
git filter-repo --path "GWT plugins/gwt-2.5.1.zip" --invert-paths
Method 2: git filter-branch (Legacy but Reliable)
This is the method that worked perfectly in our case:
git filter-branch --force --index-filter \
'git rm --cached --ignore-unmatch "GWT plugins/gwt-2.5.1.zip"' \
--prune-empty --tag-name-filter cat -- --all
Why this works:
--force
: Overwrites existing filter-branch results--index-filter
: Runs the command on the index (staging area)git rm --cached --ignore-unmatch
: Removes the file from index, ignoring if it doesn't exist in some commits--prune-empty
: Removes commits that become empty after filtering--tag-name-filter cat
: Preserves tag names-- --all
: Applies to all branches and tags
Method 3: BFG Repo-Cleaner (Alternative)
Download BFG and run:
java -jar bfg.jar --delete-files "gwt-2.5.1.zip" your-repo.git
Step 3: Clean Up and Push
After removing the file from history:
Clean up Git references:
git for-each-ref --format="delete %(refname)" refs/original | git update-ref --stdin git reflog expire --expire=now --all git gc --prune=now --aggressive
Force push to update remote:
git push origin --force --all git push origin --force --tags
Step 4: Prevent Future Issues
Add problematic file types to .gitignore
:
echo "GWT plugins/" >> .gitignore
echo "*.zip" >> .gitignore
echo "*.war" >> .gitignore
echo "*.jar" >> .gitignore
echo "build/" >> .gitignore
echo "target/" >> .gitignore
git add .gitignore
git commit -m "Add gitignore for build artifacts and large files"
Common Large Files to Avoid in Git
- Build artifacts:
.war
,.jar
,.ear
files - Dependencies: Node modules, Maven dependencies, Python packages
- Media files: Large images, videos, audio files
- Database dumps: SQL files, database backups
- IDE files: Large project files, caches
- Compressed archives:
.zip
,.tar.gz
,.rar
files
Alternative Solutions
Git LFS (Large File Storage)
If you need to track large files:
git lfs install
git lfs track "*.zip"
git add .gitattributes
git add your-large-file.zip
git commit -m "Add large file with LFS"
External Storage
Consider storing large files in:
- Cloud storage (AWS S3, Google Cloud Storage)
- Artifact repositories (Nexus, Artifactory)
- CDNs for media files
Important Warnings
⚠️ Before rewriting Git history:
- Create a backup of your repository
- Coordinate with your team - they'll need to re-clone after force pushing
- Understand the impact - this changes commit hashes and can break existing pull requests
⚠️ Force pushing considerations:
- Only force push to branches you own
- Never force push to main/master without team agreement
- Consider using
--force-with-lease
for safer force pushing
Conclusion
Large files in Git repositories are a common problem, but they're easily solvable with the right tools. The git filter-branch
command proved to be the most reliable solution for completely removing the problematic GWT plugin file from the repository history.
Key takeaways:
- Always use
.gitignore
to prevent committing large files - Regular repository audits can catch issues early
git filter-branch
is a powerful tool for cleaning Git history- Consider Git LFS for legitimate large file needs
- Always backup before rewriting history
Remember, the best approach is prevention - set up proper .gitignore
files from the start and educate your team about what should and shouldn't be committed to version control.
Have you encountered similar issues with large files in Git? What solutions worked best for your team? Share your experiences in the comments below!
This guide was based on a real-world scenario where a 106MB GWT plugin ZIP file was accidentally committed to a Java project repository. The git filter-branch
solution successfully resolved the issue and allowed the code to be pushed to GitLab.
No comments:
Post a Comment