I was working on a repository on my GitHub account and this is a problem I stumbled upon.
- Node.js project with a folder with a few npm packages installed
- The packages were in
node_modulesfolder - Added that folder to Git repository and pushed the code to GitHub (wasn't thinking about the npm part at that time)
- Realized that you don't really need that folder to be a part of the code
- Deleted that folder, pushed it
At that instance, the size of the total Git repository was around 6 MB where the actual code (all except that folder) was only around 300 KB.
Now what I am looking for in the end is a way to get rid of details of that package folder from Git's history, so if someone clones it, they don't have to download 6 MB worth of history where the only actual files they will be getting as of the last commit would be 300 KB.
I looked up possible solutions for this and tried these two methods
- Remove file from git repository (history)
- http://help.github.com/remove-sensitive-data/ (now (effectively) broken)
- bartlomiejdanek/git-remove-file.sh
The Gist seemed like it worked where after running the script, it showed that it got rid of that folder and after that it showed that 50 different commits were modified. But it didn't let me push that code. When I tried to push it, it said Branch up to date, but it showed 50 commits were modified upon a git status. The other two methods didn't help either.
Now even though it showed that it got rid of that folder's history, when I checked the size of that repository on my localhost, it was still around 6 MB. (I also deleted the refs/originalfolder but didn't see the change in the size of the repository).
What I am looking to clarify is, if there's a way to get rid of not only the commit history (which is the only thing I think happened) but also those files Git is keeping assuming one wants to rollback.
Let’s say a solution is presented for this and is applied on my localhost, but it can’t be reproduced to that GitHub repository, is it possible to clone that repository, rollback to the first commit perform the trick and push it (or does that mean that Git will still have a history of all those commits? - AKA 6 MB).
My end goal here is to basically find the best way to get rid of the folder contents from Git so that a user doesn't have to download 6MB worth of stuff and still possibly have the other commits that never touched the modules folder (that's pretty much all of them) in Git's history.
How can I do this?