This week, I came across an article that talks about Microsoft encountering an issue with Git and version control. Microsoft software engineers use Git as their version control system, and they recently discovered a problem with the amount of memory being used in their repository. Their first commit was only 2 GB, then grew to 4. However, it soon was taking up 178 GB of memory. Obviously, this is a bit of an issue. It would be very difficult for a team to make commits with that much storage space in that amount of time. But how does this happen? Well, the article states that the issue stems from name-hash collisions. Basically, Git was finding a bunch of differences with each commit despite there not being many at all. Obviously, this is a huge issue, so steps are already being made to resolve it.
This is an issue that was particularly interesting to me because we are currently learning about Git and how to properly use it in my CS 348 class. Git is an incredibly useful and necessary tool in the computer science world. To see an issue this scale at a company this important grabbed my attention immediately. I was also curious to see how Git was used on large projects at a company like Microsoft.
Seeing that problems like this can occur was for sure interesting. However, what was more interesting to me was that by the time anyone could report on this, changes were already being made. The article does state that this issue would not affect smaller repos, which would be what I am dealing with at the moment. However, it is fascinating to see how on top of everything these developers are in order to assure a smooth running product. It makes me feel better knowing that these issues will be resolved by the time I could ever encounter them. It also motivates me knowing that some day, I will have to be this vigilant in my work to truly be successful. Also, the fact that over 90% of developers use Git for version control was intriguing. I knew it was popular, but I also was aware of other version control systems. This just goes to show how reliable Git is, despite some occasional issues such as this one. I am glad to be learning how to use this system, and I am excited to learn more about it going forward.
From the blog CS@Worcester – Auger CS by Joseph Auger and used with permission of the author. All other rights reserved by the author.