For the entirety of this course so far we have been working and getting familiar with the version control tool git. In this week’s blog I will be writing about a blog called “What is Git | Explore a Distributed Version Control Tool” by Reshma Ahmed. I have chosen this article in particular to see how it came to existence, the roles they play in companies, and to get a further understanding in addition to what we have learned in class.
From the article I have learned the different types of version control such as centralized version control system and distributed version control systems. Git was created in 2005 by Linus Torvalds designed to “handle small to large projects with efficiency”. Git is a distributive version control system which is a system that prevents corruption and crashes that were caused from having a server hosted in a single repository such as in centralized version control systems due to everyone having a copy on their machine of their local repository. When reflecting back to the course we have been practicing with pushing and pulling from repositories, branches, as well as removing and adding commits. The GitKit activities worked on in class shown us practical issues that one may face with git such as merging issues as well as issues involving different versions of commits.
The article describes some of the features of git which we have looked at in class such as being open source as well as being secure. Open source software encourages transparency, collaboration, and accessibility similar to the FOSS communities mentioned in the first GitKit activity done in class. The article reads “Git uses the SHA1 to name and identify objects within its repository”. This concept becomes apparent from seeing in the GitKit activities done in class when applying commands such as “git log” that shows a commits’ hash as well as the date a commit was made.
Git not only is used in the software development community but plays a detrimental role in companies with more and more companies using git as their go-to version control system. Some of these companies include tech giants such as “Facebook, Yahoo, Twitter, eBay, Salesforce, [and] Microsoft” which shows the significance of git.
When reflecting for my personal future use of git, practical applications of git in the field of machine learning and data science is using git to manage datasets to ensure reproducibility as well as data integrity. In addition, I plan on using git for my data mining course project when collaborating with other peers.
Links
https://www.edureka.co/blog/what-is-git/#companies_using_git
From the blog CS@Worcester – Anthony Duong CS Blog by anthony duong and used with permission of the author. All other rights reserved by the author.