Author Archives: sserafin1

Encapsulate What Varies

When we write code, we try to think ahead to what possible changes we may need to implement in the future. There are many ways that we can implement these changes, ranging from slapping together a quick patch, methodically going through the code and changing all the affected parts, or writing the code in such a way that anticipated changes can be added in with just one or two small adjustments. This last method is what “encapsulate what varies” means. Writing code will often cause us to think about what future changes we need, and by isolating those parts of the code we can save ourselves time in the future. I found an article that does a good job explaining this concept, and while reading through it I was reminded of a recent project where using encapsulation ended up saving me a lot of time and headaches in the future.

The specific event that the article caused me to remember occurred during my most recent internship. One of the projects I worked on was a script that would automatically assemble 3D CAD models of of any of the systems the company was working on at the time. This script needed to read the system specifications from a database and then organize that data, identify key parts of the system, and figure out how it is assembled so that it can then send those instructions to the CAD software and create the 3D model. It was a big project, and I and the other intern working on it were daunted by the amount of ever changing data that would need to be accounted for. Many systems were of a unique design, and as such we couldnt use the same exact code for all systems. The engineers we were attatched to for this internship introduced us to something called python dataclasses. These essentially allowed us to structure parts of our code that we knew were going to be subject to change in such a way that adding or removing certain data points from the database doesn’t break the overall program. If any changes arise, we only need to alter the related dataclasses for the rest of the code to be able to work with the new change. Without these we would have had to create new methods/classes for each unique change every time it came up; which is not something anyone wanted. I am glad I found out a way of “encapsulating what varies” since I can now write better and more future-proof code by isolating the parts that I believe will be changed the most often.

https://alexkondov.com/encapsulate-what-varies/

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.

Dont Repeat Yourself

DRY, or “Dont Repeat Yourself” is an approach to writing code that emphasizes avoiding repetition in your code. In other words, “Dont Repeat Yourself” is essentially a way to tell developers to write methods for anything they think they might need to use in more than one place. For example imagine you are writing a program where certain parts of the code need to do similar things. One way to approach this problem is to write a segment of code to do the necessary task each time the need for it comes up. While this would work in practice, it is far from the best way to approach this issue. Solving the same problem by creating a separate method that achieves the intended goal and can be called whenever it is needed is a far better and more time efficient solution. Geeks for Geeks has a great concise article about this, and even gives some example using java code.

And that really it as far as “Dont Repeat Yourself” goes. Its a straightforward rule that helps keep developers from wasting time writing repetitive code snippets. While it may seem simple to implement, I know for a fact that I have had plenty of experience writing repetitive code. During my first internship especially. The issue came down to the ever changing project requirements, and my need to adjust my code to meet those requirements. In doing that I know that I definitely wrote repetitive code that could have been its own separate class or function, however as I was working on many different files and pieces of the code, it didnt resonate with me at first that some of this code can be written as one method and called as needed. Eventually while I was polishing up some of the code I realized this mistake and corrected for it. I wrote functions that accommodated most of what the repetitive code was supposed to do and replaced that code with calls to these new methods. This ended up causing many small bugs to pop up however, and I had to spend more time looking for them and fixing them. Had I slowed down when writing my code I would have been able to plan ahead and create these functions from the get go; saving me time and energy in the long run. Going forward I try to be more careful with the code that I write; and try to think ahead to what may need to be reused. Once I figure that out I can create a function for it and save myself time and energy later on.

https://www.geeksforgeeks.org/dry-dont-repeat-yourself-principle-in-java-with-examples/

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.

Anti-Patterns

Being computer science students, many of us including myself have never really written to much code before starting classes here. And since many of us were beginners at writing code and the practices that come with that, our code tended to be an affront to the eyes of anyone who was unfortunate enough to have to read it. As it turns out, many of the issues with our code actually have names, and are usually referred to under the umbrella term of Anti-Patterns. I found out about Anti-Patterns back when we were covering design smells in class, and while thinking of what I wanted to write for a blog entry this week I decided to go more in depth on some of the different types of anti-patterns out there. Many of which I have definitely been guilty of in the past. GeeksforGeeks has a great article covering what anti-patterns are and different types of them. There were plenty in that article, too many to go into in this post without making it too long, so I will go over the ones that I have done most frequently in the past (Although I almost certainly have done each one of the ones in that article at some point). First and foremost is spaghetti code, which is basically just a term that refers to messy code. Code that works, but is incredibly disorganized, difficult to read, and overall unmaintainable. I can confidently say that many of my first personal projects and class assignments suffered from this. My code would work, but god forbid you want to quickly add or remove a feature without needing to try to interpret what that code originally needed to do. Thankfully with time I have gotten better at organizing my code and making it much more readable and maintainable. Another Anti-Pattern that I have done plenty of times, and actually relates to a previous blog post I made, is the “Boat Anchor”. This is essentially useless code within your program. This can also be referred to as “Dead Code” Code that you wrote thinking it would be useful, but it ended up just taking space. This is actually the issue that YAGNI tries to address. There are still functions in many of my projects that I thought would have a purpose, but now they just take up space. I tend to do this less now, although I cant say I dont slip up every now and then.

https://www.geeksforgeeks.org/6-types-of-anti-patterns-to-avoid-in-software-development/

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.

YAGNI, or are you?

Surprise, you probably wont. Lets back up a step first though and talk about what that acronym actually means. YAGNI stands for “You Aren’t Gonna Need It” and is an important acronym to consider for software development applications. I heard about YAGNI from our class, and I wanted to know more about what exactly it consists of, and what its purpose is. Thats where I found a good article by scaleyourapp.com, which did a good job explaining it. In its essence, YAGNI means that you shouldn’t create something until its actually necessary for your application. After all why would you want to waste time on things that might not ever be needed. As many people who have written code know, we often write a function/class expecting to need it in the future, only to never actually use it. This in turn leads to wasted space in the program, wasted time writing the unneeded code, and now you have to think about whether you should bother keeping it or just dump it to save space. I have plenty of personal experience with this in my own projects and work. One particular example that comes to mind is that during a 36 hour hackathon last spring, my teammate and I spent many hours trying to figure out a good way to incorporate a login/user tracking system into the app we were writing. While this feature would EVENTUALLY be needed, for the sake of creating a basic proof of concept app it was far from crucial. Because we didn’t use YAGNI principles we wasted time that could have been spent doing any of the other myriad of tasks we had to do for the rest of the app. These several wasted hours resulted in us not being able to add other needed features, and as such the app was not as complete as it could have been given the time constraints. Even before hearing about YAGNI as an idea, that experience left my partner and I with a realization that there are much more efficient ways to approach writing code than what we attempted. Since then I have been much better about not writing useless code until its needed; however I still have instances where I create a function or class that realistically doesnt need to exist. It can be difficult sometimes to figure out what is needed and what isnt, especially at the beginning of a project. I always assume I am going to need one feature or another, so I begin write it; only to find out a little while later that its not really that important at the moment. While I still started writing it, I have gotten better at stopping before I go too far and waste too much time. That way I can spend more time and energy focusing on more important aspects of the program I am writing.

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.

Training a Machine to Learn

Machine learning, its a buzzword that many people like to throw around when discussing a myriad of tech related subjects. Often times as a way to convey that this product or that service is “state of the art” and “cutting edge” because it uses machine learning. While in some instances that can absolutely be the case, in most modern applications its a bit of an oversell. Regardless, teaching a computer to do a task, and then improve at it as more data is given is no small feat and takes a lot of know how to do properly and efficiently. In this blog post I hope to share the things I have learned regarding machine learning and how it works, to hopefully help others who found themselves in a similar spot to me. Which is to say they somewhat understand how it works, but dont exactly know the specifics behind machine learning.

So how does machine learning work? In its essence machine learning is a set of techniques that give computers an ability to learn and improve at a task without being specifically programmed to do so. This is what most people familiar with the term understand it to be. But as with many things we can go deeper. The underlying science of actually having the machine learn its task is pretty complex, and I wont even pretend that I somehow taught myself the math and logic behind it in the time I spend researching for this blog post. Simply put, to start the process of machine learning you need to force a computer to run through its set task over and over again while introducing more data points and allowing it to sort them out on its own. This is not as hands off as many people may think. Just because you wrote the initial algorithm doesn’t mean you can just run it on a dataset and call it a day. Not only do you have to procure and categorize the initial training datasets, you also need to guide the machine in its initial learning process. And I’m not talking a dataset of tens or even a few hundred data points. For any machine learning algorithm to be successful you realistically need thousands of data points in each set. The datasets also need to vary in their content in order to give the algorithm the best chance at learning its task. Take for example a common use of machine learning, image identification. Lets say you want to teach a computer to identify photos that contain a monarch butterfly. You write your algorithm and now you need to procure a dataset to teach it. What kinds of photos should you train the algorithm on? The simple answer is obviously photos containing monarch butterflies. And while this approach is sound, it leaves the door open for false positives/negatives. Realistically you need a balance of photos containing monarch butterflies, photos not containing monarch butterflies, photos containing things that look like but aren’t monarch butterflies, photos containing other species of butterflies and so on. Now keep in mind you realistically need thousands of data points to teach a machine to do something, and each data point in the training dataset needs to initially be categorized by a human, So now take each of those individual data variants and multiply them by at least a 1000 and you can see where teaching a machine to do something can become very difficult. This isnt to say that you cant teach a machine to do something with much less data, it just wont be as accurate and reliable as a machine that was taught using a much much larger dataset. Now with all the data that has been given to the monarch butterfly identifying machine, it can begin to chew through it and identify data patterns between the images, and begin to form its own method of categorizing the initial training images. There is an entire post graduate field of study dedicated to what actually happens INSIDE the algorithm, so I will definitely not be covering the ins and outs of that here. Simply put, the algorithm comes up with its own way to categorize these images, and applies that method to any new data that comes in, continuously adjusting its own model with each additional data point. This in its absolute simplest form is how many machine learning algorithms work.

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.

BASH Scripts

The command line can be an incredibly useful tool, allowing for quick navigation of directories, launching apps/executables, and a plethora of other tasks. However for all the use cases that it contains, it can be difficult to keep track of all the different commands, let alone have to repeat them often. Cases when someone needs to repeat a series of command line commands can be time consuming and tedious. Luckily there is a tool in our tool belt that will allow anyone to automate this process. This tool is called a BASH script.

BASH stands for Bourne Again SHell, and is in its essence a command line interpreter that interprets user commands and allows us to carry out different actions. We can use this to our advantage by creating a .bash file and entering the commands that we want to run within that file. It really is that simple. Once we have the file with all the commands that we want to run saved, we can go back into the command line and run it. This will then execute each command in the bash file one by one until it runs through them all, at which point the command line will be ready to accept another command.

But what if you need to loop through certain commands? Bash scripting allows for loops to be written directly within the file, and supports a myriad of loop types. Such as for-loops, while-loops, and until-loops. You can also control the exit conditions of these loops with pre-set ranges, breakpoints, or good old enumeration. If-statements are also supported, and work in a very similar way to traditional programming languages. In reality bash scripting is its own kind if programming language. One that focuses on executing command line commands. Many things that you can do in most high and low level languages you can do in a bash script. You can even write individual functions in a bash script and have them execute only if specific conditions are met. Just like in a normal programming language like Java or Python

Given all this one can easily see how a bash script can be so incredibly versatile. From simple clusters of commands, to complex functions with loops and conditional statements bash scripting gives anyone the tools they need to get the job done. Being able to automate different command line tasks can save time, and being able to do so in a complex manner opens up the door to intricate automation scripts that can in some cases remove the need for the user to even interact with the command line at all.

https://ryanstutorials.net/bash-scripting-tutorial/bash-script.php

https://ryanstutorials.net/bash-scripting-tutorial/bash-loops.php

https://ryanstutorials.net/bash-scripting-tutorial/bash-if-statements.php

https://ryanstutorials.net/bash-scripting-tutorial/bash-functions.php

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.

Containers vs. Virtual Machines and why they are important to modern computing

Most people in the Computer Science field have either heard of or directly interacted with containers or virtual machines. They are an important part of many aspects of modern computing; ranging from massive cloud servers to simply running multiple operating systems on your machine, there are many uses for these tools. While they may operate in a similar manner, there are situational benefits to using one over the other. Firstly however we should discuss what each of these tools are, how they work, and how they differ from each other.

Virtual Machines: A virtual machine at its simplest works like a normal computer, containing an operating system and any other apps/services that it has been configured with. The difference between a virtual machine and a physical computer is that a virtual machine is purely software that runs on top of the host computer’s hardware without interfering directly with the native operating system. This gives the user of virtual machines many different use cases, such as testing out new code in a separate environment, running different operating systems on top of the native OS, or even running a cloud based server with multiple different operating systems. Since virtual machines are just software, that makes them extremely portable, meaning that the user can easily transfer a virtual machine from one device to a completely different one without much hassle. One big downside to virtual machines is that they can become very large in size. Since each virtual machine uses its own OS image, runs its own apps/services it has similar space requirements to the native OS; and running multiple virtual machines on the same host machine only compounds this issue.

Containers: Similarly to a virtual machine, a container is also used to virtualize a system environment; however unlike a virtual machine, a container does not need its own image of an operating system as it shares the host machine’s operating system. This allows containers to take up much less space allowing more of them to be run on the same machine. This also makes them much faster, as containers can start up much quicker than virtual machines. Like virtual machines, containers can also be moved from one machine to another, as long as the new machine has the same OS that the old one did. This brings up the downside to containerization, which is that once the container is configured to the host OS, it’s mated to that OS and will not run with any other operating systems unless it is reconfigured from scratch. This is unfortunately the tradeoff that needs to happen to allow containers to be so much quicker and more efficient than virtual machines.

What are the use cases: Now that the similarities and differences between virtual machines and containers have been discussed, we can talk about the specific use cases that each of these tools shines in. For someone just looking to run a separate instance of an OS on their machine then a virtual machine will suffice. Although it is slower and larger than a container would be, it allows for much more versatility with the host machines OS and hardware. In more corporate or large scale settings, virtual machines can be used when multiple system configurations are needed to test a certain product. Instead of buying a new machine for each specific configuration, virtual machines can be used and configured as needed. Containers can also achieve a similar effect, although they would be stuck with the base OS configuration. Where containers really shine is in massive databases, where their smaller size and modularity allows them to perform the tasks they are designed for much more efficiently, and allows for many more containers to fit on one single server. Likewise, containers can also be used for splitting up complex applications throughout multiple containers, allowing developers to make changes to each individual application module without the need to reformat the entire application.

Sources:

https://azure.microsoft.com/en-us/overview/what-is-a-container/#why-containers

https://azure.microsoft.com/en-us/overview/what-is-a-virtual-machine/#what-benefits

https://docs.microsoft.com/en-us/virtualization/windowscontainers/about/containers-vs-vm

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.

REST API, you keep hearing about it, but what is it?

Throughout my time in college, I like many others have heard plenty of different terms related to computer science. REST API was one of those terms. Now I was familiar with what an API was, but I did not know what the distinction is between that and REST. With a little research I was able to find out that the difference is that REST APIs are just normal APIs that follow the REST architectural style. REST stands for representational state transfer, which was developed by Dr. Roy Fielding in 2000 and provides six API design principals that should developers choose to follow would give them a relatively high level of freedom and flexibility. These six design principals are:

  • Uniform Interface:
    • This design principle ensures that all requests for the same resources need to look the same in order to ensure that data can only belong to one URI or Uniform Recourse Identifier
  • Client-Server decoupling
    • This design principal exists to ensure that the client apps and server apps remain independent of each other
  • Statelessness
    • Statelessness allows REST APIs to disregard the need to requests to be processed through an independent server session, and instead ensures that all requests contain all the information needed to process it
  • Cache-ability:
    • Cache-ability ensures that resources can be cached either on the client or server side. This is intended to help improve performance on the client side, while simultaneously improving scalability on the server side
  • Layered system architecture
    • The design of a REST API should make sure that the requests to the API go through multiple layers, in order to prevent the client and server from knowing whether they are communicating directly with an intermediary or the end application
  • Code on Demand
    • This is an optional step, and if the developer elects to include it it should allow the API to send back executable code in its responses, as opposed to the static responses that it usually send

Now all this may seem like a bunch of rules that need to be followed and it may not immediately be obvious as to the benefits that REST gives developers. I recommend you watch this video which goes into detail on what a REST API is and its benefits https: //www.youtube.com/watch?v=lsMQRaeKNDk; but in summary the main benefits are that it it simple to use and standardize, scalable, and they tend to be high performance. REST allows developers to not need to worry about the state of the data and how the requests need to be formatted for different situations. It eliminates some headaches that come with working with other non standardized APIs, allowing them to simply focus on getting the necessary information from the API and applying it to their application. I would also recommend you read this short article from IBM which includes the information found in this blog post, as well as more detail about the best practices with REST APIs as well as their use cases https://www.ibm.com/cloud/learn/rest-apis.

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.

Welcome to my blog

This is the blog where I will document all the things I learn and discover about the field of computer science

From the blog CS@Worcester – Sebastian's CS Blog by sserafin1 and used with permission of the author. All other rights reserved by the author.