Author Archives: ausausdauer

Docker and Automated Testing

Last week, in my post about CI/CD, I brought up Docker. Docker can be used to create an “image”, which is a series of layers that are built upon each other. For example, you can create an image of the Ubuntu operating system. From there, you can define your own image with Python pre-installed as a second layer. When you run this image, you create a “container”. This is isolated and has everything installed already so that you, or anyone else on your development team, can use the image and know reliably that it has all necessary dependencies and is consistent.

Of course, Docker will get much more complicated and images will tend to have many more layers. In projects that run on various platforms, you will also have images that are built differently for different versions.

So how does this apply to CI/CD? Docker images can be used to run your pipeline, build your software, and run your tests.

The company released a Webinar discussing how CI/CD can be integrated with Docker. They discuss the three-step process of developing with Docker and GitLab: Build, Ship, Run. These are the stages they use in the .gitlab-ci.yml file, but remember you can define other intermediate stages if needed. The power of CI/CD and Docker is apparent, because “from a developer’s perspective, all you have to do is a ‘git push’ — and that’s it”. The developer needs to write the code and upload to version control and the rest is automated, with the exception being human testers who give feedback on the deployed product. However, good test coverage should prevent most issues and these tests are more about overall experience.

Docker CI and Delivery Workflow
From Docker Demo Webinar, 4:49

Only five lines of added code in .gitlab-ci.yml are necessary to automate the entire process, although the Docker file contains much more detail about which containers to make. The Docker file defines the created images and the code that needs to be run. In the demo, their latest Ubuntu image is pulled from a server to create a container, on which the code will be run. Then variables are defined and Git is automated to pull source code from the GitLab repository within this container.

Then, a second container is created from an image with Python pre-installed. This container is automated to copy the code from a directory in the first container, explained above. Next, dependencies are automatically installed for Flask, and Flask is run to host the actual code that was copied from the first image.

This defines the blueprint for what to be done when changes are uploaded to GitLab. When the code is pushed, each stage in the pipeline from the .gitlab-ci.yml file is run, each stage passes, and the result is a simple web application already hosted from the Docker image. Everything is done.

In the demo, as should usually be done in practice, this was done on a development branch. Once the features are complete, they can be merged with the master branch and deployed to actual users. And again, from the developer’s perspective, this is done with a simple ‘git push’.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

Inversion of Control and Dependency Injections

When I started programming, one subject I felt I never fully understood was the Inversion of Control design principle, in the context of using dependency injection when testing. These were commonly mentioned in tutorials, blogs, and documentation. My main takeaway was “make it another class’s problem”, so I think it’s time for a refresher on this topic to see if that holds true. Inversion of Control extends the Dependency Inversion principle, the D in SOLID.

Once again Martin Fowler has a great article on the subject, this time with code examples. Before I summarize his description, what is the problem that IoC solves? A picture says approximately 2¹⁰ words, so:

A MovieLister depends on both the Abstraction and Concrete Class
From https://martinfowler.com/articles/injection.html

Fowler describes a MovieLister class that can act on the MovieFinder interface. This is great for the methods in MovieLister and makes it easy to modify, but MovieLister is also creating the implementation. It would be nice to change the implementation without modifying MovieLister, especially if we want to package MovieLister for use by others, who might need a different implementation of MovieFinder.

“We would prefer it if it were only dependent on the interface, but then how do we make an instance to work with?”

Martin Fowler, Inversion of Control Containers and the Dependency Injection pattern

The basic idea is simple: create the implementation of MovieFinder in another class and “inject” it into the MovieLister. This can be done in a separate Container class, or an Assembler class.

A class acts to assemble the MovieLister by creating the concrete implementation and injecting it to MovieLister
From https://martinfowler.com/articles/injection.html

Dependency Containers allow you to keep track of the classes that you must create and their dependencies. This container will construct the dependencies, create the object you need, and return said object. This isolates all changes to dependencies into a single class. Note that this is encapsulating what varies. The user only needs to decide on concrete implementations in their own custom Container. They can implement MovieFinder in a custom class and use in in their Container.

There are three types of dependency injections: constructor, setter, and interface. The difference is straightforward: either inject the dependency with a constructor at instantiation, a public setter method, or an interface method. Fowler prefers constructors and I’m inclined to agree. If possible, it is better to have a completely constructed object immediately.

I do think this really boils down to “make it another class’s problem”, but this phrase was misleading to me a few years ago. The problem I had with it was figuring out where to end the abstraction and create a concrete class, thinking it should be yet another class’s problem. It was tempting to misuse the design pattern, which left me feeling confused when I didn’t have a good answer. At some point, you need a concrete class to actually do the work. With this design pattern that concrete class is the Dependency Container.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

The KISS and RoP Trade-off

I’ve discussed trade-offs in software before. The more you delve into the world of design and engineering, the more evident it becomes that trade-offs are an underlying principle to rule all principles. There is no perfect software; there is no perfect solution. My aim to strive for the best software, however, led to me choose this topic to discuss.

I intended to keep it simple and focus on KISS (Keep it Simple, Stupid), which claims that a simple, straightforward solution is the better solution. But after listening to a podcast about design principles by SoftwareArchitekTOUR*, I was forced to make a trade-off between a simple blog post and a more powerful, generic one. Specifically, they mentioned the RoP (Rule of Power) principle or GP (generalization principle), which states “that a more generic solution is better than a specific one”.

You will never produce a perfect design… but you can’t typically have both [KISS and RoP]. Either it is simple and specific, or it’s very generic and flexible and therefore no longer simple but rather has a certain degree of complexity. You can typically only strive for a Pareto-Optimal Solution.”

Christian Rehn (translated from German), SoftwareArchitekTOUR Episode 60, 14:30 – 15:45

Rehn says it’s a balancing act, and neither principle is either right nor wrong. The concept of a Pareto-optimal solution is rather simple: find a solution that is better in at least one aspect, but at least as good in every other aspect. As Rehn says, you have to ask yourself what works best in your situation. What is the best compromise?

So what does this mean when choosing between KISS and RoP? Like any new concept or principle, both should be practiced in isolation, with small projects. It is only with this basic experience that you can learn how to make these decisions in more complicated systems.

In learning design principles, I have refactored code in order to make it adhere to a design pattern or principle. When new situations arise, the old solution may no longer hold water. But each time, the code was kept as simple as possible given the requirements. The Pareto-optimal solution is the simplest solution that provides the required functionality. This leads me to conclude that KISS trumps RoP, especially considering other principles such as YAGNI, which strives to prevent the code smell of unneeded complexity. But on the other hand, the simplest solution in which every class does a single, specific thing, would itself become extremely verbose, violating DRY in particular.

Determining how to make these trade-offs is an art and comes with experience. In the mean time, I plan to keep my code as simple as possible and add complexity and generalizations as needed.

* This podcast is only available in German, but the English show notes explain the topics of the podcast in detail.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

Understanding Continuous Integration and Continuous Delivery (CI/CD)

While good testing and version control habits are always helpful and save a lot of time, the process of committing changes, running tests, and deploying the changes can in itself be time consuming. This is especially true in an agile development process, which aims to make small, incremental changes in a series of sprints. Ideally, the process of committing, testing and delivering could be done in a single command.

This is where CI/CD comes in. An Oracle blog post on CI/CD describes the software development pipeline as being a four-step process: commit, build, automated tests, and deploy. CI/CD involves automating this process so that a single commit results in changes deployed to production, so long as the changes can be built and pass the tests. However, any of these stages can be skipped and not all of them have to be automated if it doesn’t make sense to do so. Additionally, other stages can be added if desired.

Oracle stresses the importance of testing in this process, and the requirement to have a suite of unit, integration, and systems tests. These are important to have for any project, but are vital if a commit is automatically deployed to production to make sure users are getting a reliable product.

Gitlab uses Runners to run the jobs defined in a file named “.gitlab-ci.yml”. Jobs can be defined in the stages mentioned above, or any arbitrary number of custom stages, to create a single pipeline. When a commit is made, the pipeline lists the stages and whether they pass. Of course, unit tests should be run before a commit to ensure that the code is behaving as expected, but once a change is committed the rest of pipeline can be automatic.

In large systems, this automation is especially useful. You may be not able to compile an entire code base on a local machine. You may not be working on the same schedule as the rest of your team. Automating other portions of testing and deployment allows a single developer to finish a feature or patch on their own and reliably get it into the hands of users.

In researching software engineering jobs, it seems that many companies use Docker as part of their CI/CD process. Imagine you want to make sure your software runs on Mac and Windows machines. Docker can use predefined images to build isolated containers in which the code can be run. A pipeline can contain separate build stages, for example, to make sure the code builds on both operating systems before running tests and deploying. Next week, I will look at Docker in more detail and describe how it can be applied to software testing.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

Test-Driven Development (TDD)

Writing software is fun. Mostly.

When I look back on the projects I’ve made, a couple big ones stick out which were mostly not fun. For example, I made the mistake of trying to build an Android app before I fully understood what a class is. Design patterns were not even something I had considered. The consequence was a fairly simple app with code that was supremely complicated and confusing.

For my next project, I tried to do better with the lessons I had learned. I continuously had errors that I didn’t understand; errors that were seemingly unrelated to the changes I had made. After a couple weeks of struggle I deleted all my files and uninstalled Android Studio. The frustration was so bad it made me want to quit programming. The frustration could have been prevented.

This is where tests, and specifically Test-Driven Development (TDD) could have saved me.

James Grenning has a blog post discussing how he feels about TDD, after years of writing software without it. TDD forces you to think through what you want your code to do. If you can’t test your code it’s likely a sign of bad design, so TDD indirectly leads to a better software architecture. Tests describe what the code is supposed to do, and the tests will always be there. If the code stops doing it in the future the tests will fail, acting as a permanent, enforced documentation. This leads Grenning to bring up the concept of regression tests, which confirm that your new changes don’t modify legacy behavior. The suite of regression tests that you slowly build over time will make sure nothing is inadvertently broken. But as you write new tests, you’re also getting instant feedback that your new code is performing to specification.

My life as a programmer is much better with TDD.

James Grenning

TDD brings you back to the programming exercises you did when you first started learning. It is satisfying as it is to struggle through a problem and make your project do what you want. But it is so much more satisfying to see a big green bar that tells you everything is working, including everything you’ve done since the beginning of the project. Then you can play with your new features manually if you so desire.

There are some downsides to TDD. More code means more time and more to maintain. Sometimes you’ll have to change tests because what your code is doing also changes. Furthermore, TDD is an art in itself and must be learned, meaning you may need to break old habits of bad design to write code that is testable. I’m inclined to agree with Grenning that debugging a complex system is much more time consuming that these downsides.

It only took a couple weeks of clear thinking to realize I had overreacted when I uninstalled Android Studio. I learned to have a lot more patience and take the time to do things the proper way. I still forgo testing and TDD for small, personal projects, but the benefits of TDD greatly outweigh the downsides for any important project.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

The Factory Design Pattern

There are many design patterns that assist in building software architecture. They fall into four categories: creational; structural; behavioral; and concurrency. Each of these groups has many specific design patterns, but to focus on more than a single pattern in a blog post would be remiss.

The Coding Blocks Podcast has a series on design patterns, the first of which discusses creational patterns, which are described in the “Gang of Four” book, “Design Patterns” [1]. While the hosts find the overuse of these design patterns humorous, they admit that seeing the word “Factory” instantly describes code at a higher level. Furthermore, design patterns assist in adhering to other object-oriented design principles.

[Design Patterns] help make a system independent of how its objects are created, composed, and represented. 

– Design Patterns, Gamma, E., et al

I have come to strongly dislike nested if statements, checking if a subclass is an instance of a superclass, and switch statements. Clean Code by Robert Martin [2] has influenced that a lot, because it made me realize there’s a better way.

Imagine an app that asks the user what kind of Car they would like. This car needs to drive(), brake(), and calculateFuelCost(). Assume there are dozens of types of Car, each implementing the Car interface with these methods. The app will need a bloated switch statement to make all of these concrete classes, depending on user input.

The idea behind the Factory design pattern is that there is a separate class that creates concrete objects, but returns an abstraction (an abstract class or interface). You don’t need to scatter logic across an application that determines which type of “Car” should be created, for example. In fact, you don’t even need or want to know what kind of car it is, once it’s created. A CarFactory class returns an Abstract Car or a Car Interface. In the case of an interface, concrete classes such as ElectricCar or HybridCar implement the Car interface.

The result is an application that can create a CarFactory, tell it what kind of car the user wants, and then manipulate the returned interface as needed. And no matter the concrete object it gets back, it can be treated the same. This separates the application logic from specific implementation of concrete Car classes.

So when your application needs to add a FlyingCar option, a concrete class can be created that implements the same Car interface. The application only needs to tell the CarFactory to create a new FlyingCar object. The rest can be treated identically, because the CarFactory returns an interface with all the required methods.

As stated in the podcast episode, if this is confusing it means you’re learning. These patterns are not trivial to learn, and require many more examples and plenty of UML diagrams to fully understand the idea behind the pattern.

[1] Gamma, E., et al. Design Patterns:Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.
[2] Martin, Robert C., et al. Clean Code: a Handbook of Agile Software Craftsmanship. Prentice Hall, 2016.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

YAGNI: Because More Isn’t Always Better

YAGNI stands for “You Aren’t Gonna Need It”. This is an acronym I’ve only taken to heart recently, despite reading about design principles on and off over the past few years. I have learned enough from my mistakes in the early days of programming to know that planning pays off, and what could be better planning than adding code now to help yourself add things in the future? Two words: clean code.

This summer, I was tasked with designing and implementing a desktop application from scratch, as the only developer. The specification was vague (essentially, “these are the user inputs, this is the output, and leave room to add more inputs and outputs later”) and it required research into topics that have nothing to do with software, that I had never seen before. This caused me to create unnecessary abstractions and extra features in many places, just in case. In the end, I refactored quite a bit once I understood more about what was required and realized what wasn’t needed. The extensibility that I actually needed worked out great. Everything else got in the way.

A blog post on YAGNI by Martin Fowler does a great job of describing what I did wrong, and I chose it in order to learn how I could have improved what I did. His summary of YAGNI is that features you expect to need should not be built. Features should be built only when you actually need them, because it’s very likely that “you aren’t gonna need it”.

Fowler goes on the describe the main argument YAGNI makes, which is that you might be wrong about presumed features. When you’re wrong about features, you accrue four costs: the cost of building a presumptive feature, the cost of delaying other features, the cost of carrying the presumptive feature (making it harder to modify other code), and the cost of repairing the presumptive feature (because even unused features must be maintained).

One major point that Fowler stresses is that this does not apply to efforts to make code extensible, unless it adds unneeded complexity. There is a difference between adding code that is easy to refactor in the future, and adding unused code. The former follows other design principles. The latter adds clutter and creates confusion. And who needs more of that?

Could I have done better with the aforementioned project? I think so. There were a couple major overhauls that would have been a nightmare without some extensibility baked in, but most of that code didn’t add unnecessary complexity and is therefore in line with YAGNI. I also have to consider that it was a single-person project done in 3 months, so it is likely that the negative side-effects were kept at a minimum. It’s also impossible to determine how much time I spent searching through unnecessary code, so a little more YAGNI could have sped things up. In the future, I’ll be considering this important design principle.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

Software Quality-Quantity Trade-off

Programming was extremely hard for me at first. Once I discovered that programming isn’t just typing code, I spent so much time worrying about getting the perfect design, imagining what could go wrong, and researching alternatives that I rarely actually coded anything, at least for a brief period. While this concern may have taught me a lot in my research, I know better now. There’s a trade-off between time spent designing and time actually making software work.

Likewise, there is a trade-off between creating high-quality software and adding more valuable features, as Martin Fowler discusses in this blog post. Fowler is a software developer and author who has written extensively on the topic of software quality and design. In this blog post, he poses the question “is high quality software worth the cost?”.

Short answer: yes it is. To go further, Fowler believes that the question really doesn’t apply to software because high quality software is cheaper. He also makes the distinction between internal and external quality in software. For example, users notice a quality user interface, but have no idea which design patterns were used.

Internal quality is the goal of quality assurance and testing, and while it isn’t directly visible to the user, it makes it so much easier to provide a user with additional features by adding to an existing code base. In my last post, I wrote about SOLID and a podcast by Coding Blocks. They discussed how good patterns take a lot more code to write in the beginning, because things are broken up into smaller classes, with more, smaller methods. But as Fowler describes, the initial work more than pays off in the long run when code is easy to read and understand.

Anyone who has learned to program has spent hours looking for a mistake on a small project. These get harder to find as a project grows, especially when there are hidden dependencies among poorly-formatted code. Now imagine you wrote the code 2 years ago. And you can’t even fix it because you’re on vacation, but it’s preventing users from accessing their accounts so your coworker has to. Quality code and tests will drastically improve the chances that the problem is quickly patched and your users are happy. Take it one step further: this problem could have prevented entirely with sufficient testing and more focus on quality.

Most time is spent reading code rather than writing it. Fowler mentions this, but it is also an important theme in Clean Code by Robert C. Martin [1]. If you write quality code now, you can write even more quality code in the future.

[1] Martin, Robert C., et al. Clean Code: a Handbook of Agile Software Craftsmanship. Prentice Hall, 2016.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

SOLID: Laying the Groundwork for Design Patterns

At the core of software design must be basic principles. Following basic principles allows us to make sure we are on track with a standard. It creates consistency and makes it easier to maintain our code in the future, because we will know what to expect.

Following a single principle might be easy, but there are many object-oriented design principles. Sometimes, it is quicker to hack a solution together than to think up an elegant solution. These solutions may then be built upon, and the project will slowly stray from its original intent. Gaining the intuition of when and how to apply these principles is necessary to ensure that a project doesn’t become a mess of spaghetti as these quick-fix solutions add up.

“Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it’s worth it in the end because once you get there, you can move mountains.”

– Steve Jobs

Coding Blocks Episode 7 discusses these topics in the context of five of the most important design principles, known as SOLID. This podcast is great and I highly recommend it: it is hosted by professional programmers who discuss and debate the topics of each episode.

Each of these principles likely warrants its own podcast, and one or more blog posts to describe them in detail. Below is a short description of each one.

S: Single Responsibility Principle

  • An object should only do one thing.

O: Open Closed Principle

  • Code should have the ability to be extended, but it shouldn’t have to be modified.

L: Liskov Substitution Principle

  • Replacing an object with one of its subtypes should not break the code.

I: Interface Segregation Principle

  • Smaller, more specific interfaces are better. If an interface is too broad, an object may need to implement many methods it doesn’t need.

D: Dependency Inversion Principle

  • Within a class, you don’t want to depend on an implementation. Instead, a higher-level class should depend on interfaces, which are implemented by lower-level classes. Don’t have an abstraction relying on an implementation. Have the implementation relying on an abstraction (interfaces).

Conclusion

The biggest takeaway from this podcast episode was that even professional programmers do not follow all of these principles all the time. Some of the concepts are confusing and contradict other principles. Some of them should only be applied when refactoring code after it is working, with the goal of improving your code.

However, when learning these principles, projects should be created with the sole goal of implementing these principles. Mentally solidifying these concepts by focusing on each one and converting it into code will help a developer to predict issues that may come up in the future when implementing production code. Knowing the context of the system, code can be written that mostly follows these principles right off the bat, and improvements can be made later as code is added and the patterns emerge. Most importantly: the foundation will be SOLID.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.

The Importance of Testing and QA

When I first started learning Python, I had no idea what I was getting myself into. I wasn’t sure what programmers did to make things work. I didn’t really know what programming meant besides typing code.

I was eager to learn though and make cool things. So without knowing even knowing what a loop is, I started writing a Scrabble game. I spent what I remember being countless hours trying to get that thing to work. But I had to copy and paste a lot of code over and over. Frustrated, I kept learning.

After a few short lessons I realized I could do it better, so naturally I opened a new terminal and typed:

$ touch scrabble2.py

And I began fresh with my new knowledge of loops. Then I repeated this process when I learned about functions. And again when I learned about classes. Looking back to my first Python projects file, there are only scrabble4.py, scrabble5.py and for some reason scrabble53.py. I’m a strange mixture of relieved and disappointed that I can’t find the first 3 iterations, and I really hope there weren’t 47 more.

So I was eager, but I did not take the time to learn all the tools in the toolbox before I tried to build a house. On one hand, it was a good lesson and a good way to solidify what I knew. On the other hand, it was a royal waste of time.

Tools make our jobs easier. When I discovered Git and the concept of version control, I was shocked. No more appending numbers to a rewritten version of source code! And I can see the changes I just made when adding a new feature! When I discovered automated testing, I was just as excited. When I started Android development and began reading about software architecture, it opened up a whole new world and solved a lot of problems I had when I tried to build things myself.

Programming is not the grind I thought it was before I started learning the tools. It’s a much more creative process. It’s a lot more fun. It is not banging your head against the keyboard because it won’t work, but rather realizing there’s an error in your logic* and having the chance to find it. And software development isn’t even mostly coding; it is mostly design.

Quality assurance and testing allows programmers to do their job efficiently. They are tools that prevent us from banging our heads against our keyboards in 6 months when we add a feature to code we’ve long forgotten. Learn the tools. They’re there to help.

*Or a coworker’s logic, but don’t bang their head against the keyboard either.

From the blog CS@Worcester – Inquiries and Queries by ausausdauer and used with permission of the author. All other rights reserved by the author.