CS443: The Tribal Engineering Model (“the Spotify Model”)—An Autopsy

This codebase ain’t big enough for the two of us, pardner.

Something I think about quite frequently is how engineers seem to live in a world of their own creation.

No, really: open an app—look at a website—play a game of (virtual) cards. The designers of these programs each had their own take on how these experiences ought to be represented; the visual language, the information density, the flow from one experiential strand to another. These aren’t happenstance—they’re conscious decisions about how to approach a design.

But engineering is a bit like collectively building a diorama. Things look well enough in miniature, but the problems—with the product, and with the people working on it—become more clear the more you scale upwards.

This problem isn’t new, or all that far-fetched: engineers are specialists, and they’re good at what they do. Put two engineers together, and you’ve got yourself a crowd. Put three, and you have a quorum. Put a whole team of engineers on the same problem, and now you have a new problem.

(If you’ve ever wondered what happens when an engineer fears of being obsolesced out of their job, it’s not hard to imagine. It rhymes with “reductoring.” Alternatively, imagine a small, innocent codebase exploding into a million pieces, and you get the idea.)

So, how do you get engineers to play nice?

Plenty of solutions have been suggested; most of them involve Agile in some form or another.

(I personally don’t believe in it. I like—and have used—many of its byproducts, such as Continuous Delivery and Continuous Integration, but I don’t find that the underlying methodology behind the model itself is as universally applicable as it would have you believe.)

About a decade ago, there was a lot of buzz over Spotify having finally, definitively, solved the problem of widescale Agile implementation. It was highly ambitious—virtually utopian in its claimed economies-of-scale, with reports of unprecedented efficiency/productivity gains. Almost too good to be true.

Just one problem: it didn’t actually work. Also, they never actually used it, according to the testimony of one of the senior product managers involved in implementing it. Despite a grand showing in 2012, Scaling Agile @ Spotify was pretty much DOA before it had even hit the ground.

In hindsight, it should have been fairly obvious that things over at Spotify weren’t as brilliantine-sheened as the glowing whitepaper they said contained the key to their company’s success may have suggested. Years of complaints about incomprehensible UX decisions seemingly passed unilaterally with little user (or developer) input; a platform that abruptly pivoted towards delivering podcast and, strangely enough, audiobook content with almost the same priority as music streaming, and a lukewarm reception to the arrival of social networking capabilities.

So, what went wrong?

Firstly: every single organizational unit under this model, with the exception of Guilds, were ordered as essentially fixed, permanent structures. If a Squad was assigned to work on UX, that was their sole responsibility, unless a member had other duties through a Guild (more on that later). Accordingly, each Squad had equal representation within a Tribe, regardless of how important or pressing their work was. Further, each Squad had its own manager, who existed in direct opposition to three other managers in the same Tribe, who all had diametrically opposing interests due to representing different product segments. If an overriding voice was required, Spotify sought to preempt such issues by including a fifth Übermanager in a Tribe, who did not actually have a day-to-day purpose other than mediating disputes between managers, presumably because such instances were or were expected to be extremely common. (It is unknown whether this fifth manager was included in a similarly structured college of Middle Managers).

Worse, yet, however, is it becomes painfully evident how little work could get done under such a model. Because Tribes were, by design, interdependent on each other due to cross-pollination of key devs through Guilds, a work blockage in a Tribe not only required the intervention of two or more Tribes, but required the key drivers of the entire Tribe to do so, preventing any of the involved Tribes from doing any meaningful work. This is on top of the presupposition that each Squad has had to have mastered Agile in small-groups for the idea of an Agile economy-of-scale to even make sense.

Probably most damning, though, is the impulse to just double down on your own work when confronted by a deluge of meetings, progress reports, post mortems, and pulse checks. Rather than focusing on the interconnectedness of the Team-Tribe-Guild structure and how to best contribute as but a small part of a greater whole, many Spotify engineers simply did what they are instinctually led to do and submitted product as if working in small-groups.

This essentially created a “push” rather than “pull” system where engineers would deliver the product they had believed higher-ups would expect them to deliver, rather than the actual direction that Spotify executives wanted to steer them in. When noticed by upper management, these facilitated “course corrections”. Sweeping, unilateral, and uniquely managerial.

And that was pretty much the end of Agile for Spotify.

Things look well enough in miniature, but the problems become more clear the more you scale upwards.

So, why even talk about this?

I’ve had plenty of good collaborative experiences with engineers in industry. I want to work with people—I have worked with people, and well. But I believe there’s a science to it, and as engineers, we simply do not focus enough on the human element of the equation. The Tribal model reminds me a lot of Mocks, and even the POGIL methods we’ve used in this course so far.

Mocks are a way to implement that at an individual, codebase level. POGIL teaches us to think for ourselves and expand our individual skills. I think what Spotify failed to recognize is that people do not instinctively know how to work together, especially tech-minded people like engineers. Interdependence isn’t something to be cultivated absentmindedly, as if we could ignore the effects of losing the one person who knows what piece of the pie to cut for a day, week, or more; rather, it’s something to be warded against.

The Guinness pie from O’Connor’s in Worcester, by the way, is life-changingly good.

Some food for thought.

Kevin N.

From the blog CS-443 – Kevin D. Nguyen by Kevin Nguyen and used with permission of the author. All other rights reserved by the author.

Customer and Enterprise: Why is one valued over the other

Photo by Anna Nekrashevich on Pexels.com

Hello, Debug Ducker here and have you ever thought how low quality a software you use feels, despite being made by a well known company. This is how I feel when it comes to videos games.

It was a thought that came to me during class when a professor said, if a company release buggy untested software that may ruin the companies reputation. A student ask well what makes the game industry different then. For those in the know the game industry has been plagued with the problem or releasing products in a buggy or half-finished state, that they expect the consumer to buy.

You would think after years of doing such thing, game development companies would be careful about development. Many gamers have criticized this on going problem within the industry and some gaming companies are seen in a poor light, though such reputation never seems to completely ruined them, it does make them less trustworthy. So why are video games different in terms of software testing.

This question kept bothering me, and I brought this up with a friend who may know more. He states that it is because that the consumer is not the most important person to disappoint, that in the software testing field the one who you don’t want to give a poor product or low quality product would be a company or a business. As they aren’t the average customer and have a lot more money to spend.

This is where I did a bit more digging and found out a lot of interesting things when it comes to making a product for the average consumer and making one for a company.

There is a lot of money making a product for a company. The graph from Dells revenue throughout the years showcase how much money can be made in enterprise products

As you can see the commercial products, which are products businesses themselves purchase make most of dell revenue compared to the average consumer. In a way I can see them being prioritized when it comes to reputation, you don’t want to have bad relations with the ones bring in the money.

There is possibly a more logical than financial answer to the question. Consumers are the common people and there are a lot of them. They may have different reactions to the product but since they are so many, there will always be someone willing to buy a product despite the quality. Then comes the company who probably needs the product to do a services and would prefer it thoroughly tested before getting it.

With this I can understand a little of why it is so important to test products in software testing especially when it comes to businesses. We need them for their continued support and they bring in a lot of money.

Admin. “Dell Statistics 2024 by User and Revenue.” KMA Solutions, 22 Apr. 2024, http://www.kma.ie/dell-statistics-2024-by-user-and-revenue/.

From the blog CS@Worcester – Debug Duck by debugducker and used with permission of the author. All other rights reserved by the author.

Different Types Of Behavioral Unit

Hello everyone,

The topic of this week’s blog will be about Behavioral Testing. Testing your code is one of the most important things every programmer has to master in their professional career. There are many many ways to test your code, and each niche technique focuses and works differently for specific purposes. Some can be similar enough compared to each other but different enough so they can be separated, and what we will focus on today will be about the “Different Types Of Behavioral Unit” As the name suggests, Behavioral Testing focuses more on how your code behaves rather than how it is written. While it may sound plain and simple, this type of testing has a lot of different ways programmers can use to test their code.

The first one you always start with is the “Happy Path Tests”, which basically checks if everything is working the way it should. The first goal each time you work on a project is to make sure that it runs and it outputs the wanted results, and then after you try to see how it reacts when things get a bit more complicated. Next we have “Negative Tests” and you use this to see how the program reacts when bad inputs are entered on purpose. This is used to see if some specific features work, like entering the wrong password. If that happens the program should give you another chance to enter the password or guide you on how to make a new one. This makes the program more secure and trustworthy for all users. The next most common type of Behavioral Testing, is “Boundary Tests” which allows you to see how the code behaves when inputs outside of the wanted range are entered or it can also be used to check the limit and boundaries of the code. This helps out with scaling the program if things are predicted to go bigger, from the database, users etc. One of my favorite things about this blog is that it covers a lot of key aspects that everyone should know about Behavioral Testing. Some tips that I learnt from it is that when writing this type of test you need to test one behavior at a time. Trying to test two at the same time will just ruin the purpose of it. You should also be very clear and describe exactly what you are trying to test. Another good habit you can do is to simulate events, from successful ones to trying to break your code on purpose to see how well it behaves in all conditions.

In conclusion Behavioral Testing is important because you can not only check if there are errors in the code from early to end development but also helps you understand how it behaves in different scenarios which is so important to know, it helps you to understand better and indirectly makes debugging a lot easier.

Source:

https://keploy.io/blog/community/understanding-different-types-of-behavioral-unit-tests

From the blog Elio's Blog by Elio Ngjelo and used with permission of the author. All other rights reserved by the author.

Test-Driven Development: An efficient development process

Summary Of The Source

  1. The process of TDD: There are three main steps to the TDD process, including the “red phase”, where a test is written but expected to fail. The green phase where code is written, enough to make the test pass. Finally the refactoring phase that aims to improve the code in functionality and in design metrics.
  2. Benefits of TDD for software teams: TDD is an implementation of agile development, since there is constant feedback on the tests and the code being written for them. This makes it so that the teams are regularly communicating and on the same page for fulfilling the requirements. Other benefits as well such as lower long-term costs, documentation, etc.
  3. Best practices for TDD: Organize correctly, making sure that naming conventions specify which function is being tested. Create tests that are simple and target only one aspect to assess, this makes it easier to pinpoint failure in the code. 

Why I Chose This Resource

I chose this blog post in particular because it is very easy to read for someone who is new to this process, providing all the necessary information needed to really understand what it is and its benefits without too much technical jargon. 

Personal Reflection

I have learned in the past about the waterfall development process vs the agile process, and that agile was pretty much just better, both in terms of time and resources spent. This blog made me realize that TDD falls under the agile framework and that makes the development more responsive to change along the way. It personally seemed silly to me at first that the test would be written first, testing seemingly nothing and only being there to cause confusion, but now I understand that it acts as more of a set of restrictions that keep the code produced working correctly, kind of like a mold that the code has to fit to work. This approach actually seems very intuitive because of the constant feedback and the way the functionality is built around something that is tested to work. To me, there don’t really seem to be any glaring downsides to this form of development unless the team decides against it and has a more comfortable development process, which for them would work better. It’s a simple strategy but it seems very clean in its workflow and deployment. 

Future Application

After learning more about this process, I would like to work in an environment where this is practiced. It seems very intuitive and efficient and utilizing it would help me get a better personal feel for it. Developing around tests probably seems confining, but I think it does produce more correctly working code than the other way around.

Citation


CircleCI. (2020, August 11). What is test-driven development? Retrieved from https://circleci.com/blog/test-driven-development-tdd/

From the blog CS@Worcester – The Science of Computation by Adam Jacher and used with permission of the author. All other rights reserved by the author.

Data Flow Testing

Source: https://www.testbytes.net/blog/data-flow-testing/

Data Flow Testing, viewed as a nuanced approach in software testing, is the process of examining data variables and their values through the use of the control flow graph. Data Flow Testing is an example of white box and structural testing. By targeting gaps that become prevalent within path and branch testing, data flow testing aims to locate bugs from the misuse of data variables/values. 

Data flow testing is used on both static and dynamic levels. In static levels, data flow testing involves the analysis of source code without actually running the application. The control flow graph in this instance represents execution paths of the code from the application. Errors that can be found on the static level include, definition-use anomalies (a variable is defined but is never used), redundant definitions (variable is defined multiple times before it’s used), and uninitialized use (a variable is used before a value has been assigned). 

On the dynamic level, the application is executed, and the flow of data values/variables is analyzed. Issues that can be found on this level include, data corruption (value of a variable is modified in an unexpected way, leading to undesirable behaviors), memory leaks (unnecessary memory allocations, memory consumption becomes uncontrolled), and invalid data manipulation (data is manipulation in an unpredicted way, causing many outputs).

The steps of data flow testing include identifying the variables, constructing a control flow graph, analyzing the flow of data, identifying anomalies in the data, dynamic testing, designing test cases and executing them, resolving anomalies, and documentation. I decided to read about data flow testing because I thought it’d be valuable to learn about how this type of testing allows for early bug detection and overall improved code quality and maintainability. The obvious downsides however are that not every possible anomaly can be caught, the process can be time consuming, and other testing techniques should also be used.

From the blog CS@Worcester – Shawn In Tech by Shawn Budzinski and used with permission of the author. All other rights reserved by the author.

Why Boundary Testing is so Important

Boundary testing is an integral aspect of testing software used to test the extreme limits of a program. It consists of testing the minimum value, right above minimum, a nominal value, a maximum value, and right below maximum. If we’re using a strong testing method, we would also test below the minimum and above the maximum.

By testing this way, we can assure that the “boundaries” of the program work correctly as well as center of those boundaries. We can also ensure that invalid inputs return an invalid response from the program.

As Jayesh Karkar says in his blog “Everything You Need to Know About Boundary Value Testing”, “This testing is imperative when it comes to testing the functional competence of software modules.”

He mentions some factors that justify the necessity of boundary value testing. These include that functionality errors are likely to occur more often at boundaries, it only tests a small range of values making it simple to execute, and due to the testing being based on guidelines and framework there is no chance of compromising the boundary values which ensures maximum test effectiveness.

Now as most of us in the computer science field know, there are many ways of testing programs. Boundary value testing is just one of them. With this in mind, boundary value testing is certainly one you should keep in your “tool kit” and combine with other forms to create the most effective tests possible.

As I continue my career as a student and my future career as a computer scientist, I plan on taking this form of testing with me wherever I go as a fundamental.

From the blog CS@Worcester – DPCS Blog by Daniel Parker and used with permission of the author. All other rights reserved by the author.

Mocha, Chai, and Why

URL: TestoMat Mocha and Chai Web Application Testing
The chosen article walks us through the implementation and inner workings of two JavaScript testing tools. Mocha and Chai are two different testing tools that, when combined, form a powerful addition to any software development project. Mocha serves as a test runner, being responsible for executing the test suites. While Mocha can function independently and does not necessarily require any additional tools, combining it with Chai, an assertion library, allows for the validation of expected outcomes in test cases.

The blog post also provides us with several reasons why the utilization of both tools can help improve your software development speed and quality. Among the listed benefits are:

Mocha:

  • Flexible and customizable testing.
  • Simplifies the testing of asynchronous code.
  • The ability to apply before, beforeEach, after, and afterEach.
  • Runs tests in both browser and Node.js environments.

Chai:

  • Provides clear and expressive assertions that improve the readability of your test scripts.
  • Allows you to write descriptive tests using natural language constructions.
  • Enables the creation of custom assertions tailored to your specific testing requirements.
  • Supports assertions on complex data structures such as arrays, objects, etc.

At the end of the article, it presents an opportunity to implement Mocha with some real-world examples, which is great and especially helpful for anyone unfamiliar with the framework.

The reason I chose this article relates mainly to another class. I will be taking on the task of developing test suites for the Thea’s Pantry app. I also found it interesting because it explains why you would use both Mocha and Chai instead of using other tools or Mocha alone.

The content is helpful and does not seem to include any bias by presenting Mocha and Chai as the only testing environment for JavaScript. Instead, it highlights their pros and invites the reader to determine whether these tools will suit their specific use case. One thing that caught my attention about using Mocha is its ability to generate documentation through testing. This is great, as it will help me better understand how documentation related to testing is created and how useful it can be.

Mocha and Chai’s use of natural, human-like language makes testing much easier. The use of natural or human-like language in programming often makes me skeptical of such tools, libraries, or frameworks. Sometimes, this characteristic is marketed as a way to draw people in, as if it will help them code more effectively. Although Mocha and Chai are distinct tools, their implementation of this characteristic leans more toward improving readability for the programmer rather than for just anyone. What I mean by this is that they avoid technical or overly formal wording in favor of keywords that resemble everyday human language.

From the blog CS@Worcester – CS Today by Guilherme Salazar Almeida Nazareth and used with permission of the author. All other rights reserved by the author.

More About Mocks

This week in class, we took another deep dive into the world of test doubles. This week pertained to mocks specifically. While doing class activities related to mocks this week in class, it seemed to me that mocks are easier to use than stubs while offering more insight on how a program is supposed to run. With that in my mind, I took the opportunity to further learn about mocks. I ended up finding a blog that gave me an even better understanding about mocks, and it is called “Understanding Mock Objects in Software Testing: A Tale of Simulated Reality”.

One quote from the blog does a great job of summing up what mocks are. “It’s like a stand-in actor in a movie – the mock object behaves like the real one but is controlled within the testing environment.” Mocks mimic the actions of the object it replaces when called upon in a test class. Mocks check the order and frequency of method calls in which they are used, which stubs don’t do.

Frameworks

There are mocking frameworks for many different languages. Some common mock frameworks include Mockito in Java, Moq in .NET, and Jasmine in JavaScript. Mockito is popular for its simplicity. It does well to create mock objects for a certain class or database that simulates how a real object would respond in the program. Moq is primarily used for C# applications and is popular for its strong typing and fluent interface. A language with strong typing demands specification over data types, so variables must have a type when they are defined. Jasmine mocks HTTP requests, which allows front-end applications to be tested without back-end interactions.

Pros

Some of the benefits of using mocks is that they are used in a controlled and predictable environment, and they are efficient. Mocks give predetermined behaviors and don’t leave the testing framework, which leads to consistency and predictability. And since mocks are used in the place of real-world objects, the testing process is sped up. It allows testers to isolate external factors and focus solely on the code being tested. Mock testing is also a cheaper alternative to testing real-world objects that may be very expensive.

Cons

A drawback of mock testing is that while tests pass in the testing, they may not in real-world situations. And just like stubs, changes to the system can cause mocks to become outdated, so it is important that mocks are frequently provisioned in order kept up to date with any changes to the system being tested.

Example

In the example above Mockito is the framework being used. WeatherService represents the external dependency, and WeatherModule represents what is being tested. mockService should return “Sunny” when “New York” is the string in the .getWeather() method. This allows WeatherModule to be tested in isolation. module.getWeather() is then assigned to the string variable “weather”. From there, weather is tested for if it returns “Sunny”. This is fascinating because the tests compile and are ran using make-believe objects

Reference

https://www.thetesttribe.com/blog/mock-testing/

From the blog Blog del William by William Cordor and used with permission of the author. All other rights reserved by the author.

Use-Testing Graphs

In the previous week we spoke about Use-Testing which is a graph node with edges where the edges show the flow of the program, and the nodes are each line. These can be very big graphs if it is a large project but luckily, we can always shrink it down by taking out unnecessary nodes. This will also mean redirecting the edges correctly but after working with the full model it is much simpler to shrink it down. These graphs are called Decision to decision path graphs that reduce redundancy while keeping the important parts of the graphs still visible and with the flow of the program. These graphs go well with decision tables where it would be a True or False scenario with a “don’t care” option which can be used together with the path graph.

We also learned to define the paths by splitting the variables, defined at and used nodes into separate columns. This simplified things in order for us to assign the node number to the variable. For example, node 1 is numCells which is used in node 4 and 10, when we do this method, we can see where each variable is used at and therefore when we shrink our graphs, we know how to better organize it.

Overall, the use of the test cases helps us organize for test cases for the entire system. It helps us determine whether something is useful or not like an empty line can be counted as a node because the IDE is still going to check if there is code on that line, but for the final graph empty lines and else statements can be erased or shown as an edge. The graphs also help with noticing what we need for the test case not only what we do not need. This makes it easier to visualize the structure of the code while also easily being able to communicate what is required in order to make the test cases because the last thing anyone wants to do is write unnecessary test cases or and over abundant amount of test cases that should not be there.

Also, by using path testing you can easily visualize which node interacts with a certain edge or another node. If needed to explain to a team of engineers it will be easy and organized for anyone to look at while also having all the valid and important information that they need to create the test cases for the software.  I also learned about the cases that come along with these graphs which are cases 1-5 where case 1 is a node with no edge going into it which would always be the first node of the graph. Case 2 just does not go out to another node implying in some cases the last node. Case 3 would be an indegree greater than 2 and out degree greater than 2. Case 4 consists of 1 degree in and 1 degree out which is usually in variable declaration or in a sequential program. Case 5 is a single entry or single exit. These cases help to define the nodes further in-depth in order to dive deeper into the flow of the program.

Source: Software Testing – Use Case Testing | GeeksforGeeks

From the blog Cinnamon Codes by CinCodes and used with permission of the author. All other rights reserved by the author.

Week 10- Stubs and Mocks

During week 10, we experimented with stubs and mocks. Stubs and mocks are two forms of testing doubles that allow a team to write tests before the whole program has been written. Stubs and mocks simulate the code’s fleshed out methods and allow for testing methods to be written in advance. 

Stubs are methods or classes that are either empty or return a set value so they can run and be tested. Stubs are state testers and focus on testing the outcome of methods. Mocks are more dynamic, the test block can define what it wants the outcome of any method to be and then test for that outcome. It can also test for multiple set outcomes in the same block. Mocks are behavior testers and test the interactions between methods, rather than the outcome. 

The reference I found was a blog that compares and contrasts stubs and mocks by BairesDev to get a better idea of their use cases. They explained what stubs and mocks are, the advantages and disadvantages of each, and the situations to use both of them.

Advantages for stubs are their predictability and isolating the method. They will always return what you are expecting because of how simple they are. Since stubs do not involve any other calls or methods, they are great at isolating testing to just that method. Disadvantages are user error and the lack of behavior testing. The user might have a discrepancy in what they return in the stub and what they expect in the test. If your method needs to interact with other methods, stubs are not great for testing behavior because they only look at the outcome. 

Advantages for mocks are being better at testing subtle bugs and issues and testing the interactions between methods in your code. Since mocks are behavioral tests, if the methods don’t interact how they’re expected to, the test will not pass, which goes beyond what stubs do. Disadvantages are increased complexity and brittle tests. Mocks make your tests more complex than if you were testing the fully written code, which may take more time to adapt to once the program is finished. Brittle tests can occur if tests are too connected to the mock expectations, and small changes can cause a lot of errors. 

Overall, stubs are useful when testing independent methods and those that only need to be tested for the outcome. Mocks are useful when methods are dependent on others and can find errors that might not show up if you were just testing outcomes. Both are great when writing tests, but have different applications and both should be used when testing programs. 

Source: https://www.bairesdev.com/blog/stub-vs-mock/

From the blog CS@Worcester – ALIDA NORDQUIST by alidanordquist and used with permission of the author. All other rights reserved by the author.