White box testing includes the use of all of a program’s internal structure to your advantage in the testing phase. A component of this internal structure that usually makes up a small percentage of the code body, but can contribute to the most amount of problem cases, are variable/data type declarations. Testing for these cases is called dataflow testing. In the blog, “All about dataflow testing in software testing”, by Prashant Bisht, the author details how dataflow testing would be implemented, and some examples of how it might look.
The implementation of dataflow testing, before any interaction with the code is done, is first executed with a control flow graph, which tracks the flow of variable definitions and where they are utilized. This type of organization allows for the first important internal component of dataflow testing, the tracking of unused variables. Removal of these unused variables can help narrow the search for the source of other problem cases. The second anomaly commonly testing for in dataflow testing is the undefined variable. These are more obvious than unused variables, since they almost always produce an error, due to the program relying on non existent data. The final anomaly tested for is multiple definitions of the same variable. Redundancy that can be introduced by this anomaly can lead to unexpected results or output.
Subtypes of dataflow testing exist, and are specialized to test different types of data. For example, static dataflow testing is the tracking of the flow of variables without the running of the tested code. This code only includes the analysis of the code’s structure. Another example, dynamic dataflow testing, focusing only on how the data relating to variables changes throughout the code’s execution.
To show how dataflow testing would work in practice, the author provided an example. This example concerns variables num1, num2, and num3. First, initialization of these variables is checked, i.e. if num1 is initialized as int nuM1 = some_int, the testing phase would catch this. Then, it ensures that the use of these variables don’t cause errors. This depends on program specifications, like if the program is meant to add each variable. The data flow is then analyzed, ensuring that operations including multiple variables are functioning properly. i.e. if num1 + num2 = result1, and num2 + num3 = result2, the dataflow phase would ensure that the operation result1 + result2 = result3 is functioning properly (though result3 being defined is a problem that would be handled by the first phase). The final phase is the data update phase, where the values of operations are verified to be what they’re expected to be.
From the blog CS@Worcester – My first blog by Michael and used with permission of the author. All other rights reserved by the author.