Category Archives: Python

Getting Started With Python Unit Testing After Learning JUnit

Christian Shadis

CS-443 Self-Directed Blog Post #5

This past semester I have been spending a lot of time in my QA class working with JUnit 5. I want to be able to take my familiarity with JUnit and apply the same principles with unit testing in Python. I am leaning toward a data-centric career path and Python is widely used for data analytics, so this would be valuable information for me.

This post is not an expert-authored tutorial on Python unit testing because I, myself, am just getting started with it. In this post I will instead give tiny, bite-sized examples of just the basics, translating it from JUnit to Python unittest. I built identical, small classes in Java and Python, and will build tests to go with them. Below are the classes in Java and Python, respectively.

/**
 * Simple class with basic methods, written in Java
 * @author Christian Shadis
 */
public class main {
    public static void main(String[] args){
        int i = 0; // dummy code to keep compiler happy
    }

    public static int addTwoNumbers(int x, int y){
        return x + y;
    }

    public static String toCapital(String str){
        return str.toUpperCase();
    }
}

# Simple class with basic functions, written in Python
# Author: Christian Shadis

class main:
    def add_two_numbers(x, y):
        return x+y

    def to_capital(string):
        return string.upper()

Writing the unit tests in JUnit is simple: we import the JUnit assertions, and the @Test annotation. Then we create the test class, each of the two tests, setup, exercise, and verify just as always.

/**
 * Test Class for main.java
 * @author: Christian Shadis
 */
import static org.junit.jupiter.api.Assertions.*;
import org.junit.jupiter.api.Test;

public class maintest {

    @Test
    void testAddTwoNumbers(){
        int result; // setup
        result = main.addTwoNumbers(566, 42); // exercise
        assertEquals(result, 608); // verify
    }

    @Test
    void testToCapital(){
        String result; // setup
        result = main.toCapital("string"); // exercise
        assertEquals(result, "STRING"); // verify
    }
}

Luckily, writing unit tests is just as easy in Python as Java. We import the unittest library and the other class, define the class, and add a Test Case object as the parameter.

import unittest
from main import *

class maintest(unittest.TestCase):
import unittest
from main import *

class maintest(unittest.TestCase):

    def test_add_two_numbers(self):

        pass
    def test_to_capital(self):

        pass

Now we need a test case for each of our two functions, add_two_numbers and to_capital. Python’s unittest objects have very similar assertions as in JUnit. Use assertEqual(x, y) to check that x == y. This is the assertion we will use in this example, but any of the following are commonly used unittest assertions:

  • assertEqual
  • assertNotEqual
  • assertTrue
  • assertFalse
  • assertIs
  • assertIsNot
  • assertIsNone
  • assertIsNotNone
  • assertIn
  • assertNotIn
  • assertIsInstance
  • assertIsNotInstance

assertEqual takes two arguments and, as the name suggests, asserts their equality. See the implementation below:

    def test_add_two_numbers(self):
        result = main.add_two_numbers(33, 44)
        self.assertEqual(result, 77)

    def test_to_capital(self):
        result = main.to_capital("hi")
        self.assertEqual(result, "HI")

Run the tests and you will see them both pass. Below is a screenshot of the tests executing in Python and then in JUnit. As you can see, the Python tests are slightly faster than the Java tests, but not by much. I used IntelliJ IDEA and Pycharm IDE.

I have found the most helpful way to learn testing is to just play around with the unit tests, see what works and what doesn’t, see what causes failures and what those failures look like, and so forth. I would suggest any other beginner QA student to do the same. Playing around with different assertions and looking at the unittest documentation is a great way to learn this library. I hope this post gave you some insight on how to get started with unit testing your Python modules if you have done some work with JUnit in the past.

4/28/2021

Works Cited:
Unittest – unit testing framework¶. (n.d.). Retrieved April 28, 2021, from https://docs.python.org/3/library/unittest.html

From the blog CS@Worcester – Christian Shadis' Blog by ctshadis and used with permission of the author. All other rights reserved by the author.

All Aboard the Coding Train!

cue Ozzy Osbourne laughter…

This blog is coming to you direct from Amtrak Northeast Regional Train 95, where Stoney Jackson and I are on our way to POSSE 2014 at Drexel University in Philadelphia, PA. This is becoming an annual tradition for us.

So, why is it called the Coding Train? Because we are spending the 5 hour train ride writing code!

When we did this for the first time last year, we worked on the code for the grading scripts that I had started writing in bash (https://github.com/kwurst/grading-scripts/tree/bash-version). Stoney started adding error checking, and then a Python version – neither of which he finished, but we learned a lot about how GitHub works for collaborative development.

This year we discussed a number of options for what project we would sprint on (after we spent a lot of time on professor-talk about curricula, and courses, and learning outcomes, and assessment) but we ended up back on the same project. This time our starting point was the Python conversion of the original scripts that I had started in December, and which I had just begun to refactor this month (https://github.com/kwurst/grading-scripts/tree/master).

Stoney has been doing some serious refactoring on the code, adding one major new feature: a JSON configuration file so that I don’t need 15 different scripts – just different configuration files to pass to a single, more general script. He’s also undertaken a major cleanup of the code, and added the project’s first unit test!

I, on the other hand, have been installing tools that Stoney suggested – git flow and git bash prompt, and in the process having to debug my Mac’s installation of Homebrew and cleaning up my .bashrc file (being completely ignored by my shell) and my .bash_profile file (full of lots of cruft from previous installs.)

Stoney has just pushed his branch, so now it’s time for me to pull it, and test it on some data on my computer. And we’re almost to Philadelphia, so just in time…

From the blog On becoming an Eccentric Professor... » CS@Worcester by Karl R. Wurst and used with permission of the author. All other rights reserved by the author.

Code Break: GitLab API Part 1 – Creating student accounts

Now that we have the CS Department’s GitLab server set up, and CS-140 Lab 1 is rewritten and tested using the new server, I’ve started to think about how to automate my interactions with the server. I had already  written some Bash scripts to interact with the Bitbucket server to get student code, convert it to PDF files, and put it back on the server after grading. Those scripts should still work fine with GitLab, since it’s just git on a different server.

One thing that I had not been able to automate previously is the step of issuing a pull request for students to merge my grading branch into their repository. This was not too much of an issue when there were only 6 students in the summer class (so only 3 repositories per lab assignment), but it was going to take more time with ~48 students in the spring class. While reading RSS feeds, I came across a post mentioning the GitLab API. This could be the solution to my problems! And there’s a Python module for the API! I had already been writing Python scripts to make my grading easier, and had been starting to rewrite my Bash scripts in Python.

I started playing with the GitLab API in Python, and had managed to create a merge request (GitLab’s term for pull request.) I had also noticed that you could create GitLab accounts through the API. This seemed like something I should pursue – creating ~48 accounts per semester seemed like something that should be automated.

Since I intended to post my code on Github, one of the first issues I had to address is how to avoid publishing my private token for GitLab. I could have put in a dummy token before pushing my code, but I would have to remember to do that before every time I committed my code. The solution to this issue was solved through the use of the .gitignore file. If I put my token into a file, then I could add a line to my .gitignore file so that it would not be committed.

# Private GitLab Token - not to be stored in repository #
########################################################
gitlabtoken.txt

Then I could just read the token out of the file, and use that string.

# Get my private GitLab token
# stored in a file so that I can .gitignore the file
token = open('gitlabtoken.txt').readline().strip()

After importing the pyapi-gitlab module, I could use that token, along with the server’s URL to create a GitLab object. Notice, that I had to turn ssl verification off, since we only have a self-signed certificate.

# Create a GitLab object
# For our server, verify_ssl has to be False, since we have a self-signed certificate
git = gitlab.Gitlab(GITLAB_URL, token, verify_ssl=False)

Creating a user account is pretty simple using the API:

# Create the account  
success = git.createuser(name, username, password, email)

The returned success value is a boolean — either it worked, or it failed (but you can’t tell why…).

One thing that’s a bit odd about the createuser call, is that you have set a password for the user, but the notification email to the doesn’t include the password. (If you create a user account from the web interface, it generates a random password, includes it in the notification email to the user, and requires the user to change their password when first logging in.) And, the password you set doesn’t seem to work either!

So, I’m just telling the students that they should use the “Forgot Password” link to have a password reset email sent to them, and then proceed from there. (If this is ever fixed, I’ll have to generated a random password.)

Getting the class list as a CSV file from the Blackboard Grade Center is pretty easy, and the first three rows contain the student’s last name, first name, and username. I can use those three strings to generate the name, username, and email needed for the createuser API call.

The only challenge with processing the CSV file is that Blackboard puts some strange character at the beginning of the file, so the file has to be opened with utf-8 encoding. (And the header line needs to be thrown away.)

The last thing I wanted to add is a way to have optional verbose output, so that I could see if the user creation was working. (I decided that it should always notify the user if the account creation failed.)  To do this I had to learn two new things about Python: how to parse arguments1, and how to send output to stderr.

I used the argparse module:

import argparse
# Set up to parse arguments
parser = argparse.ArgumentParser()
parser.add_argument('filename', help='Blackboard CSV filename with user information')
parser.add_argument('-v', '--verbose', help='increase output verbosity', action='store_true')
args = parser.parse_args()

and used the verbose argument to determine what to print:

if not success:
    sys.stderr.write('Failed to create acccount for: '+name+ ', '+username+', '+email+'\n') 
elif args.verbose:
    sys.stderr.write('Created account for: '+name+', '+username+', '+email+'\n')

Full code is on Github here.

  1. I already knew how to do simple argument parsing, but I wanted to learn how to deal with optional arguments.

From the blog On becoming an Eccentric Professor... » CS@Worcester by Karl R. Wurst and used with permission of the author. All other rights reserved by the author.

Code Break: Making My Grading Easier

Downloading student assignment files from Blackboard as a single zip file saves a lot of time — you don’t have to individually open each “attempt”, download the file (renaming it in the process, so you don’t keep overwriting the previous file, since they are all named “Homework1.pdf” ;) ), and then move on to the next one. Instead you get one convenient .zip file that contains all of the assignment files.

Unfortunately, Blackboard does some other things that make your life a bit more difficult. Once you unzip the file, you will find:

  1. The student files are renamed from filename.ext to assignmentname_username_attempt_datetime_filename.ext
  2. A text file is created for each student named assignmentname_username_attempt_datetime.txt even if the student has not entered any text data or comments.

Checking all of the text files to see if they really contain a comment and deleting those that don’t, and renaming all of the assignment files to username.ext so that I can start grading them 1 This process takes 15 minutes or more per assignment, which certainly lowers my enthusiasm for grading.

Today, I decided that I should write some code to automate this task. The time it would take to write the script would be recouped in only a few assignments. I decided to write the script in Python because I could easily see how to do the string manipulations. My shell scripting string manipulations are not as good. I would have to learn how to do the file system manipulations in Python, but I figured that would be relatively simple.

The first step is getting a list of all the files in the directory (leaving out all of the subdirectories)2:

onlyfiles = [ f for f in os.listdir(dir) if os.path.isfile(os.path.join(os.curdir,f)) ]

The next step is filtering that list to get just the .txt files:

txtfiles = [ f for f in onlyfiles if '.txt' in f ]

Then you can search the contents of the textfiles. You’ll notice that there are two characteristic phrases that indicate no text data and no comments. You can just delete the files that contain both of those:

for f in txtfiles:
    file = open(f)
    contents = file.read()
    file.close()
    if 'There are no student comments for this assignment' in contents and \
       'There is no student submission text data for this assignment.' in contents:
        os.remove(f)
        print('Deleted', f)

After refreshing the list of files to be just the remaining files, you can go about renaming the files. They all have _attempt_ embedded in their filename. Then you want to strip off everything up-to-and-including the first underscore, and from the second underscore up to the file extension. Then rename the file.

for f in onlyfiles:
    if '_attempt_' in f:
        first = f.find('_') # location of first underscore
        second = f.find('_',first+1) # location of second underscore
        extension = f[f.rfind('.'):] # get file extension
        newf = f[first+1:second] + extension
        os.rename(f, newf)
        print('Renamed', f, 'to', newf)

There are probably other features I can add, but this works well enough for now. Back to grading…

Full code is on GitHub here.

  1. I may still have to convert some of them to PDFs, if the students have not followed instructions, since I grade them by marking up the PDFs on my iPad. But that’s something I’ll tackle later. For my programming classes, I do that with my grading scripts which are still a work-in-progress.
  2. http://stackoverflow.com/a/3207973

From the blog On becoming an Eccentric Professor... » CS@Worcester by Karl R. Wurst and used with permission of the author. All other rights reserved by the author.

Code Break: Data File Manipulations in Python

In my CS-135 Programming for Non-CS Majors class, one of the primary objectives for the students is to learn to work with collections of data in files. I’m always happy when this requires manipulations that can’t be performed with other tools that the students are comfortable with — thus motivating the need to learn to code.

This afternoon in class, students were working in groups on their final projects. Two groups came up against some problems in getting their data into a format that could be easily processed in Python. Both cases involved data that was only available in the form of PDF files.

The old standby of selecting text and pasting it into Excel did not provide nice columns of information. Our second attempt was to export the data as text.

Case 1

In the first case, we got text data that looked like:

Biology 306 N/A 306
Biotechnology 80 26 106
Business Administration 748 N/A 748
Chemistry 141 N/A 141
Communication 245 N/A 245
Communication Sciences & Disorders 218 N/A 218
Community Health 158 N/A 158
Computer Science 116 N/A 116
Criminal Justice 445 N/A 445
Early Childhood Education 80 19 99
Early Childhood Education, Non-Licensure 26 N/A 26

This looked promising – we’ve dealt with one-record-per-line-space-delimited data files in class before. You just need to read a line at a time, and use Python’s string split method to turn it into a list… But — wait! — the first item  is a variable number of words separated by spaces. That will make for some messy lists — they’ll all be of different lengths:

['Communication', '245', 'N/A', '245']
['Communication', 'Sciences', '&', 'Disorders', '218', 'N/A', '218']
['Community', 'Health', '158', 'N/A', '158']

Here’s the solution: Python lists can be indexed from the end using negative indices. So, we can definitely get at the last three values (numbers of majors — undergraduate, graduate, and total). Assuming a list in a variable department, they are at positions department[-3], department[-2], and department[-1] respectively.

But, what about the department name, which may be in multiple list items? Well, we can get it as a sub-list, using list slicing: department[:-3] yields:

['Communication']
['Communication', 'Sciences', '&', 'Disorders']
['Community', 'Health']

All that’s left is to concatenate them together into a single string:

name = ''
for item in department[:-3]:
    name = name + item + ' '

Full code is here: https://gist.github.com/kwurst/7761789

Case 2

In the second case, we got text data that looked like:

Boston    00350000    4368    65.9    15.2    0.8    2.1    15.9    0.1
Boston Collegiate Charter (District)    04490000    34    67.6    32.4    0.0    0.0    0.0    0.0
Boston Day and Evening Academy Charter (District)    04240000    162    13.0    55.6    0.0    6.8    24.7    0.0
Boston Green Academy
Horace Mann Charter School
(District)    04110000    72    70.8    26.4    0.0    1.4    1.4    0.0
Boston Preparatory Charter Public (District)    04160000    27    74.1    11.1    0.0    3.7    11.1    0.0
Bourne    00360000    145    90.3    4.8    0.0    2.1    2.8    0.0
Braintree    00400000    369    95.1    3.3    0.3    0.3    1.1    0.0

Which could be fixed the same way, except for the fact that some of the district names ended up broken across multiple lines. (I’m not sure why this happened, and it turned out that exporting the data in a different way fixed the problem. But I’d already found a solution, so I’m going to document it here…)

Working from the assumption that the district org code always starts with a zero (I know — not a good assumption, but it works in this case…), the solution involves checking for lines with no zero in them and concatenating them together. Then you can treat the lines as in Case 1.

for line in f:
    while line.find('0') == -1:
        line = line + f.readline()

Full code is here: https://gist.github.com/kwurst/7761789

From the blog On becoming an Eccentric Professor... » CS@Worcester by Karl R. Wurst and used with permission of the author. All other rights reserved by the author.

My First Real FOSS Contribution

I spend a lot of my free time writing code. I usually work on my own
personal projects that never really go anywhere. So, I decided to take
a detour from my normal hacking routine and contribute to an existing
free software project. My contribution was accepted awhile ago now,
but I wasn’t blogging then so I’m rambling about it now.

It’s wise to find a project with a low barrier of entry. An active IRC
channel and/or mailing list with people willing to help newcomers is
ideal. I remembered hearing about GNU MediaGoblin at LibrePlanet
2012, so I decided to check things out. MediaGoblin is a media sharing
web application written in Python. Their bug tracker marks tickets
that require little work and don’t require a deep understanding of
MediaGoblin as ‘bitesized’.

I chose to work on this ticket because it didn’t require any
complicated database migrations or knowledge of the media processing
code. I added a new configuration option, ‘allow_comments’, and a
small amount of code to enforce the setting.

Eventually, the ticket got reviewed and Christopher Webber
(MediaGoblin’s friendly project leader) merged it: "Heya. Great
branch, this works perfectly. Merged!"

It was a very small change, but I was happy to finally have some
actual code of mine in a real free software project. I have a strong
passion for free software and the GNU philosophy, so it’s really great
to participate in the community. My job as a professional software
developer eats up a lot of my time these days, but I hope to find the
time to continue hacking and contributing.

From the blog dthompson by David Thompson and used with permission of the author. All other rights reserved by the author.

More Python

While things with infrastructure were being dealt with, I decided to jump on the eutester documentation project. So, during class I cloned the repo that kwurst forked and got to it. I took some time throughout the week to go through multiple directories in the eutester code and try to understand how it all worked and connected. I’m still not sure how everything works, but I do have a much better understanding of the code.

As far as documentation goes, I did do some in the eucaweb directory. The files i added to were the euwebaccount.py and euwebgroup.py. These are fairly basic scripts that setup accounts and groups respectively for testing purposes. Once I was done with commenting those for the day, i committed and pushed my changes to the repo. I checked the github page just to make sure and seeing my first push ever was pretty cool!

It looks like a few of us are doing good on commenting the code, so I wouldn’t be surprised if we had this project done within the next week. Now that I have a better understanding of the code I will be able to get some more commits in before the semester is over.

From the blog clacroix12 by clacroix12 and used with permission of the author. All other rights reserved by the author.

Python and Euca2ools

Since Eucalyptus is coded in python, I spend the past week going through online tutorials and learning more about how to code in python. In comparison to any programming languages I have used before, the syntax is very straight forward and easy to comprehend and learn. It may be that python isn’t by far the first programming language that I have learned, or that it is just an easy to learn language in general. Im going to continue with the tutorials I have been using and hopefully understand most of what is going on in the Eucalyptus code so I can contribute to commenting it.

In addition to learning python, I also had to redownload and initialize Euca2ools on my new laptop. This time was actually much more successful than the first time I attempted doing this. I couldn’t figure out how to download the credentials .zip from our matrix cluster yet, so I tested this process out with the ECC credentials. I was able to initialize, create a key pair, find an image, and create an instance of it. In addition to that I was also able to assign that instance an availible IP, SSH into it, and terminate the instance using euca2ools. I did have to use a few extra commands in order to make the process from our euca2ools wiki page work, which i later added in.

 

This is how you can generate an accessible instance IP:

euca-allocate-address

Associate the allocated address with your VM instance:

euca-associate-address <IP from allocate> -i <instance ID>

Last monday I was put on the task of figuring out how to get multiple people the permissions to receive user requests and approve them. After some research and playing around with our matrix graphical interface I couldn’t figure out how to get multiple users the ability to do that. So as of right now, the only way that I know how to have multiple people receive requests is to give them all access to the email address the requests get sent to. If we wish to have professor Wurst have access to user requests, then we will either have to make a new email address that our “Administrators” can access, or just switch the email address to professor Wurst’s so he will receive them.

Now that I have a better understanding of python and euca2ools, I plan on getting to know more about eutester and how that area of our project works. I also plan on looking at more of the Eucalyptus code to see if I can start commenting parts of it.  But more of that stuff soon enough! I’m looking forward to this coming week and seeing where our next step is as a class.

From the blog clacroix12 by clacroix12 and used with permission of the author. All other rights reserved by the author.

Meeting 6 – More research on Eutester

There were a lot of research I had to do on the Eutester since we will be using this framework a lot later on to test out the cloud. So the first thing I decided to do is try to understand the code in Eutester files. They’re written in Python so I started with understanding Python language first. In general Python is an OOP language that is kinda similar to Java, so it didn’t take long to pick it up. I went through some manuals and tutorials online and put together the main differences between Python and other programming languages on the wiki. Here is the link: http://cs.worcester.edu/wiki/index.php?n=Main.Python.

And also thanks to prof. Wurst for providing us with useful links to Eutester blogs that I was able to test out some commands. Well but first thing first, if anyone has an outdated Boto version installed on his/her computer (latest version is 2.2), he/she needs to get the latest version. Here the link to get it http://pypi.python.org/pypi/boto

After that is taken care of, I decided to try out the commands that were shown on the blog http://testingclouds.wordpress.com/2012/03/04/test1/ to test out the connectivity of the instance. Everything went smoothly up to when I was ready to ping the instance. But it failed, the ping was unsuccessful.

This was one busy week for me but I really learned a lot and finally, I’ve started to get my hands on testing out the EuTester. Can’t wait to test it out on our cloud.

From the blog longnguyen16 » wsu-cs by watever10 and used with permission of the author. All other rights reserved by the author.