A Scaredy-Cat's Guide to Getting Started in Automation Testing with Python

by Michael Ruttenberg on

scaredy cat

Scared cat is really scared by Kabukicho Shinjuku (CC BY-ND 2.0)

Scaredy-cat (noun, informal): a timid person, someone frightened by almost everything

Why am I a scaredy-cat? I am not from a developer background. Coding is scary and confusing. I don't understand lots of codey-techie stuff and I am impatient to just get going. Classes? Page objects? Huh?! Crickets...

So let me tell you how I got started in my automation journey...

The story

I once had a permanent QA role that had little to work on for 18 months. Yes, you read that right, - 18 months! What to do? Between the long lunches and extensive use of Facebook, I needed something to do. I hate being bored.

I had a Raspberry Pi sitting around at home (still do, still boxed) but it runs on Python, and I've not been a developer since I had a ZX Spectrum (Timex 2000 in the US) so I thought, "Hey, I could learn some Python to program my Pi, and maybe use it at work too." I started reading a book about Python: Python Programming for the Absolute Beginner by Michael Dawson. (NOTE: This book is for Python 2, some syntax will have changed for Python 3). I also did a (free) Codecademy course in Python.

I needed something to do and a reason to do it, but I wasn't from a developer background, and I wasn't ready to jump in with the heavy languages like Java or the Microsoft-based ones like .NET. The best catalyst for learning something is having a problem to solve, so I found one. I had something repetitive and tedious that I needed to do in my hobby - amateur ("ham") radio.

The project

I had made a radio contact with an obscure US island (an oversized rock) in the Caribbean between Haiti and Jamaica called Navassa. I somehow hadn't logged the contact - I don't recall having made it but I'm in their log so it must have happened, right? So there we have it, they had me but I didn't have them in my log (and yes, we keep a log of people we contact around the world). The guys on Navassa had uploaded their log and I was in it. To get a pretty, commemorative postcard for this obscure place (called a "QSL card") I needed to provide the time and date of the contact. I didn't have that so I wanted to reverse-engineer it to find out the date and time.

The plan

  1. Work out the start date and end date of when they went to Navassa. This is published online so that was all good.
  2. Upload a text file I created containing rows of my contact with them but with the timestamp value spaced in 5 minute increments for the 14 day period they were there, so a sort of text file containing around 4000 rows. Only 1 would be a match.
  3. Upload the file to the web-based matching system (clublog.org)
  4. Find the match and then delete the roughly 3999 non-matching rows. This involves running a search and then clicking a Delete button, then repeating the search over and over.

So I had a mission and now had to find out how to make that happen. Enter Selenium...

The first problem

Selenium and browsers change. Firefox ripped out the bits that Selenium relied on from Firefox v48 onwards, so Selenium no longer worked midway through embarking on my Selenium journey once I upgraded Firefox.

The second problem

Documentation is often either wrong or incomplete. I found that guidance about setting up a Marionette server was REALLY unhelpful, unclear and often incorrect. (Marionette is the new driver that is included inside Firefox since v48, and now relies on Geckodriver to interact with it in order to run Selenium.) I just wanted a script I could copy, paste and run. My boss at the time said, "That's normal in IT." ARGH! Documentation goes out of date really quickly, and if you're the only QA and no-one around you has ever done automation, who do you call on? I still have this issue. And what is that Marionette stuff anyway? I don't know and still don't know, and do I need to? I am still unconvinced. I read up some really confusing articles about Marionette and struggled for weeks to get anything to work. Frustrating.

The third problem

The version of Geckodriver (the interface between Firefox and Selenium) I used had a massive breakage in it: You couldn't send a click action. It's been fixed subsequently but at that time it was horrible.

The solution

I cobbled together something that worked, and I pass this knowledge to you now.

  • To use Chrome you need a file called Chromedriver installed on your machine, so that Selenium can talk to Chrome.
  • To use Firefox you need a file called Geckodriver present on your machine.

Let's assume you'll be using Python on a PC and get started with some basic setup. I'll also assume you haven't got any requirements installed. The process is similar for a Mac, so these steps are mostly transferable. I am assuming you are using Firefox for now.

  1. Install a version of Python 3.x from https://www.python.org/downloads/. It must be Python 3.x, as Python 2.x no longer works with Selenium. I put it in a folder when prompted called C:\Python3 (You will need to edit the path that is offered as the default to this value). If prompted to add this System's PATH I said Yes. That means that Windows can access Python from anywhere as it will know where to find Python.
  2. Choose how you want to edit your scripts. You can use the Python GUI (called "IDLE"). I use SublimeText 3.
  3. In order to access the scripts independently of any specific user on the PC, I put them in a new folder off the root of the drive. In this case I created C:\selenium.
  4. Go to https://github.com/mozilla/geckodriver/releases and download the geckodriver 64-bit file (assuming you're using a 64-bit machine and 64-bit version of Python). It's a zip file. Unzip it. Then copy the geckodriver.exe file to C:\Python3
  5. Go to http://chromedriver.chromium.org/downloads and download the chromedriver file. It's a zip file. Unzip it. Then copy the chromedriver.exe file to C:\Python3.
  6. In Windows start a CMD window (Press START + R then type cmd and press Enter)
  7. Change the current folder to C:\selenium (using cd c:\selenium).
  8. Type pip install -U selenium then press Enter. This installs the Selenium library for Python. The -U parameter updates it if you already have it, and installs it even if it isn't previously installed. This is handy for updating from time to time to updated libraries for Selenium.

Now we're ready to go with some basic Selenium scripts which will set up Firefox and open a Google browser window. Enter the following in your Python editor. The # sign means it's a comment and so that line won't be executed.

This is all you need to get started:

# call basic libraries for Selenium
from selenium import webdriver

# Firefox support
from selenium.webdriver.common.desired_capabilities \
    import DesiredCapabilities

# always useful to have this but we’re not using it
#from selenium.webdriver.common.by import By

# allows us to send text to the browser for forms
from selenium.webdriver.common.keys import Keys

# set up Firefox parameters, generally do not edit
firefox_capabilities = DesiredCapabilities.FIREFOX
firefox_capabilities['marionette'] = True

# set the "driver" variable to be our instance of Selenium.
# Everything from now on will reference "driver".
driver = webdriver.Firefox(capabilities=firefox_capabilities)

Everything above here can generally stay untouched and you'll reuse it in every script you write, if you just want to interact with a basic site and click around forms and links. I'm not using more complicated stuff like PageObjects (because I never have done and that's next on my to-do list), but for basic getting going we're good to go.

Now the fun stuff. The script will open a browser window and enter the following:

# open the URL of the page here
driver.get("https://www.google.co.uk")

Let's give it a try. If you are using IDLE press F5 to run what you have typed above. For SublimeText, press Ctrl+B to run it.

Hopefully the script won't error and a browser window will pop open.

You'll notice that the browser window may not be full screen. Let's make it full screen by adding the following line to the script and then execute it.

# start the browser in full screen
driver.maximize_window()

Now we can search for something. Let's tell Selenium to put the cursor in the search field on the form and enter some text.

browser = driver.find_element_by_name("q")
browser.send_keys("wonderproxy")

Great! When you run the script it will fill in the form field. It will leave the Firefox window open when it finishes. If you're happy with the results, you can close the Firefox window now.

Next let's edit the previous line and submit the form by pressing Enter.

browser = driver.find_element_by_name("q")
browser.send_keys("wonderproxy" + Keys.ENTER)

There are a number of things that are tricky up there so let's highlight them (I can't explain them but that's okay, we're Script Kiddying here).

  1. Keys has a capital letter.
  2. ENTER is capitalised as it is a control character. Control characters usually take capitalised text.
  3. What is "q"? It's the "name" element specified in the HTML for the search field. If you right click on the search field on google.co.uk and select Inspect Element, you'll see this, and we want to interact with it, so we choose something there to attach an action to. We have many choices in the example below, some of them are easier to use in Selenium than others.
  4. We used a variable browser so that the element we are interacting with can be used a second time, for the send_keys action. If we don't have this, then the second action doesn't remember which element we are using.

You'll see that name="q" is in there somewhere so we will be using that for now.

Next let’s refactor the code above in to a single line since Selenium lets us chain up commands. Since we are doing it in one command we can drop the browser variable we used above.

driver.find_element_by_name("q").send_keys("wonderproxy" + Keys.ENTER)

Next we will see what happens if we put the text in the Google search field and want to ignore the Google search suggestions. If we were doing this by hand we could press Escape to dismiss the search suggestions. Selenium lets us do that too.

driver.find_element_by_name("q").send_keys("wonderproxy"+Keys.ESCAPE)

Why did we not want to use the Google search suggestions? Google offers contextual search terms in a dropdown menu and we don't know what they will offer or the order in which they will be offered. The dropdown also obscures the Search button. We can skip all of that by forcing the submission of the text we entered and ignore the dropdown suggestions.

Now let us find the Search button and click it using .click()

driver.find_element_by_name("btnK").click()

Boom! We have a script that can navigate pages and click on buttons.

Let's click on the first result on the results page.

Warning: Each result has the same class name, but we're going to click on the first one so it doesn't really matter that there are many instances of the same class, one for each link on the page.

driver.find_element_by_class_name("LC20lb").click()

There are ways of selecting other links on the page so fear not. Small steps...

Lastly, we need to close our browser session. Add the following to the end of your script:

driver.close()

If you run the script now with driver.close() it will run, execute quickly and disappear before you had a chance to see anything, but if it closed, it probably also succeeded. If it didn't succeed then Python has very verbose (and not that helpful) error messaging. Hopefully for now that isn't an issue if the code above worked.

Save the file with a name ending .py in C:\selenium, e.g. firstscript.py and now you can execute it and mess about with it in your chosen editor package.

So that's how I got started in Selenium: I used the script to do the following, all of which are available with the commands above:

  • Open a session to the website I needed,
  • log me in by entering my credentials in two fields and submit the form,
  • navigate to the relevant page,
  • enter search terms in a form field,
  • submit the search,
  • on the results page click a button,
  • wait for the page to refresh.

I put the everything that happens after logging in inside a while loop which would run until nothing was left in the result set. This worked to clean up the many thousands of contacts I had injected into my radio online log. It removed everything including the one good match I wanted, but I knew which row it was so I could reload it later. The script ran overnight but it would have taken me many days to execute it manually, so I probably wouldn't have done it. It was the perfect repeatable action to automate. As we say here in the UK, "I was well chuffed" (which means, "Very pleased with myself").

Picture or it didn’t happen... Tada!

radio contact record

Further suggested reading

Thanks for reading and good luck!