Skip to content

Structuring your test code in Playwright and Python

Matthew Heusser Mar 30, 2022 Automation

Deciding on a testing toolstack is a major decision with long-term impact on software quality and delivery speed. The proof of concept is easy enough; you can follow the directions to run the first test in Playwright with Python, then to add localization. The question is: Then what? How big should the test files be? Where should they be stored in version control? How do you perform setup?

Today we'll explore how to answer these questions, in light of the strengths and weaknesses of Playwright and Python.

Python Test Structure

Traditional user interface testing is often about the user journey. A test might be a scenario that takes the user from creating an account to order complete, end-to-end. This has the most value for the human, who can notice problems and mark a bug, then work around and keep testing. A computer using this strategy will either error out or keep trying to click links that are no longer there, on the wrong screen. The most common counter-strategy to this is to have each test be a small, isolated, independent test of functionality. The software might have a one-minute running test of login, a one-minute test of search, spend one minute on editing user details, and so on. These tests should run independently, so they can run randomly in any order or in parallel. The good news is that Playwright and Python combine to create a way of working that enables parallel, independent test scripts.

The most popular Python unit test tool is  "pytest", which we use in our examples. Pytest will run through the current directory, as well as all sub-directories, looking for files that begin with test_, and end in .py. For all those matching files, pytest executes functions that begin with test_. That means you can put your tests in the same directory as the main program, with "helper" modules, while only having your tests executed by pytest. Having the code with the tests, with identical file names except for the prepended test_, can aid in debugging and fixing. Then again, those test scripts may need shared libraries; a large project could get confusing between test-code and production code. You'll need to balance the convenience of side-by-side tests against a cleaner organization of your codebase.

Modern Test Construction

Tests tend to do very similar things, over and over again. A login test, for example, will need to "type" into the username and password fields, submit the login form, and check the output. That doesn't happen just once, it happens for a half-dozen types of invalid conditions like each field left blank, bad username, bad password, special characters, and so on. Typing in all those details tends to make tests long and hard to debug.

One pattern to resolve this is to create a "login" helper method that takes a username and password. The method either drives the browser to the correct page for assertion or takes an expected text input and returns true/false. Once this method exists, your tests can call the method. Page Objects take this idea to the next level, making every logical "web page" an object. With Page Objects, the workflows on each page, such as search, tag, or check out all become functions. When a test fails inside a function, it only has to be fixed in one place. For our login example, if an "account type" input is added, the tester might add a variable for account type and a default if no account type is selected.

Using Page Objects splits the code into the model of the workflow versus the test, and can result in a test that is human-readable by a non-technical reader. Here's a sample code snippet from what a client test looks like using a Page Object approach; the full code is in Github.

# Arrange: instantiate Page Model
google_search = GoogleSearch(browser, location['server'])

# Act: trigger action:
google_search.visit()

# Assert values match current location:
assert google_search.search_button_text == location['searchButtonText']
assert google_search.lucky_button_text == location['luckyButtonText']

Browsers to Test Against

Since Playwright was created and supported by Microsoft, it naturally supports the Edge browser, along with Safari, Firefox, and Chrome. Which browsers to test against is a business decision. That decision can change over time, and the organization might wish to triage testing and features by browsers. For example, the team might have a set of "smoke" tests that are essential user flows for Edge, while the full test suite runs in Chrome and Firefox. To do this we'll want to be able to run suites of tests by browsers.

The pytest plugin for Playwright, which is creatively named pytest-playwright, allows the user to select browsers from the command line – even multiple browsers. The command:

pytest --browser chromium --browser webkit

will actually run all the tests twice: once against Chromium, and once against WebKit (Safari). See the plugin documentation for more details.

Another option is to run all the tests, but change the method name to show it should not run under certain browsers. Pytest's -k command line option makes it possible to only run test names that match a pattern.

Running Tests in Parallel

By default, pytest will run multiple tests at the same time. This can create a performance bottleneck if you have a lot of tests. The pytest-parallel module and pytest-pyxdist module allow the programmer to set the number of workers from the command line. You can also limit the number of workers to one if the tests depend on running in a specific order or "step" on each other's data. In general, this will cause the run time to increase. A better option is to build tests so they can run independently – we'll tell you how in a minute. Before that, though, consider running the tests in a perfectly repeatable environment, such as a Sauce Labs Docker Container with their TestRunner Toolkit. TestRunner supports Python and takes a video of the test, correlating commands to timing, for easy playback and debugging.

Making tests independent will depend on your software. eCommerce accounts usually cannot see each other, and Social Media accounts only see each other if configured to. If every test creates a new account, that account will be "empty", which can then be seeded with data. While you might have a single test that creates an account, redoing that work through the browser will be slow and painful. Instead, we suggest a command-line or API tool to create test users. The username could be a combination of the test name and a date stamp down to the millisecond; that will ensure the account names are unique.

Playwright and Continuous integration

Pytest is a classic command-line tool that gives an exit code of 0 for success and positive numbers for various kinds of failures. That means it will work for almost any continuous integration tool that looks to exit codes for pass/fail, such as Jenkins, CircleCI, TravisCI, and so on. Pytest can publish the results in a machine-readable format, so the CI tool may have some insight into the number of tests that passed or failed, which ones failed, and so on. Pytest has a verbose mode, -v, which might be helpful for metrics and quickly figuring out what failed and why.

Password Storage

Our examples store passwords in environment variables. That keeps them out of version control. Larger customers might want to keep passwords out of even the employee's hands, lest they leave the company and keep the logins. Larger enterprises will have that problem in more than one area, and generally have a secrets storage policy, such as AWS Secret Store.

A Word about Localization

WonderProxy allows you to change the true location of the requesting browser. Select "Venice", and your website under test will get requests from Venice, Italy. That makes localization testing easy – the website should show up in Italian.

Which brings up the question: Once you can test in Italian, how should you test?

The heaviest option is to create a configuration file with a mapping. For each symbol, create a key/value pair for language type and then translation. The tests contain the symbol and execute a lookup by language for translation. That approach has a significant overlap with the translation effort. A lighter approach is to only check a few key translations but test under every key local. If you use locator strings instead of text, then testing in every location becomes as easy as adding a for loop, possibly storing the current location-to-use in an environment variable. Another approach is to use the style of the code example, where values are stored in a lookup array.

Putting it Together

When moving from proof of concept to test strategy, consider making the tests small and  independent. Also think about what elements should be isolated from the tests themselves. Another term for this paradigm is "cross-cutting concerns." These are the parameters that you might pass in to every test suite run, such as browser, what location to source requests from, test URLs to use, and so on. If you can get these as command-line or environment variables, then you can slice, dice, and execute tests independently, leading to a more scalable and maintainable architecture.

Matthew Heusser

The managing director of Excelon Development, Matt Heusser writes and consults on software delivery with a focus on quality.