This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Test Practices
Some guidelines and recommendations on testing from the Selenium project.
A note on “Best Practices”: We’ve intentionally avoided the phrase “Best
Practices” in this documentation. No one approach works for all situations.
We prefer the idea of “Guidelines and Recommendations.” We encourage
you to read through these and thoughtfully decide what approaches
will work for you in your particular environment.
Functional testing is challenging to get right for many reasons.
As if application state, complexity, and dependencies do not make testing difficult enough,
dealing with browsers (especially with cross-browser incompatibilities)
makes writing good tests a challenge.
Selenium provides tools to make functional user interaction easier,
but does not help you write well-architected test suites.
In this chapter, we offer advice, guidelines, and recommendations
on how to approach functional web page automation.
This chapter records software design patterns popular
amongst many of the users of Selenium
that have proven successful over the years.
1 - Design patterns and development strategies
(previously located: https://github.com/SeleniumHQ/selenium/wiki/Bot-Style-Tests)
Overview
Over time, projects tend to accumulate large numbers of tests. As the total number of tests increases,
it becomes harder to make changes to the codebase — a single “simple” change
may cause numerous tests to fail, even though the application still works properly.
Sometimes these problems are unavoidable, but when they do occur you want to be up
and running again as quickly as possible. The following design patterns and strategies
have been used before with WebDriver to help make tests easier to write and maintain.
They may help you too.
DomainDrivenDesign: Express your tests in the language of the end-user of the app.
PageObjects: A simple abstraction of the UI of your web app.
LoadableComponent: Modeling PageObjects as components.
BotStyleTests: Using a command-based approach to automating tests, rather than the object-based approach that PageObjects encourage
Loadable Component
What Is It?
The LoadableComponent is a base class that aims to make writing PageObjects less painful.
It does this by providing a standard way of ensuring that pages are loaded and providing
hooks to make debugging the failure of a page to load easier. You can use it to help
reduce the amount of boilerplate code in your tests, which in turn makes maintaining
your tests less tiresome.
There is currently an implementation in Java that ships as part of Selenium 2, but the approach used is simple enough to be implemented in any language.
Simple Usage
As an example of a UI that we’d like to model, take a look at
the new issue page. From the point of view of a test author,
this offers the service of being able to file a new issue. A basic Page Object would look like:
package com.example.webdriver;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
public class EditIssue {
private final WebDriver driver;
public EditIssue(WebDriver driver) {
this.driver = driver;
}
public void setSummary(String summary) {
WebElement field = driver.findElement(By.name("summary"));
clearAndType(field, summary);
}
public void enterDescription(String description) {
WebElement field = driver.findElement(By.name("comment"));
clearAndType(field, description);
}
public IssueList submit() {
driver.findElement(By.id("submit")).click();
return new IssueList(driver);
}
private void clearAndType(WebElement field, String text) {
field.clear();
field.sendKeys(text);
}
}
In order to turn this into a LoadableComponent, all we need to do is to set that as the base type:
public class EditIssue extends LoadableComponent<EditIssue> {
// rest of class ignored for now
}
This signature looks a little unusual, but all it means is that this class
represents a LoadableComponent that loads the EditIssue page.
By extending this base class, we need to implement two new methods:
@Override
protected void load() {
driver.get("https://github.com/SeleniumHQ/selenium/issues/new");
}
@Override
protected void isLoaded() throws Error {
String url = driver.getCurrentUrl();
assertTrue("Not on the issue entry page: " + url, url.endsWith("/new"));
}
The load
method is used to navigate to the page, whilst the isLoaded
method is used to
determine whether we are on the right page. Although the method looks like it should return
a boolean, instead it performs a series of assertions using JUnit’s Assert class.
There can be as few or as many assertions as you like. By using these assertions
it’s possible to give users of the class clear information that can be used to debug tests.
With a little rework, our PageObject looks like:
package com.example.webdriver;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.support.FindBy;
import org.openqa.selenium.support.PageFactory;
import static junit.framework.Assert.assertTrue;
public class EditIssue extends LoadableComponent<EditIssue> {
private final WebDriver driver;
// By default the PageFactory will locate elements with the same name or id
// as the field. Since the summary element has a name attribute of "summary"
// we don't need any additional annotations.
private WebElement summary;
// Same with the submit element, which has the ID "submit"
private WebElement submit;
// But we'd prefer a different name in our code than "comment", so we use the
// FindBy annotation to tell the PageFactory how to locate the element.
@FindBy(name = "comment") private WebElement description;
public EditIssue(WebDriver driver) {
this.driver = driver;
// This call sets the WebElement fields.
PageFactory.initElements(driver, this);
}
@Override
protected void load() {
driver.get("https://github.com/SeleniumHQ/selenium/issues/new");
}
@Override
protected void isLoaded() throws Error {
String url = driver.getCurrentUrl();
assertTrue("Not on the issue entry page: " + url, url.endsWith("/new"));
}
public void setSummary(String issueSummary) {
clearAndType(summary, issueSummary);
}
public void enterDescription(String issueDescription) {
clearAndType(description, issueDescription);
}
public IssueList submit() {
submit.click();
return new IssueList(driver);
}
private void clearAndType(WebElement field, String text) {
field.clear();
field.sendKeys(text);
}
}
That doesn’t seem to have bought us much, right? One thing it has done is encapsulate
the information about how to navigate to the page into the page itself, meaning that
this information’s not scattered through the code base. It also means that we can do this in our tests:
EditIssue page = new EditIssue(driver).get();
This call will cause the driver to navigate to the page if that’s necessary.
Nested Components
LoadableComponents start to become more useful when they are used in conjunction
with other LoadableComponents. Using our example, we could view the “edit issue”
page as a component within a project’s website (after all, we access it via a tab
on that site). You also need to be logged in to file an issue. We could model this
as a tree of nested components:
+ ProjectPage
+---+ SecuredPage
+---+ EditIssue
What would this look like in code? For a start, each logical component would
have its own class. The “load” method in each of them would “get” the parent.
The end result, in addition to the EditIssue class above is:
ProjectPage.java:
package com.example.webdriver;
import org.openqa.selenium.WebDriver;
import static org.junit.Assert.assertTrue;
public class ProjectPage extends LoadableComponent<ProjectPage> {
private final WebDriver driver;
private final String projectName;
public ProjectPage(WebDriver driver, String projectName) {
this.driver = driver;
this.projectName = projectName;
}
@Override
protected void load() {
driver.get("http://" + projectName + ".googlecode.com/");
}
@Override
protected void isLoaded() throws Error {
String url = driver.getCurrentUrl();
assertTrue(url.contains(projectName));
}
}
and SecuredPage.java:
package com.example.webdriver;
import org.openqa.selenium.By;
import org.openqa.selenium.NoSuchElementException;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import static org.junit.Assert.fail;
public class SecuredPage extends LoadableComponent<SecuredPage> {
private final WebDriver driver;
private final LoadableComponent<?> parent;
private final String username;
private final String password;
public SecuredPage(WebDriver driver, LoadableComponent<?> parent, String username, String password) {
this.driver = driver;
this.parent = parent;
this.username = username;
this.password = password;
}
@Override
protected void load() {
parent.get();
String originalUrl = driver.getCurrentUrl();
// Sign in
driver.get("https://www.google.com/accounts/ServiceLogin?service=code");
driver.findElement(By.name("Email")).sendKeys(username);
WebElement passwordField = driver.findElement(By.name("Passwd"));
passwordField.sendKeys(password);
passwordField.submit();
// Now return to the original URL
driver.get(originalUrl);
}
@Override
protected void isLoaded() throws Error {
// If you're signed in, you have the option of picking a different login.
// Let's check for the presence of that.
try {
WebElement div = driver.findElement(By.id("multilogin-dropdown"));
} catch (NoSuchElementException e) {
fail("Cannot locate user name link");
}
}
}
The “load” method in EditIssue now looks like:
@Override
protected void load() {
securedPage.get();
driver.get("https://github.com/SeleniumHQ/selenium/issues/new");
}
This shows that the components are all “nested” within each other.
A call to get()
in EditIssue will cause all its dependencies to
load too. The example usage:
public class FooTest {
private EditIssue editIssue;
@Before
public void prepareComponents() {
WebDriver driver = new FirefoxDriver();
ProjectPage project = new ProjectPage(driver, "selenium");
SecuredPage securedPage = new SecuredPage(driver, project, "example", "top secret");
editIssue = new EditIssue(driver, securedPage);
}
@Test
public void demonstrateNestedLoadableComponents() {
editIssue.get();
editIssue.setSummary("Summary");
editIssue.enterDescription("This is an example");
}
}
If you’re using a library such as Guiceberry in your tests,
the preamble of setting up the PageObjects can be omitted leading to nice, clear, readable tests.
Bot Pattern
(previously located: https://github.com/SeleniumHQ/selenium/wiki/Bot-Style-Tests)
Although PageObjects are a useful way of reducing duplication in your tests,
it’s not always a pattern that teams feel comfortable following.
An alternative approach is to follow a more “command-like” style of testing.
A “bot” is an action-oriented abstraction over the raw Selenium APIs.
This means that if you find that commands aren’t doing the Right Thing
for your app, it’s easy to change them. As an example:
public class ActionBot {
private final WebDriver driver;
public ActionBot(WebDriver driver) {
this.driver = driver;
}
public void click(By locator) {
driver.findElement(locator).click();
}
public void submit(By locator) {
driver.findElement(locator).submit();
}
/**
* Type something into an input field. WebDriver doesn't normally clear these
* before typing, so this method does that first. It also sends a return key
* to move the focus out of the element.
*/
public void type(By locator, String text) {
WebElement element = driver.findElement(locator);
element.clear();
element.sendKeys(text + "\n");
}
}
Once these abstractions have been built and duplication in your tests identified, it’s possible to layer PageObjects on top of bots.
2 - Overview of Test Automation
First, start by asking yourself whether or not you really need to use a browser.
Odds are that, at some point, if you are working on a complex web application,
you will need to open a browser and actually test it.
Functional end-user tests such as Selenium tests are expensive to run, however.
Furthermore, they typically require substantial infrastructure
to be in place to be run effectively.
It is a good rule to always ask yourself if what you want to test
can be done using more lightweight test approaches such as unit tests
or with a lower-level approach.
Once you have made the determination that you are in the web browser testing business,
and you have your Selenium environment ready to begin writing tests,
you will generally perform some combination of three steps:
- Set up the data
- Perform a discrete set of actions
- Evaluate the results
You will want to keep these steps as short as possible;
one or two operations should be enough most of the time.
Browser automation has the reputation of being “flaky”,
but in reality, that is because users frequently demand too much of it.
In later chapters, we will return to techniques you can use
to mitigate apparent intermittent problems in tests,
in particular on how to overcome race conditions
between the browser and WebDriver.
By keeping your tests short
and using the web browser only when you have absolutely no alternative,
you can have many tests with minimal flake.
A distinct advantage of Selenium tests
is their inherent ability to test all components of the application,
from backend to frontend, from a user’s perspective.
So in other words, whilst functional tests may be expensive to run,
they also encompass large business-critical portions at one time.
Testing requirements
As mentioned before, Selenium tests can be expensive to run.
To what extent depends on the browser you are running the tests against,
but historically browsers’ behaviour has varied so much that it has often
been a stated goal to cross-test against multiple browsers.
Selenium allows you to run the same instructions against multiple browsers
on multiple operating systems,
but the enumeration of all the possible browsers,
their different versions, and the many operating systems they run on
will quickly become a non-trivial undertaking.
Let’s start with an example
Larry has written a web site which allows users to order their
custom unicorns.
The general workflow (what we will call the “happy path”) is something
like this:
- Create an account
- Configure the unicorn
- Add it to the shopping cart
- Check out and pay
- Give feedback about the unicorn
It would be tempting to write one grand Selenium script
to perform all these operations–many will try.
Resist the temptation!
Doing so will result in a test that
a) takes a long time,
b) will be subject to some common issues around page rendering timing issues, and
c) is such that if it fails,
it will not give you a concise, “glanceable” method for diagnosing what went wrong.
The preferred strategy for testing this scenario would be
to break it down to a series of independent, speedy tests,
each of which has one “reason” to exist.
Let us pretend you want to test the second step:
Configuring your unicorn.
It will perform the following actions:
- Create an account
- Configure a unicorn
Note that we are skipping the rest of these steps,
we will test the rest of the workflow in other small, discrete test cases
after we are done with this one.
To start, you need to create an account.
Here you have some choices to make:
- Do you want to use an existing account?
- Do you want to create a new account?
- Are there any special properties of such a user that need to be
taken into account before configuration begins?
Regardless of how you answer this question,
the solution is to make it part of the “set up the data” portion of the test.
If Larry has exposed an API that enables you (or anyone)
to create and update user accounts,
be sure to use that to answer this question.
If possible, you want to launch the browser only after you have a user “in hand”,
whose credentials you can just log in with.
If each test for each workflow begins with the creation of a user account,
many seconds will be added to the execution of each test.
Calling an API and talking to a database are quick,
“headless” operations that don’t require the expensive process of
opening a browser, navigating to the right pages,
clicking and waiting for the forms to be submitted, etc.
Ideally, you can address this set-up phase in one line of code,
which will execute before any browser is launched:
// Create a user who has read-only permissions--they can configure a unicorn,
// but they do not have payment information set up, nor do they have
// administrative privileges. At the time the user is created, its email
// address and password are randomly generated--you don't even need to
// know them.
User user = UserFactory.createCommonUser(); //This method is defined elsewhere.
// Log in as this user.
// Logging in on this site takes you to your personal "My Account" page, so the
// AccountPage object is returned by the loginAs method, allowing you to then
// perform actions from the AccountPage.
AccountPage accountPage = loginAs(user.getEmail(), user.getPassword());
# Create a user who has read-only permissions--they can configure a unicorn,
# but they do not have payment information set up, nor do they have
# administrative privileges. At the time the user is created, its email
# address and password are randomly generated--you don't even need to
# know them.
user = user_factory.create_common_user() #This method is defined elsewhere.
# Log in as this user.
# Logging in on this site takes you to your personal "My Account" page, so the
# AccountPage object is returned by the loginAs method, allowing you to then
# perform actions from the AccountPage.
account_page = login_as(user.get_email(), user.get_password())
// Create a user who has read-only permissions--they can configure a unicorn,
// but they do not have payment information set up, nor do they have
// administrative privileges. At the time the user is created, its email
// address and password are randomly generated--you don't even need to
// know them.
User user = UserFactory.CreateCommonUser(); //This method is defined elsewhere.
// Log in as this user.
// Logging in on this site takes you to your personal "My Account" page, so the
// AccountPage object is returned by the loginAs method, allowing you to then
// perform actions from the AccountPage.
AccountPage accountPage = LoginAs(user.Email, user.Password);
# Create a user who has read-only permissions--they can configure a unicorn,
# but they do not have payment information set up, nor do they have
# administrative privileges. At the time the user is created, its email
# address and password are randomly generated--you don't even need to
# know them.
user = UserFactory.create_common_user #This method is defined elsewhere.
# Log in as this user.
# Logging in on this site takes you to your personal "My Account" page, so the
# AccountPage object is returned by the loginAs method, allowing you to then
# perform actions from the AccountPage.
account_page = login_as(user.email, user.password)
// Create a user who has read-only permissions--they can configure a unicorn,
// but they do not have payment information set up, nor do they have
// administrative privileges. At the time the user is created, its email
// address and password are randomly generated--you don't even need to
// know them.
var user = userFactory.createCommonUser(); //This method is defined elsewhere.
// Log in as this user.
// Logging in on this site takes you to your personal "My Account" page, so the
// AccountPage object is returned by the loginAs method, allowing you to then
// perform actions from the AccountPage.
var accountPage = loginAs(user.email, user.password);
// Create a user who has read-only permissions--they can configure a unicorn,
// but they do not have payment information set up, nor do they have
// administrative privileges. At the time the user is created, its email
// address and password are randomly generated--you don't even need to
// know them.
val user = UserFactory.createCommonUser() //This method is defined elsewhere.
// Log in as this user.
// Logging in on this site takes you to your personal "My Account" page, so the
// AccountPage object is returned by the loginAs method, allowing you to then
// perform actions from the AccountPage.
val accountPage = loginAs(user.getEmail(), user.getPassword())
As you can imagine, the UserFactory
can be extended
to provide methods such as createAdminUser()
, and createUserWithPayment()
.
The point is, these two lines of code do not distract you from the ultimate purpose of this test:
configuring a unicorn.
The intricacies of the Page Object model
will be discussed in later chapters, but we will introduce the concept here:
Your tests should be composed of actions,
performed from the user’s point of view,
within the context of pages in the site.
These pages are stored as objects,
which will contain specific information about how the web page is composed
and how actions are performed–
very little of which should concern you as a tester.
What kind of unicorn do you want?
You might want pink, but not necessarily.
Purple has been quite popular lately.
Does she need sunglasses? Star tattoos?
These choices, while difficult, are your primary concern as a tester–
you need to ensure that your order fulfillment center
sends out the right unicorn to the right person,
and that starts with these choices.
Notice that nowhere in that paragraph do we talk about buttons,
fields, drop-downs, radio buttons, or web forms.
Neither should your tests!
You want to write your code like the user trying to solve their problem.
Here is one way of doing this (continuing from the previous example):
// The Unicorn is a top-level Object--it has attributes, which are set here.
// This only stores the values; it does not fill out any web forms or interact
// with the browser in any way.
Unicorn sparkles = new Unicorn("Sparkles", UnicornColors.PURPLE, UnicornAccessories.SUNGLASSES, UnicornAdornments.STAR_TATTOOS);
// Since we are already "on" the account page, we have to use it to get to the
// actual place where you configure unicorns. Calling the "Add Unicorn" method
// takes us there.
AddUnicornPage addUnicornPage = accountPage.addUnicorn();
// Now that we're on the AddUnicornPage, we will pass the "sparkles" object to
// its createUnicorn() method. This method will take Sparkles' attributes,
// fill out the form, and click submit.
UnicornConfirmationPage unicornConfirmationPage = addUnicornPage.createUnicorn(sparkles);
# The Unicorn is a top-level Object--it has attributes, which are set here.
# This only stores the values; it does not fill out any web forms or interact
# with the browser in any way.
sparkles = Unicorn("Sparkles", UnicornColors.PURPLE, UnicornAccessories.SUNGLASSES, UnicornAdornments.STAR_TATTOOS)
# Since we're already "on" the account page, we have to use it to get to the
# actual place where you configure unicorns. Calling the "Add Unicorn" method
# takes us there.
add_unicorn_page = account_page.add_unicorn()
# Now that we're on the AddUnicornPage, we will pass the "sparkles" object to
# its createUnicorn() method. This method will take Sparkles' attributes,
# fill out the form, and click submit.
unicorn_confirmation_page = add_unicorn_page.create_unicorn(sparkles)
// The Unicorn is a top-level Object--it has attributes, which are set here.
// This only stores the values; it does not fill out any web forms or interact
// with the browser in any way.
Unicorn sparkles = new Unicorn("Sparkles", UnicornColors.Purple, UnicornAccessories.Sunglasses, UnicornAdornments.StarTattoos);
// Since we are already "on" the account page, we have to use it to get to the
// actual place where you configure unicorns. Calling the "Add Unicorn" method
// takes us there.
AddUnicornPage addUnicornPage = accountPage.AddUnicorn();
// Now that we're on the AddUnicornPage, we will pass the "sparkles" object to
// its createUnicorn() method. This method will take Sparkles' attributes,
// fill out the form, and click submit.
UnicornConfirmationPage unicornConfirmationPage = addUnicornPage.CreateUnicorn(sparkles);
# The Unicorn is a top-level Object--it has attributes, which are set here.
# This only stores the values; it does not fill out any web forms or interact
# with the browser in any way.
sparkles = Unicorn.new('Sparkles', UnicornColors.PURPLE, UnicornAccessories.SUNGLASSES, UnicornAdornments.STAR_TATTOOS)
# Since we're already "on" the account page, we have to use it to get to the
# actual place where you configure unicorns. Calling the "Add Unicorn" method
# takes us there.
add_unicorn_page = account_page.add_unicorn
# Now that we're on the AddUnicornPage, we will pass the "sparkles" object to
# its createUnicorn() method. This method will take Sparkles' attributes,
# fill out the form, and click submit.
unicorn_confirmation_page = add_unicorn_page.create_unicorn(sparkles)
// The Unicorn is a top-level Object--it has attributes, which are set here.
// This only stores the values; it does not fill out any web forms or interact
// with the browser in any way.
var sparkles = new Unicorn("Sparkles", UnicornColors.PURPLE, UnicornAccessories.SUNGLASSES, UnicornAdornments.STAR_TATTOOS);
// Since we are already "on" the account page, we have to use it to get to the
// actual place where you configure unicorns. Calling the "Add Unicorn" method
// takes us there.
var addUnicornPage = accountPage.addUnicorn();
// Now that we're on the AddUnicornPage, we will pass the "sparkles" object to
// its createUnicorn() method. This method will take Sparkles' attributes,
// fill out the form, and click submit.
var unicornConfirmationPage = addUnicornPage.createUnicorn(sparkles);
// The Unicorn is a top-level Object--it has attributes, which are set here.
// This only stores the values; it does not fill out any web forms or interact
// with the browser in any way.
val sparkles = Unicorn("Sparkles", UnicornColors.PURPLE, UnicornAccessories.SUNGLASSES, UnicornAdornments.STAR_TATTOOS)
// Since we are already "on" the account page, we have to use it to get to the
// actual place where you configure unicorns. Calling the "Add Unicorn" method
// takes us there.
val addUnicornPage = accountPage.addUnicorn()
// Now that we're on the AddUnicornPage, we will pass the "sparkles" object to
// its createUnicorn() method. This method will take Sparkles' attributes,
// fill out the form, and click submit.
unicornConfirmationPage = addUnicornPage.createUnicorn(sparkles)
Now that you have configured your unicorn,
you need to move on to step 3: making sure it actually worked.
// The exists() method from UnicornConfirmationPage will take the Sparkles
// object--a specification of the attributes you want to see, and compare
// them with the fields on the page.
Assert.assertTrue("Sparkles should have been created, with all attributes intact", unicornConfirmationPage.exists(sparkles));
# The exists() method from UnicornConfirmationPage will take the Sparkles
# object--a specification of the attributes you want to see, and compare
# them with the fields on the page.
assert unicorn_confirmation_page.exists(sparkles), "Sparkles should have been created, with all attributes intact"
// The exists() method from UnicornConfirmationPage will take the Sparkles
// object--a specification of the attributes you want to see, and compare
// them with the fields on the page.
Assert.True(unicornConfirmationPage.Exists(sparkles), "Sparkles should have been created, with all attributes intact");
# The exists() method from UnicornConfirmationPage will take the Sparkles
# object--a specification of the attributes you want to see, and compare
# them with the fields on the page.
expect(unicorn_confirmation_page.exists?(sparkles)).to be, 'Sparkles should have been created, with all attributes intact'
// The exists() method from UnicornConfirmationPage will take the Sparkles
// object--a specification of the attributes you want to see, and compare
// them with the fields on the page.
assert(unicornConfirmationPage.exists(sparkles), "Sparkles should have been created, with all attributes intact");
// The exists() method from UnicornConfirmationPage will take the Sparkles
// object--a specification of the attributes you want to see, and compare
// them with the fields on the page.
assertTrue("Sparkles should have been created, with all attributes intact", unicornConfirmationPage.exists(sparkles))
Note that the tester still has not done anything but talk about unicorns in this code–
no buttons, no locators, no browser controls.
This method of modelling the application
allows you to keep these test-level commands in place and unchanging,
even if Larry decides next week that he no longer likes Ruby-on-Rails
and decides to re-implement the entire site
in the latest Haskell bindings with a Fortran front-end.
Your page objects will require some small maintenance in order to
conform to the site redesign,
but these tests will remain the same.
Taking this basic design,
you will want to keep going through your workflows with the fewest browser-facing steps possible.
Your next workflow will involve adding a unicorn to the shopping cart.
You will probably want many iterations of this test in order to make sure the cart is keeping its state properly:
Is there more than one unicorn in the cart before you start?
How many can fit in the shopping cart?
If you create more than one with the same name and/or features, will it break?
Will it only keep the existing one or will it add another?
Each time you move through the workflow,
you want to try to avoid having to create an account,
login as the user, and configure the unicorn.
Ideally, you will be able to create an account
and pre-configure a unicorn via the API or database.
Then all you have to do is log in as the user, locate Sparkles,
and add her to the cart.
To automate or not to automate?
Is automation always advantageous? When should one decide to automate test
cases?
It is not always advantageous to automate test cases. There are times when
manual testing may be more appropriate. For instance, if the application’s user
interface will change considerably in the near future, then any automation
might need to be rewritten anyway. Also, sometimes there simply is not enough
time to build test automation. For the short term, manual testing may be more
effective. If an application has a very tight deadline, there is currently no
test automation available, and it’s imperative that the testing gets done within
that time frame, then manual testing is the best solution.
3 - Types of Testing
Acceptance testing
This type of testing is done to determine if a feature or system
meets the customer expectations and requirements.
This type of testing generally involves the customer’s
cooperation or feedback, being a validation activity that
answers the question:
Are we building the right product?
For web applications, the automation of this testing can be done
directly with Selenium by simulating user expected behaviour.
This simulation could be done by record/playback or through the
different supported languages as explained in this documentation.
Note: Acceptance testing is a subtype of functional testing,
which some people might also refer to.
Functional testing
This type of testing is done to determine if a
feature or system functions properly without issues. It checks
the system at different levels to ensure that all scenarios
are covered and that the system does what it’s
supposed to do. It’s a verification activity that
answers the question:
Are we building the product right?
This generally includes: the tests work without errors
(404, exceptions…), in a usable way (correct redirections),
in an accessible way and matching its specifications
(see acceptance testing above).
For web applications, the automation of this testing can be
done directly with Selenium by simulating expected returns.
This simulation could be done by record/playback or through
the different supported languages as explained in this documentation.
As its name indicates, performance tests are done
to measure how well an application is performing.
There are two main sub-types for performance testing:
Load testing
Load testing is done to verify how well the
application works under different defined loads
(usually a particular number of users connected at once).
Stress testing
Stress testing is done to verify how well the
application works under stress (or above the maximum supported load).
Generally, performance tests are done by executing some
Selenium written tests simulating different users
hitting a particular function on the web app and
retrieving some meaningful measurements.
This is generally done by other tools that retrieve the metrics.
One such tool is JMeter.
For a web application, details to measure include
throughput, latency, data loss, individual component loading times…
Note 1: All browsers have a performance tab in their
developers’ tools section (accessible by pressing F12)
Note 2: is a subtype of non-functional testing
as this is generally measured per system and not per function/feature.
Regression testing
This testing is generally done after a change, fix or feature addition.
To ensure that the change has not broken any of the existing
functionality, some already executed tests are executed again.
The set of re-executed tests can be full or partial
and can include several different types, depending
on the application and development team.
Test driven development (TDD)
Rather than a test type per se, TDD is an iterative
development methodology in which tests drive the design of a feature.
Each cycle starts by creating a set of unit tests that
the feature should eventually pass (they should fail their first time executed).
After this, development takes place to make the tests pass.
The tests are executed again, starting another cycle
and this process continues until all tests are passing.
This aims to speed up the development of an application
based on the fact that defects are less costly the earlier they are found.
Behavior-driven development (BDD)
BDD is also an iterative development methodology
based on the above TDD, in which the goal is to involve
all the parties in the development of an application.
Each cycle starts by creating some specifications
(which should fail). Then create the failing unit
tests (which should also fail) and then do the development.
This cycle is repeated until all types of tests are passing.
In order to do so, a specification language is
used. It should be understandable by all parties and
simple, standard and explicit.
Most tools use Gherkin as this language.
The goal is to be able to detect even more errors
than TDD, by targeting potential acceptance errors
too and make communication between parties smoother.
A set of tools are currently available
to write the specifications and match them with code functions,
such as Cucumber or SpecFlow.
A set of tools are built on top of Selenium to make this process
even faster by directly transforming the BDD specifications into
executable code.
Some of these are JBehave, Capybara and Robot Framework.
4 - Encouraged behaviors
Some guidelines and recommendations on testing from the Selenium project.
A note on “Best Practices”: We’ve intentionally avoided the phrase “Best
Practices” in this documentation. No one approach works for all situations.
We prefer the idea of “Guidelines and Recommendations”. We encourage
you to read through these and thoughtfully decide what approaches
will work for you in your particular environment.
Functional testing is difficult to get right for many reasons.
As if application state, complexity, and dependencies do not make testing difficult enough,
dealing with browsers (especially with cross-browser incompatibilities)
makes writing good tests a challenge.
Selenium provides tools to make functional user interaction easier,
but does not help you write well-architected test suites.
In this chapter we offer advice, guidelines, and recommendations
on how to approach functional web page automation.
This chapter records software design patterns popular
amongst many of the users of Selenium
that have proven successful over the years.
4.1 - Page object models
Note: this page has merged contents from multiple sources, including
the Selenium wiki
Overview
Within your web app’s UI, there are areas where your tests interact with.
A Page Object only models these as objects within the test code.
This reduces the amount of duplicated code and means that if the UI changes,
the fix needs only to be applied in one place.
Page Object is a Design Pattern that has become popular in test automation for
enhancing test maintenance and reducing code duplication. A page object is an
object-oriented class that serves as an interface to a page of your AUT. The
tests then use the methods of this page object class whenever they need to
interact with the UI of that page. The benefit is that if the UI changes for
the page, the tests themselves don’t need to change, only the code within the
page object needs to change. Subsequently, all changes to support that new UI
are located in one place.
Advantages
- There is a clean separation between the test code and page-specific code, such as
locators (or their use if you’re using a UI Map) and layout.
- There is a single repository for the services or operations the page offers
rather than having these services scattered throughout the tests.
In both cases, this allows any modifications required due to UI changes to all
be made in one place. Helpful information on this technique can be found on
numerous blogs as this ‘test design pattern’ is becoming widely used. We
encourage readers who wish to know more to search the internet for blogs
on this subject. Many have written on this design pattern and can provide
helpful tips beyond the scope of this user guide. To get you started,
we’ll illustrate page objects with a simple example.
Examples
First, consider an example, typical of test automation, that does not use a
page object:
/***
* Tests login feature
*/
public class Login {
public void testLogin() {
// fill login data on sign-in page
driver.findElement(By.name("user_name")).sendKeys("userName");
driver.findElement(By.name("password")).sendKeys("my supersecret password");
driver.findElement(By.name("sign-in")).click();
// verify h1 tag is "Hello userName" after login
driver.findElement(By.tagName("h1")).isDisplayed();
assertThat(driver.findElement(By.tagName("h1")).getText(), is("Hello userName"));
}
}
There are two problems with this approach.
- There is no separation between the test method and the AUT’s locators (IDs in
this example); both are intertwined in a single method. If the AUT’s UI changes
its identifiers, layout, or how a login is input and processed, the test itself
must change.
- The ID-locators would be spread in multiple tests, in all tests that had to
use this login page.
Applying the page object techniques, this example could be rewritten like this
in the following example of a page object for a Sign-in page.
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
/**
* Page Object encapsulates the Sign-in page.
*/
public class SignInPage {
protected WebDriver driver;
// <input name="user_name" type="text" value="">
private By usernameBy = By.name("user_name");
// <input name="password" type="password" value="">
private By passwordBy = By.name("password");
// <input name="sign_in" type="submit" value="SignIn">
private By signinBy = By.name("sign_in");
public SignInPage(WebDriver driver){
this.driver = driver;
if (!driver.getTitle().equals("Sign In Page")) {
throw new IllegalStateException("This is not Sign In Page," +
" current page is: " + driver.getCurrentUrl());
}
}
/**
* Login as valid user
*
* @param userName
* @param password
* @return HomePage object
*/
public HomePage loginValidUser(String userName, String password) {
driver.findElement(usernameBy).sendKeys(userName);
driver.findElement(passwordBy).sendKeys(password);
driver.findElement(signinBy).click();
return new HomePage(driver);
}
}
and page object for a Home page could look like this.
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
/**
* Page Object encapsulates the Home Page
*/
public class HomePage {
protected WebDriver driver;
// <h1>Hello userName</h1>
private By messageBy = By.tagName("h1");
public HomePage(WebDriver driver){
this.driver = driver;
if (!driver.getTitle().equals("Home Page of logged in user")) {
throw new IllegalStateException("This is not Home Page of logged in user," +
" current page is: " + driver.getCurrentUrl());
}
}
/**
* Get message (h1 tag)
*
* @return String message text
*/
public String getMessageText() {
return driver.findElement(messageBy).getText();
}
public HomePage manageProfile() {
// Page encapsulation to manage profile functionality
return new HomePage(driver);
}
/* More methods offering the services represented by Home Page
of Logged User. These methods in turn might return more Page Objects
for example click on Compose mail button could return ComposeMail class object */
}
So now, the login test would use these two page objects as follows.
/***
* Tests login feature
*/
public class TestLogin {
@Test
public void testLogin() {
SignInPage signInPage = new SignInPage(driver);
HomePage homePage = signInPage.loginValidUser("userName", "password");
assertThat(homePage.getMessageText(), is("Hello userName"));
}
}
There is a lot of flexibility in how the page objects may be designed, but
there are a few basic rules for getting the desired maintainability of your
test code.
Assertions in Page Objects
Page objects themselves should never make verifications or assertions. This is
part of your test and should always be within the test’s code, never in an page
object. The page object will contain the representation of the page, and the
services the page provides via methods but no code related to what is being
tested should be within the page object.
There is one, single, verification which can, and should, be within the page
object and that is to verify that the page, and possibly critical elements on
the page, were loaded correctly. This verification should be done while
instantiating the page object. In the examples above, both the SignInPage and
HomePage constructors check that the expected page is available and ready for
requests from the test.
Page Component Objects
A page object does not necessarily need to represent all the parts of a
page itself. The same principles used for page objects can be used to
create “Page Component Objects” that represent discrete chunks of the
page and can be included in page objects. These component objects can
provide references to the elements inside those discrete chunks, and
methods to leverage the functionality provided by them. You can even
nest component objects inside other component objects for more complex
pages. If a page in the AUT has multiple components, or common
components used throughout the site (e.g. a navigation bar), then it
may improve maintainability and reduce code duplication.
Other Design Patterns Used in Testing
There are other design patterns that also may be used in testing. Some use a
Page Factory for instantiating their page objects. Discussing all of these is
beyond the scope of this user guide. Here, we merely want to introduce the
concepts to make the reader aware of some of the things that can be done. As
was mentioned earlier, many have blogged on this topic and we encourage the
reader to search for blogs on these topics.
Implementation Notes
PageObjects can be thought of as facing in two directions simultaneously. Facing toward the developer of a test, they represent the services offered by a particular page. Facing away from the developer, they should be the only thing that has a deep knowledge of the structure of the HTML of a page (or part of a page) It’s simplest to think of the methods on a Page Object as offering the “services” that a page offers rather than exposing the details and mechanics of the page. As an example, think of the inbox of any web-based email system. Amongst the services it offers are the ability to compose a new email, choose to read a single email, and list the subject lines of the emails in the inbox. How these are implemented shouldn’t matter to the test.
Because we’re encouraging the developer of a test to try and think about the services they’re interacting with rather than the implementation, PageObjects should seldom expose the underlying WebDriver instance. To facilitate this, methods on the PageObject should return other PageObjects. This means we can effectively model the user’s journey through our application. It also means that should the way that pages relate to one another change (like when the login page asks the user to change their password the first time they log into a service when it previously didn’t do that), simply changing the appropriate method’s signature will cause the tests to fail to compile. Put another way; we can tell which tests would fail without needing to run them when we change the relationship between pages and reflect this in the PageObjects.
One consequence of this approach is that it may be necessary to model (for example) both a successful and unsuccessful login; or a click could have a different result depending on the app’s state. When this happens, it is common to have multiple methods on the PageObject:
public class LoginPage {
public HomePage loginAs(String username, String password) {
// ... clever magic happens here
}
public LoginPage loginAsExpectingError(String username, String password) {
// ... failed login here, maybe because one or both of the username and password are wrong
}
public String getErrorMessage() {
// So we can verify that the correct error is shown
}
}
The code presented above shows an important point: the tests, not the PageObjects, should be responsible for making assertions about the state of a page. For example:
public void testMessagesAreReadOrUnread() {
Inbox inbox = new Inbox(driver);
inbox.assertMessageWithSubjectIsUnread("I like cheese");
inbox.assertMessageWithSubjectIsNotUnread("I'm not fond of tofu");
}
could be re-written as:
public void testMessagesAreReadOrUnread() {
Inbox inbox = new Inbox(driver);
assertTrue(inbox.isMessageWithSubjectIsUnread("I like cheese"));
assertFalse(inbox.isMessageWithSubjectIsUnread("I'm not fond of tofu"));
}
Of course, as with every guideline, there are exceptions, and one that is commonly seen with PageObjects is to check that the WebDriver is on the correct page when we instantiate the PageObject. This is done in the example below.
Finally, a PageObject need not represent an entire page. It may represent a section that appears frequently within a site or page, such as site navigation. The essential principle is that there is only one place in your test suite with knowledge of the structure of the HTML of a particular (part of a) page.
Summary
- The public methods represent the services that the page offers
- Try not to expose the internals of the page
- Generally don’t make assertions
- Methods return other PageObjects
- Need not represent an entire page
- Different results for the same action are modelled as different methods
Example
public class LoginPage {
private final WebDriver driver;
public LoginPage(WebDriver driver) {
this.driver = driver;
// Check that we're on the right page.
if (!"Login".equals(driver.getTitle())) {
// Alternatively, we could navigate to the login page, perhaps logging out first
throw new IllegalStateException("This is not the login page");
}
}
// The login page contains several HTML elements that will be represented as WebElements.
// The locators for these elements should only be defined once.
By usernameLocator = By.id("username");
By passwordLocator = By.id("passwd");
By loginButtonLocator = By.id("login");
// The login page allows the user to type their username into the username field
public LoginPage typeUsername(String username) {
// This is the only place that "knows" how to enter a username
driver.findElement(usernameLocator).sendKeys(username);
// Return the current page object as this action doesn't navigate to a page represented by another PageObject
return this;
}
// The login page allows the user to type their password into the password field
public LoginPage typePassword(String password) {
// This is the only place that "knows" how to enter a password
driver.findElement(passwordLocator).sendKeys(password);
// Return the current page object as this action doesn't navigate to a page represented by another PageObject
return this;
}
// The login page allows the user to submit the login form
public HomePage submitLogin() {
// This is the only place that submits the login form and expects the destination to be the home page.
// A seperate method should be created for the instance of clicking login whilst expecting a login failure.
driver.findElement(loginButtonLocator).submit();
// Return a new page object representing the destination. Should the login page ever
// go somewhere else (for example, a legal disclaimer) then changing the method signature
// for this method will mean that all tests that rely on this behaviour won't compile.
return new HomePage(driver);
}
// The login page allows the user to submit the login form knowing that an invalid username and / or password were entered
public LoginPage submitLoginExpectingFailure() {
// This is the only place that submits the login form and expects the destination to be the login page due to login failure.
driver.findElement(loginButtonLocator).submit();
// Return a new page object representing the destination. Should the user ever be navigated to the home page after submiting a login with credentials
// expected to fail login, the script will fail when it attempts to instantiate the LoginPage PageObject.
return new LoginPage(driver);
}
// Conceptually, the login page offers the user the service of being able to "log into"
// the application using a user name and password.
public HomePage loginAs(String username, String password) {
// The PageObject methods that enter username, password & submit login have already defined and should not be repeated here.
typeUsername(username);
typePassword(password);
return submitLogin();
}
}
Support in WebDriver
There is a PageFactory in the support package that provides support for this pattern and helps to remove some boiler-plate code from your Page Objects at the same time.
4.2 - Domain specific language
A domain specific language (DSL) is a system which provides the user with
an expressive means of solving a problem. It allows a user to
interact with the system on their terms – not just programmer-speak.
Your users, in general, do not care how your site looks. They do not
care about the decoration, animations, or graphics. They
want to use your system to push their new employees through the
process with minimal difficulty; they want to book travel to Alaska;
they want to configure and buy unicorns at a discount. Your job as
tester is to come as close as you can to “capturing” this mind-set.
With that in mind, we set about “modeling” the application you are
working on, such that the test scripts (the user’s only pre-release
proxy) “speak” for, and represent the user.
The goal is to use ubiquitous language. Rather than referring to “load data into this table” or
“click on the third column” it should be possible to use language such as “create a new account” or
“order displayed results by name”
With Selenium, DSL is usually represented by methods, written to make
the API simple and readable – they enable a report between the
developers and the stakeholders (users, product owners, business
intelligence specialists, etc.).
Benefits
- Readable: Business stakeholders can understand it.
- Writable: Easy to write, avoids unnecessary duplication.
- Extensible: Functionality can (reasonably) be added
without breaking contracts and existing functionality.
- Maintainable: By leaving the implementation details out of test
cases, you are well-insulated against changes to the AUT*.
Further Reading
(previously located: https://github.com/SeleniumHQ/selenium/wiki/Domain-Driven-Design)
There is a good book on Domain Driven Design by Eric Evans http://www.amazon.com/exec/obidos/ASIN/0321125215/domainlanguag-20
And to whet your appetite there’s a useful smaller book available online for
download at http://www.infoq.com/minibooks/domain-driven-design-quickly
Java
Here is an example of a reasonable DSL method in Java.
For brevity’s sake, it assumes the driver
object is pre-defined
and available to the method.
/**
* Takes a username and password, fills out the fields, and clicks "login".
* @return An instance of the AccountPage
*/
public AccountPage loginAsUser(String username, String password) {
WebElement loginField = driver.findElement(By.id("loginField"));
loginField.clear();
loginField.sendKeys(username);
// Fill out the password field. The locator we're using is "By.id", and we should
// have it defined elsewhere in the class.
WebElement passwordField = driver.findElement(By.id("password"));
passwordField.clear();
passwordField.sendKeys(password);
// Click the login button, which happens to have the id "submit".
driver.findElement(By.id("submit")).click();
// Create and return a new instance of the AccountPage (via the built-in Selenium
// PageFactory).
return PageFactory.newInstance(AccountPage.class);
}
This method completely abstracts the concepts of input fields,
buttons, clicking, and even pages from your test code. Using this
approach, all a tester has to do is call this method. This gives
you a maintenance advantage: if the login fields ever changed, you
would only ever have to change this method - not your tests.
public void loginTest() {
loginAsUser("cbrown", "cl0wn3");
// Now that we're logged in, do some other stuff--since we used a DSL to support
// our testers, it's as easy as choosing from available methods.
do.something();
do.somethingElse();
Assert.assertTrue("Something should have been done!", something.wasDone());
// Note that we still haven't referred to a button or web control anywhere in this
// script...
}
It bears repeating: one of your primary goals should be writing an
API that allows your tests to address the problem at hand, and NOT
the problem of the UI. The UI is a secondary concern for your
users – they do not care about the UI, they just want to get their job
done. Your test scripts should read like a laundry list of things
the user wants to DO, and the things they want to KNOW. The tests
should not concern themselves with HOW the UI requires you to go
about it.
*AUT: Application under test
4.3 - Generating application state
Selenium should not be used to prepare a test case. All repetitive
actions and preparations for a test case, should be done through other
methods. For example, most web UIs have authentication (e.g. a login
form). Eliminating logging in via web browser before every test will
improve both the speed and stability of the test. A method should be
created to gain access to the AUT* (e.g. using an API to login and set a
cookie). Also, creating methods to pre-load data for
testing should not be done using Selenium. As mentioned previously,
existing APIs should be leveraged to create data for the AUT*.
*AUT: Application under test
4.4 - Mock external services
Eliminating the dependencies on external services will greatly improve
the speed and stability of your tests.
4.5 - Improved reporting
Selenium is not designed to report on the status of test cases
run. Taking advantage of the built-in reporting capabilities of unit
test frameworks is a good start. Most unit test frameworks have
reports that can generate xUnit or HTML formatted reports. xUnit
reports are popular for importing results to a Continuous Integration
(CI) server like Jenkins, Travis, Bamboo, etc. Here are some links
for more information regarding report outputs for several languages.
NUnit 3 Console Runner
NUnit 3 Console Command Line
xUnit getting test results in TeamCity
xUnit getting test results in CruiseControl.NET
xUnit getting test results in Azure DevOps
4.6 - Avoid sharing state
Although mentioned in several places, it is worth mentioning again.
We must ensure that the tests are isolated from one another.
-
Do not share test data. Imagine several tests that each query the database
for valid orders before picking one to perform an action on. Should two tests
pick up the same order you are likely to get unexpected behavior.
-
Clean up stale data in the application that might be picked up by another
test e.g. invalid order records.
-
Create a new WebDriver instance per test. This helps ensure test isolation
and makes parallelization simpler.
4.7 - Tips on working with locators
When to use which locators and how best to manage them in your code.
Take a look at examples of the supported locator strategies.
In general, if HTML IDs are available, unique, and consistently
predictable, they are the preferred method for locating an element on
a page. They tend to work very quickly, and forego much processing
that comes with complicated DOM traversals.
If unique IDs are unavailable, a well-written CSS selector is the
preferred method of locating an element. XPath works as well as CSS
selectors, but the syntax is complicated and frequently difficult to
debug. Though XPath selectors are very flexible, they are typically
not performance tested by browser vendors and tend to be quite slow.
Selection strategies based on linkText and partialLinkText have
drawbacks in that they only work on link elements. Additionally, they
call down to querySelectorAll selectors internally in WebDriver.
Tag name can be a dangerous way to locate elements. There are
frequently multiple elements of the same tag present on the page.
This is mostly useful when calling the findElements(By) method which
returns a collection of elements.
The recommendation is to keep your locators as compact and
readable as possible. Asking WebDriver to traverse the DOM structure
is an expensive operation, and the more you can narrow the scope of
your search, the better.
4.8 - Test independency
Write each test as its own unit. Write the tests in a way that will not be
reliant on other tests to complete:
Let us say there is a content management system with which you can create
some custom content which then appears on your website as a module after
publishing, and it may take some time to sync between the CMS and the
application.
A wrong way of testing your module is that the content is created and
published in one test, and then checking the module in another test. This
is not feasible as the content may not be available immediately for the
other test after publishing.
Instead, you can create a stub content which can be turned on and off
within the affected test, and use that for validating the module. However,
for content creation, you can still have a separate test.
4.9 - Consider using a fluent API
Martin Fowler coined the term “Fluent API”. Selenium already
implements something like this in their FluentWait
class, which is
meant as an alternative to the standard Wait
class.
You could enable the Fluent API design pattern in your page object
and then query the Google search page with a code snippet like this one:
driver.get( "http://www.google.com/webhp?hl=en&tab=ww" );
GoogleSearchPage gsp = new GoogleSearchPage();
gsp.withFluent().setSearchString().clickSearchButton();
The Google page object class with this fluent behavior
might look like this:
public class GoogleSearchPage extends LoadableComponent<GoogleSearchPage> {
private final WebDriver driver;
private GSPFluentInterface gspfi;
public class GSPFluentInterface {
private GoogleSearchPage gsp;
public GSPFluentInterface(GoogleSearchPage googleSearchPage) {
gsp = googleSearchPage;
}
public GSPFluentInterface clickSearchButton() {
gsp.searchButton.click();
return this;
}
public GSPFluentInterface setSearchString( String sstr ) {
clearAndType( gsp.searchField, sstr );
return this;
}
}
@FindBy(id = "gbqfq") private WebElement searchField;
@FindBy(id = "gbqfb") private WebElement searchButton;
public GoogleSearchPage(WebDriver driver) {
gspfi = new GSPFluentInterface( this );
this.get(); // If load() fails, calls isLoaded() until page is finished loading
PageFactory.initElements(driver, this); // Initialize WebElements on page
}
public GSPFluentInterface withFluent() {
return gspfi;
}
public void clickSearchButton() {
searchButton.click();
}
public void setSearchString( String sstr ) {
clearAndType( searchField, sstr );
}
@Override
protected void isLoaded() throws Error {
Assert.assertTrue("Google search page is not yet loaded.", isSearchFieldVisible() );
}
@Override
protected void load() {
if ( isSFieldPresent ) {
Wait<WebDriver> wait = new WebDriverWait( driver, Duration.ofSeconds(3) );
wait.until( visibilityOfElementLocated( By.id("gbqfq") ) ).click();
}
}
}
4.10 - Fresh browser per test
Start each test from a clean, known state.
Ideally, spin up a new virtual machine for each test.
If spinning up a new virtual machine is not practical,
at least start a new WebDriver for each test.
Most browser drivers like GeckoDriver and ChromeDriver will start with a clean
known state with a new user profile, by default.
WebDriver driver = new FirefoxDriver();
5 - Discouraged behaviors
Things to avoid when automating browsers with Selenium.
5.1 - Captchas
CAPTCHA, short for Completely Automated Public Turing test
to tell Computers and Humans Apart,
is explicitly designed to prevent automation, so do not try!
There are two primary strategies to get around CAPTCHA checks:
- Disable CAPTCHAs in your test environment
- Add a hook to allow tests to bypass the CAPTCHA
5.2 - File downloads
Whilst it is possible to start a download
by clicking a link with a browser under Selenium’s control,
the API does not expose download progress,
making it less than ideal for testing downloaded files.
This is because downloading files is not considered an important aspect
of emulating user interaction with the web platform.
Instead, find the link using Selenium
(and any required cookies)
and pass it to a HTTP request library like
libcurl.
The HtmlUnit driver can download attachments
by accessing them as input streams by implementing the
AttachmentHandler
interface. The AttachmentHandler can then be added to the HtmlUnit WebClient.
5.3 - HTTP response codes
For some browser configurations in Selenium RC,
Selenium acted as a proxy between the browser
and the site being automated.
This meant that all browser traffic passed through Selenium
could be captured or manipulated.
The captureNetworkTraffic()
method
purported to capture all of the network traffic between the browser
and the site being automated,
including HTTP response codes.
Selenium WebDriver is a completely different approach
to browser automation,
preferring to act more like a user.
This is represented in the way you write tests with WebDriver.
In automated functional testing,
checking the status code
is not a particularly important detail of a test’s failure;
the steps that preceded it are more important.
The browser will always represent the HTTP status code,
imagine for example a 404 or a 500 error page.
A simple way to “fail fast” when you encounter one of these error pages
is to check the page title or content of a reliable point
(e.g. the <h1>
tag) after every page load.
If you are using the page object model,
you can include this check in your class constructor
or similar point where the page load is expected.
Occasionally, the HTTP code may even be represented
in the browser’s error page
and you could use WebDriver to read this
and improve your debugging output.
Checking the webpage itself is in line
with WebDriver’s ideal practice
of representing and asserting upon the user’s view of the website.
If you insist, an advanced solution to capturing HTTP status codes
is to replicate the behaviour of Selenium RC by using a proxy.
WebDriver API provides the ability to set a proxy for the browser,
and there are a number of proxies that will
programmatically allow you to manipulate
the contents of requests sent to and received from the web server.
Using a proxy lets you decide how you want to respond
to redirection response codes.
Additionally, not every browser
makes the response codes available to WebDriver,
so opting to use a proxy
allows you to have a solution that works for every browser.
5.4 - Gmail, email and Facebook logins
For multiple reasons, logging into sites like Gmail and Facebook
using WebDriver is not recommended.
Aside from being against the usage terms for these sites
(where you risk having the account shut down),
it is slow and unreliable.
The ideal practice is to use the APIs that email providers offer,
or in the case of Facebook the developer tools service
which exposes an API for creating test accounts, friends and so forth.
Although using an API might seem like a bit of extra hard work,
you will be paid back in speed, reliability, and stability.
The API is also unlikely to change,
whereas webpages and HTML locators change often
and require you to update your test framework.
Logging in to third party sites using WebDriver
at any point of your test increases the risk
of your test failing because it makes your test longer.
A general rule of thumb is that longer tests
are more fragile and unreliable.
WebDriver implementations that are
W3C conformant
also annotate the navigator
object
with a WebDriver
property
so that Denial of Service attacks can be mitigated.
5.5 - Test dependency
A common idea and misconception about automated testing is regarding a
specific test order. Your tests should be able to run in any order,
and not rely on other tests to complete in order to be successful.
5.6 - Performance testing
Performance testing using Selenium and WebDriver
is generally not advised.
Not because it is incapable,
but because it is not optimised for the job
and you are unlikely to get good results.
It may seem ideal to performance test
in the context of the user but a suite of WebDriver tests
are subjected to many points of external and internal fragility
which are beyond your control;
for example browser startup speed,
speed of HTTP servers,
response of third party servers that host JavaScript or CSS,
and the instrumentation penalty
of the WebDriver implementation itself.
Variation at these points will cause variation in your results.
It is difficult to separate the difference
between the performance of your website
and the performance of external resources,
and it is also hard to tell what the performance penalty is
for using WebDriver in the browser,
especially if you are injecting scripts.
The other potential attraction is “saving time” —
carrying out functional and performance tests at the same time.
However, functional and performance tests have opposing objectives.
To test functionality, a tester may need to be patient
and wait for loading,
but this will cloud the performance testing results and vice versa.
To improve the performance of your website,
you will need to be able to analyse overall performance
independent of environment differences,
identify poor code practices,
breakdown of performance of individual resources
(i.e. CSS or JavaScript),
in order to know what to improve.
There are performance testing tools available
that can do this job already,
that provide reporting and analysis,
and can even make improvement suggestions.
Example (open source) packages to use are: JMeter
5.7 - Link spidering
Using WebDriver to spider through links
is not a recommended practice. Not because it cannot be done,
but because WebDriver is definitely not the most ideal tool for this.
WebDriver needs time to start up,
and can take several seconds, up to a minute
depending on how your test is written,
just to get to the page and traverse through the DOM.
Instead of using WebDriver for this,
you could save a ton of time
by executing a curl command,
or using a library such as BeautifulSoup
since these methods do not rely
on creating a browser and navigating to a page.
You are saving tonnes of time by not using WebDriver for this task.
5.8 - Two Factor Authentication
Two Factor Authentication (2FA) is an authorization
mechanism where a One Time Password (OTP) is generated using “Authenticator”
mobile apps such as “Google Authenticator”, “Microsoft Authenticator”
etc., or by SMS, e-mail to authenticate. Automating this seamlessly
and consistently is a big challenge in Selenium. There are some ways
to automate this process. But that will be another layer on top of our
Selenium tests and not as secure. So, you should avoid automating 2FA.
There are few options to get around 2FA checks:
- Disable 2FA for certain Users in the test environment, so that you can
use those user credentials in the automation.
- Disable 2FA in your test environment.
- Disable 2FA if you login from certain IPs. That way we can configure our
test machine IPs to avoid this.