Abbot
(with support from Costello)
Applying what I learned with JUnit to writing GUI tests, and building a better
'bot.
Contents
The Pain (Problems and Goals)
On the subject of why testing is so important, I won't repeat here what
has been aptly described in
Test
Infected. If you haven't read that document, please do so before
proceeding here.
Often a developer will want to make some optimizations to a piece of code on
which a lot of user-interface behavior depends. Sure, the new table sorter
will be ten times faster and more memory efficient, but are you sure the
changes won't affect the report generator? Run the tests and find out. A
developer is much more likely to run a test that has encapsulated how to set
up and test a report than he is to launch the application by hand and try to
track down who knows how to do that particular test.
To test a GUI, you need a reliable method of finding components of interest,
clicking buttons, selecting cells in a table, dragging things about. If
you've ever used java.awt.Robot, you know that you need A Better 'Bot in order
to perform any user-level actions. The events the Robot provides are like
assembly language for GUI testing. To facilitate and encourage testing, you
need a higher-level representation.
Describing expected behavior before writing the code can clarify the
developer's goals and avoid overbuilding useless feature sets. This principle
applies to GUI component development as well as composite GUI development,
since the process of writing tests against the API can elucidate and clarify
what is required from the client/user point of view.
GUI Design
First of all, any GUI component should provide a public API which can be
invoked in the same manner via a system user event or programmatically. Keep
this in mind when writing new components. In the case of Java's Swing
components, the event handling is mixed up with complex component behavior
triggers in the Look and Feel code. It's not possible to execute the same
code paths without triggering the originating events. A better design would
be to have the Look and Feel code simply translate arbitrary,
platform-specific event sequences into public API calls provided by the
underlying component. Such a design enables the component to be operated
equally as well by code, whether for accessibility or testing purposes.
GUI Testing
GUIs need testing. Contrary to some opinion, the problem is not always (or
even commonly) solvable by making the GUI as stupid as possible. GUIs that are
sufficiently simple to not require testing are also uninteresting, so they do
not play into this discussion. Any GUI of sufficient utility will have some
level of complexity, and even if that complexity is limited to listeners (code
responding to GUI changes) and updates (GUI listening to code state changes),
those hookups need to be verified.
Getting developers to test is not easy, especially if the testing itself
requires additional learning. Developers will not want to learn the details
of specific application procedures when it has no bearing on their immediate
work, nor should they have to. If I'm working on a pie chart graph, I don't
really want to know the details of connecting to the database and making
queries simply to get an environment set up for testing. So the framework
for testing GUIs should require no more special knowledge than you might need
to use the GUI manually. That means
- Look up a component, usually by some obvious attribute like its label.
- Perform some user action on it, e.g. "click" or "select row".
Scripts vs compiled code
How can I test a GUI prior to writing code? One alternative (and useful in
certain cases) is developing a mockup in a RAD tool. Unfortunately, the
usefulness is relatively short-lived; it's not really possible (at this point
in time) to automatically generate a very interesting interface. If your
entire interface consists of buttons and forms, you may not really need a gui
tester anyway. Mockups don't convert well to tests for the actual developed
code, and RAD tool output usually requires some hand modification
afterwards.
What a test script could plainly describe the GUI components of interest, and
simply describe the actions to take on those components? Providing you know
the basic building blocks, you can edit the scripts by hand or if you don't
know the building blocks, in a script editor. No compilation necessary, which
speeds development and maintenance of tests.
I wanted scripts to be hand-editable, with no separate compilation step. I
wanted to be able to drop scripts into directory hierarchies as needed, and
have the test framework pick them up automatically, similar to how JUnit
auto-generates test cases.
How to Test
One issue in defining a GUI test is that you need to map from a semantic
action (select the second item in the records table) onto a programmatic
action (myTable.setSelectedIndex(1)
). This comprises two separate
problems. First, the target of the action, "records table", must be somehow
translated into an actual code object. Second, the semantic event "select the
second item" must be translated into a programmatic action on that code
object.
Tracking components
There are many methods of identifying a component in a GUI hierarchy. We want
the tests to be flexible enough to not break simply because another button or
panel was added to the GUI layout.
- Component Names Java provides for naming components, but since no
one ever names them, this method of identification is mostly useless. Worse,
some auto-generated components (frames for otherwise frameless windows,
windows for popup menus, and most dialog instantiations) have auto-generated
names, which aren't particularly helpful, or even downright misleading if
components get created in a different order than expected.
- Position in hierarchy This method guarantess a unique match, but
also guarantees your script will break when that hierarchy changes, even for
otherwise trivial modifications (like inserting a scrollpane). Each component
would need to store its parent reference and index within that parent. Note
that this implies each parent reference might need to store the same
information for itself.
- Parent window title This is useful as a first-pass discriminator in
multi-window applications.
- Component class This helps discriminate from the available
components list, but is not likely sufficient on its own to identify a
component.
- Custom tags Here's where the real component resolution is done.
Many component classes will have some aspect or property that can be used to
uniquely identify them, typically a label (the text of a button or menu, a
labelFor component, a window's title), but potentially any identifying
element may be used. Abbot uses custom component testers to get the unique
tag for any given component. These testers are dynamically loaded, so we
don't have to know a priori how to get a tag for a given component class.
Combined with all the previous attributes, Abbot does a pretty good job of
tracking components.
Recording events
Ideally, there would be a programmatic function on every component that
performed a given user action, similar to the doClick
method on
AbstractButton
. Given that this is not the case, we want to
provide for both implementing such functions and loading them dynamically, and
constructing semantic events from low-level events when corresponding
functions are not available. Having dedicated semantic functions makes
writing scripts easier, but supporting the low-level events means they are not
absolutely required.
The most basic events to support are those that correspond to low-level OS
events for user input, namely
- Pointer motion
- Button press/release
- Key press/release
In addition, the most common semantic events that affect user input are
windows opening and closing, so we throw in support for those (Alternatively,
the time delta between events could be preserved for replay, but doing so has
not yet proved useful in very many cases)
- Wait for window open/closed
The easiest way to write a script is to record a series of actions for later
replay, adding checkpoints or assertions along the way to verify correctness.
Eventually, you might want to add heuristics that strip out meaningless
events, but keeping track of a few basic events is sufficient for good
functionality.
Over and above event stream support, it's useful to have custom, class-based
component functions that provide higher-level semantic events. For example, a
custom table might export a select(int row, int col)
action to
select a cell within the table, or sortByColumn(int col)
to
invoke a click on the column header which would cause a sort in the table.
The custom actions use existing building blocks (low-level events or existing
semantic actions) to construct the higher-level semantic event.