Skip to content

Many of you are using image recognition or other types of visual/element/object character recognition implementations for testing. Especially for mobile games, this is very handy as all of the graphical content is based on OpenGL ES or directly coming from UI engines that are difficult to recognize by test automation scripts.

In this example, I’ll be walking you through of basic example how to use image recognition for mobile game testing and what sort of assets and test script you need for that.

Download Our Appium Beginner’s Guide to Set Up a Proper Testing Environment for Appium and Image Recognition

In this example, we’ll be using Hill Climb Racing (directly downloaded from Google Play) for Android. For testing, we use a set of real Android devices (from different OEMs, different OS versions, different form factors etc.) and compare those results. Also, we’ll be using Appium test automation framework with our own image recognition/object character recognition feature implemented for Bitbar Testing. And to make things easy and straightforward, we’ll be using server-side Appium execution so that minimalistic configurations are required from a test automation perspective.

File Structure

Screen Shot 2015-05-19 at 12.37.44 PM
As it comes to putting the zip file together with all required test assets, the basic file structure is as illustrated in the picture above. The three core files for test package – pom.xml (for maven including the build specific stuff), (with Bitbar Testing and Appium specific URL information) and (shell script for execution) – are the actual configuration for the test infrastructure. Image files (under ‘queryimages’ / ‘hc’) are the png files that certain visual elements from Hill Climb Racing and will be used to define a certain area and action for the test script.

The Test Script and Images as Visual Identifiers

Screen Shot 2015-05-19 at 1.11.17 PM
With the help of Akaze and OpenCV, we can quickly build functions that compare the screen and graphics content in .png files. The idea is to give visual assets (.png files) as they are and test script will compare and perform an action whenever those visual assets are shown on screen. For example, with some of the test frameworks, the timing and delays have been problematic. With this sort of implementation, you don’t need to implement delays or any timing logic for scripts but instead, the script can wait until certain visual assets are shown on screen.

    public Point[] findImageOnScreenAndSetRotation(String image) throws Exception {
        int retries = 5;
        Point[] imgRect = null;
        while ((retries > 0) && (imgRect == null)) {
            if (retries < 5) {
                log("Find image failed, retries left: " + retries);
            takeScreenshot(image + "_screenshot");
            //this will identify the rotation initially
            imgRect = findImage(image, image + "_screenshot", "notSet");
            retries = retries - 1;
        assertNotNull("Image " + image + " not found on screen.", imgRect);
        return imgRect;

The above function is used to determine if the screenshots need to be rotated and by what degree. This will also set the rotation for screen and images will be recognized with the proper orientation. Next we will perform a simple click for any identified visual asset described in .png file:
Screen Shot 2015-05-19 at 1.47.01 PM

For example, the image with “more button” content is compared here with the screen content and if a suitable element is spotted, the click is performed. The function is called with the right coordinates and time how long button/visual element should be pressed is given as a parameter.

    public void tapImageOnScreen(String image) throws Exception {
        Point[] imgRect = findImageOnScreen(image);
        //imgRect[4] will have the center of the rectangle containing the image
        if (automationName.equals("selendroid")) {
            selendroidTapAtCoordinate((int) imgRect[4].x, (int) imgRect[4].y, 1);
        } else {
            driver.tap(1, (int) imgRect[4].x, (int) imgRect[4].y, 1);
    public void selendroidTapAtCoordinate(int x, int y, int secs) throws Exception {
        TouchActions actions = new TouchActions(driver);
        actions.down(x, y).perform();
        actions.up(x, y).perform();

In the game-play, we will use selendroidTapAtCoordinate directly. As a parameter, the png name is given and the time how long the key will be pressed. The test script has advanced to the actual game-play stage and we really only have three possible (and clickable) items on the screen. With this configuration, the ‘gas’ pedal is pressed down for 15 seconds and then released:

Screen Shot 2015-05-19 at 2.30.18 PM

As you might have known what happens in this game after car runs out of fuel, the game-play ends and social media sharing view with scoring etc. will be shown. Before that, the test script checks if the test passed or failed – and this is also done based on images shown on the screen.

    public Point[] findImageOnScreenNoAssert(String image, int retries) throws Exception {
        Point[] imgRect = null;
        while ((retries > 0) && (imgRect == null)) {
            if (retries < 5) {
                log("Find image failed, retries left: " + retries);
            takeScreenshot(image + "_screenshot");
            imgRect = findImage(image, image + "_screenshot");
            retries = retries - 1;
        return imgRect;

Screen Shot 2015-05-19 at 2.39.10 PM

Sudden Notification Messages/Distractions for Test Scripts

The typical “issue” with visual image recognition on Android is those notification “pop-ups” that may disturb the execution of the script. However, those notifications can be also visually identified as a png and included in the script. For example, if the following notification message comes up your script can simply do the click on OK and everything keeps going on:

Screen Shot 2015-05-19 at 2.43.19 PM

Reviewing the Test Results

As this sort of test can be executed simultaneously against hundreds of devices on Bitbar Testing, you will get screenshots, logs and Appium logs, test data etc. as a result Let’s take the screenshots and comparison of those first. Screenshots are taken along the way when test script advances, those are used to compare the test script progress, perform actions and to define how quickly test script advances.

Screen Shot 2015-05-19 at 2.50.21 PM

Another very important part of the testing is the performance. This is especially the case with mobile games and we’ve integrated Gamebench as part of the test execution at Bitbar Testing to provide comprehensive and accurate performance results.

Screen Shot 2015-05-19 at 2.50.07 PM

Then, after the execution on a variety of different devices, the logs can be very instrumental for understanding if app/game has any problems or potential issues and how those can be fixed. We gather all important metrics (from logs) under the projects and all of those can be inspected after the test run with correct stamped time/date:

Screen Shot 2015-05-19 at 2.50.34 PM

Finally, with the test automation and possibility to do simultaneous test runs on a variety of different devices, users get PASS/FAIL status from each device. This is a great overview of which devices might require special attention as you fine-iron your app/game for market/users:

Screen Shot 2015-05-19 at 2.51.19 PM

Happy testing!

Ville-Veikko Helppi

Mobile Testing Product Expert