Appium Tip #19: Explore How To Interact With Elements

Everything About Appium Test Automation

In the previous blog we had a glance on to use different types of XPath locators in Appium and how those function calls work when building your test scripts. Today, we’re continuing with the same theme but this time we’ll focus more on Selenium WebElements and Appium MobileElements and how to use those to explore an ideal way to interact with UI elements. This is the 19th blog in our massive 37 Things You Should Know About Appium blog series. Also, I’d like to thank many of you for providing such a great feedback on our prior blogs.


There is distinctive difference between how Appium interacts with UI elements on an app and how Selenium interacts with a web elements. Identifying UI elements is the key to write test scripts that efficiently test through the UI logic. We’ve discussed about identifying those UI elements with test scripting tools like Appium Inspector and uiautomatorviewer, and how to find those UI elements in the Appium back-end context.

The major difference for testers who have worked with Selenium to identify UI elements is to learn how the same can be done with Appium and in mobile app context.

In short, there are many different ways to interact with the elements of your apps. Let’s take a look at some of the most useful ones.

Java for MobileElements

For example in Java, it’s quite easy to check the API documentation for the MobileElements. Here’s a shortened list of some of the most useful methods for your tests:

Methods inherited from the Selenium WebElements are clear, click, getSize,getText, getAttribute, getLocation, getTagName, isDisplayed, isEnabled, isSelected, sendKeys. When you call any of these methods the auto-refresh check is done to determine if UI element is still valid (visible). The typical issue is that if UI element is not visible anymore (for any reason) it will throw an error and all the following calls will fail. The same thing applies to Appium test script: if any of your call fails, it will fail the following ones as well. Expected failures can be countered with try catch statements.

For comparison, the Appium MobileElement methods include the calls for more advanced gestures and direct UI interaction: getCenter, pinch, swipe, tap, zoom. Most of these methods are convenience methods for writing cleaner scripts. One of the great things about these methods is that they don’t actually have to click/interact with a specific element, but instead can be used towards the whole device screen. For example, it usually makes more sense to target a swipe gesture relative to the device screen dimensions. This way the gesture gets done without the use of UI elements and also never outside of the screen. In case like that Appium would throw outOfBounds error.

As you can see, you can do the basic click interaction as well as sendKeys via the Selenium WebElement class, and also get different useful information from an element with the get and is methods. Appium’s MobileElement class gives a bit more options for gestures in the form of convenience methods such as pinch, swipe and tap.

To take this even further, you could take a look at the TouchAction and MultiTouchAction classes, which give even more advanced choices for gestures. TouchActions are explained more in detail for example at the Appium’s Github docs page.

Okay, to make sure your WebDriver always uses the MobileElement instead of the default Selenium WebElements, you’ll need to cast the MobileElement within AppiumDriver like so:

For Android:

AppiumDriver<MobileElement> driver = new AndroidDriver<MobileElement>(capabilities, url);

For iOS:

AppiumDriver<MobileElement> driver = new IOSDriver<MobileElement>(capabilities, url);

If you don’t include the cast, the WebDriver will always consider the elements it finds as Selenium WebElements, which will then be missing the methods of MobileElement. Of course, a WebElement can be cast to a MobileElement also while finding it:

MobileElement element = (MobileElement)driver.findElement(By.xpath("//xpath"));
element.swipe(SwipeElementDirection.DOWN, 1000);


Rest of the Bindings

When using Appium as the server, you can choose to interact with the app either via the Appium’s own language bindings or directly from the Selenium’s own libraries. The difference is in which API library you’re importing for your test script’s disposal. In general it’s better idea to take the Appium’s library to use, since you do get more possibilities to your disposal. You could also purely stay with the Selenium API to make things more simple and less cluttered.

To see what Appium’s Python bindings have to offer on top of the Selenium API, you could take a look at the Python client’s Github page or directly the class.

For Appium’s Ruby library the docs are separated to generic, Android and iOS specific docs.

Node.js client has different frameworks to work with (such as Mocha), so their page first starts off with examples of using those. Then comes the listing of the API commands.

The page for .NET client in C# has a simple nudge towards how to setup your environment. You could take a look at the AppiumDriverCommand class for a listing of all Appium specific commands, but in the end, this is the same info you’d directly get from your IDE’s code completion once set up.

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.