We’re firm believers that teams that have incorporated DevOps practices get more done, plain and simple. Whether it’s a movement, practice or company culture it is a debate of its own, but we’ve seen lots of concrete results adopting this approach, collaborating cross-functionally between all our teams and maintaining/upgrading the huge infrastructure we host here at Bitbar. The creation, testing, releasing and all communication related to that flow can happen quickly, frequently, reliably and in a timely manner. In this blog, I’d like to shed some light on how we do DevOps, how we maintain and run our mobile device farm, what infrastructure we use, what actual value we provide for customers by using this approach and what things you’ll probably find interesting.
Yet Another Day In Parad… DevOps!
My name is Radek and I’m running our operations team. Frankly, it’s no joke, but maintaining thousands of devices with hundreds of server machines, a few dozens of different monitoring software components is really a fun job to do. Seeing this unbelievable infrastructure running 24/7 is extremely rewarding and seeing people getting significant value when using it is simply awesome. It does require lots of attention, hard work and sometimes quick workarounds or fire-fighting, but I’d like to share a few thoughts on how we do this. I hope this also gives some references to all of our users about what it takes to build an internal device lab – and not only to build it but more importantly to maintain the infrastructure and enable the show go on.
What It Takes to Maintain Our Devices and Infrastructure
Working with real mobile devices can be challenging. The number of those to be maintained, upgraded and basically monitored – even we’ve built internal monitoring software for that – isn’t a small task. Some of those exotic devices need a lot of attention. Basically, in the Android context, this means that we have to find a way to get those devices recognized and seen by ADB (Android Debug Bridge) before it can be connected to our mobile device farm.
When having the basic connection working and servers communicating with the newly added devices we have to figure out the software side of things. The first thing to do is to enable Google Play (again, in Android) to work seamlessly. This provides a way for users to pull any apps, services and more from app markets on those devices for their test sessions.
Running a Mobile Device Farm – Hardware Point of View
When we look at things from the hardware perspective the top challenge typically related to 24/7 execution and having these devices constantly available for test session is the challenge with the power. Charging and battery bring another dimension to our daily support and DevOps work. Quite many of those devices do not charge properly using a regular USB connection but needs to be recharged using the proper chargers. This brings a minor down-time, but we’ve managed to find an ideal solution for that as well, and as we host numerous copies of devices, we are using rotation to get devices charged so that users don’t see any downtime. Again, this is critical as we’re determined to provide the most robust user experience when it comes to access to our devices.
Furthermore, it’s a good trick to dim screen displays as low as possible. It will save a lot of energy, keep the battery going much longer and the basic need of recharging isn’t that acute. With some of those devices, it’s actually possible to darken the screen entirely and for this reason, those devices seem to run more consistently for longer periods of time.
Power is very critical but so is the connectivity. All our devices on Bitbar Testing are WiFi connected so you can imagine what level of internet connection we need here. There are dozens of WiFi networks we use and our monitors are constantly scanning the service level, and how much we are getting out of network provider. Once the basic connection is working properly we also have to make sure that all devices get connected to Google Play and App Store, and some other app markets (if applicable).
Devices Do Need Constant Attention
Our devices – regardless of whether they are in use for test runs or just idling – do need attention every now and then. As our standard practice is to reboot and clean devices before the next sessions start, sometimes we have to do this manually as well. Things do get stuck or may have some background resources/thread/apps/whatever running that may have an impact on next sessions. Therefore, we take pride to make sure all our devices are clean, rebooted and ready for our users’ test sessions. As you know what happens to the device when you manually reboot it, you have to power it up again and make sure all settings are properly configured (e.g. WiFi connected, app market connection working, Bluetooth/other connectivity enabled).
Also, while using devices for mobile app testing, there are sometimes OS-level notifications about the battery, OS updates, new OEM versions, etc. showing up so that’s us who take care of these. You – as a user – will only get the perfect, readily available sessions with these devices. From the user point of view, this is a tremendous value that is provided for you.
Monitoring of Mobile Device Farm – The Thing That Keeps Us Going
Our mobile device farm has grown up huge so quickly (just in two years) that it would not be maintainable without proper monitoring software. The majority of the software that we use are built for us, internally. For devices and all fine-grained pieces of our cloud implementation require 24/7 monitoring and of course sometimes our manual efforts to get things back online. Our cloud implementation communicates back to servers a lot of data (not only data that you see as your test runs, but also our internal verification messaging) and it needs to be carefully monitored and securely dealt with.
In short, devices need to be online we also use good-old ADB and WiFi tricks to verify that devices are usable, they are properly connected, logged into app markets, batteries, other connections, etc. are working.
Next time, I’ll focus a bit more on our process and how we deal with those alerts our monitoring system is providing us.