Building the Future of Device Testing: My Journey with DAZN's TestOps Team

When the pandemic hit and our Katowice office became empty, I realized something profound about my role on DAZN's TestOps team: while most of my colleagues could seamlessly transition to remote work, our team couldn't. Why? Because you can't take a room full of smart TVs, PlayStation consoles, and set-top boxes home with you.

Working on an Autonomous Team

Being part of DAZN's TestOps team meant being a true full-stack developer in every sense. Our team was completely autonomous - we didn't just write code, we managed infrastructure, worked with physical hardware, designed user interfaces, and built entire systems from the ground up. In a single day, I might find myself debugging WebDriver implementations, configuring HDMI capture cards, writing React components, and optimizing database queries.

This autonomy was both challenging and incredibly rewarding. When we identified a problem with our testing infrastructure, we didn't wait for another team to fix it - we rolled up our sleeves and built the solution ourselves.

The Challenge: Testing Sports on Every Screen

DAZN streams live sports to millions of viewers across an incredibly diverse ecosystem of devices. Data showed that a significant percentage of our viewing hours came from living room devices - smart TVs, game consoles, and set-top boxes. Each platform presented unique challenges:

Smart TVs: Samsung Tizen, LG webOS, Android TV with different WebView engines
Gaming Consoles: PlayStation 4/5, Xbox Series X/S with proprietary frameworks
Set-Top Boxes: Various cable boxes and streaming devices with limited APIs

The traditional Selenium approach to automated testing simply doesn't work when you're dealing with a PlayStation 5 or a Samsung Smart TV. There's no WebDriver for a console, and you certainly can't install browser extensions on a smart TV.

Building the Test Automation Framework (TAF)

Our solution started as a side project called the Test Automation Framework (TAF). What made TAF special was its architecture - instead of trying to force traditional web testing tools onto living room devices, we built a middleware layer that could adapt to any platform.

The TAF-Middleware Architecture

The heart of TAF was our custom middleware - a cloud service that acted as a translator between standard WebDriver commands and device-specific actions. Here's how it worked:

Test Framework sends standard WebDriver commands (click, screenshot, etc.)
TAF-Middleware processes these commands and routes them to the target device
Injected Scripts on the device execute the actual actions and return results
TAF-Middleware formats the response back to the test framework

This architecture meant our test scripts could remain platform-agnostic while TAF handled the complexity of communicating with different devices.

The Screenshot Challenge

One of the biggest challenges I faced was capturing screenshots from these devices. Screenshots are crucial for visual regression testing and debugging, but each platform required a different approach:

Device APIs: When available (like Android), we could use platform APIs
HDMI Capture: For pixel-perfect screenshots, but DRM content often blocked this
Camera Solutions: Universal but quality-dependent on camera placement and lighting

I spent countless hours optimizing our HDMI capture setup, dealing with DRM restrictions, and fine-tuning camera positioning to get reliable visual feedback from our tests.

While most DAZN employees could work from home during the pandemic, our TestOps team needed to maintain physical access to our devices. This led us to create TVLab - a dedicated room in our Katowice office filled with every device we needed to support.

TVLab wasn't just a storage room for devices; it was a sophisticated testing environment. Every TV, console, and set-top box was connected to our network and equipped with infrared emitters for remote control simulation. We had HDMI capture cards, multiple camera setups, and network monitoring equipment.

TVLab Hardware Setup

Managing 20+ different devices in a single room presented unique challenges. Infrared signals could interfere with each other when multiple devices of the same brand were close together. We solved this by creating focused IR emitters and carefully positioning devices to minimize cross-interference.

The Virtual Remote Revolution

One of our most innovative solutions was the Virtual Remote system. This allowed anyone at DAZN to remotely access and control devices in TVLab from their home computer.

How Virtual Remote Worked

The Virtual Remote was essentially a small computer that could:

Capture real-time video from connected devices
Send infrared, HDMI-CEC, or network commands to control devices
Record sessions for later analysis
Learn new remote control patterns for unsupported devices

Technical Implementation

I worked extensively on the kernel patches and system hooks that made Virtual Remote possible. The system needed to:

Route video streams efficiently without lag
Handle multiple simultaneous user sessions
Manage device control conflicts when multiple users tried to access the same device
Maintain device state consistency

The learning mode was particularly interesting to implement - we could teach the system new device control patterns by recording IR signals and mapping them to standardized commands.

As a full-stack developer, I was responsible for building the web interface that internal teams used to interact with our testing infrastructure. The React-based dashboard needed to handle real-time data from dozens of devices while providing an intuitive experience for both technical and non-technical users.

Real-Time Device Monitoring

The dashboard displayed live feeds from all devices in TVLab, their current status, and any active test sessions. Using WebSocket connections, we could stream video and device status updates in real-time. I implemented a queue system to handle multiple simultaneous video streams without overwhelming the browser.

Test Tagging and Categorization

One feature I'm particularly proud of is the tagging system we built. Teams could tag their tests by:

Region: EU, US, APAC for geolocation-specific testing
Device Type: Smart TV, Console, STB for hardware-specific tests
Feature: Live Sports, VOD, Login for functionality-specific tests
Priority: P0, P1, P2 for test importance and scheduling

This made it incredibly easy for teams to run exactly the tests they needed without having to manually configure device lists.

Reporting and Analytics

Moving Beyond TestRail

Initially, we used TestRail for test reporting, but as our autonomous team, we wanted something more tailored to our needs. I built a custom reporting system using Allure as the base and created TeReBu - our internal tool for historical test analysis.

TeReBu provided two key functions:

Test Report Uploads: Automated collection of test results from all TAF runs
Historical Trend Analysis: Visualization of test performance over time, regression detection, and device reliability metrics

Data Storage and Retrieval

I designed the database schema to efficiently handle our massive volume of test data. We needed to store not just pass/fail results, but also performance metrics, screenshots, video recordings, and device telemetry data. The challenge was making this searchable and fast - engineers needed to quickly find why a specific test failed on a specific device three weeks ago.

The Day-to-Day Reality

Hardware Challenges

Working with physical devices meant dealing with hardware failures regularly. TVs would randomly reboot, consoles would overheat, and network connections would drop. I became adept at remote device management - writing scripts to automatically power cycle devices, monitoring temperature sensors, and implementing health checks that could detect when a device needed manual intervention.

One memorable incident involved a Samsung TV that would only fail tests on Fridays. After weeks of investigation, we discovered it was a network congestion issue - our office cleaning schedule interfered with the WiFi connectivity just enough to cause intermittent failures.

The Multi-Device Control Problem

Controlling multiple devices simultaneously presented fascinating technical challenges. When testing live sports, we needed to verify that the same stream worked correctly across dozens of devices at once, each with different capabilities and network conditions.

I developed a device orchestration system that could:

Queue commands across multiple devices to avoid IR interference
Synchronize actions (like starting playback at the same time)
Handle device-specific timing requirements (some TVs need longer to process commands)
Manage resource conflicts when multiple tests needed the same device

Performance Insights from Real Hardware

One of the most valuable aspects of our approach was getting performance data from actual consumer devices. Unlike synthetic tests, we could see exactly how the DAZN app performed on a three-year-old Smart TV with limited memory, or how input lag affected the user experience on gaming consoles.

This real-world data led to several critical optimizations:

Memory usage improvements for older Smart TV models
Buffering strategies optimized for different device capabilities
UI responsiveness tuning based on actual device performance profiles

While DAZN embraced flexible working during the pandemic, our TestOps team had a unique responsibility: ensuring that everyone else could work effectively from home. We became the bridge between remote developers and the physical devices they needed to test their code.

The Virtual Remote system was crucial here. Product managers could demo new features to stakeholders using real devices from their home offices. Developers could debug device-specific issues without traveling to the office. QA engineers could run exploratory tests on actual hardware during their home working hours.

But someone still needed to maintain TVLab, manage the hardware, and ensure everything was working. That someone was us - the TestOps team.

Lessons from Building Something Unique

The Full-Stack Reality

Working on an autonomous team meant I gained expertise across the entire technology stack in ways that wouldn't have been possible in a more specialized role. In a typical week, I might:

Debug kernel-level issues with video capture drivers
Optimize React components for real-time video streaming
Design database schemas for test result storage
Configure network infrastructure for device connectivity
Write automation scripts for device management
Implement WebDriver protocols for custom platforms

This breadth of experience was invaluable, but it also meant constantly context-switching between very different technical domains.

Infrastructure Complexity

Managing physical devices at scale taught me lessons that no amount of cloud computing experience could provide. When a server fails in the cloud, you spin up a new one. When a Smart TV fails in TVLab, someone has to physically walk to the device, diagnose the issue, and potentially replace hardware.

I learned to build systems that could gracefully handle hardware failures, automatically detect when devices needed maintenance, and provide enough redundancy to keep tests running even when multiple devices were offline.

The Importance of Real-World Testing

Our approach validated something important: there's no substitute for testing on real hardware with real network conditions. The performance characteristics, edge cases, and user experience nuances we discovered through actual device testing simply couldn't be replicated in emulated environments.

Looking back, I'm incredibly proud of what we built with TAF, Virtual Remote, and TVLab. The system evolved from a side project into a critical piece of DAZN's infrastructure that enabled the company to scale its living room presence while maintaining quality.

But perhaps more importantly, the experience taught me the value of autonomous teams and full-stack thinking. When you're responsible for everything from the hardware to the user interface, you develop a deeper understanding of how systems work together and where the real bottlenecks lie.

Reflections on Autonomous Development

The Power of Ownership

Being part of an autonomous team meant we had complete ownership of our problems and solutions. When we identified an issue, we didn't create a ticket for another team - we fixed it ourselves. This led to faster iteration cycles and solutions that were perfectly tailored to our specific needs.

The downside was the constant context switching and the breadth of knowledge required. Some days I felt like I was spreading myself too thin across too many different technical domains.

Building for the Future

The systems we built were designed to last and scale. TAF's architecture was flexible enough to support new device types as they emerged. Virtual Remote's learning mode meant we could quickly add support for new hardware without rewriting core functionality. The reporting infrastructure could handle growth in both test volume and result complexity.

The Ultimate Test

You know the system works when you can't see it working. The ultimate validation came during major live sports events - Champions League finals, Premier League matches, NFL games - when millions of viewers tuned in simultaneously across hundreds of different devices.

When everything worked seamlessly, when fans could focus on the game instead of fighting with their TV remote or dealing with buffering issues, that's when I knew our work had made a difference.

After a long day of debugging infrared signals and optimizing video capture drivers, there's something deeply satisfying about settling in to watch a match on DAZN, knowing that the technology infrastructure you helped build is making that experience possible for millions of fans around the world.