How we built mobile replay

Sep 18, 2024

Posted by
  • Ian Vanagas
  • Manoel Aranda Neto
    Manoel Aranda Neto

Session replay is one of the most powerful tools for understanding user behavior. Web session replay has been a core part of PostHog for a long time now (it was built in our first hackathon), but mobile teams have had to wait longer.

Fortunately, it's finally here on iOS, Android, and React Native (with Flutter coming soon).

What took so long? Although we had the structure to ingest and playback replays, recording them in mobile apps is much trickier than in web apps. This post goes over why and how we finally managed to overcome them.

What's so difficult about mobile session replay

Others have complained about the lack of good mobile replay options. Why is that the case?

1. Multiple platforms

The secret about web session replay industry-wide is that it largely relies on a single open source library to work: rrweb. It includes tools for recording web interactions and state changes, structuring session data, and playback.

Unfortunately, rrweb for mobile doesn't exist. To build mobile session replay, we needed to do all the work ourselves, and when compared to the web, this is a lot of work. This is because, instead of a single JavaScript library, language, and SDK, mobile requires multiple (like iOS, Android, React Native, and Flutter).

There are even breaking differences within platforms. For example, Jetpack Compose uses a compositional model for UI, different from Android's traditional view-based model. This means you need to develop separate ways of doing replays when using it. iOS has a similar problem with SwiftUI.

2. Performance

Desktops are a lot more powerful than phones. Because of this, we need to be much more sensitive about performance.

If you've ever tried to record your phone screen, you would know its impact on performance. Apps take longer to load, animations become choppy, reactivity degrades, and your phone heats up.

Any that degrades the experience this much is not an option for many developers.

3. Privacy

A big difference between web and mobile apps is the DOM.

In web apps, this provides a hierarchical tree structure that represents and uniquely identifies elements. The web also has standardized elements, like <input type="password">. Combining these makes it easier to identify, mask, and exclude sensitive elements on the web.

Mobile doesn't have standardized structures or elements. Accessibility identifiers are also inconsistently implemented. This means identifying, masking, and excluding content is a lot more tricky.

4. Testing

We're big fans of dogfooding at PostHog. Often, we are our own best users. As you might know, we don't have a mobile app.

During development, we would need to rely on demo and open source apps. We risk creating something that doesn't work well for larger, production-quality apps.

Developing a high-quality mobile replay product means relying more on our users and their feedback.

The prerequisites for mobile session replay

First, none of this would be possible without Manoel. We had mobile experience, but not the dedicated mobile SDK experience needed for this big of a feature. Manoel had that experience and this was only possible thanks to him.

When Manoel joined the first thing he worked on was rewriting the SDKs in Kotlin and Swift, removing code we didn't use, improving tests, automating deployments, and making sure they worked with the latest platforms.

We already had the other prerequisite, our existing session replay infrastructure. We had the data structure, the way to store replays, as well as a complete product for playback and analysis. All this could be reused for mobile replay.

Replay

Importantly, Paul and Manoel realized the mobile data needed to be transformed into a format the rrweb player could use. They wanted to keep the rrweb schema to ensure the fewest changes possible to our API and player. To do this, they wrote a validation and testing tool to rapidly test the transformations before deploying it in our main app.

Once this was done, everything was ready for the mobile replay capture to be worked on.

How we built mobile session replay

Work started by developing a proof of concept for Android session replay. This developed from sending anything from an Android app and being replayed to basic components like text and images.

PoC

From here, there was wireframe capture along with logs and network requests. Afterwards, there were standard Android components like radios, checkboxes, Calendar, Toggle, RatingBar, and more.

These need to be transformed to render as an HTML wireframe (as rrweb expects). Many of them required custom transformation and components to render properly. For example, radio buttons didn't group, padding wasn't applied, and positioning was wrong.

Beyond this, fitting the data into a service meant for web caused many challenges. For example, click events weren't showing even though the types and data were the same as data that worked. After some investigation, we found that rrweb expects touch events to be associated with specific elements. Setting the ID to the body element was enough to fix it.

Once we got consistent and useful results from what we built, we recruited our first test users and iterated from there. Later, we followed a similar process for iOS.

Solving mobile replays big problems

So how did we solve some of the big problems we identified earlier? Two were relatively simple:

  1. Multiple platforms: Do the work to develop mobile replay for all the platforms (which is still ongoing).
  2. Testing: Use open source and test apps to develop a proof of concept. Once complete, use our large user base to find users willing to test our prototype. Luckily, this feature had massive demand and there were users who were willing to try the earliest versions of it.

The other two have more clever solutions.

1. Performance

Not slowing down people's apps was core to our mission with session replay.

Our strategy to do wireframes was much less performance-intensive than others tools' reliance on screenshots. We still built screenshot mode, which provided a more accurate representation, but mostly focused on making our wireframe mode as good as possible.

Screenshot vs Wireframe

Many performance issues users faced were either caused by screenshot mode or unsupported data being captured. Both of these were solved by making wireframe mode better.

On top of this, we try to offload as much work to our servers as possible. For example, the transformation to the rrweb schema happens on the server side.

2. Privacy

As for privacy, we built the ability to mask all text inputs and images as well as redact certain views with ph-no-capture like this:

<ImageView
android:id="@+id/imvProfilePhoto"
android:layout_width="200dp"
android:layout_height="200dp"
android:tag="ph-no-capture"
/>

We added this functionality on both wireframe and screenshot mode. Although as nice as it would have been to have automatic masking like we do on the web, the ability to have masking at all enables privacy-sensitive teams to actually use replay.

There is more challenges to solve, specifically around SwiftUI and Jetpack Compose, since the way they transpile code causes the representation to not be a 1:1 match and properties are lost.

Making mobile session replay available for everyone

In many ways, mobile has been neglected by the analytics industry. Tools like session replay have either not existed, been locked behind enterprise plans, or been too expensive for most developers.

We want to change this. Mobile replay is free while in beta, and once it's out of beta, we'll follow our pricing principles, making it as affordable as possible.

This enables us to help more developers have the tools they need to build successful products.

Further reading

Comments