We’re solving a personal pain point of broken analytics and how much effort it was to have an overview of what was being tracked across product teams and platforms. We all worked together on a game called QuizUp (100M+ users) where we used metrics to make decisions. The problem was we repeatedly “broke” conversion funnels and retention charts we relied on when we shipped product updates, by mistakenly removing or changing analytics implementation.
It was driving everyone involved mad – so we built internal dev tools and processes that made implementation easier and our data more reliable. What we built was never perfect, and it was clunky in many ways:
1) There was no one that really wanted to maintain this – but developers ended up agreeing to maintain it because it was better than the alternative of frustrated data scientists requesting a fix for analytics implementation that the developers worked on weeks or months ago.
2) JSON files (or the crappy web apps we invested time in building on top of the JSON schemas) didn’t give us a “human-accessible” overview of what was being tracked and when. So people who weren’t working on analytics every day had no idea what data they should look at to dig into user behavior.
We also discovered that a lot of companies build similar stuff – i.e. some version of internal tools for data validation, either through code gen or through server-side validation, often based on JSON schemas. The same seems to apply for those companies; it’s clunky to update, doesn’t give a proper overview, and no one wants to maintain it – yet it beats the alternative of not having it.
So now, six years after we started maintaining tools like these internally, we’ve built Avo, to solve these issues for more people.
Here’s how it works:
1) The web app is built to optimize the experience of maintaining and version controlling complicated event schemas. That means a few things, for example:
- we built a “differ” that feels similar to git, but instead of line-based diff, it’s object-based
- when you make updates, Avo gives you suggestions to maintain casing and reuse properties across events.
- you can view the historical change of each object similar to Asana tasks
2) The code gen is optimized for bringing type safety and rigour to analytics implementation:
- You install a CLI to easily update (`avo pull`) your tracking library according to the latest version of the event schema. The generated code contains a type-safe function per each event.
- For example: A "Cart Updated" event with an "Item Count" property, would generate `cartUpdated(itemCount: Int)` for Swift. For dynamic languages, as well as for limitations which cannot be expressed through type systems, such as min / max, the runtime validation logs warnings or errors for data structure errors.
Things to note:
- Avo does not store, process or access your data – so no GDPR approval required.
- The Avo code generated libraries wrap whatever analytics SDK you already use. You can use the Avo library alongside the tracking you already have, or do a full migration to make sure all your events are according to the specs in Avo.
- Avo is not another analytics or data pipeline vendor. We love the ones that exist already. We’ve just built Avo to make sure we can use the data we send into them.
Thanks for reading, HN. We would love to hear your feedback, as well as stories of when you built this internally or when you wish you had this.
- Did you use JSON, Thrift, Protobuf, Avro or something else to define the schema for the events in the stream?
- Did you version the schema?
- Did you version each event or object in the schema?
- What type of versioning did you use? (E.g. semantic, incremental counters, git hashes, another type of hash, etc.)
- Any other cool things you'd like to share?
We're looking to understand people's experience with broken analytics, such as inconsistent event names for the same user actions, multiple versions of the same property, inconsistent types, forgetting to send events, etc. For example iOS sends "Order Completed" and Android sends "Buy Item". Or for example web refactors the Signup flow, and accidentally stops sending the "Signup Complete" event.
Can you please help us learn from your experience with product and user behavior analytics?
1) When did you last experience "broken analytics"?
2) How did it break?
3) How did you realize that something was broken?
4) How long until you realized it?
5) What were the consequences?
6) How did you deal with it?
7) How could you have prevented your analytics from breaking?
8) How many people work at your company?
9) What are you building?
Link to survey, if you prefer that: https://goo.gl/forms/bptiCWfF5h86yrAs1
All feedback, insights, and discussions appreciated!