(The user event stream for the individual user shows the correct session length though.)
This would make me wary of the rest of the data. Not to mention that all of the more interesting reports are in the paid plans, which makes 10M free events that you can't report on of limited value.
Edit: The first event for the first user / session in the "real-time activity details" has a usage time of "20 min", wheras the first event for the second user / session starts at 0 seconds, so the bug lies there somewhere, and seems like it shouldn't have a meaningful impact on statistics for an app in production with lots of users, but disconcerting for a first-time dev-user of Amplitude to see bogus values.
While I think reports like individual user streams are neat, I find they're not very good at diagnosing a product and driving growth.
One of the best charts for doing that is a simple cohort analysis / retention chart. If you've been storing historical data about your users in your database or in a log file, one thing you could try is importing historical data into Amplitude and then looking at your retention chart. I just finished doing this for a friend in Mixpanel earlier today. Here's the result: http://aacook.co/retention.png
This chart only uses two user events (Sign Up and some usage event you define) but tells you so much. Week/week acquisition (number of new users signing up) is in the first column, new user activation in the 2nd column (number of new sign ups who reached a moment of value) and a basic form of retention (number of users coming back at week N).
In my friend's startup, they're doing a great job with new user acquisition but they have a clear onboarding/activation problem. Less than half of new sign ups reach the authentic usage state. In the following week, another 50% of those users drop off.
I was only referencing the fact that the individual user data appeared to be correct in one view, even though the session length distribution chart was way-off. Had it been off for all users, it would be a big problem, but only affects the first session (See my edit above).
I agree cohort analysis is useful, but not particularly difficult to capture and chart on the server side for web apps, or mobile apps with a server back-end. For SPAs or hybrid or mobile apps you need a facility to capture client side events, which is where something like Amplitude really adds value.
It's one less thing to build or run, but if it's hamstrung by limited reporting, free isn't really free - you have to pay to unlock the value of that event stream.
To give a little more context around how web sessions work, the first event creates the session and each subsequent event triggered within 30 min of the previous one will be considered in the same session. The session length is calculated as the time between the first event of the session and the last.
So the trends were similar, but the actual numbers where completely different.
I don't understand why people will use "analytic tools" that doesn't actually document what they're counting, and how. At least with Free software, one can have a look and try to figure it out (eg: piwik, visitors). With a lot of "services" your only option for a sanity check is to look at access logs -- which of course is one of the things one tries to get away from when moving to an analytics platform...
(Note: Haven't yet looked at what documention amplitude provides, just noting that documenting how you're counting is actually an essential part of the product -- and something any user needs to know. Seeing your comment here, at leasts hints that that information isn't (readily) available on amplitude.com/docs. If it is, you'd just have linked to it, right? ;-)
I guess what I'm trying to say is that, while services like these are great, I find that eventually after a certain size startups will almost always need to have in-house analytics infrastructure to cover all use cases and needs specific to the business. Not that services like Amplitude don't have their place, which they do, I just wish there was an universal solution to that problem. But I suppose you can't help rolling your own infrastructure given specific enough needs.
We've been thinking a lot about this exact issue over the last few years and have come to the same conclusion: when dealing with a platform that provides a ton of value out of the box, you need to be somewhat opinionated, and you can't expect to cover every use case. Different customers can have very different questions about their data and unique insights that they want to discover.
To address that, we offer direct SQL access to the raw data in Amazon Redshift. While Redshift isn't the best solution for powering high-performance, realtime dashboards, its main benefit is that it supports most of the SQL standard, which means that you can answer close to any question about the data. This duality of the fast, easy-to-use dashboards that provide immediate value and the flexibility of Amazon Redshift for deeper, customized queries has helped a lot of our customers overcome the problems you describe.
Hope this makes sense, and we'd love to hear your feedback!
Edit: Also just saw you guys just recently raised a round, congrats!
Jeffrey's the VP of Engineering at Amplitude who built the core of our current infrastructure.
If you ever get to that point, you can use raw Redshift access to do whatever queries you want. I used to use Zynga's pretty state of the art analytics infrastructure, and Amplitude dashboard was the only thing that covered most of my edge cases. For the rest you can always go raw SQL.
The other strategy companies use is "Burn out the support team by starving them for resources" but I don't like that either.
The other way to attack it is to do things that scale better- for example better onboarding documentation or hosting webinars.
You can use the normal self-service helpdesk methods to filter out the newbie questions.
We're also in the middle of a 3rd party SOC 2 audit right now.
is redshift that cost effective ?
We've actually developed an in house backend from scratch that's gone through ~10 or so iterations. There are 3 main data stores:
1) We have a real-time in memory data store (similar to Redis) for recent data that aggregates data in various ways to display on the dashboard 2) We have a batch service backed by Amazon S3 that for data older than 24 hours that also aggregates data 3) We have a column store for more complex queries that can't be represented by the aggregated data
On top of all that we have a distributed query engine that access all the data stores and queries the appropriate ones in parallel at query time on the dashboards.
We'll have a blog post that has more details about our stack in the next week or so.
I'll personally guarantee that we'll never remove stuff from the current free plans- in fact we hope to add features to the free plan and increase it's volume as time goes on. Let me know if you need any other guarantees!
Want Analytics? It's spelled Amplitude.
The dashboard api is amazing. We only need a few simple numbers, and the API works fantastic.
Thinks like custom dashboards are cool, but we're able to just grab the numbers onto our own that we have melded with other services as well.
Your growth discovery engine feature looks pretty cool, as does the event path report. This looks similar to conversion path reports I've seen in GA's attribution data.
Unfortunately, a lot of times those reports look nice but don't help much because they fail to provide meaningful insights when dealing with any sort of user volume. I've always wanted some sort of zoom out view that lets me view color-coded patterns or something like that so I can visually get a sense of what is going on and let my brain's pattern recognition abilities go to work to spot clusters. This is what in turn informs the questions to feed into your growth report, since your customers may not know the right questions to ask yet. Providing tools to give the answers isn't enough, you have to provide tools that spit out the questions ;)
We actually tried to make a fully automated version of growth discovery engine and customers hated it! To our surprise- people don't trust the result if Amplitude hands it out on a silver platter. They want control in setting up the analysis so they know what they're getting out of the system. It's the same sort of aversion people have towards "magic" in programming. Eventually as people get more comfortable we'll automate more of it!
Out of curiosity, what sorts of things was your initial attempt spitting out as insights?
Also, as someone likely in your target audience (senior digital marketing/analytics guy), I have a suggestion for your site that I see lots of similar businesses fall victim to.
I want to see how the product works without investing time or having a sales conversation so I can determine if there might be sufficient fit to even warrant investing more of my extremely limited time.
You have some small screenshots and some marketing text. Consider adding a full demo account with pre-populated data that people can poke around in WITHOUT providing an email address. At the very least give me full-screen images when I click a screenshot. I want to see your interface. I want to see what data looks like in there, how I can expect to manipulate it, create reporting views, filter, etc. And again, I want to do all this without being pestered by a sales person or being added to a list.
The marketer in me is saying "well, maybe consider A/B testing gating such a demo behind an email capture or more involved signup form", but the other part of me that gets pestered by sales people for things like this day in and day out, and really geeks out on this stuff is screaming "just let me see the F-ing product!"
https://medium.com/@illscience/mixpanel-vs-amplitude-6b3ba36...
All that being said, the free plan doesn't include all those neat features like cohort analysis, and the paid plan is way beyond our reach at ~$1k. We're currently paying around $300 to mixpanel, and that's already maxing out our small budget. We would love to have a paid plan with less events, but more functionality at a more affordable price point.
Here is the reality of how this works. Someone on the business side sees a shiny dashboard and wants one. The product involved gets pitched over the fence to engineering, and we go to the site. We know from go that we are fucked cause the header doesn't mention API/Developer or the words integration. Because we have a JOB that we probably like we click on help...
Oh look integration docs, Maybe there is hope for this turd!
Wait I'm on zen desk? There are broken pages (there are broken pages)? No public forms (well at least they aren't apparent).
This tool like the 16 other analytics packages we run, are going to be a flash in the pan, or so narrowly defined that it will be single use. Im not going to put anything useful in here, because, well when we hit that cap we will be asked to "cull data" rather than pay.
Keys to success in this space: Give a shit about the guys implementing your service. If you can't be bothered to build and run and host your own documentation I don't know what to say. If your going to send me to a third party, at least have user forums / public user communication front and center. Let me know that people are having issues, that your addressing them.
Give me a way to tier out data. Yes you sold my business person on the "free" tier thats nice. Let me give you the data they are going to want at some point, let me do it in such a way that you can put it in cold storage till I need it. Give me a way to back it up to your S3 at cost, or to where I want (my own S3 my own data storage solution for free). Give me a way to "keep myself under the cap" by moving the data I want around.
A while later, some business person wants to see "something new" in the dashboard... Well guess what, we have been collecting that all along, go pay the vendor get the history and see it in real time. If you can do THAT without having to involve ME as an engineer then you have a winner on your hands, otherwise your just analytics implementation 17 and we will move on to 18.
Also- totally agreed on there being a lot of other analytics products out there, which is why we felt the need to be aggressive with giving away so much data for free.
What do you mean to tier out data? Would love to hear how we could make it compelling for you.
1. Product person A, adds your product into the stack and pushes RIGHT UP to the limit you have set. 2. Product person B needs "some new slice of data" but guess what... your 12k annual cost is way more than they want to or need to spend out of their P/L, and their data is going to push the organization over the cap, so they look at a new product.
1. Product person A adds in your product to whatever portion of the platform they are responsible for. it lives in a vacuum there. 2. Product person B comes along and lacks awareness or training on how to use your tool. They can implement another tool, get free training and not have to involved them selves with product person A.
1. Product person A adds your product into the stack, the engineers who implement it find issue with your API/documentation/support. 2. Product person B wants to add your product into the mix, and the ENGINEERS say "were not doing that again find another vendor".
All of the above situations create "fragmented" insights. As an example one of my employers used Mixpanel for mobile analytics. They were happy with "real time sales numbers" but the reality of "were not shipping a sub set of those orders" is where the rubber meets the road. Not only does fragmentation make it hard to get real insight but it makes it hard to troubleshoot things as well.
You want my data, take it, but take ALL of it. Don't just take the events I want to send you, take the logs, take it from mobile and web, take it from the backend systems, take it from my shippers. Take more than JSON over http, let me open up a socket and give you the fire hose, give me librarys in Go, java, php, python, C*. Give me a set of logging tools that gets you that data quickly. Give engineers a way to send you more context than just the user (process, session, transaction, response time). Give me a way to tell YOU what data to keep "hot" and what data to archive... Just take all the data. If you have it all in one place, your going to make the engineers life easy, the first time some product person runs up and says "conversions are up on our new launch" and wants to take credit for the UI changes, and I can lay a graph of response times over it and say, "well it might be cause the server is faster, and people aren't walking away" your going to not only have my business but everyone else I can convince to adopt your product.