I'd like to know: what are HN's opinions and/or solutions to this dichotomy?
1) Don't go overboard. GA is one thing. GA plus twenty other obscure analytics cookies and scripts is another. If your site looks like [1], there is 0 chance of ever being on my white-list for ads.
2) Just be honest and transparent. You don't need a big "WE USE COOKIESSSSSS" banner, but a short little message along the lines of "We use GA to help develop a better experience for you. All data is in aggregate, and not identifiable. Please consider white-listing our website if you would like to participate in making this website even better".
Ideally when I visit a site I'd like to see minimal tracking by default with an opt-in model. If a site tracks the bare minimum by default and unobtrusively asks me if I'd be okay with sharing a little bit more (or participating in certain testing, etc.) I will generally say yes. Showing me you care enough to let me opt-in vastly increases my trust in you as a provider. Opt-in is best, opt-out is not ideal but acceptible if clear and easy, no option = no whitelist.
Also, I think if you are asking as a privacy conscious startup, you should consider taking a step back and really asking yourself how much you really need to track. What is essential vs. what is nice to have? Anything that you implement that is non-essential is another step away from being privacy conscious, in my books at least.
We're living in a time where the default seems to be to track as many data points as possible, then figure out if they are relevent/useful. This is backwards. Implement the absolute minimum and see how it goes. A month down the road you might realize you desperately need to see X metric, so you implement X tracking. However, and more likely in my opinion, is that you realize that X metric adds minimal value to you - yet it tracks significantly more PII from your customers. It's all about balance.
My anonymizer tool uses a bloom filter to determine which IP are addresses are unique, maintaining a semi-accurate unique visitor count. User agents and referrers are also anonymized.
My blog post explains this in more technical detail: https://www.jamieweb.net/blog/using-a-bloom-filter-to-anonym...
[1] https://gitlab.com/jamieweb/web-server-log-anonymizer-bloom-...
If you really need to use a third-party, I'd prefer one where you pay for the service (MixPanel, etc) rather than a "free" one like Google Analytics. At least the paid one has less incentive to use the data for their own purposes while the whole point of Google Analytics (and the reason for it being free) is to provide data for Google's advertising business.
If HTTP referrer isn't an option, a custom query parameter like ?referer=source_site is what I'd consider okay - it looks straightforward so someone can decide to remove it manually if they have a good reason to do so. I would avoid stuff like utm_ parameters as they just mean Google Analytics and not only is this blocked by default by my browser & DNS server it also means you don't care about my privacy and are happy for Google to track me (I never click on links in emails for this reason because they all got some kind of scummy shit to track me that involves a third-party).
If its just a flat rate per conversion, this could be accomplished with minimal (if not no) PII collected from the customer.
I recall someone posting a privacy-friendly analytics tool on show HN a few months back, so you don’t have to roll your own.
The downside is that you lose GA’s insight into who your users are (gender, interests, age group...). But that is some seriously creepy stuff when you stop to think about it.
You may not need user tracking at all. You can track how many signups came from which domain by setting a cookie with the referrer and incrementing a count for that domain at signup. Here’s an interesting post about it https://doingdone.app/blog/building-a-startup-without-user-t...
* Anonymized and/or aggregated data
* Behavior tracking via cookies (in app vs marketing?)
* Referral/Affiliate tracking