undefined | Better HN

0 pointsnumber61y ago0 comments

I am a DPO. The claims Plausible makes won't hold up to scrutiny.

It's a simple trick: declaring all data collected to technical data, when in fact it is linkable to a data subject.

Thus collection of the data requires consent, because a subject is identified at least for the session.

If you can identify unique visitors you are clearly identifying individuals.

0 comments

Indeed you are correct. Plausible it is not. They should put their cookie consent back up, and need to inform their users how they are indeed processing the data collected from personal users.

Symbiote1y ago

  hash(daily_salt + website_domain + ip_address + user_agent)

That's what they do. Within 24 hours the daily salt is gone, and the data is anonymous.

https://plausible.io/data-policy#how-we-count-unique-users-w...

makach1y ago

problem is that this is what they say they do, there are too many examples of companies being noncompliant to their own policies and regulations. they should explain the abovementioned algorithm in their data privacy declaration published online. also even a hash can be considered as a private and personal data unless it has been protected sufficiently. thus need to inform your users anyway.

number6OP1y ago

Good approach. IP Addresses are personal data. So the data and the hash is subject to GDPR.

You still need consent to collect it - well or some other kind of legal shenanigans. The intent is to track a person, it is not technically necessary. You might have a legitimate interest - but in the end you still have to consider the GDPR to use this tool.

https://europa.eu/youreurope/business/dealing-with-customers...

omnimus1y ago

Turns out that many officials believe this is fine. Companies using Plausible, Matomo and similar services have been under scrutiny.

IP adress is required for site to function - your server cant not collect it. Plausible also only processes it for uniqueness and doesnt save it as is. Interestingly most webservers/firewalls will have to keep track of ip adresses so they will be saved in acess logs and caches. Making them more problematic than Plausible. Yet its most likely fine because the intent is not to track individual users but to improve service/keep it runing. Plausible intent is also not track individual users but collect visitor counts which is something used for improving service too.

I think you might be prematurely spreading fear.

JimDabell1y ago

> Turns out that many officials believe this is fine.

Who has gone on record with this, and in which jurisdictions?

1 more reply

number6OP1y ago

> Plausible also only processes it for uniqueness and doesnt save it as is

That's exactly the point. Processing of personal data to identify a unique person.

Regarding firewalls and logs: It's argued that this is legitimate interest as it is stated in Recital 49 of the GDPR. So they got a free pass, for the better or worth.

> I think you might be permanently spreading fear

Don't get me wrong, I like the approach. But it's not a get out of GDPR free card.

2 more replies

dgroshev1y ago

That's a bit simplistic. IP addresses are not unequivocally personal data. Let's rewind back a bit, GDPR Art. 4:

> ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

IP addresses only allow to identify a natural person when combined with other data, such as ISP data or a profile built over dozens of websites. This is not the same kind of personal data as a name + address, Breyer notwithstanding (note the bit about the ISP in the judgment).

GDPR is not about identifying an abstract entity, it's about identifying a natural person. Doing the former for long enough/with enough data allows the latter, but especially with time-limited in-memory hashes that's a non-existent window of opportunity.

In practice this'd probably need to be resolved in court, and I'm sure not a single SME using Plausible or similar will even get a stern letter, much less fined.

number6OP1y ago

> In practice this'd probably need to be resolved in court, and I'm sure not a single SME using Plausible or similar will even get a stern letter, much less fined.

Agreed.

Plausible just makes false claims like:

> All the site measurement is carried out absolutely anonymously. Cookies are not used and no personal data is collected. There are no persistent identifiers.

That's a heavy statement and it is simply not true, as you quoted:

> an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person

hash(daily_salt + website_domain + ip_address + user_agent) will fall under this definition.

But again, you are right, better then anything any other service does

1 more reply

aspect05451y ago

What’s your thought on the approach adjust.com takes? They say you can claim legitimate interest

newusertoday1y ago

what are your thought on aggregated data? you can still identify unique visitors but its aggregated data so you can't link it back to the individual.

I have doubts that just identifying unique visitors would also identify individuals. Their current approach of creating random id which is unique for 24 hours should not violate GDPR? or it would?

number6OP1y ago

You begin at a point where you have data to aggregate. This data is linked to individuals.

Anonymisation of data is data processing and some argue, that it is subject to a privacy impact assessment. Arguing that if done poorly it has great negative consequences for the individual if they can be deanonymized.

The duration itself does not change the outcome.

Thus said the approach Plausible takes is much better than any cookie used.

anonzzzies1y ago

I think you can argue if this holds up: you cannot retrieve the ip from the hash (and residential IPs are usually dynamic). The short lifetime together with never storing the hash makes it so you cannot de-anonymise the user.

No one will get fined for not asking consent for this. Our DPO just said ‘don’t be silly’ when I asked him. But we will see if it gets tested (my bet: it won’t).

ralferoo1y ago

> I think you can argue if this holds up:

Sadly, reckons don't hold up in court.

> you cannot retrieve the ip from the hash

You don't need to retrieve the ip to make it PII, the hash itself is PII.

You might not think of it as containing actual "personal information", but its sole purpose is to attempt to uniquely identify a person. That makes it PII.

> (and residential IPs are usually dynamic)

This actually makes the short lifetime more suitable as a PII, because it reduces the likelihood of the same IP being used by a different person being tracked as the same person.

> The short lifetime together with never storing the hash makes it so you cannot de-anonymise the user.

That also doesn't matter, because the lifetime of the token is long enough to track the user through and entire typical session, maybe several.

The stupid thing in all these shenanigans is that collecting the data isn't itself the problem, it's not getting the user's consent. Just tell the user what you're doing, and it's not a problem - if it's a "technically required" cookie they can make an informed choice to use your site or not, if it's an "optionally required" cookie, they can choose whether to accept or not. Most users won't care and will click on the biggest, most obvious buttons. The ones that do care are likely atypical and would skew your metrics anyway.

JimDabell1y ago

> you cannot retrieve the ip from the hash

You can as long as you have IPv4 visitors, because the search space is small enough to brute-force. There are only four billion IP addresses. The user-agent complicates things a little but there aren’t many of those, so you could retrieve the IP addresses of most visitors from the hash if you wanted to.

> residential IPs are usually dynamic

Usually isn’t good enough. I’ve had residential IPs that are on public record belonging to me personally. IP addresses can be personally identifying information, so they need to be treated that way.

1 more reply

number6OP1y ago

You would still have to produce the paperwork for this.

Most websites don't get fined using GA. Plausible is a huge step in the right direction, but their claims are very strong and not backed up by the GDPR if you take a closer look.

Regarding fines: most offices will give you a warning instead of a fine, you adjust your cookie banner and you are good to go

1 more reply

j / k navigate · click thread line to collapse

0 comments

makach1y ago

Indeed you are correct. Plausible it is not. They should put their cookie consent back up, and need to inform their users how they are indeed processing the data collected from personal users.

Symbiote1y ago

  hash(daily_salt + website_domain + ip_address + user_agent)

That's what they do. Within 24 hours the daily salt is gone, and the data is anonymous.

https://plausible.io/data-policy#how-we-count-unique-users-w...

makach1y ago

number6OP1y ago

Good approach. IP Addresses are personal data. So the data and the hash is subject to GDPR.

https://europa.eu/youreurope/business/dealing-with-customers...

omnimus1y ago

Turns out that many officials believe this is fine. Companies using Plausible, Matomo and similar services have been under scrutiny.

I think you might be prematurely spreading fear.

JimDabell1y ago

> Turns out that many officials believe this is fine.

Who has gone on record with this, and in which jurisdictions?

1 more reply

number6OP1y ago

> Plausible also only processes it for uniqueness and doesnt save it as is

That's exactly the point. Processing of personal data to identify a unique person.

Regarding firewalls and logs: It's argued that this is legitimate interest as it is stated in Recital 49 of the GDPR. So they got a free pass, for the better or worth.

> I think you might be permanently spreading fear

Don't get me wrong, I like the approach. But it's not a get out of GDPR free card.

2 more replies

dgroshev1y ago

That's a bit simplistic. IP addresses are not unequivocally personal data. Let's rewind back a bit, GDPR Art. 4:

In practice this'd probably need to be resolved in court, and I'm sure not a single SME using Plausible or similar will even get a stern letter, much less fined.

number6OP1y ago

> In practice this'd probably need to be resolved in court, and I'm sure not a single SME using Plausible or similar will even get a stern letter, much less fined.

Agreed.

Plausible just makes false claims like:

> All the site measurement is carried out absolutely anonymously. Cookies are not used and no personal data is collected. There are no persistent identifiers.

That's a heavy statement and it is simply not true, as you quoted:

hash(daily_salt + website_domain + ip_address + user_agent) will fall under this definition.

But again, you are right, better then anything any other service does

1 more reply

aspect05451y ago

What’s your thought on the approach adjust.com takes? They say you can claim legitimate interest

newusertoday1y ago

what are your thought on aggregated data? you can still identify unique visitors but its aggregated data so you can't link it back to the individual.

I have doubts that just identifying unique visitors would also identify individuals. Their current approach of creating random id which is unique for 24 hours should not violate GDPR? or it would?

number6OP1y ago

You begin at a point where you have data to aggregate. This data is linked to individuals.

The duration itself does not change the outcome.

Thus said the approach Plausible takes is much better than any cookie used.

anonzzzies1y ago

No one will get fined for not asking consent for this. Our DPO just said ‘don’t be silly’ when I asked him. But we will see if it gets tested (my bet: it won’t).

ralferoo1y ago

> I think you can argue if this holds up:

Sadly, reckons don't hold up in court.

> you cannot retrieve the ip from the hash

You don't need to retrieve the ip to make it PII, the hash itself is PII.

You might not think of it as containing actual "personal information", but its sole purpose is to attempt to uniquely identify a person. That makes it PII.

> (and residential IPs are usually dynamic)

This actually makes the short lifetime more suitable as a PII, because it reduces the likelihood of the same IP being used by a different person being tracked as the same person.

> The short lifetime together with never storing the hash makes it so you cannot de-anonymise the user.

That also doesn't matter, because the lifetime of the token is long enough to track the user through and entire typical session, maybe several.

JimDabell1y ago

> you cannot retrieve the ip from the hash

> residential IPs are usually dynamic

1 more reply

number6OP1y ago

You would still have to produce the paperwork for this.

Most websites don't get fined using GA. Plausible is a huge step in the right direction, but their claims are very strong and not backed up by the GDPR if you take a closer look.

Regarding fines: most offices will give you a warning instead of a fine, you adjust your cookie banner and you are good to go

1 more reply

j / k navigate · click thread line to collapse