Yahoo to anonymize logs after 90 days

The Electronic Frontier Foundation's Kevin Bankston discusses the news that Yahoo! will radically reduce the retention period for its logs, anonymizing them after just 90 days (compared with Google's 9 months). It's a pretty radical development: for years, I've been skeptical of claims that tech companies would compete on privacy, issuing press releases that said, in effect, "Use us, we're less snoopy and creepy than those guys!" But here we are -- the company whose data-retention and palsy relationship with the Chinese Politburo put a campaigning journalist in jail is now saying that it's going to sanitize its logs on a quarterly basis. Kevin's got a reality check:
Unfortunately, it's hard to gauge the true privacy impact of this policy change until we know exactly what steps Yahoo will be taking to anonymize the data. The devil's in the details, and if Yahoo's anonymization process isn't robust enough, this new logging policy may end up being more privacy PR than privacy protection. Fully anonymizing IP addresses and cookie data can be tricky, and even if that data is thrown away completely, there's still the possibility of individuals being identified based on the content of their search queries, as AOL's search data spill demonstrated.

So, as Yahoo finalizes its policy plans, it should take a look at EFF's newly-revised Best Practices for Online Service Providers, which recommends a range of techniques to strongly anonymize online user data. Hopefully, we'll see the details of Yahoo's plan soon, as well as new announcements from other search engines trying to keep up in this accelerating privacy competition. Internet users have long trusted search engines and internet portals like Yahoo and Google with the privacy of their most intimate and sensitive data, and we're glad to see those companies finally vying to earn that trust.

Yahoo To Anonymize Logs After 90 Days, Compared to Google's 9 Months

Update: Christopher sez, "You note that Google currently 'anonymizes" logs after 9 months. That is not true, due to the fact that they do not attempt to mask cookies until the 18 month mark. Removing some tiny portion of an IP address from the logs is worthless, if cookies can be used to match up new log entries and older log entries."


Discussion

Take a look at this

It might also be an unintended side-effect of a new process for archiving data, in order to meet applicable data retention laws. Perhaps they find it more cost-effective, or just cannot afford any longer to hold onto data whose future utility is dubious.

Take a look at this

I'm also unamused by the synchronicity between the outgoing Bush administration's last days and this move.

Take a look at this

From TFPR:

To protect users and our business partners, there will be some specific and limited exceptions to the anonymization policy. In order to fight fraud and preserve system security, Yahoo! will retain system specific data in identifiable form for no more than 6 months -- but only for this purpose. Yahoo! may have to retain data for longer periods to meet other legal obligations.

Just sayin...

Take a look at this

What does it mean to "anonymize" logs? Remember when AOL released anonymous logs and it turned out that if you have a bunch of "anonymous" searches from someone, you can pretty well figure out who they are?

Unless each search is unrelated to other searches by the same user, it's not anonymous.

Take a look at this

"Yahoo! may have to retain data for longer periods to meet other legal obligations."

I hear Chinese judges are sticklers for correct procedure.

Post a comment

Anonymous