Yahoo to anonymize logs after 90 days
Unfortunately, it's hard to gauge the true privacy impact of this policy change until we know exactly what steps Yahoo will be taking to anonymize the data. The devil's in the details, and if Yahoo's anonymization process isn't robust enough, this new logging policy may end up being more privacy PR than privacy protection. Fully anonymizing IP addresses and cookie data can be tricky, and even if that data is thrown away completely, there's still the possibility of individuals being identified based on the content of their search queries, as AOL's search data spill demonstrated.Yahoo To Anonymize Logs After 90 Days, Compared to Google's 9 MonthsSo, as Yahoo finalizes its policy plans, it should take a look at EFF's newly-revised Best Practices for Online Service Providers, which recommends a range of techniques to strongly anonymize online user data. Hopefully, we'll see the details of Yahoo's plan soon, as well as new announcements from other search engines trying to keep up in this accelerating privacy competition. Internet users have long trusted search engines and internet portals like Yahoo and Google with the privacy of their most intimate and sensitive data, and we're glad to see those companies finally vying to earn that trust.
Update: Christopher sez, "You note that Google currently 'anonymizes" logs after 9 months. That is not true, due to the fact that they do not attempt to mask cookies until the 18 month mark. Removing some tiny portion of an IP address from the logs is worthless, if cookies can be used to match up new log entries and older log entries."


the latest
latest episodes
It might also be an unintended side-effect of a new process for archiving data, in order to meet applicable data retention laws. Perhaps they find it more cost-effective, or just cannot afford any longer to hold onto data whose future utility is dubious.
I'm also unamused by the synchronicity between the outgoing Bush administration's last days and this move.
From TFPR:
To protect users and our business partners, there will be some specific and limited exceptions to the anonymization policy. In order to fight fraud and preserve system security, Yahoo! will retain system specific data in identifiable form for no more than 6 months -- but only for this purpose. Yahoo! may have to retain data for longer periods to meet other legal obligations.
Just sayin...
What does it mean to "anonymize" logs? Remember when AOL released anonymous logs and it turned out that if you have a bunch of "anonymous" searches from someone, you can pretty well figure out who they are?
Unless each search is unrelated to other searches by the same user, it's not anonymous.
"Yahoo! may have to retain data for longer periods to meet other legal obligations."
I hear Chinese judges are sticklers for correct procedure.