Article here.
Discuss.
Article here.
Discuss.
Not sure if you’ll get much without a summary of the main concern , but it’s worth mentioning that using hashing they just have something in place to detect exact duplicates of those files. It’s not like the NSA, where individuals are looking at your communications and attachments. So it is does not really touch on any significant privacy concerns. Google also said they are developing a way to use encrypted “fingerprints” of files so that other companies could implement the same, something which could be made mandatory by law for corporations and websites in various countries.
So, I’m not really sure there’s much to discuss here, unless I am missing something.
LikeLike
Sorry. Are you saying google had an existing copy of the file and simply detected that one of their users was storing it in his email? If so, I agree there is not much to discuss. I agree that seems fairly undiscussion-worthy. I didn’t get that from what I read.
thanks.
LikeLike
A hash value is a unique identifier for a file (or other piece of data, like a text string). Hash values look like long strings of jumbled letters and numbers. (For example, the md5 hash value of “NewAPPS” is “4c90c305d1f20d2a75113a18a2f6fcc9”.) You can calculate that value by running a hashing function on that data, and there are different functions with different algorithms. Identical files will have identical hashes for a given algorithm. What looks to have happened here is that there is a large database of hash values of known child pornography images and Google then ran the hashing function against files on its servers (which it probably does anyway for other technical purposes) and then checked to see if any of the resulting hash values matched ones already in the database. If a file on its servers turns out to have a hash value that’s already in the database, then it’s very likely child pornography. (It’s important that the hashing function be “collision resistant” [see http://en.wikipedia.org/wiki/Collision_resistance%5D so that there are not two different files with the same hash. But if there’s some guy with a bunch of images that have the same hashes as those in the database, it’s very unlikely that they would all be due to collision.) So Google doesn’t have to have a human even look at the guy’s emails/images nor does it even have to have a prior copy of the image to do this.
LikeLike
Yeah, it was on some other blogs I read and I looked up how they did it (because i can’t let people think I am wrong on the internet!!). Here is what I have from earlier:
This is Google’s description
Since 2008, we have used “hashing” technology to tag known child sexual abuse images, allowing us to identify duplicate images which may exist elsewhere. Each offending image in effect gets a unique fingerprint that our computers can recognize without humans having to view them again. Recently, we have started working to incorporate these fingerprints into a cross-industry database.
So as far as I know, they are only detecting exact duplicates of known files. This is a little less concerning than if it detected near matches (because of false positives) or if there was an element of manual searching, like if keywords or text combinations got people’s stuff flagged and then a google employee would start browsing around. Though I believe google has made clear elsewhere that it would be totally within their legal rights to do so.
Looking at it again, the original report by khou ends with a quote of the detective saying “I really don’t know how they do their job. But I’m just glad they do it.” And it says that the files were “detected.” Then other news sites said that google “spotted” the images and “tipped off” the authorities, which suggests that a person was looking in the emails and stuff and then called them.
Actually what google does is use hashes to find exact matches in email attachments and automatically forwards this information to the tip line, which is also what they do with their search engine. It’s a legal obligation to do this anyway, including with their web crawler thingy for the search engine. No individuals at google are involved in the process of detection or reporting.
Probably the guys email account was under his name or phone number and this info matched the sec offender registry, so the local police got a search warrant. Which, meh. This is exactly what even privacy diehards would want from Google and friends. So, false alarm cause, viral headline, and there’s much more troubling privacy issues out there if you’re into that sorta thing, like how law enforcement people don’t need a warrant or cause to go and read your email, Facebook stuff, etc, because the data is hosted on those companies’ servers.
LikeLike
Thanks guys. Comforting to know that’s how Google spotted him.
LikeLike
Leave a comment