howard rheingold on crap detection
Howard Rheingold on Crap Detection 101:
The answer to almost any question is available within seconds, courtesy of the invention that has altered how we discover knowledge – the search engine. Materializing answers from the air turns out to be the easy part – the part a machine can do. The real difficulty kicks in when you click down into your search results. At that point, it’s up to you to sort the accurate bits from the misinfo, disinfo, spam, scams, urban legends, and hoaxes. “Crap detection,” as Hemingway called it half a century ago, is more important than ever before, now that the automation of crapcasting has generated its own word: “spamming.”
Unless a great many people learn the basics of online crap detection and begin applying their critical faculties en masse and very soon, I fear for the future of the Internet as a useful source of credible news, medical advice, financial information, educational resources, scholarly and scientific research. Some critics argue that a tsunami of hogwash has already rendered the Web useless. I disagree. We are indeed inundated by online noise pollution, but the problem is soluble. The good stuff is out there if you know how to find and verify it. Basic information literacy, widely distributed, is the best protection for the knowledge commons: A sufficient portion of critical consumers among the online population can become a strong defense against the noise-death of the Internet.
The first thing we all need to know about information online is how to detect crap, a technical term I use for information tainted by ignorance, inept communication, or deliberate deception. Learning to be a critical consumer of Webinfo is not rocket science. It’s not even algebra. Becoming acquainted with the fundamentals of web credibility testing is easier than learning the multiplication tables. The hard part, as always, is the exercise of flabby think-for-yourself muscles.
Read more: http://www.sfgate.com/cgi-bin/blogs/rheingold/detail?entry_id=42805#ixzz0iTjBVWA7
(etiquette note: 3 paragraphs is fair reblog use, right?)
Funny he should mention algebra. I’ve been thinking a lot about the ways in which mathematics can contribute to crap detection. Examples of what I’m thinking of so far:
Hans Rosling, Let My Dataset Change Your Mindset, TED talk
I’ve tweeted this three times at least. You really must watch it. I had some dumb ideas about “the developing world” before Rosling corrects me with a whump upside the head of brilliantly narrated data. I’d love to interrogate http://gapminder.org but I don’t know the right questions. Anyone for some digital social science?
A few weeks back there was an interactive infographic that shows perceived sustainability graphed against actual sustainability (I’m afraid I’ve lost the link). Whole Foods in particular is perceived as being really green, without being substantially more sustainable than other grocery stores. Of course, have we detected crap re: Whole Foods, or have we detected crap re: the survey methods?
Benford’s Law
Naturally occurring data has a particular sort of probability distribution, no matter what kind of data. Seriously, the examples are bizarre. (Radiolab has a great story on this law, I believe in the ‘Numbers’ episode.) That gives it a potential use in detecting data which either 1. has interesting anomalies, or 2. is fraudulent. It seems we should be testing this law against all kinds of open data, if for no other reason than to ooh and aah over Benford’s law, but also with an eye for detecting surprising things.
Re: your comment on Benford’s Law. I think you will run into a problem with Goodhart’s law.
* any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes
If Benford’s Law is widely used to test probability that data is real, then spammers will manipulate the data and make it more compliant with Benford’s Law than the real world.
I don’t know enough about statistical magics to know if that sort of manipulation is supposed to be detectable. I’ll look into Goodhart’s, thank you!
You’ve probably seen it, but; Terrence Tao on Goodhart’s:
http://www.google.com/buzz/114134834346472219368/UGWBBEPG3LE/To-illustrate-Goodharts-law-as-discussed-in-the
http://www.google.com/buzz/114134834346472219368/UHQZQdCB86m/Goodharts-law-Any-observed-statistical-regularity