Is Google’s Book project just another content scraper?
Over at The Chronicle of Higher Education, Geoffrey Nunberg points out some serious flaws in the Google Book project.
I think it can be summed up by this:
“The ad placement on Google’s book search right now is often comical, as when a search for Leaves of Grass brings up ads for plant and sod retailers…”
Google’s intent for the project seems to be two-fold: Have an immense amount of data to develop natural language processing and have something else to plug text links into.
Making an actual usable database that’s something scholars can rely upon seems a little bit too advanced for their algorithm at this time:
“To take Google’s word for it, 1899 was a literary annus mirabilis,which saw the publication of Raymond Chandler’s Killer in the Rain, The Portable Dorothy Parker, AndrĂ© Malraux’s La Condition Humaine, Stephen King’s Christine, The Complete Shorter Fiction of Virginia Woolf, Raymond Williams’s Culture and Society 1780-1950, and Robert Shelton’s biography of Bob Dylan, to name just a few.”
As it is, the project can feel as useful as a link farm filled with scraped content. If Google Books was a site about web content and not print, they probably would have blacklisted it by now.
Andrew Mayne is founder of Blurbtastic.com and publisher of WeirdThings.com. His personal website can be found at AndrewMayne.com.
