Issues and developments related to IP, AI, and OM, examined in the IP and tech ethics graduate courses I teach at the University of Pittsburgh School of Computing and Information. My Bloomsbury book "Ethics, Information, and Technology", coming in Summer 2025, includes major chapters on IP, AI, OM, and other emerging technologies (IoT, drones, robots, autonomous vehicles, VR/AR). Kip Currier, PhD, JD
Showing posts with label mining Google's scanned books. Show all posts
Showing posts with label mining Google's scanned books. Show all posts
Tuesday, December 10, 2013
In a Scoreboard of Words, a Cultural Guide; New York Times, 12/7/13
Natasha Singer, New York Times; In a Scoreboard of Words, a Cultural Guide:
"“We wanted to create a scientific measuring instrument, something like a telescope, but instead of pointing it at a star, you point it at human culture,” Mr. Michel recalls. The pair approached Peter Norvig, the director of research at Google, with a pie-in-the-sky proposal: to mine Google’s library of digital books so they could apply automated quantitative analysis to the typically qualitative study of history.
According to the book, Mr. Norvig was intrigued. But he challenged the graduate students by asking how such a system could work without violating copyright.
After some thought, Mr. Aiden and Mr. Michel proposed creating a kind of “shadow data set” that would contain frequency statistics on the most common words or phrases in the digitized books — but would not contain the books’ actual texts.
The pair developed a prototype interface, called Bookworm, to prove their idea. Soon after, engineers at Google, including Jon Orwant and Will Brockman, built a public, web-based version of the tool."
Subscribe to:
Posts (Atom)