Illustration: Mark Allen Miller
Had Narrative Science — a company that trains computers to write news stories—created this piece, it probably would not mention that the company’s Chicago headquarters lie only a long baseball toss from the Tribune newspaper building. Nor would it dwell on the fact that this potentially job-killing technology was incubated in part at Northwestern’s Medill School of Journalism, Media, Integrated Marketing Communications. Those ironies are obvious to a human. But not to a computer.
At least not yet.
For now consider this: Every 30 seconds or so, the algorithmic bull pen of Narrative Science, a 30-person company occupying a large room on the fringes of the Chicago Loop, extrudes a story whose very byline is a question of philosophical inquiry. The computer-written product could be a pennant-waving second-half update of a Big Ten basketball contest, a sober preview of a corporate earnings statement, or a blithe summary of the presidential horse race drawn from Twitter posts. The articles run on the websites of respected publishers like Forbes, as well as other Internet media powers (many of which are keeping their identities private). Niche news services hire Narrative Science to write updates for their subscribers, be they sports fans, small-cap investors, or fast-food franchise owners.
And the articles don’t read like robots wrote them:
Friona fell 10-8 to Boys Ranch in five innings on Monday at Friona despite racking up seven hits and eight runs. Friona was led by a flawless day at the dish by Hunter Sundre, who went 2-2 against Boys Ranch pitching. Sundre singled in the third inning and tripled in the fourth inning … Friona piled up the steals, swiping eight bags in all …
OK, it’s not Roger Angell. But the grandparents of a Little Leaguer would find this game summary—available on the web even before the two teams finished shaking hands—as welcome as anything on the sports pages. Narrative Science’s algorithms built the article using pitch-by-pitch game data that parents entered into an iPhone app called GameChanger. Last year the software produced nearly 400,000 accounts of Little League games. This year that number is expected to top 1.5 million.
Narrative Science’s CTO and cofounder, Kristian Hammond, works in a small office just a few feet away from the buzz of coders and engineers. To Hammond, these stories are only the first step toward what will eventually become a news universe dominated by computer-generated stories. How dominant? Last year at a small conference of journalists and technologists, I asked Hammond to predict what percentage of news would be written by computers in 15 years. At first he tried to duck the question, but with some prodding he sighed and gave in: “More than 90 percent.”
That’s when I decided to write this article, hoping to finish it before being scooped by a MacBook Air.
Hammond assures me I have nothing to worry about. This robonews tsunami, he insists, will not wash away the remaining human reporters who still collect paychecks. Instead the universe of newswriting will expand dramatically, as computers mine vast troves of data to produce ultracheap, totally readable accounts of events, trends, and developments that no journalist is currently covering.
That’s not to say that computer-generated stories will remain in the margins, limited to producing more and more Little League write-ups and formulaic earnings previews. Hammond was recently asked for his reaction to a prediction that a computer would win a Pulitzer Prize within 20 years. He disagreed. It would happen, he said, in five.
Hammond was raised in Utah, where his archaeologist dad taught at a state university. He grew up thinking he’d become a lawyer. But in the late 1980s, as an undergraduate at Yale, he fell under the sway of Roger Schank, a renowned artificial intelligence researcher and chair of the computer science department. After earning a doctorate in computer science, Hammond was hired by the University of Chicago to lead a new AI lab. While there, in the mid-1990s, he created a system that tracked users’ reading and writing and then recommended relevant documents. Hammond built a small company around that technology, which he later sold. By that time, he had moved to Northwestern University, becoming codirector of its Intelligent Information Laboratory. In 2009, Hammond and his colleague Larry Birnbaum taught a class at Medill that included both programmers and prospective journalists. They encouraged their students to create a system that could transform data into prose stories. One of the students in the class was a stringer for the Tribune who covered high school sports; he and two other journalism students were paired with a computer science student. Their prototype software, Stats Monkey, collected box scores and play-by-play data to spit out credible accounts of college baseball games.
At the end of the semester, the class participated in a demo day, where students presented their projects to a roomful of executives from the likes of ESPN, Hearst, and the Tribune. The Stats Monkey presentation was particularly impressive. “They put a box score and play-by-play into the program, and in something close to 12 seconds it drew examples from 40 years of Major League history, wrote a game account, located the best picture, and wrote a caption,” recalls the Medill dean, John Lavine.