I wanted to call this story “Father of Data Warehousing Talks Turkey About Big Data at Upcoming Mission College Event.” But I didn’t, because while my readers would understand it, Google might not, and index the story with websites about cooking Thanksgiving turkey.
You understand, because you speak colloquial American English, you know that Thanksgiving is 10 months away, and that November is the only month of the year turkey recipes make headlines. But nothing in the raw text tells Google this.
That illustrates the problem that Bill Inmon will talk about on March 6 in the second of Mission College Center for Innovation and Technology’s (MC2IT, mc2it.missioncollege.edu) free speaker series, “Creating Business Value from Big Data.”
Inmon is credited with being the father of the data warehouse – he coined the term in the 1970s and, in 1992, published “Building the Data Warehouse,” which continues to be a central text for data management. Computerworld named him one of the 10 most influential people in the 40-year history of the computing industry. Inmon also founded several businesses starting with Prism Solutions in 1991, and his most recent, Forest Rim Technology, in 2008.
It’s estimated that we’re creating 2.5 quintillion (10 followed by 18 zeros) bytes of data every day. That’s an ocean – more figurative language! – of potential insight for everything from curing cancer to figuring out the most likely customers for your brand of insurance.
Big Data is the technology for making sense of this data ocean, and it’s been getting a lot of buzz lately. But despite the buzz, “doing analytics and trying to get insight from Big Data is devilishly difficult,” says Inmon.
One large consulting company reports that of 150 Big Data “proof of concept” projects that its clients began, only five went on to a successful project implementation. “A Wall Street Journal survey in December said that corporations are receiving a return of 55 cents for every dollar spent on Big Data,” Inmon says.
“Despite all of the potential of Big Data, achieving business value is something that is very difficult to do. It doesn’t come in the traditional methods [for handling information]. It’s not IT business as usual – the approaches, techniques used in the past don’t work with Big Data.”
Inmon frequently uses this example. Two men are talking about a woman and one says to the other “She’s hot.” What does that mean? If they’re in a singles bar, it probably means the woman is attractive. If they’re doctors in a hospital they’re probably talking about a patient’s temperature.
In other words, the information “she’s hot” only has meaning if we know its context – something that we’re always doing without paying much attention to it.
However, traditional business information systems operate within a specified context – a structure that tells you what a piece of information is. It’s in a specific file, such as “part number” or a defined field in a record, such as “customer name.” With unstructured data – think of corporate emails, sales reps’ notes, desktop spreadsheets, Tweets – you don’t have the structure to tell you what it means.
“There is context in Big Data,” Inmon notes. “But it’s not where you think it is.” As you may expect, Inmon hasn’t been at this for 40 years without having a solution. It’s “textual disambiguation,” a method for bringing context to raw data from sales, marketing, manufacturing and other business functions.
“It’s 90 percent a business job and 10 percent an IT job,” he says. That collaboration is “a change that the corporation hasn’t accepted gracefully,” he observes. “There’s a whole new set of skills that are needed. If people are going to be making decisions about Big Data, they need to know that word [textual disambiguation] and how it helps you determine business value.”
If you want to know more, you’ll have to attend Inmon’s talk on March 6 at the Santa Clara Convention Center, 5001 Great America Parkway, from 6 to 8 p.m. at the Betty E. Hangs Theatre. Admission is free. Find out more, as well as information about the VIP reception from 4:30-6 p.m., at tinyurl.com/mc2itinmon.