As Bob Eve sees it, big data should be a team sport. One where the IT department deploys and maintains foundational tools and business teams use those tools to perform ad hoc analyses and unearth new insights. But despite the emergence of powerful data management platforms and sophisticated analytics software, most teams are stuck on the practice field.
“Eighty percent of every analytics project is devoted to data preparation,” says Eve, a director of data and analytics products at Cisco. “It needs to be easier and faster to find, combine, and normalize data in advance of analysis. Not just for IT specialists and data scientists, but for everyone.”
- To get the most out of big data, companies need to put the power of analytics into the hands of business users. After all, they are the ones with the questions and the ability to act on the answers.
- Access to data is the key, but it must be given in a way that doesn’t create chaos in the IT environment.
Confronting a bottleneck
Many companies have built data lakes to store large volumes of information and advance their analytics capabilities. But the results haven’t always matched the hype.
“Companies are getting frustrated because they are pouring tons of money into their big data systems, but not seeing a comparable return,” says Michele Goetz, principal analyst at Forrester. “But you can’t just dump IoT data into a Hadoop environment, for example, and expect business insights to magically appear. Sensor data is just a log. It doesn’t mean anything without some context or correlation.”
Today, aggregating diverse data sets from diverse systems and applying contextual relevance is largely a manual process, the complexity of which places a tremendous strain on IT groups and slows down the business.
“Big data environments and analytics appliances aren’t the problem,” says Goetz. “It’s the bottleneck that is created when every request and query is reliant on the IT organization.”
Self-service data preparation
What’s needed is a self-service, front-end data portal, Goetz suggests. One that is platform agnostic and helps business users find, pull, and prepare data from a number of repositories. One that allows data manipulation and experimentation without adversely affecting the underlying systems or overarching governance policies.
- Cisco® Data Preparation—which runs on the Intel® Xeon® processor-based Cisco Unified Computing System™—allows raw data to be quickly and easily gathered, combined, and enriched.
- The self-service application puts the power of analytics into the hands of nontechnical business users.
“Data from multiple sources can be integrated, cleansed, and explored without coding or scripting,” says Eve. “It works a bit like Excel, where columns can be added and things can be moved around. And you don’t have to be an IT whiz.”
What may seem like a risky relinquishing of control can greatly benefit an IT organization, says Goetz.
“IT groups can focus on foundational capabilities instead of one-off projects,” she says. “They can maintain security and access parameters that protect underlying systems and the data within. And they can learn which data sources and queries are most valuable to the business, and use that knowledge to continually enhance their analytics capabilities.”
Self-service data preparation not only brings IT and business teams closer together—it helps them get off the big data practice field and into the game.