Countless companies have taken preliminary steps to extract more value from their data. Some have adopted the latest open source repositories to reduce the cost of storage and free up precious space in their enterprise data warehouse. Others have brought together a limited number of data sets to conduct trial analyses and see what insights can be attained.
Today, many of these companies are reaching an inflection point. One where cost reduction and experimentation must give way to production capabilities and revenue generation.
“Most companies are over the hump of gathering and managing high volumes of data,” says Ron Kasabian, vice president and general manager of big data solutions at Intel. “And now they are looking for more return on their big data investments. They’re ready to make the leap.”
Preparing to scale and evolve
As with any advanced technology, this leap requires the right underpinnings—an infrastructure that delivers world-class performance, management, and scalability.
“As they move beyond pilot projects and targeted use cases, companies must learn how to deal with the increased complexity and scale,” says DD Dasgupta, vice president of enterprise product marketing at Cisco. “It requires an infrastructure that can bring together Hadoop platforms, analytics software, and other enterprise systems with immense processing power and automated, policy-driven governance.”
- Scalability is often the greatest inhibitor of a company’s big data evolution.
- Pulling a few data sources into an isolated data lake is one thing; correlating hundreds of disparate data streams across thousands of nodes and dozens of enterprise systems is quite another.
“Scalability and management are critical, especially as IoT [Internet of Things] becomes more mainstream,” says Dasgupta. “These new workloads are so different from traditional enterprise data, which is more structured and static. Companies need an infrastructure that can align new and legacy systems without negatively impacting the security and governance policies that are already in place.”
The infrastructure must also be able to support an evolving set of software tools.
“Hadoop has been around for a decade, so it has matured quite a bit. But analytics software is still relatively nascent and continues to change at a rapid clip,” says Kasabian, who recommends infrastructure consistency with software flexibility. “You need to be open minded and willing to change course as new tools and capabilities come to the fore. At the same time, you don’t want to overhaul the infrastructure every time you adopt a new piece of software.”
A powerful foundation
Cisco and Intel work closely with big data vendors to make sure Hadoop distributions and analytics software perform best on the Intel® Xeon® processor-based Cisco Unified Computing System™ (Cisco UCS®).
“Cisco UCS Integrated Infrastructure for Big Data provides a secure and scalable infrastructure to support enterprise requirements,” Forrester stated in a recent report, listing Cisco as a “Leader” in Big Data Hadoop-optimized systems1. “Cisco’s UCS solution comes pretested and prevalidated for Cloudera, Hortonworks, IBM, and MapR, providing a lower-cost and scalable storage platform to support Hadoop deployments.”
It’s a powerful foundation that includes centralized management and policy automation, which are critical as a big data environment grows.
“Going from less than ten nodes to hundreds or even thousands of nodes is a huge difference,” says Dave Kloempken, director of global data center solution sales at Cisco. “You need to spin them all up, load them with software, and keep the firmware up to date. Without centralized management and automation, it can be extremely difficult to expand an environment and maintain that level of scale.”
“Management tools such as Cisco UCS Manager and Cisco UCS Director allow for simple configuration of big data Hadoop clusters that can adapt dynamically to changing workloads,” the Forrester report noted. “Cisco’s key differentiators lie in its ability to offer a wide range of configurations, its strong focus on Internet of Things (IoT) use cases, and its broad partner ecosystem.”
1 Forrester, The Forrester Wave™: Big Data Hadoop-Optimized Systems, Q2 2016, May 2016.