Information may indeed be power, but organizations large and small are finding that they have to expend more energy than ever just to keep up with the data streams flooding their infrastructures.
It’s not just the internal data that provides intelligence about their businesses. Organizations are also having to manage all that external Big Data coming in that can help them land new customers and add to their revenue.
Those levels of data complexity can be daunting, but many businesses and other entities are finding that automated solutions — and adopting an agnostic, end-to-end tool chain approach — can help streamline their data and information management needs.
In this podcast, Matt Wolken, executive director and general manager for information management at Dell Software shares his insight into the current state of the information management market and its future path. Dana Gardner, principal analyst with Interarbor Solutions, leads the discussion.
Download the podcast (36:23) or use the player:
Here are some excerpts:
Dana Gardner: From your perspective, what are the biggest challenges that businesses need to solve now when it comes to data and information management? What are the big hurdles that they’re facing?
Matt Wolken: It’s an interesting question. When we look at customers today, we’re noticing how their environments have significantly changed from maybe 10 or 15 years ago.
About 10 or 15 years ago, the problem was that data was sitting in individual databases around the company, either in a database on the backside of an application, the customer relationship management application, the enterprise resource planning application, or in data marts around the company. The challenge was how to bring all this together to create a single cohesive view of the company.
That was yesterday’s problem, and the answer was technology. The technology was a single, large data warehouse. All of the data was moved to it, and you then queried that larger data warehouse where all of the data was for a complete answer about your company.
What we’re seeing now is that there are many complexities that have been added to that situation over time. We have different vendor silos with different technologies in them. We have different data types, as the technology industry overall has learned to capture new and different types of data — textual data, semi-structured data, and unstructured data — all in addition to the already existing relational data. Now, you have this proliferation of other data types and therefore other databases.
The other thing that we notice is that a lot of data isn’t on premise any more. It’s not even owned by the company. It’s at your software-as-a-service provider for CRM, your SaaS provider for ERP, or your travel or human resources provider. So data again becomes siloed, not only by vendor and data type, but also by location. This is the complexity of today, as we notice it.
All of this data is spread about, and the challenge becomes how do you understand and otherwise consume that data or create a cohesive view of your company? Then there is still the additional social data in the form of Twitter or Facebook information that you wouldn’t have had in prior years. And it’s that environment, and the complexity that comes with it, that we really would like to help customers solve.
Gardner: When it comes to this so-called data dichotomy, is it oversimplified to say it’s internal and external, or is there perhaps a better way to categorize these larger sets that organizations need to deal with?
Wolken: There’s been a critical change in the way companies go about using data, and you brought it out a little bit in the intro. There are some people who want to use data for an outcome-based result. This is generally what I would call the line-of-business concern, where the challenge with data is how do I derive more revenue out of the data source that I am looking at?
What’s the business benefit for me examining this data? Is there a new segment I can codify and therefore market to? Is there a campaign that’s currently running that is not getting a good response rate, and if so, do I want to switch to another campaign or otherwise improve it midstream to drive more real value in terms of revenue to the company?
That’s the more modern aspect of it. All of the prior activities inside business intelligence — let’s flip those words around and say intelligence about the business — was really internally focused. How do I get sanctioned data off of approved systems to understand the official company point of view in terms of operations?
That second goal is not a bad goal. That’s still a goal that’s needed, and IT is still required to create that sanctioned data, that master data, and the approved, official sources of data. But there is this other piece of data, this other outcome that’s being warranted by the line of business, which is how do I go out and use data to derive a better outcome for my business? That’s more operationally revenue-oriented, whereas the internal operations are around cost orientation and operations.
So where you get executive dashboards for internal consumption off of BI or intelligence for the business, the business units themselves are about visualization, exploration, and understanding and driving new insights.
It’s a change in both focus and direction. It sometimes ends up in a conflict between the groups, but it doesn’t really have to be that way. At least, we don’t think it does. That’s something that we try to help people through. How do you get the sanctioned data you need, but also bring in this third-party data and unstructured data and add nuance to what you are seeing about your company?
Gardner: Just as 10 or 15 years ago the problem to solve was the silos of data within the organization, is there any way in traditional technology offerings that allows this dichotomy to be joined now, or do we need a different way in which to create insights, using both that internal and external type of information?
Wolken: There are certainly ways to get to anything. But if you’re still amending program after program or technology after technology, you end up with something less than the best path, and there might be new and better ways of doing things.
There are lots of ways to take a data warehouse forward in today’s environment, manipulate other forms of data so it can enter a data warehouse or relational data warehouse, and/or go the other way and put everything into an unstructured environment, but there’s also another way to approach things, and that’s with an agnostic tool chain.
Tools have existed in the traditional sense for a long time. Generally, a tool is utilized to hide complexity and all of the issues underneath the tool itself. The tool has intelligence to comprehend all of the challenges below it, but it really abstracts that from the user.
We think that instead of buying three or four database types, a structured database, something that can handle text, a solution that handles semi-structured or structured, or even a high performance analytical engine for that matter — what if the tool chain abstracts much of that complexity? This means the tools that you use every day can comprehend any database type, data structure type, or any vendor changes or nuances between platforms.
That’s the strategy we’re pursuing at Dell. We’re defining a set of tools, not the underlying technologies or proliferation of technologies, but the tools themselves, so that the day-to-day operations are hidden from the complexity of those underlying sources of vendor, data type, and location.
That’s how we really came at it — from a tool-chain perspective, as opposed to deploying additional technologies. We’re looking to enable customers to leverage those technologies for a smoother, more efficient, and more effective operation.
Gardner: Am I right then in understanding that this is at more of a meta level, above the underlying technologies, but that, in a sense, makes the whole greater than the sum of the parts of those technologies?
Wolken: That’s a fair way of looking at it. Let’s just take data integration as a point. I can sometimes go after certain siloed data integration products. I can go after a data product that goes after cloud resources. I can get a data product that only goes after relational. I can get another data product to extract or load into Hive or Hadoop. But what if I had one that could do all of that? Rather than buying separate ones for the separate use cases, what if you just had one?
Metadata, in one way, is a descriptor language, if I use it in that sense. Can I otherwise just see and describe everything below it, or can I actually manipulate it as well? So in that sense, it’s a real tool to actually manipulate and cause the effective change in the environment.