Businesses are generating more data than ever before. In fact, the total size of business data around the globe is estimated to almost double every year. Storing, accessing and mining this data for actionable insights poses some serious challenges, including the existence of dark data.

Dark data is a term used to refer to any data that is collected but not used. You might think that a majority of dark data is generated by internet users, but the truth is that 80 percent of existing data is currently stored by businesses. It is estimated that 430 billion in productivity gains could be made by 2020 if all the data available could be turned into actionable information.

Failing to turn data sets into actionable insights isn’t an isolated issue. In fact, only 0.5 percent of data that exists on a global scale is used for analytics.

Discovering your dark data

Dark data is often text-based and unstructured. Qualitative data is often left unused since extracting analytics from a quantitative data set tends to be easier.

Here are a few examples of dark data your business probably stores:

  • Email communication, both internal and with customers.
  • Transcripts from interactions with customer service.
  • Data from your social media channels, including engagement metrics and direct interactions with your audience.
  • Your website traffic logs.
  • Logs files from your network.
  • Log files from your mobile app.
  • Old documents, such as spreadsheets you no longer use, notes from old projects, and former employee files.
  • Inactive databases.

Dark data doesn’t necessarily refer to a large data set. If there are data fields in your customer files that you don’t use to target your audience, this information falls into the dark data category.

Obstacles to using and harnessing dark data

A lot of businesses don’t have company-wide best practices for reviewing data and assessing how it should be mined. The first obstacle to using dark data is that you might not have much visibility regarding the data available and no idea of whether or not there is a real potential for generating more revenues.

Secondly, mining dark data might not be a priority. A data discovery process needs to be completed before you can get an idea of the kind of commitment of time and resources required to mine this data.

Another potential obstacle could be the lack of a scalable data solution. An upgrade to a more powerfully performing cloud-based analytics solution to process large data sets and generate actionable insights may be required.

The unstructured nature of dark data is another issue you will have to address. How can you mine texts, images, videos and log files for insights? Some data can be semistructured by adding tags but accessibility can be an obstacle if you are dealing with a large unstructured data set that is mostly image or text-based.

And while a mathematical approach can easily be taken with qualitative data sets, extracting actionable information from qualitative data is more challenging.

Using dark data to generate revenues and improve business operations

You need to engage in a dark data discovery process sooner rather than later because some of the information stored by your business might represent a safety risk. Old employee and customer files probably include some personal data that isn’t necessary to your business. Unmanaged data is a liability that you need to address, either by deleting sensitive data you no longer need or by choosing a more secure storage option for some data.

Surveying your unmanaged data will also help you assess your data storage needs. This might be a small expense now, but with business analytics becoming more complex, the size of your dark data will keep growing at a rapid pace if you don’t determine what needs to be stored and for how long.

In order to use dark data, you need to plan a thorough surveying process to assess what is available. You can then determine how valuable the data is and choose a relevant strategy to use it.

You will need to create a system to categorize the data you find, assess potential liabilities, and report how the data could be used.

Once data has been discovered, you can structure it. Structuring qualitative data shouldn’t be an issue, but you’ll still need to assess the level of granularity that will be most valuable. When it comes to quantitative data, you might want to create a structure with tags or extract only the most helpful information from the data set.

When data is properly structured, the next step is integrating it into an existing data system. A centralized system will make data more accessible and will allow you and your team to find connections between different data sets and then place those data sets in a broader context by helping you generate insights on a higher level. For instance, the log files from your website or your mobile app will help you get a better idea of what kind of experience these channels deliver and could be valuable when analyzed along with data from your sales department.

Ideally, you should create a plan for storing, structuring and analyzing data as it is being generated minimize dark data production in the future.

Dark data represents a real challenge because of the size of data sets as well as time and resources required to discover and analyze it. There is, however, a strong potential for generating better analytics and improving business processes, products and customer service, and ultimately increasing revenue streams. Integrating a larger quantity of data into your system will result in more accurate insights that show the whole picture and provide access to actionable insights for processes not previously analyzed due to a lack of structured data.

Midmarket CIO Forum kicks off April 8-10 in Savannah and is proud to showcase offerings from some of the market’s leading and emerging business intelligence and analytics service providers.