And the real-life challenges midsized enterprises are juggling as they scale to achieve them

The practice of data discovery and classification varies widely depending on a company’s size, the industry it operates in and the kind of information it needs to track based on individual industry requirements. Classification methods and outcomes vary too; while larger organizations embrace technology like data loss prevention (DLP) solutions, SMBs and midsize enterprise who have limited budget and small teams rely on manual methods to keep track of their data. I recently sat down with Mark Dillon, VP of Information Technology at Waterloo North Hydro, for perspective on the real-life challenges midsized enterprise are juggling as they scale data discovery and classification processes. 

James Mignacca (JM): How has the process of data discovery and classification evolved over the last several years? 

Mark Dillon (MD): Data sprawl is a growing issue and it’s getting worse with the proliferation of data within business networks. Historically speaking, IT leaders were rarely expected to know where all of the data lived within the organization, except to whatever degree a particular industry defined.  

Municipalities are an example of an industry that loves data classification. They’re held to a number of standards and need classification for things like freedom of information access requests and data retention requirements. They, and organizations conforming to similar requirements tackled data classification through similar systems – data would be put in walled networks with double logins to secure it within the network perimeter, but people could still copy data and nobody would check it. Data loss prevention (DLP) technologies introduced a new way to track data and get notified when data was moved around out of the network, but those notifications never came in real-time. DLP solutions were also very expensive, and usually only afforded to larger organizations. 

JM: How are most small or midsized enterprises addressing data discovery today? 

MD: Midsized enterprises don’t necessarily do data discovery, so much as data classification and usually that’s achieved through folder structures and meta data tagging. Some of that process can be a little more sophisticated and automated depending on the document type in question and whether teams can apply OCR (optical character recognition) that makes some documents searchable. Overall small and medium sized businesses just don’t have the staff to manage the manual lift that’s required to support manual and traditional processes.   

JM: What gaps do the traditional methods create? 

MD: These processes are impractical to manage and barely work. One of the greatest challenges they create is the fact that teams have a belief in a policy that isn’t enforceable. They’re making assumptions on who has file access, file integrity and retention periods, which leads to bad work instructions, work knowledge gaps, creeping permissions and overall, bad discipline. The workload behind these manual processes is so large that it’s impractical for teams to manage effectively.   

JM: Are there best practices companies can use to lead program improvements? 

MD: Teams need to rethink the way they’re doing classification and holistically, how it supports operational, security and data privacy requirements. People tend to fall back to a folder structure, fully indexing instead of searching. We’ve also reached a perfect storm and a convergence of different data storage methods. We have a new generation of search-first employees who don’t think in folders, so operationally businesses are working in departmental silos with a lot of unstructured data. Fundamentally, teams need to change the way they store and classify their data. Embracing automation and an outcome-based approach will drive a scalable and more secure solution. 

JM: Thinking of today’s security and data privacy use cases; what sorts or outcomes should businesses expect from their data discovery and classification processes? 

MD: Fundamentally outcomes should address core operational and security-focused pillars. First, teams need to understand their unique risks based on their data they use and store, and how much risk the organization is willing to accept. Secondly, teams need to address data integrity and access control to limit who can access and share data based on data sensitivity. Lastly, information is knowledge. Understanding what information your business has can help you get more value out of that information. Achieving these pillars, with buy-in from every department helps institute data privacy and protection best practices across the organization which will go a long way in optimizing operational and security processes.  

Learn more about the risks of limited and manual data classification

READ THE USE CASE