The future is Lean Data not Big Data
Big Data has dominated the headlines for some time in technology and business media. In recent years it has also become much more widely discussed in the general media as innovations spread from traditional roots in online retail and social media to other sectors such as Healthcare, Government, Manufacturing and Finance. Research by Deloitte suggested during 2012 90% of Fortune 500 companies would pursue Big Data projects. Undoubtedly there are numerous, indisputable and powerful use cases demonstrating the social, environmental and economic benefits of leveraging Big Data. Is following the trend of hoarding data with the aim of finding nuggets of insight the right approach however?
Whilst many commentators refer to Data as the new oil it has been a key corporate asset for some time, decades in fact. The first Data Warehouse was conceived in the 1960s and as relational databases grew in popularity in the 1980s so did corporate data volumes. This growth led to control challenges and thus in response a number of organisations and frameworks formed to bring structure, organisation and planning to Data Management. A variety of sectors began to adopt the principles espoused by organisations such as DAMA and BCS and frameworks such as Zachman and TOGAF. Many of these approaches focused on waste and a belief that ‘less is more’. Minimising storage of data, movement of data, cleansing of data and processing of data was the objective. Thus Big Data to an extent represents an antithesis for these traditional principles.
We are now in the era of feast, not famine. Falling processing and storage costs, a range of new technologies and new approaches including NoSQL databases, Machine Learning and Neuro-Linguistic Programming have enabled many organisations’ data volumes increase without causing meltdowns. If anything many firms such as Facebook, Google and Ebay have demonstrated that accumulating huge volumes of data can be immensely valuable and yield previously undiscoverable insights. Clearly many facets of Big Data are quite distinct to traditional Data Management tools and techniques. In fact the latest data analysis techniques are so different to historical techniques that many herald the triumph of Data driven decision making over gut instinct. Does this however mean the Big Data blueprint should be embraced by all types of organisations and that greater data storage and processing is the answer? Not quite.
Whilst Data Management costs have fallen as computer processor and storage technologies have advanced this was driven by historical requirements for data. One could argue that these falling costs have not only been a significant driver but also a result of Big Data innovation. Many studies using cost per gigabyte measures also show storage cost decreases have slowed over the last 6 years. Some even believe storage costs will begin to increase again. Thus in many ways Big Data’s growth shares characteristics with other societal trends such as car ownership and road congestion. Historically it was believed that by building more roads congestion would decrease. In fact the reverse has happened as the cost to drivers of using the road network has effectively fallen leading to more drivers making more journeys. We are still developing Big Data techniques and clearly powerful innovations are still being developed. One trend however is clear – the genie is out of the bottle and it is likely that future infrastructure cost reductions will be rapidly consumed by new data requirements.
Big Data isn’t solely about storage however, it’s also about processing and interpreting data. One major constraint that is often overlooked in this area is the growing problem of Information Overload and our limitations as humans to absorb and process growing volumes of data. Many studies have found worrying trends to support this such as attention spans shortening and IQ scores falling, particularly in highly developed economies. Within the Big Data domain inference algorithms, Natural Language Processing and other semantic technologies in particular are reducing the requirement for detailed human decision making. How likely it is that organisations will increasingly delegate decision making to machines however depends on society’s ability to progress related issues such as increasing democracy in decision making, privacy, data security and control. Indeed with eminent thought leaders such as Elon Musk and Stephen Hawking amongst those voicing concerns about Artificial Intelligence developments it’s unlikely that the need for considerable human oversight will disappear any time soon.
So what’s the solution? It appears organisations may be dammed if they fully embrace Big Data, but also dammed if they don’t. One movement that has grown in popularity in recent years but retained its traditional roots may be able to help. Lean as a philosophy has been around for some time, originating in manufacturing. Recently it’s been very successfully applied to areas as diverse as Change Management, start-ups and project delivery. It uses a number of techniques and principles to focus on reducing waste – any activity that doesn’t directly create value for customers. Whether these customers are internal stakeholders or external clients. Applying this framework to Information Management has proven to yield some interesting results which go some way to helping organisations decide on what level of Big Data adoption is right for them. A central premise is the rule that every piece of Data within an organisation’s Information landscape should in some way be linked to creating value for the end customer and the organisation’s objectives – whether this is revenue maximisation, cost reduction or something else. Whilst this is easier said than done there are a number of Lean Information Management techniques organisations can employ.
Using automated discovery techniques it’s now possible to classify, catalogue, model and define an organisation’s data assets more rapidly and thoroughly than ever before. Metadata discovery, Profiling and Semantic technologies in particular are becoming much more usable and cost effective. Not only does this reduce time spent finding data, which can be as much as 25% of an employee’s day, it also aids data security, archiving and deletion strategies. Modelling your Data Architecture and Data Management practices is also invaluable for understanding whether data travels between producers and consumers via the shortest path. For Lean this modelling generates valuable metrics to help steer its hypothesis driven approach to Data Strategies. Time is also a key form of waste and thus a key principle is also to only collate data for decision making when absolutely necessary. Not all decisions need to be made using hard fought empirical evidence, sometimes common sense and trust is enough.
If your organisation is considering major investments in Big Data its worth considering whether these key concerns and principles are addressed. If not perhaps it’s time to consider Lean Data as an alternative.