About Us
Our Approach
Services
Projects
News
Contact Us
 

Experian Data Quality Summit 2014

Experian Data Quality Summit panel

Experian Data Quality Summit 2014

  |   Events

Data to Value had the pleasure of being invited to speak at the Experian Data Quality Summit earlier this month in London.  This proved to be a great opportunity to discuss the latest trends, ideas and challenges for organisations in many sectors addressing data quality issues.  The day kicked off with a fantastic key note speech by Gary Barnett of Ovum which many will remember for its use of highly entertaining metaphors including pengiuns, unicorns and donkeys.  This was swiftly followed by some thought provoking insights on the use of Chief Data Officers (CDO’s) and the evolving market for Data Quality products and services.

 

James was one of the expert panel hosted by Dylan Jones of Data Quality Pro.com who had the challenging task of responding to queries raised by practitioners during the day via twitter and other channels.  Whilst there were many different opinions and perspectives during the day one thing seems to be clear – there has never been a more exciting time to be involved in Data Quality Management….

Read More

Data Science on demand – a realistic next step for immature organisations seeking insight?

  |   Blog

Interest in Big Data and Data Science as disciplines, toolsets, roles and philosophies is at an all time high.  Job boards, software tools, specialist consultancies and even university courses are all rapidly expanding to embrace what Harvard Business Review has described as “the sexiest job of the 21st century”.  Cynics argue that the discipline is simply the modern application of Statistics and nothing new.  Proponents point to the difference in scale, innovative use of technology and use of new techniques such as machine learning to go beyond the data centric nature of statistics into modelling and mining knowledge itself.

 

Ways of achieving a Big Data capability without the infrastructure and expense.

Ways of achieving a Big Data capability without the infrastructure and expense.

 

Adoption is growing rapidly across some sectors with online retail, healthcare, social media and manufacturing amongst those accelerating what Mckinsey suggest will be a substantial skills shortage by 2018.  Across other sectors adoption however is less pronounced and is held back by a variety of factors.  The Economist in a report on Big Data Adoptionsuggests “a company’s biggest hindrance to gaining value from big data is often itself” with the two largest inhibitors being a lack of suitable software and in-house skills.  This in itself is not however a cause of slow adoption – the tools and professional landscape are more diverse, mature and inexpensive than ever at present.

 

 

Prior to the Big Data era many firms embarked on enterprise data warehousing and business intelligence projects which were often expensive and prolonged with difficult to quantify returns.  Often these were driven by insight rather than efficiency objectives in the same way Data Science and Big Data solutions are often promoted.  These approaches are sometimes however unfairly treated synonymously with the challenges of one tarnishing the other.  Whilst Data Warehousing and BI tools are well geared to answering what Donald Rumsfeld would describe as known unknowns Data Science and Big Data approaches are much better at providing insight on unknown unknowns.  Both approaches still have a very valid place in the Information Management toolkit addressing somewhat different use cases.

 

So what possible solutions exist for organisations looking to gain insight from large volumes of varied data without the implementation risk?  One approach that is currently evolving is the use of service based models with both startups and large players offering a growing number of different options inspired by cloud computing.  For processing power and advanced machine learning capabilities there are a number of smaller startups now offering services such as Wise.io,Datumbox and BigML which offers a storage and prediction based pricing.  For more sophisticated and context based requirements IBM have also started to rent out the processing infrastructure of their famous jeopardy-winning Watson supercomputer.  This has tackled some impressive and diverse use cases ranging from cancer research to cognitive cooking.  Not everyone however requires the full infrastructure to be provided, for some the challenge is more around expertise and leveraging existing best practice.  This is an area that startup Algorithmia aim to tackle by creating a marketplace for algorithms accessible via both an API and a code library.

 

Clearly there is an ever growing number of options for those wishing to benefit from Big Data Analytics, Machine Learning and other related Data Science capabilities without the distraction of managing complex infrastructure.  Clearly however there are also a number of risks to understand and also prerequisites to maximising benefits of the service based approach.  Given many of the solutions are third party hosted, cloud based platforms many of the traditional concerns in this area need to be considered.  Additionally to these long debated privacy, information security and support issues there are also intellectual property considerations that have to be carefully made when reusing algorithms on a market place.  More important however is the consideration that to really get the most value from any kind of third party analysis toolset you need to have well understood data quality, definitions and coverage of the data inputs.  Without these important pre-requisites the unknown unknowns will always remain just that, unknown unknowns.

Read More

Grass roots Data Quality Management as a catalyst for innovation

  |   Blog

Most organisations irrespective of size have data quality issues. A study by Omikron involving 200 business managers found that almost half of those questioned said the quality of their data was measuring at a confidence level of below 60%, with just 7% having a level of 90% or more.[1] These issues degrade and delay decision making and often add risk and cost to day to day activities. The last thing professionals with a busy day want is to be scrubbing data before they can start working productively. One often overlooked point however is that widespread data quality issues also inhibits creativity, entrepreneurship and innovation at a grass roots level within organisations.

 

Most of us at some point in our careers have been required to build or enhance ‘creative’ solutions to common operational challenges that businesses face. It’s not always possible to wait for a well-defined project or the IT department to implement a strategic solution. Sometimes it’s not even clear whether a strategic solution is feasible hence the need to act creatively and prototype a solution. These often manifest as the Access databases, monolithic Excel spreadsheets and other intimidating, outdated solutions. They keep a business running but can often create more problems than they solve when they begin to affect data quality and indirectly inhibit innovation.

 

Paradigm changing discoveries are often made accidentally by those with a curiosity and access to suitably clean data. There are numerous examples of these throughout history ranging from industrial innovations such as microwave ovens and Teflon to scientific discoveries such as Sommer’s link between Vitamin A deficiency and blindness. Its difficult enough to place a value on these innovations let alone estimate the opportunity cost of data quality issues preventing their discovery in the first place. Clearly however there is a link between enterprise wide high quality data and an organisational culture that fosters employee innovation and creativity. Google allocate 20% of employee’s time to ‘innovation’ side projects which have led to some of their most profitable products such as Google News, GMail, GoogleEarth and Google Maps Street View. Imagine if your organisation could fund this innovation capacity purely through Data Quality Management time savings.

 


 

[1] http://nmk.co.uk/2014/01/28/companies-need-to-be-doing-more-than-simply-paying-lip-service-to-data-quality/

Read More