About Us
Our Approach
Services
Projects
News
Contact Us
 

Author: jphare

Data Science on demand – a realistic next step for immature organisations seeking insight?

  |   Blog

Interest in Big Data and Data Science as disciplines, toolsets, roles and philosophies is at an all time high.  Job boards, software tools, specialist consultancies and even university courses are all rapidly expanding to embrace what Harvard Business Review has described as “the sexiest job of the 21st century”.  Cynics argue that the discipline is simply the modern application of Statistics and nothing new.  Proponents point to the difference in scale, innovative use of technology and use of new techniques such as machine learning to go beyond the data centric nature of statistics into modelling and mining knowledge itself.

 

Ways of achieving a Big Data capability without the infrastructure and expense.

Ways of achieving a Big Data capability without the infrastructure and expense.

 

Adoption is growing rapidly across some sectors with online retail, healthcare, social media and manufacturing amongst those accelerating what Mckinsey suggest will be a substantial skills shortage by 2018.  Across other sectors adoption however is less pronounced and is held back by a variety of factors.  The Economist in a report on Big Data Adoptionsuggests “a company’s biggest hindrance to gaining value from big data is often itself” with the two largest inhibitors being a lack of suitable software and in-house skills.  This in itself is not however a cause of slow adoption – the tools and professional landscape are more diverse, mature and inexpensive than ever at present.

 

 

Prior to the Big Data era many firms embarked on enterprise data warehousing and business intelligence projects which were often expensive and prolonged with difficult to quantify returns.  Often these were driven by insight rather than efficiency objectives in the same way Data Science and Big Data solutions are often promoted.  These approaches are sometimes however unfairly treated synonymously with the challenges of one tarnishing the other.  Whilst Data Warehousing and BI tools are well geared to answering what Donald Rumsfeld would describe as known unknowns Data Science and Big Data approaches are much better at providing insight on unknown unknowns.  Both approaches still have a very valid place in the Information Management toolkit addressing somewhat different use cases.

 

So what possible solutions exist for organisations looking to gain insight from large volumes of varied data without the implementation risk?  One approach that is currently evolving is the use of service based models with both startups and large players offering a growing number of different options inspired by cloud computing.  For processing power and advanced machine learning capabilities there are a number of smaller startups now offering services such as Wise.io,Datumbox and BigML which offers a storage and prediction based pricing.  For more sophisticated and context based requirements IBM have also started to rent out the processing infrastructure of their famous jeopardy-winning Watson supercomputer.  This has tackled some impressive and diverse use cases ranging from cancer research to cognitive cooking.  Not everyone however requires the full infrastructure to be provided, for some the challenge is more around expertise and leveraging existing best practice.  This is an area that startup Algorithmia aim to tackle by creating a marketplace for algorithms accessible via both an API and a code library.

 

Clearly there is an ever growing number of options for those wishing to benefit from Big Data Analytics, Machine Learning and other related Data Science capabilities without the distraction of managing complex infrastructure.  Clearly however there are also a number of risks to understand and also prerequisites to maximising benefits of the service based approach.  Given many of the solutions are third party hosted, cloud based platforms many of the traditional concerns in this area need to be considered.  Additionally to these long debated privacy, information security and support issues there are also intellectual property considerations that have to be carefully made when reusing algorithms on a market place.  More important however is the consideration that to really get the most value from any kind of third party analysis toolset you need to have well understood data quality, definitions and coverage of the data inputs.  Without these important pre-requisites the unknown unknowns will always remain just that, unknown unknowns.

Read More

Grass roots Data Quality Management as a catalyst for innovation

  |   Blog

Most organisations irrespective of size have data quality issues. A study by Omikron involving 200 business managers found that almost half of those questioned said the quality of their data was measuring at a confidence level of below 60%, with just 7% having a level of 90% or more.[1] These issues degrade and delay decision making and often add risk and cost to day to day activities. The last thing professionals with a busy day want is to be scrubbing data before they can start working productively. One often overlooked point however is that widespread data quality issues also inhibits creativity, entrepreneurship and innovation at a grass roots level within organisations.

 

Most of us at some point in our careers have been required to build or enhance ‘creative’ solutions to common operational challenges that businesses face. It’s not always possible to wait for a well-defined project or the IT department to implement a strategic solution. Sometimes it’s not even clear whether a strategic solution is feasible hence the need to act creatively and prototype a solution. These often manifest as the Access databases, monolithic Excel spreadsheets and other intimidating, outdated solutions. They keep a business running but can often create more problems than they solve when they begin to affect data quality and indirectly inhibit innovation.

 

Paradigm changing discoveries are often made accidentally by those with a curiosity and access to suitably clean data. There are numerous examples of these throughout history ranging from industrial innovations such as microwave ovens and Teflon to scientific discoveries such as Sommer’s link between Vitamin A deficiency and blindness. Its difficult enough to place a value on these innovations let alone estimate the opportunity cost of data quality issues preventing their discovery in the first place. Clearly however there is a link between enterprise wide high quality data and an organisational culture that fosters employee innovation and creativity. Google allocate 20% of employee’s time to ‘innovation’ side projects which have led to some of their most profitable products such as Google News, GMail, GoogleEarth and Google Maps Street View. Imagine if your organisation could fund this innovation capacity purely through Data Quality Management time savings.

 


 

[1] http://nmk.co.uk/2014/01/28/companies-need-to-be-doing-more-than-simply-paying-lip-service-to-data-quality/

Read More

Data-driven Sailing – Data to Value sponsor International Moth Open

  |   Events

Data to Value sponsored the first UK International Moth open championship event of the year at London’s Queen Mary Sailing Club.  The event was hotly contested with a number Americas Cup and Olympic sailors in attendance.

As part of the sponsorship package Data to Value provided onsite GPS tracking of every competitor using GPS tracking devices from TracTrac.  This enabled live broadcasting of the races over the internet as well as speed, heading and location data for post race analytics using Tableau.

Data to Value’s Managing Director, James Phare, competing in his UK designed Rocket won race 2 – not bad going in a fleet featuring Olympic Gold, Silver and Bronze medallists!  Fingers crossed that he’ll be able to repeat that performance a few times as the season progresses!

 

Replays of the races:

http://www.tractrac.com/index.php?page=clubpage&id=20

 

Full press release:

http://www.yachtsandyachting.com/news/175222

 

Post race analytics:

 

Read More