Author: Nigel Higgs

Outside In Data Governance – a value driven approach

  |   Blog

We who have been in the data sphere a while and in and around Data Governance will have seen the pitch-decks, watched the webinars, read the blogs and attended the conferences.  Some of us will have hired the staff, taken sage advice from expensive consultants and kicked off programmes to get the organisation up the Data governance maturity curve. It’s almost like a religion, Data Governance is so clearly the answer why can’t everybody in the organisation see it? It’s a no-brainer. Unfortunately and speaking as a Data Governance practitioner for far too many years I can honestly say that I have yet to see a fully functioning enterprise-wide Data Governance implementation. Look, I appreciate that could be down to my incompetence, but I know this is not an isolated or unique sentiment. Lots of peers, colleagues and people far smarter than me have been preaching the benefits of data administration, data architecture, data governance, or whatever it will be called next, for many years and yet many of them struggle to come up with success stories. In fact when pressed they often don’t have any!

 

 

So why so much denial? Einstein is reputed to have said something along the lines of ‘the definition of insanity is to keep doing the same thing and expect the outcome to be different’. It is also reputed to be the most wrongly attributed and quoted platitude on the planet! But hey this is a LinkedIn post and like most of my writings nobody will read it.

 

 

What’s that got to do with Data Governance? Well, ‘Outside In Data Governance’ is about approaching the problem from a different angle. There is little doubt the problem Data Governance is trying to solve is very real. Very few organisations know what data they have got, what it means, where it is, who is responsible for it or what its quality is?

 

 

But how to solve the problem? What I typically hear is that you need to write a policy, form committees, define processes, assign roles and then everything will be working like clockwork within months – data governed, quality data delivered to users and the organisation flying up the data maturity curve. But is that what happens, does the story painted in the pitch-decks come into reality? Sadly, it very rarely if ever does.

 

 

What is needed is a value driven approach. Start with who are we doing a Data Governance approach for? We are doing it for the business users. Then ask what are they interested in? They are interested in something that makes their lives easier right now. So ‘Outside In Data Governance’ starts with a single business report and works back from there. Answer those fundamental questions (what, where, who and how good?) about the fields and outputs on the report and make that knowledge accessible. You could do this with a simple Excel based approach or maybe a Wiki or SharePoint; but pretty soon you will need some tooling to really make it scalable and responsive to increasing demands for more reports to be included in the scope. There are ways to do this in a ‘proof of concept’ environment and demonstrate the benefits before committing to spend. A friend of mine is fond of saying it’s easier to ask for forgiveness that for permission’. In this case he is right. There are browser based tools that sit outside your firewall and can offer this try before you buy approach.

 

 

This is what a value driven and lean approach is all about. If what you do in this small scale doesn’t get traction then what makes you think a £250k project will end up any better? Start small, ensure you get honest feedback from users at every iteration of your solution and focus on delivering value. If you bring the data users with you then they will demand the capability is extended. Beat the Einstein quote and start from the ‘Outside In’.

 

Read More

Top 21 Open Data sources

  |   Blog

Data is everywhere, created and used by just about anyone. The days when companies or individuals have to pay significant sums of money to access useful and interesting datasets is long gone. Here is our top 20 list of the best free data sources available online.

 

1. Data.gov.uk the UK government’s open data portal including the British National Bibliography – metadata on all UK books and publications since 1950.

 

2. Data.gov Search through 194,832 USA data sets about topics ranging from education to Agriculture.

 

3. US Census Bureau latest population, behaviour and economic data in the USA.

 

4. Socrata – software provider that works with governments to provide open data to the public, it also has its own open data network to explore.

 

5. European Union Open Data Portal thousands of datasets about a broad range of topics in the European Union.

 

6. European Data Portal is a European portal that harvests metadata from public sector portals throughout Europe. EDP therefore focuses on data made available by European countries. In addition, EDP also harvests metadata from ODP.

 

7.DBpedia crowd sourced community trying to create a public database of all Wikipedia entries.

 

8. The New York Times a searchable archive of all New York Times articles from 1851 to today.

 

9. Dataportals.org datasets from all around the world collected in one place.

 

10. The World Factbook information prepared by the CIA about, what seems like, all of the countries of the world.

 

11. NHS Health and Social Care Information Centre data sets from the UK National Health Service.

 

12. Healthdata.gov detailed USA healthcare data covering loads of health related topics.

 

13. UNICEF statistics about the situation of children and women around the world.

 

14. World Health organisation statistics concerning nutrition, disease and health.

 

15. Amazon web services large repository of interesting data sets including the human genome project, NASA’s database and an index of 5 billion web pages.

 

16. Google Public data explorer search through already mentioned and lesser known open data repositories.

 

17. Gapminder a collection of datasets from the World Health Organisation and World Bank covering economic, medical and social statistics.

 

18.Google Trends analyse the shift of searches throughout the years.

 

19. Google Finance real-time finance data that goes back as far as 40 years.

 

20. UCI Machine Learning Repository a collection of databases for the machine learning community.

 

21.National Climatic Data Center world largest archive of climate data.

 

For more interesting articles, projects and events visit our news section or contact us directly

Read More

Introducing the Data To Value Lean Data Process

  |   Blog

Here at Data To Value we have pooled together the many years of experience of our partners, consultants and affiliates and taken what works best to create an iterative and agile data development methodology. This work has evolved into the Lean Data Process and we will be building out each of the components over the coming months as we continue to apply the principles, tools and techniques to real problems that our customer’s experience.

 

The overall approach is represented by the Lean Data Framework which encapsulates the data development and management life-cycle. All data projects should be driven by business need and the start of the process is always the business requirement or the business problem to be addressed.

The top and bottom of the stack below are business owned domains. These are supported by the middle two domains which are specialist IT domains where the business requirements are satisfied and the business problems are solved using technology.

 

We also recognise that the world of data and information is changing rapidly. New technologies for data management are coming on the scene in quick-fire succession. The Lean Data Framework covers all types of data, structured, semi-structured and unstructured; be it stored in databases, files, email, website, content stores or wherever it needs to be understood, used, tagged and accessed.

 

Lean Data Framework

 

The Lean Data Process uses a Build, Measure, Learn cycle to create a continuous development environment geared to delivering rapid business benefit. A very popular engagement is our Lean Data Quality service. Another typical application is the creation of an enterprise wide information model. This and other components from the overall process will be described in more detail in future posts.

 

To achieve accelerated delivery we leverage partnerships with innovative software tool vendors. Each tool  support specific areas of the process. The benefit to our customers is that following a consulting engagement they will be left with real collateral and not merely powerpoint slides.

 

 

Lean Data Tool Accelerators

Each tool has specific capabilities and components that support that capability. We have demonstrations of each tool and how it fits within the Lean Data Process which will be shared in future posts. If you want to find out more about the process and the tools contact us here

 

 

Lean Data Framework Tools Stack

 

Semanta Encyclopaedia

 

Experian Pandora

 

Manta Tools

 

poolparty thesaurus server

 

poolparty Semantic Integrator   poolparty Power Tagging   poolparty Web Mining

 

 

 

 

Read More