Data engineers: your knowledge bridge By Rob Harris
I’m a bit of a juggler at heart. For the last twenty years, I’ve supported data engineers, scientists, and more in their line of work. And I’ve been everywhere from the smallest of startups, to massive data centers (some even hosting military data). If I’ve learned one thing in this time, it’s this: data is
The post Data engineers: your knowledge bridge appeared first on data.world.
I’m a bit of a juggler at heart. For the last twenty years, I’ve supported data engineers, scientists, and more in their line of work. And I’ve been everywhere from the smallest of startups, to massive data centers (some even hosting military data).
If I’ve learned one thing in this time, it’s this: data is everywhere, and it’s getting bigger. Organizations big and small need to figure out what all this data means.
But how do you make sense of it all?
Think about our words
Some languages have thousands of characters in their alphabet. Many average just thirty letters.
Those characters make up hundreds of thousands of words. Those words fill millions of books, entertainment, music, and libraries full of knowledge. Those libraries fuel billions of ideas.
Communication is challenging. But it’s critical. The common understanding and alignment around language at a global scale is an incredible example of its importance.
Just a few letters can create hundreds of millions of meanings, and so can data. For your business to be successful in today’s data-driven economy, your data must be like words: intuitive to understand. Not just for your IT or data engineering teams, but for everyone.
We get carried away by the day-to-day work, and get used to the status quo. As a data engineer, you tolerate requests like:
- Can you email me that dataset?
- Who manages that data?
- Could you create a login for me to our data lake?
- Is this the right data to use for this analysis?
Ad-hoc requests come from everywhere. Instead of feeling like an engineer that builds a bridge across the river, you’re throwing life preservers at those who try to swim.
Our data workflows today are reactive, not proactive. The good news is there’s a better way for data and analytics teams to operate, and it starts with a new framework for collaboration.
There’s no silver bullet
The geek in me hates to admit it, but no tool or technology alone can solve this problem. A combination of people, process, and technology is imperative to a world-class data strategy.
A data catalog is your gateway to data empowerment. A catalog powers data democratization that, in turn, fuels analytics-driven creativity and insights. That’s how we can unlock an environment where data is fully embedded in everyone’s daily work.
Tedious, repetitive data work
The data engineering team at Aceable, a digital mobile-first driver’s education platform, loves its data. The data management process left something to be desired, however.
For example, here’s what a data analyst had to do in order to create a single dashboard:
- ask a data engineer to transform and clean data, which could take up to a week
- request the BI team to turn that data into a visualization, an additional one to three days added to the process
- finally get the data, only to find out it’s a static graph or chart that’s now outdated due to the lag between request and fulfillment. Start over.
That’s a lot of points of friction for a single dashboard. And this is for a report that’s produced quarterly, monthly, and possibly more frequently than that.
Naturally the level of difficulty here discourages people from using the data, even if they know how important it is to be data-driven. But we can make people, process, and technology work effectively together. It starts by building the bridge.
Let’s cross the river
There are only so many life preservers a data engineer can throw out before they run out. It’s a delicate balancing act of deciding which task or request is most important.
That’s why you need a bridge. It’s more sustainable, allows you to support more people, and is more rewarding. As a data engineer, your role must evolve to handle more requests. That can only happen if you establish repeatable workflows and pipelines.
Check out our blog about the role of the data product manager, an increasingly important function in your data ecosystem.
Aceable’s bridge looked like this:
- Data engineers wired up MongoDB data directly to data.world
- The BI team set up queries, self-service query templates, and dashboards within the data catalog platform
- Data consumers would then log into data.world, run the query, and pull the data they needed
From multiple points of friction to self-service analytics on findable, understandable, trustable data on-demand. It’s a win-win for data analysts and engineers because data workflows and pipelines could now be reused across the business. Meanwhile, data engineers are doing more interesting and rewarding work than ever before.
Aceable’s team collaborated on data and captured meaning and context in a way they never could before, and made critical business decisions in real-time.
Data engineers: it’s time to build your data and knowledge bridge.
Click here to learn more about Aceable’s story.