
Here's your Playbook
Download now
Oops! Something went wrong while submitting the form.
Learn More
TABLE OF CONTENT
Get weekly insights on modern data delivered to your inbox, straight from our hand-picked curations!
We added a summarised version below for those who prefer the written word, made easy for you to skim and record top insights! 📝
Additional note from community moderators: We’re presenting the insights as-is and do not promote any specific tool, platform, or brand. This is to simply share raw experiences and opinions from actual voices in the analytics space to further discussions.
Prefer watching over listening? Watch the Full Episode here ⚡️
Matthew is a Senior Data Engineer with a rich background in building scalable data solutions across industry giants like Meta, Disney, and Nielsen. With a strong focus on IoT data, cloud technologies, and high-performance data pipelines, Matthew brings deep expertise in handling complex datasets and optimizing data infrastructure. Beyond his technical skills, he’s an active writer on Medium, sharing valuable insights with the data engineering community.
We’ve covered a RANGE of topics with Matthew. Dive in! 🤿
At Samsara, my focus is on enabling data usage and literacy across the organization, ensuring high-quality, timely data for stakeholders in the IoT space. Having worked with large-scale data systems at Meta and Disney, I’ve seen common challenges everywhere—whether it’s gigabytes or petabytes of data, issues like quality and late-arriving data persist. The key is leveraging past experience to navigate them smoothly.
Defining North Star metrics starts with close collaboration with stakeholders to identify key success indicators. These must be backed by strong data queries and flexible slicing on meaningful dimensions. A metric is only as useful as the context it provides, so ensuring its stability means keeping an eye on its underlying dimensions and continuously refining them.
Data quality is non-negotiable. The goal is to detect issues before stakeholders do, using proper checks, observability, and alerts. Understanding data deeply—knowing primary keys, trends, and expected fluctuations—helps in setting up proactive monitoring. If issues arise, fast detection and clear communication ensure minimal disruption.
Governance has evolved significantly, especially with regulations like GDPR. Early on, many teams scrambled to comply, but today, security and compliance are foundational. At Disney, handling sensitive customer data meant hashing and limiting exposure. Compliance isn't just a legal necessity—it’s key to building trust.
Start with software engineering fundamentals—they provide a strong foundation for data engineering. SQL, data modeling, and pipeline design are essential, but programming practices make the difference. Fundamentals of Data Engineering by Joe Reis is a must-read, as core principles remain relevant despite rapid industry changes.
Data modeling is harder to master because there’s no single "right" approach. Models must be scalable, efficient, and easy for stakeholders to use, which can be conflicting goals. It’s a skill learned over time through experience, not something that can be "mastered" overnight.
Migrating legacy infrastructure to modern cloud solutions is challenging but essential. A Nielsen project required transitioning connected TV data from local servers to cloud-based solutions like AWS and Databricks to handle big data growth. Staying updated with evolving trends in data engineering is key, as today's modern platforms will be legacy tomorrow.
Small-scale experimentation is the best way to assess new tools. For instance, DuckDB offers fast local querying, making it useful for testing without cloud overhead. When evaluating tools, data engineers should balance immediate wins with long-term projects, considering migration effort and business priorities.
Leading a data migration project early in my career helped me grasp the full data pipeline—ingestion, transformation, and delivery to stakeholders. Hands-on experience, whether through professional projects or personal data sets, is crucial for mastering data engineering concepts.
Three major challenges for data engineers:
Generative AI and LLMs are reshaping the data ecosystem, but companies must be cautious in their adoption. While these tools can enhance processes like data cataloging, ensuring they retrieve the right datasets remains a challenge. They're not perfect, but when they work well, they can make life much easier.
Semantic layers are gaining traction as they help align data with business context. While I haven’t fully developed one, we’re working toward it. It’s crucial to have a structured layer that allows people to easily locate and understand where data is housed.
The impact of data engineering is measured by how well insights drive business success. If teams can access timely, accurate data to inform decisions, we know we're making an impact. While it may not always be visible in direct sales figures, effective data enables high-performing teams and better business outcomes.
Yes, catalogs help reduce ad-hoc requests by centralizing data discovery, minimizing constant queries like "where is this data?" Though maintaining them is effort-intensive, they free up engineers to focus on development rather than answering repetitive questions. The tools are improving, and broader adoption enables more self-service analytics.
Nothing is ever truly future-proof in data. The best approach is designing for scalability and flexibility—anticipating evolving needs instead of solving just one problem at a time. Constant collaboration with stakeholders and awareness of upcoming changes help avoid rework and inefficiencies.
Everyone interacts with data products—whether enabling clickstream analytics at Disney or building core datasets at Samsara. The key is creating structured, reusable data solutions tailored to stakeholder needs.
Data Engineering Weekly is my go-to resource—shoutout to Ananth for that. I also follow company blogs from DoorDash, Meta, Spotify, and others leading in data engineering. Staying updated is key because no one works in isolation—ideas evolve and influence the industry.
I’m still working on having a life outside of work! But when I do, concerts, sports, and getting outdoors help me unwind—Seattle winters make that tough, though.
📝 Note from Editor
The above insights are summarised versions of Matthew’s actual dialogue. Feel free to refer to the transcript or play the audio/video to capture the true essence and details of his as-is insights. There’s also a lot more information and hidden bytes of wonder in the interview, listen in for a treat!
Thanks for reading Modern Data 101! Subscribe for free to receive new posts and support our work.
Connect with me on LinkedIn 🙌🏻