Description

This project aims to develop a comprehensive Data Observability Dashboard that provides r insights into key aspects of data quality and reliability. The dashboard will track:

Data Freshness: Monitor when data was last updated and flag potential delays.

Data Volume: Track table row counts to detect unexpected surges or drops in data.

Data Distribution: Analyze data for null values, outliers, and anomalies to ensure accuracy.

Data Schema: Track schema changes over time to prevent breaking changes.

The dashboard's aim is to support historical tracking to support proactive data management and enhance data trust across the data function.

Goals

Although the final goal is to create a power bi dashboard that we are able to monitor, our goals is to 1. Create the necessary tables that track the relevant metadata about our current data 2. Automate the process so it runs in a timely manner

Resources

AWS Redshift; AWS Glue, Airflow, Python, SQL

Why Hedgehogs?

Because we like them.

Looking for hackers with the skills:

sql python

This project is part of:

Hack Week 24

Activity

  • 6 months ago: ihannemann joined this project.
  • 6 months ago: ihannemann liked this project.
  • 6 months ago: gsamardzhiev liked this project.
  • 6 months ago: gsamardzhiev added keyword "sql" to this project.
  • 6 months ago: gsamardzhiev added keyword "python" to this project.
  • 6 months ago: gsamardzhiev started this project.
  • 6 months ago: gsamardzhiev originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    This project is one of its kind!