Description

This project aims to develop a comprehensive Data Observability Dashboard that provides r insights into key aspects of data quality and reliability. The dashboard will track:

Data Freshness: Monitor when data was last updated and flag potential delays.

Data Volume: Track table row counts to detect unexpected surges or drops in data.

Data Distribution: Analyze data for null values, outliers, and anomalies to ensure accuracy.

Data Schema: Track schema changes over time to prevent breaking changes.

The dashboard's aim is to support historical tracking to support proactive data management and enhance data trust across the data function.

Goals

Although the final goal is to create a power bi dashboard that we are able to monitor, our goals is to 1. Create the necessary tables that track the relevant metadata about our current data 2. Automate the process so it runs in a timely manner

Resources

AWS Redshift; AWS Glue, Airflow, Python, SQL

Why Hedgehogs?

Because we like them.

Looking for hackers with the skills:

sql python

This project is part of:

Hack Week 24

Activity

  • about 1 year ago: ihannemann joined this project.
  • about 1 year ago: ihannemann liked this project.
  • about 1 year ago: gsamardzhiev liked this project.
  • about 1 year ago: gsamardzhiev added keyword "sql" to this project.
  • about 1 year ago: gsamardzhiev added keyword "python" to this project.
  • about 1 year ago: gsamardzhiev started this project.
  • about 1 year ago: gsamardzhiev originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    Update M2Crypto by mcepl

    There are couple of projects I work on, which need my attention and putting them to shape:

    Goal for this Hackweek

    • Put M2Crypto into better shape (most issues closed, all pull requests processed)
    • More fun to learn jujutsu
    • Play more with Gemini, how much it help (or not).
    • Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.


    Improvements to osc (especially with regards to the Git workflow) by mcepl

    Description

    There is plenty of hacking on osc, where we could spent some fun time. I would like to see a solution for https://github.com/openSUSE/osc/issues/2006 (which is sufficiently non-serious, that it could be part of HackWeek project).


    Liz - Prompt autocomplete by ftorchia

    Description

    Liz is the Rancher AI assistant for cluster operations.

    Goals

    We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.

    Example:

    • User prompt: "Can you show me the list of p"
    • Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"

    Example:

    • User prompt: "Show me the logs of #rancher-"
    • Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".

    Technical Overview

    1. The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
    2. The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.

    Resources

    GitHub repository


    Enhance git-sha-verify: A tool to checkout validated git hashes by gpathak

    Description

    git-sha-verify is a simple shell utility to verify and checkout trusted git commits signed using GPG key. This tool helps ensure that only authorized or validated commit hashes are checked out from a git repository, supporting better code integrity and security within the workflow.

    Supports:

    • Verifying commit authenticity signed using gpg key
    • Checking out trusted commits

    Ideal for teams and projects where the integrity of git history is crucial.

    Goals

    A minimal python code of the shell script exists as a pull request.

    The goal of this hackweek is to:

    • DONE: Add more unit tests
      • New and more tests can be added later
    • Partially DONE: Make the python code modular
    • DONE: Add code coverage if possible

    Resources


    Improve/rework household chore tracker `chorazon` by gniebler

    Description

    I wrote a household chore tracker named chorazon, which is meant to be deployed as a web application in the household's local network.

    It features the ability to set up different (so far only weekly) schedules per task and per person, where tasks may span several days.

    There are "tokens", which can be collected by users. Tasks can (and usually will) have rewards configured where they yield a certain amount of tokens. The idea is that they can later be redeemed for (surprise) gifts, but this is not implemented yet. (So right now one needs to edit the DB manually to subtract tokens when they're redeemed.)

    Days are not rolled over automatically, to allow for task completion control.

    We used it in my household for several months, with mixed success. There are many limitations in the system that would warrant a revisit.

    It's written using the Pyramid Python framework with URL traversal, ZODB as the data store and Web Components for the frontend.

    Goals

    • Add admin screens for users, tasks and schedules
    • Add models, pages etc. to allow redeeming tokens for gifts/surprises
    • …?

    Resources

    tbd (Gitlab repo)