Project Description

Nowadays most customers are looking for multi-cloud and container solutions. The main critical point for their business is providing a better service and make the customer happy. The efficiency of the IT Ops team key to the superior customer experience. In most case customers reports the issue and support will fix the issue but support is not aware of the problems (like node failures, resource crunch limits) in the multi-container environment until customers report them. Even though monitoring and alerts systems exist in the current market that only provide alerts when an issue occurs BUT we need smarter solutions to analyze existing systems and predict future anomalies.

The proposed system will do:

  1. Data collection (unstructured data) from k8s components across the environments
  2. Identifies the common pattern happens in the failure cases.
  3. Creates a Knowledge base for the identified patterns with related components . (Structured data)
  4. Uses a specific data model for the prediction
  5. Use the output from data model to predict the analysis.
  6. Send the alerts and reports

This is further classified as 3 main components in the proposed architecture:

  1. Data collection
  2. Data Prediction
  3. Alers & Reports

Resources that can be considered for the analysis and prediction: 
 Storage devices- Capacity, State Network devices ( LB, Firewalls)- Like Link status , Packet drops Compute Nodes: CPU,Memory,I/O, Storage

Solution Approach: -- Create data model -- Scan & Filter Data -- Extract Entity -- Annotate Data and Input to Model -- Process Output from Model -- Notify / Recommend / Self Heal

Goal for this Hackweek

Use existing log collector to collect the data from rancher k8s clusters and come up with a appropriate data model.

https://support.rancher.com/hc/en-us/articles/360039113911-The-Rancher-v2-x-log-collector-script

Resources

ML engineer,

ML, Python, kubernetes, data model, monitoring tools. @

Looking for hackers with the skills:

python3 machinelearning

This project is part of:

Hack Week 20

Activity

  • 7 months ago: sbabusadhu added keyword "python3" to this project.
  • 7 months ago: sbabusadhu added keyword "machinelearning" to this project.
  • 7 months ago: sbabusadhu added keyword "python3" to this project.
  • 7 months ago: sbabusadhu added keyword "machinelearning" to this project.
  • 7 months ago: sbabusadhu originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    Cluster Python API by fmherschel

    [comment]: # (Please use the project descriptio...


    Phoebe - where AI meets Linux by mvarlese

    Project Description

    Phoeβe (/ˈfiːbi/) wan...


    Ambrogio - a privata consierge for you and your pets by rsblendido

    [comment]: # (Please use the project descriptio...


    FuseML - accelerate your Hack Week ML projects by stefannica

    [comment]: # (Please use the project descriptio...