SUSE Hack Week: YAML 1.2 Schema support for PyYAML

Project Description

PyYAML is a YAML processor in python, and it was one of the first libraries written for YAML. It is used by tools like Ansible or Saltstack.

It was written when YAML 1.1 came out, but was never updated for the new Schema introduced by YAML 1.2.

Look here for an overview of Schemas/Types in YAML 1.1 and 1.2

As you can see there, in YAML 1.1 many strings are recognized as booleans. One promiment one is NO, the country code for Norway, which is read as a boolean False, when unquoted.

This was improved in YAML 1.2 by reducing the number of boolean strings, but so far not implemented by many libraries.

I have been working on a lot of YAML related stuff in the past years, and YAML 1.2 support for PyYAML is something which is requested regularly.

Goal for this Hackweek

I want to add support for the YAML 1.2 Core and JSON Schemas.

Luckily the corresponding test data is already there.

At the end of hackweek, it should be possible to create a PyYAML loader class that loads a YAML 1.2 document.

The branch I'm working on: yaml12

Help

If you are a PyYAML user and know a bit about custom loaders, you can help by discussing the API for the new methods.

You can reach me in our internal chat or in freenode#pyyaml

Resources

PyYAML on Github

Progress

Some weeks ago I already added the test data for the existing YAML 1.1 types, so that adding the tests for the 1.2 Schemas wasn't much work.

I made already progress on Tuesday and Wednesday, so tests are already running successfully, but the challenge now is to create a good API, that is flexible and easy.

On Thursday and Friday I fixed minor issues and updated the related PRs:

Add a test for YAML 1.1 types - This is the base for the new schema. First we need to make sure we test all existing types
Fix yaml11 float resolver for '.'

Then I created a draft PR:

Support for the YAML 1.2 Core and JSON schemas

This will probably take a while until it gets merged (or rejected). Happy for feedback!

Looking for hackers with the skills:

python yaml

This project is part of:

Hack Week 20

Activity

almost 5 years ago: enavarro_suse liked this project.

almost 5 years ago: cdywan liked this project.

almost 5 years ago: okurz liked this project.

almost 5 years ago: tinita added keyword "python" to this project.

almost 5 years ago: tinita added keyword "yaml" to this project.

almost 5 years ago: tinita started this project.

almost 5 years ago: tinita originated this project.

Comments

Be the first to comment!

Similar Projects

python

Song Search with CLAP by gcolangiuli

Description

Contrastive Language-Audio Pretraining (CLAP) is an open-source library that enables the training of a neural network on both Audio and Text descriptions, making it possible to search for Audio using a Text input. Several pre-trained models for song search are already available on huggingface

Goals

Evaluate how CLAP can be used for song searching and determine which types of queries yield the best results by developing a Minimum Viable Product (MVP) in Python. Based on the results of this MVP, future steps could include:

Music Tagging;
Free text search;
Integration with an LLM (for example, with MCP or the OpenAI API) for music suggestions based on your own library.

The code for this project will be entirely written using AI to better explore and demonstrate AI capabilities.

Result

In this MVP we implemented:

Async Song Analysis with Clap model
Free Text Search of the songs
Similar song search based on vector representation
Containerised version with web interface

We also documented what went well and what can be improved in the use of AI.

You can have a look at the result here:

Future implementation can be related to performance improvement and stability of the analysis.

References

CLAP: The main model being researched;
huggingface: Pre-trained models for CLAP;
Free Music Archive: Creative Commons songs that can be used for testing;

Liz - Prompt autocomplete by ftorchia

Description

Liz is the Rancher AI assistant for cluster operations.

Goals

We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.

Example:

User prompt: "Can you show me the list of p"
Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"

Example:

User prompt: "Show me the logs of #rancher-"
Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".

Technical Overview

The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.

Resources

GitHub repository

Improvements to osc (especially with regards to the Git workflow) by mcepl

Description

There is plenty of hacking on osc, where we could spent some fun time. I would like to see a solution for https://github.com/openSUSE/osc/issues/2006 (which is sufficiently non-serious, that it could be part of HackWeek project).

Collection and organisation of information about Bulgarian schools by iivanov

Description

To achieve this it will be necessary:

Collect/download raw data from various government and non-governmental organizations
Clean up raw data and organise it in some kind database.
Create tool to make queries easy.
Or perhaps dump all data into AI and ask questions in natural language.

Goals

By selecting particular school information like this will be provided:

School scores on national exams.
School scores from the external evaluations exams.
School town, municipality and region.
Employment rate in a town or municipality.
Average health of the population in the region.

Resources

Some of these are available only in bulgarian.

https://danybon.com/klasazia
https://nvoresults.com/index.html
https://ri.mon.bg/active-institutions
https://www.nsi.bg/nrnm/ekatte/archive

Results

Information about all Bulgarian schools with their scores during recent years cleaned and organised into SQL tables
Information about all Bulgarian villages, cities, municipalities and districts cleaned and organised into SQL tables
Information about all Bulgarian villages and cities census since beginning of this century cleaned and organised into SQL tables.
Information about all Bulgarian municipalities about religion, ethnicity cleaned and organised into SQL tables.
Data successfully loaded to locally running Ollama with help to Vanna.AI
Seems to be usable.

TODO

Add more statistical information about municipalities and ....

Code and data

Github

Improve/rework household chore tracker `chorazon` by gniebler

Description

I wrote a household chore tracker named chorazon, which is meant to be deployed as a web application in the household's local network.

It features the ability to set up different (so far only weekly) schedules per task and per person, where tasks may span several days.

There are "tokens", which can be collected by users. Tasks can (and usually will) have rewards configured where they yield a certain amount of tokens. The idea is that they can later be redeemed for (surprise) gifts, but this is not implemented yet. (So right now one needs to edit the DB manually to subtract tokens when they're redeemed.)

Days are not rolled over automatically, to allow for task completion control.

We used it in my household for several months, with mixed success. There are many limitations in the system that would warrant a revisit.

It's written using the Pyramid Python framework with URL traversal, ZODB as the data store and Web Components for the frontend.

Goals

Add admin screens for users, tasks and schedules
Add models, pages etc. to allow redeeming tokens for gifts/surprises
…?

Resources

tbd (Gitlab repo)