Motivation

The openQA test database, based on PostgreSQL, has a lot of potential to be able to answer various questions but often we do not know how to put those questions into the right SQL queries so let's harness the power of artificial intelligence to go wild in this direction :)

Goals

  • G1: A web based service frontend can be used to query the openQA database for various tasks without needing to know or write SQL

Execution

  • Ask AI how to use AI to query SQL ;) or do a simple we search for current best practices and industry standards
  • Try out solutions yourself, e.g. web services or local tools
  • Investigate the feasibility to use production databases from e.g. openqa.opensuse.org or openqa.suse.de
  • Prepare a demonstration
  • Gather results, possibilities, limitations and implications

Progress

Day 1

Julián Díaz joined the project and could gladly already do some research while I was still busy with non-Hack-Week tasks, see the comment below from their research based on my suggestion that we start with researching each on our own what solutions and products exist.

Day 2

Ideas to continue: We can use the access to a database dump from o3 on which one would have real data to look into. As alternative have AI to construct SQL queries and then execute them ourself against the production database after reviewing. As a safer approach instead of using an existing openQA database we can at best use data from a local instance like the one that openQA integration tests as part of github.com/os-autoinst/openQA/ spawn. Or just our own openQA instance where we create some dummy data. As Julián is likely doing the "real work" here I am "playing around" a bit with my VR headset trying to find what is actually realistic as a fancy user interface using AI. I tried out asking a virtual companion, not that I expect much technical depth from any conversations ;) Asking the companion to access https://openqa.opensuse.org the companion could give a not so bad answer what openQA is and what can be found on the next but further follow-up quickly derailed in me and the companion argued about "The current URL for the Tumbleweed test script on o3 is https://openqa.opensuse.org/test/openqa/package/tumbleweed-test/test_script." if I entered the correct URLs as that address does not make sense to me :D Also trying out current ChatGPT with the "Nova" mobile client. Trying in "German" to make it a bit more challenging. And ChatGPT at least provides a very good explanation on what openQA is and does (before my free quota ran out). I tried out https://www.sqlchat.ai/. That system does not seem to provide a way to read code to teach it a schema however it seems to be possible to directly connect to an existing database service. However it provides a start especially for people not so versed with SQL and is a fun way to improve SQL queries which the users can then try out in action and provide back feedback to sqlchat and let that system refine the next steps.

The conversation goes something like this: Q: Formulate SQL queries to find the latest test results in openQA A: To find the latest test results in openQA, you can use the following SQL query:

SELECT *
FROM test_results
ORDER BY timestamp_column DESC
LIMIT 1;

This query retrieves all columns from the test_results table and orders the results in descending order based on the timestamp_column. The LIMIT 1 clause ensures that only the latest test result is returned.

Please note that you may need to replace test_results with the actual table name and timestamp_column with the appropriate column name in your openQA database schema.

Q: filter that to show only tests with result "failed" A: To filter the query to show only tests with a result of "failed", you can modify the SQL query as follows:

SELECT *
FROM test_results
WHERE result = 'failed'
ORDER BY timestamp_column DESC
LIMIT 1;

[...]

Q: the test table is called "tests". Can you replace that? A: Certainly! Here's the modified SQL query with the table name replaced as "tests": [...] Q: instead of listing all failed tests can you count the ratio of failed compared to all tests? ... still here? A: Apologies for the delay. Yes, I'm still here. To count the ratio of failed tests compared to all tests, you can use the following SQL query:

SELECT COUNT(CASE WHEN result = 'failed' THEN 1 END) AS failed_count,
       COUNT(*) AS total_count,
       COUNT(CASE WHEN result = 'failed' THEN 1 END) * 100.0 / COUNT(*) AS failure_ratio
FROM tests
WHERE t_finished >= current_date - interval '2 days';

This query uses conditional aggregation to count the number of failed tests (failedcount), the total number of tests (totalcount), and calculates the failure ratio as a percentage (failureratio). The WHERE clause filters the results to include only tests from the last 2 days based on the tfinished column.

The whole conversation feels like a dynamic "stackoverflow" thread with quick feedback :)

Day 3

Preparing o3 database dump without sensitive data based on a nice hint from jgomez referencing https://gitlab.suse.de/scc/infrastructure/db-backup/:

podman run --rm -it -v $PWD:/src registry.opensuse.org/opensuse/tumbleweed
zypper -n in sudo postgresql-server
mkdir -p /run/postgresql/
chown postgres /run/postgresql
sudo -u postgres pg_ctl -D /var/lib/pgsql init
sudo -u postgres pg_ctl -D /var/lib/pgsql start
sudo -u postgresql psql -c 'create database openqa; create role geekotest;'
sudo -u postgres pg_restore -d openqa -C 2023-11-06.dump
sudo -u postgresql psql -d openqa -c "TRUNCATE TABLE api_keys, audit_events, secrets; update users set email = 'foo@bar.com',fullname = concat('fullname-',id), nickname = concat('nickname-',id), username = concat('username',id);"
sudo -u postgres pg_dump --compress=zstd -f /tmp/2023-11-06_anonymized.dump.zstd openqa

and then synced it over so that it's available on http://w3.nue.suse.com/~okurz/2023-11-06_anonymized.dump.zstd

Results

From my side (Julian) installed openQA in local aswell postgres in order to interact with the openQA database. Found a few problems when trying to connect to openQA database, but at the end I was able to connect and to run the dump file.

Looking for hackers with the skills:

Nothing? Add some keywords!

This project is part of:

Hack Week 23

Activity

  • 6 months ago: mgriessmeier joined this project.
  • 6 months ago: jad12 joined this project.
  • 6 months ago: jad12 liked this project.
  • 7 months ago: okurz started this project.
  • 7 months ago: okurz originated this project.

  • Comments

    • jad12
      6 months ago by jad12 | Reply

      Hello masters, here I found some interesting information:

      TEXT2SQL.AI Link: https://www.text2sql.ai/app Supply your database schema to Text2SQL.AI, choose the database you are using (MySQL, PostgreSQL, Snowflake, BigQuery, MS SQL Server), and Text2SQL.AI will generate the corresponding SQL for you.

      When compared with other SQL AI tools, their website that besides being affordable (current price is $4/month for 300 requests), it is also the only tool capable of generating SQL as well as regular expressions, Excel and Google Sheets formulas (again, perfect for business analysts who need to work with Excel and data all the time!).

      AI2sql.io Link: https://app.ai2sql.io/?section=8 AI-driven SQL query generator, which means you can use natural language to instruct AI2sql to complete SQL queries. AI2sql has been around since 2021, way before the AIGC wave arrived. Recently, it has also integrated with OpenAI's GPT-3. Compared to ChatGPT, AI2sql is designed explicitly for querying databases or generating SQL queries. It supports the most popular databases on the market such as MySQL, PostgreSQL, MongoDB, Oracle, etc.

      AI Query (PAY at first) Link: https://aiquery.co/ AI Query uses the GPT-3 model to generate SQL queries from natural language. It currently supports database types such as Postgres, MySQL, MariaDB, and SQL Server, with plans to include others in the future (taking notes from this genius marketing approach).

      OUTERBASE (NEED to connect database, few databases support) Link: https://app.outerbase.com/julian/ Can be used on top of Postgres, MySQL, and most relational databases. Outerbase has a super modern and minimal interface compared to traditional database management tools. The user experience is similar to that of a spreadsheet. Although it also has AI-enhanced capabilities to assist with writing SQL and gaining insights from the database, it feels more like a SQL client designed for data analysts and business users.

      SQL Chat Link: https://www.sqlchat.ai/ Brings SQL clients from the traditional GUI-based stage to the CUI (Chat-based UI) stage: it integrates with ChatGPT to help you write SQL (and of course, answer any questions about databases!). It currently supports MySQL, PostgreSQL, SQL Server, and TiDB Serverless. And if you connect SQL Chat to your own database, it can write more accurate SQL queries.

      Recently, SQL Chat introduced a subscription model (If you don't have an OpenAI account, you can still enjoy SQL Chat!) and the option to choose between GPT-3.5 or GPT-4 models.

    • Sheldon
      3 months ago by Sheldon | Reply

      The only one that really works for chatting with your database is AskYourDatabase.

      Link: https://www.askyourdatabase.com/

      It allows users to chat with their SQL & NoSQL databases for various tasks like:

      1. Gaining insights
      2. Visualizing data
      3. Designing table schemas
      4. Data analysis

    Similar Projects

    This project is one of its kind!