Ever left a restaurant wanting to write a review, but thinking it wasn't worth the trouble to tap out all those words on your phone -- you just want to give the place your n stars and provide a few words of praise or condemnation? If only you could press a button to generate a plausible review. If this project happens, you will.

We'll use the Yelp API to grab as many reviews of certain types of restaurants as the terms of service allow (I assume "Use any robot, spider, site search/retrieval application, or other automated device, process or means to access, retrieve, scrape, or index any portion of the Site or any Site Content;" doesn't apply to API users -- otherwise it wouldn't be much of an API).

We'll investigate libraries like spaCy for doing natural language processing (in Python).

We'll dive into the research on Markov Chains for Natural Language Generation

Finally we'll put this functionality on the web, using Flask & SQLAlchemy in front of a postgres database. The idea is to generate a sample review, and allow the user to easily tweak it and copy it to the clipboard, not to automatically post the reviews to Yelp.

Looking for hackers with the skills:

python nltk webapps flask sqlalchemy

This project is part of:

Hack Week 15

Activity

  • almost 9 years ago: cschum liked this project.
  • almost 9 years ago: ericp added keyword "flask" to this project.
  • almost 9 years ago: ericp added keyword "sqlalchemy" to this project.
  • almost 9 years ago: ericp added keyword "python" to this project.
  • almost 9 years ago: ericp added keyword "nltk" to this project.
  • almost 9 years ago: ericp added keyword "webapps" to this project.
  • almost 9 years ago: ericp started this project.
  • almost 9 years ago: dmacvicar liked this project.
  • almost 9 years ago: ericp originated this project.

  • Comments

    • ericp
      almost 9 years ago by ericp | Reply

      Mixed results on the hack.

      On one hand, I learned how to use Python's nltk library to do natural language processing on the reviews, to the point that I could lex each word. nltk has parsing abilities as well, but I decided to see how far I could get with just lexing and statistical analysis. Also, the flask and sqlalchemy libraries were straightforward Python analogs of Ruby's sinatra and ActiveRecord/Sequel ORMs, nothing new here. And the Yelp API was straightforward to work with.

      The downside was that the Yelp API only exposed the first 10 words of three reviews for each business. If we assume the average business has 100 reviews of about 200 words each, this wasn't going to give me the data I needed. However, each review in the resource returned byhttps://api.yelp.com/v3/businessesBUSINESSID/reviews also contained a URL, and following that URL gave me the full text of 19 reviews.

      And apparently following that URL violated Yelp's general terms of service, but not the developers's ToS, and I was cut off after pulling down reviews for 350 restaurants. At least I randomized my selection procedure, so I ended up with a smattering of Mexican, Chinese, Japanese and American-style restaurants.

      The best generated sentence might have been one of the first: "I travel the bone tender." I also liked "My wife had the chipotle pancakes." But most of the sentences were grammatically incorrect, or made no sense, or both. I did try tweaking the Markov generator to use a mix of single-word and double-word prefixes, but given the lack of data, I ended the hack and went back to work.

    Similar Projects

    Liz - Prompt autocomplete by ftorchia

    Description

    Liz is the Rancher AI assistant for cluster operations.

    Goals

    We want to help users when sending new messages to Liz, by adding an autocomplete feature to complete their requests based on the context.

    Example:

    • User prompt: "Can you show me the list of p"
    • Autocomplete suggestion: "Can you show me the list of p...od in local cluster?"

    Example:

    • User prompt: "Show me the logs of #rancher-"
    • Chat console: It shows a drop-down widget, next to the # character, with the list of available pod names starting with "rancher-".

    Technical Overview

    1. The AI agent should expose a new ws/autocomplete endpoint to proxy autocomplete messages to the LLM.
    2. The UI extension should be able to display prompt suggestions and allow users to apply the autocomplete to the Prompt via keyboard shortcuts.

    Resources

    GitHub repository


    Collection and organisation of information about Bulgarian schools by iivanov

    Description

    To achieve this it will be necessary:

    • Collect/download raw data from various government and non-governmental organizations
    • Clean up raw data and organise it in some kind database.
    • Create tool to make queries easy.
    • Or perhaps dump all data into AI and ask questions in natural language.

    Goals

    By selecting particular school information like this will be provided:

    • School scores on national exams.
    • School scores from the external evaluations exams.
    • School town, municipality and region.
    • Employment rate in a town or municipality.
    • Average health of the population in the region.

    Resources

    Some of these are available only in bulgarian.

    • https://danybon.com/klasazia
    • https://nvoresults.com/index.html
    • https://ri.mon.bg/active-institutions
    • https://www.nsi.bg/nrnm/ekatte/archive


    Update M2Crypto by mcepl

    There are couple of projects I work on, which need my attention and putting them to shape:

    Goal for this Hackweek

    • Put M2Crypto into better shape (most issues closed, all pull requests processed)
    • More fun to learn jujutsu
    • Play more with Gemini, how much it help (or not).
    • Perhaps, also (just slightly related), help to fix vis to work with LuaJIT, particularly to make vis-lspc working.


    Help Create A Chat Control Resistant Turnkey Chatmail/Deltachat Relay Stack - Rootless Podman Compose, OpenSUSE BCI, Hardened, & SELinux by 3nd5h1771fy

    Description

    The Mission: Decentralized & Sovereign Messaging

    FYI: If you have never heard of "Chatmail", you can visit their site here, but simply put it can be thought of as the underlying protocol/platform decentralized messengers like DeltaChat use for their communications. Do not confuse it with the honeypot looking non-opensource paid for prodect with better seo that directs you to chatmailsecure(dot)com

    In an era of increasing centralized surveillance by unaccountable bad actors (aka BigTech), "Chat Control," and the erosion of digital privacy, the need for sovereign communication infrastructure is critical. Chatmail is a pioneering initiative that bridges the gap between classic email and modern instant messaging, offering metadata-minimized, end-to-end encrypted (E2EE) communication that is interoperable and open.

    However, unless you are a seasoned sysadmin, the current recommended deployment method of a Chatmail relay is rigid, fragile, difficult to properly secure, and effectively takes over the entire host the "relay" is deployed on.

    Why This Matters

    A simple, host agnostic, reproducible deployment lowers the entry cost for anyone wanting to run a privacy‑preserving, decentralized messaging relay. In an era of perpetually resurrected chat‑control legislation threats, EU digital‑sovereignty drives, and many dangers of using big‑tech messaging platforms (Apple iMessage, WhatsApp, FB Messenger, Instagram, SMS, Google Messages, etc...) for any type of communication, providing an easy‑to‑use alternative empowers:

    • Censorship resistance - No single entity controls the relay; operators can spin up new nodes quickly.
    • Surveillance mitigation - End‑to‑end OpenPGP encryption ensures relay operators never see plaintext.
    • Digital sovereignty - Communities can host their own infrastructure under local jurisdiction, aligning with national data‑policy goals.

    By turning the Chatmail relay into a plug‑and‑play container stack, we enable broader adoption, foster a resilient messaging fabric, and give developers, activists, and hobbyists a concrete tool to defend privacy online.

    Goals

    As I indicated earlier, this project aims to drastically simplify the deployment of Chatmail relay. By converting this architecture into a portable, containerized stack using Podman and OpenSUSE base container images, we can allow anyone to deploy their own censorship-resistant, privacy-preserving communications node in minutes.

    Our goal for Hack Week: package every component into containers built on openSUSE/MicroOS base images, initially orchestrated with a single container-compose.yml (podman-compose compatible). The stack will:

    • Run on any host that supports Podman (including optimizations and enhancements for SELinux‑enabled systems).
    • Allow network decoupling by refactoring configurations to move from file-system constrained Unix sockets to internal TCP networking, allowing containers achieve stricter isolation.
    • Utilize Enhanced Security with SELinux by using purpose built utilities such as udica we can quickly generate custom SELinux policies for the container stack, ensuring strict confinement superior to standard/typical Docker deployments.
    • Allow the use of bind or remote mounted volumes for shared data (/var/vmail, DKIM keys, TLS certs, etc.).
    • Replace the local DNS server requirement with a remote DNS‑provider API for DKIM/TXT record publishing.

    By delivering a turnkey, host agnostic, reproducible deployment, we lower the barrier for individuals and small communities to launch their own chatmail relays, fostering a decentralized, censorship‑resistant messaging ecosystem that can serve DeltaChat users and/or future services adopting this protocol

    Resources


    HTTP API for nftables by crameleon

    Background

    The idea originated in https://progress.opensuse.org/issues/164060 and is about building RESTful API which translates authorized HTTP requests to operations in nftables, possibly utilizing libnftables-json(5).

    Originally, I started developing such an interface in Go, utilizing https://github.com/google/nftables. The conversion of string networks to nftables set elements was problematic (unfortunately no record of details), and I started a second attempt in Python, which made interaction much simpler thanks to native nftables Python bindings.

    Goals

    1. Find and track the issue with google/nftables
    2. Revisit and polish the Go or Python code (prefer Go, but possibly depends on implementing missing functionality), primarily the server component
    3. Finish functionality to interact with nftables sets (retrieving and updating elements), which are of interest for the originating issue
    4. Align test suite
    5. Packaging

    Resources

    • https://git.netfilter.org/nftables/tree/py/src/nftables.py
    • https://git.com.de/Georg/nftables-http-api (to be moved to GitHub)
    • https://build.opensuse.org/package/show/home:crameleon:containers/pytest-nftables-container

    Results

    • Go nftables issue was related to set elements needing to be added with different start and end addresses - coincidentally, this was recently discovered by someone else, who added a useful helper function for this: https://github.com/google/nftables/pull/342.

    Side results

    Upon starting to unify the structure and implementing more functionality, missing JSON output support was noticed for some subcommands in libnftables. I am submitting patches as needed:

    • https://lore.kernel.org/netfilter-devel/20251203131736.4036382-2-georg@syscid.com/T/#u


    Collection and organisation of information about Bulgarian schools by iivanov

    Description

    To achieve this it will be necessary:

    • Collect/download raw data from various government and non-governmental organizations
    • Clean up raw data and organise it in some kind database.
    • Create tool to make queries easy.
    • Or perhaps dump all data into AI and ask questions in natural language.

    Goals

    By selecting particular school information like this will be provided:

    • School scores on national exams.
    • School scores from the external evaluations exams.
    • School town, municipality and region.
    • Employment rate in a town or municipality.
    • Average health of the population in the region.

    Resources

    Some of these are available only in bulgarian.

    • https://danybon.com/klasazia
    • https://nvoresults.com/index.html
    • https://ri.mon.bg/active-institutions
    • https://www.nsi.bg/nrnm/ekatte/archive