Based on the little-used HTML5 outline spec, investigate&implement an in-browser tool (currently a chrome extension or browser user script) to easily, interactively scrap a documentation web page into an 'index-content' map for (offline) searching.

Motivated by the fact that most scrappers today are command line tools, too tech-savvy.

Currently only target documentation web pages, which are much better structured and so easier to scrap. Also I think these pages benefit most from indexing&scraping.

Looking for hackers with the skills:

web scrapper indexer documentation

This project is part of:

Hack Week 13

Activity

  • about 7 years ago: cxiong added keyword "web" to this project.
  • about 7 years ago: cxiong added keyword "scrapper" to this project.
  • about 7 years ago: cxiong added keyword "indexer" to this project.
  • about 7 years ago: cxiong added keyword "documentation" to this project.
  • about 7 years ago: cxiong started this project.
  • about 7 years ago: cxiong originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    Adopt Typescript in D-Installer by IGonzalezSosa

    Project Description

    In January, we announ...


    Setup A Linux Cross Referencer for SUSE kernels by tdz

    Project Description

    There's [Elixir](http...


    WebRTC individual track recorder by avicenzi

    [comment]: # (Please use the project descriptio...


    Update quilt's manual page by jdelvare

    [comment]: # (Please use the project descriptio...


    Learn to do 3D animations for product documentation in Blender by rainerkoenig

    [comment]: # (Please use the project descriptio...


    awesome open source by hennevogel

    There are tons of [awesome lists](https://githu...


    Refresh the internal SUSE Manager maintenance documentation by deneb_alpha

    Project Description

    With this project I wou...


    EVERYONE can contribute to documentation – See how YOU can join the party by chabowski

    [comment]: # (Please use the project descriptio...