When using files that should be in an accessible file system, quite often I have been in a situation where

  • I couldn't find a document by name but remembered attributes like 'document' (unclear which format) with '> 12 pages' and dates from '2011 - 2015'
  • I needed to remove a lot of duplicates for sake of saving disk space but as well for cleaning up directory structures
  • searching for 'similar' files (where similiar is to be defined by content type)

Therefore, I initiated a project that consists of three elements

  • a database that holds not only file attributes stored in a file system, but additional values such as content type, checksum, type-specific characteristics
  • a script to manage that database (import, cleanup, ....)
  • a frontend to search for documents of interest

See

https://github.com/hrommel/fdb

Looking for hackers with the skills:

patternmatching databases hpc

This project is part of:

Hack Week 17

Activity

  • over 6 years ago: hrommel1 added keyword "hpc" to this project.
  • over 6 years ago: hrommel1 added keyword "databases" to this project.
  • over 6 years ago: hrommel1 added keyword "patternmatching" to this project.
  • over 6 years ago: hrommel1 started this project.
  • over 6 years ago: hrommel1 originated this project.

  • Comments

    Be the first to comment!

    Similar Projects

    This project is one of its kind!