libextractor is a library used to extract meta-data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types. The goal is to provide developers of file-sharing networks, file managers, and WWW-indexing bots with a universal library to obtain meta-data about files. It includes a shell-command and bindings for Java (JNI) and Python.

2012-09-26

Major changes to the plugin mechanism now allow out-of-process plugins full random access to the entire file. Most plugins have been rewritten to the new plugin API. The external (libextractor) API remains unchanged and compatible with 0.6. As part of the rewrite, many plugins were changed to use standard 3rd party libraries (libjpeg, libtiff, libgif, libtidy, and libmagic) for parsing. A new plugin based on gstreamer replaces many existing multimedia plugins. Automated test cases for (almost all) of the plugins were also written, and the documentation was updated.
Stable

