NEWS
07 Feb 2012 - Version 1.0 released.
06 Feb 2012 - Project started.
About k5n EDocIAS
k5n EDocIAS is an open source PHP-based electronic document
index and search system. It is not a document management
system. With the proper 3rd party text extraction tools, you can easily
search for text embedded in binary files (PDF, DOC, XLS, HTML, etc.) And,
with the installation of Tesseract, any documents you have
scanned (and saved as TIFF, JPEG or PNG) can also be searched.
Requirements
- PHP 5
- MySQL (other databases will likely work but not tested)
- 3rd party tools for extracting text (Tesseract, etc.)
(See the README.txt file for a list of possible tools.)
Goals
It's important to keep in mind not only what this tool was designed for but
also what it is not.
- Open source: GLP v2
- Multiplatform (Mac, Windows, Linux): Achieved by using PHP and MySQL.
- Simple: This is not a document management system with file management,
user access control, etc. This is intended for users who already have
a bunch of documents of various formats and just need to find a way
to quickly search them from any machine on their network.
- Lean: Since this app may not be used for days at a time (depending on
how it's being used), it should not require additional memory or CPU
time while not in use. So, no standalone Java server for example.
- Accessible from the network: Even though all the files are centrally
located on a single machine, you can search for and download the documents
using the simple web interface. This makes all the documents instantly
accessible to the other machines on your network. And no mounting of
network drives is required.
- Don't reinvent the wheel: There are plenty of existing tools out there
that will extract the plain text from binary data. So, there is no
need to rewrite that functionality again. The document index process
invokes 3rd party tools.
- Expandable: If you have files of a different type that you want to
include, all you need to do is find or build a tool to extract the
text you want to index. For example, find a tool to pull the metadata
out of photos and index your pictures.
- Intranet-focused: Easily integrated into other PHP-based apps for use
in an intranet. Custom header, trailer and CSS is easily configured.
See Plans to see what
features will be added next.
SourceForge.net Services
Development resources from SourceForge.net
are used in the development of this component.
For a complete list, see the Developers page.
|
|