DocumentCloud will be an index of primary source documents and a tool for annotating, organizing and publishing them to the web.
Documents will be contributed by journalists, researchers and archivists. If your organization does document-driven investigations, we’d love to have you join us. Using the DocumentCloud workspace, you can upload documents, share them with the rest of your organization and get back the full text and a PDF copy as well as access to a structured search engine and analytical tools to help you make the best use of your documents. You’ll even be able to download a lightweight javascript document viewer that will let you display documents right on your own website.
At the moment, we’re working towards an initial beta release. As we develop the DocumentCloud application, we’re packaging up the components that support it, and releasing them as open-source projects. Our releases so far include: a utility for extracting images and text from document files, an asset packager, a library of functional helpers for JavaScript and a parallel processing system.
You should follow us on Twitter or email us. We’re even on Identi.ca.
