Frequently Asked Questions

What is it?

DocumentCloud is both a repository of primary source documents and a tool for document-based investigative reporting. Think of the repository as a card catalog for primary source documents. Our tools accelerate the work of reporters who need to make sense of large sets of documents. (You can use it on small sets, too.)

Yes, but what is it?

Ruby (Rails, Sinatra) and JavaScript. We're using Tesseract for OCR and OpenCalais for entity extraction. Take a look at our GitHub projects and our blog to get a better sense of the tools we’re using.

What can I do with it?

Take a look at what other reporters are doing with it! And that's just the beginning. When you upload a document to DocumentCloud, you can annotate it, share it with colleagues in your newsroom or beyond your newsroom, view lists of people and places named in it, plot the dates it contains on a timeline and more.

How do I get an account?

Learn more about getting accounts via our Contact page. When you're ready, fill out an application.

Is there an API?


Are my documents automatically public?

No. DocumentCloud was started by journalists who understand journalism. Documents you upload aren't public until you make them public.

Can I keep documents behind a pay wall?

Sure. Other reporters who are DocumentCloud users can view your public documents from our workspace, but your public facing copy can sit behind a paywall. If the general public has no access to your reporting, however, you probably aren't a good fit -- DocumentCloud is intended as a public catalog.

What does it cost?

Currently, DocumentCloud is a free service. In the future, we will develop models to ensure the platform's sustainability. See the next section for more.

Who funds it?

DocumentCloud has been funded by generous grants from the John S. and James L. Knight Foundation. Initial funding came in 2009, when DocumentCloud was the winner of a Knight News Challenge grant. The Knight Foundation awarded subsequent grants in 2011 and 2014 to further develop the platform. The latest grant is designed to expand the DocumentCloud team, improve the service and find revenue models that can help DocumentCloud reach sustainability. Finally, in 2011, DocumentCloud merged with Investigative Reporters and Editors, a nonprofit grassroots organization committed to fostering excellence in investigative journalism. IRE, itself housed at the Missouri School of Journalism at the University of Missouri in Columbia, provides staff support and guidance to the team while connecting us strongly to the journalists whose work is at the core of the platform.

Why would I want to share my documents?

Because it makes your documents and your reporting more findable, more useful and ultimately more popular.

We know that many journalists want to share their source materials. It’s one of the reasons why most news organizations post source documents alongside news stories on their websites. DocumentCloud makes it easy to make those documents useful or even findable after the story fades from the headlines.

Many other organizations (bloggers, watchdog groups, citizen journalists) are in that same boat. They have a wealth of documents but are only able to post them as individual PDF files. This is the problem DocumentCloud solves.

Can’t I already find documents using a search engine?

Search engines are very powerful; our goal is to make documents even easier to find on search engines. DocumentCloud has information about documents and relations between them, for example what locations, people, or organizations a group of documents have in common.

How do you guarantee authenticity?

Currently, we limit access to journalists and other researchers who have an established editorial process and a history of publishing high quality reporting. Each contributing organization takes responsibility for ensuring that the documents it uploads to DocumentCloud are what they say they are.