We started the LostDocs blog back in September 2009 to collect e-mail receipts for items that were reported to GPO as “fugitive documents” — agency documents that should have made it into the Federal Depository Library Program and/or the Catalog of Government Publications.

In the process of running this blog, we have identified 40 documents reported since April 2008 that were cataloged by GPO after being reported as “fugitive documents.” These fall into the “found documents” category of our blog.

You can find our list of 40 (and counting) cataloged fugitives at http://spreadsheets.google.com/pub?key=t8pEBNg2FGqGgHx5IxhtoAQ&single=true&gid=0&output=html. This spreadsheet will be updated whenever we identify new GPO cataloging for items that had been reported as fugitive documents.

The results are interesting and somewhat disturbing, but not definitive.

The 40 items were cataloged in times varying from three days to 524 days. The mean cataloging time was 213 days. The median cataloging time was 184 days or about six months.

If the cataloging times above were typical of all documents reported through the LostDocs process, we think this would be a major problem for GPO that would require some serious soul searching and dialog about how this result could be changed and what tradeoffs and/or extra community involvement would be required as a result.

We are NOT making the claim that these cataloging times are typical for reported fugitive documents. We honestly do not know what is typical. Jim Jacobs, FGI’s resident data librarian, had this to say about our sample of cataloged documents:

As for sample size and relevance: the number of items in the sample can’t tell us the significance or accuracy of the results. We’d have to know two other things: the size of the universe (of all reported lost docs), and the accuracy of the sample.  Since the sample was self- selected (by those reporting) rather than random, and since we don’t know if the sample is 1% or 85% of all submitted lostdocs, we can’t claim that the findings necessarily reflect the status of the whole universe. (does that make sense? If only people w/ long waits reported to us, our sample does not accurately reflect all lostdocs.)

When we first thought about making lostdocs reports available to the community at large, we first approached GPO with a partnering opportunity. We would maintain the blog, and offer them the opportunity to comment on the blog whether something was out of scope for CGP or already in the catalog. In return, we asked them to modify their LostDocs form so that when they received a report, the blog would automatically get a copy. If this partnership had been accepted, then we would know the two facts Jim cited above that are needed to tell us whether we have typical results or not. GPO declined to accept our partnership agreement, citing their workload. We’re not questioning that they are overworked.

We do feel that the results above deserve further investigation. Perhaps GPO could prepare a report on documents cataloged as a result of fugitive reports over the past few years. Unless they’ve discarded the e-mail receipts (which would be defensible), they have the dates of when documents were reported. The CGP lists when an item was first added to the CGP. They could have an intern make a semester project of putting the two together and then posting the results to fdlp.gov.

If they have tossed previous e-mail receipts, they could start saving them for a year starting in January 2010 and do the analysis we propose above in 2011. But in either case we feel the analysis should be done. If it confirms our results then it will be good ammunition in Congress to procure more cataloging staff or to start cataloging collaborations with FDLP members. If the GPO analysis concludes that items reported to lost docs are in fact cataloged in a timely manner, then that will help build trust with the documents community and motivate more people to report fugitive documents. Either way it is a win-win for GPO.