In May 2011, John Palfrey, the Chairman of the Steering Committee of the Digital Public Library of America at the time, put out a call for a Beta Sprint of ideas for a new common platform called DPLA. By June of 2011, the committee had received 60 enthusiastic responses. The Smithsonian was among the 60 submissions and participated in a Beta Sprint Contest to show our idea of what a DPLA could be.
As part of this demonstration, we wanted to highlight the fact that records from different organizations could work well together. Both the Library of Congress and the National Archives sent records that they could produce in a short amount of time. Among them were catalog records of photographs, personal letters, and music manuscripts. These items told stories of Civil War veterans, the Union Pacific Railroad, musical history, and gave insight into the lives of many famous people from American history. Because the National Archives had a proprietary system, it was not easy for them to produce records in MARC format (Machine Readable Cataloging format). It took some hand coding to produce these archival records in MARC. Though the Smithsonian system did not require records to be stored in this format, using MARC enabled us to standardize our starting point. This also made the point that even though our data could come from different places, we needed a standard format to create the necessary data consistency for a common system to work well.
We mapped the two record sets from MARC to the Smithsonian EDAN (Enterprise Digital Asset Network) data format in no time. After the initial data ingest process into the Smithsonian system, we matched these records from the Library of Congress and the National Archives with the Smithsonian data. Even though the two record sets comprised fewer than 200 records, exciting results started to happen immediately. For example, the Library of Congress’s photographs of “Civil War veterans” responded to searches along with Smithsonian records of sculptures, paintings, and photographs on the same topic. The National Archives’s photographs of “railroad trains” matched with Smithsonian photographs, trade catalogs, postcards and posters. The National Archives’s letter written by “Rose Greenhow” matched with multiple Smithsonian’s photographs of Rose Greenhow and a book about the life of Rose Greenhow. The Library of Congress’s Letter by Johannes Brahms matched with Smithsonian’s photographs of Johannes Brahms. The following are some of the examples we used in our presentation.
This experiment provided the evidence that the concept of DPLA would work very well. Even though these records had never been on the same system before, this preliminary experiment worked immediately; the standard metadata and proper vocabulary control used in these records were the key to success. These records all used Library of Congress subject headings and Form and Genre terms, and all records contained properly formulated name headings. The system architecture proposed to the Beta Sprint proved to be robust and can handle dynamic situations with very different records.
This is a win-win project for all, and we encourage more libraries, archives and museums to join this great national project!
Ching-hsien Wang, Project Manager
Collections Systems & Digital Assets Division
Office of the Chief Information Officer