Tuesday, December 17, 2019

The Smithsonian’s Journey of Computerized Library and Archives

The Smithsonian Institution, with its 19 museums, 20 libraries and 14 archival units, prioritizes sharing our resources and discovering knowledge with the public.  Today, 15.5 million library, archives and museum objects and 10.4 million images are online to support research, education and public service.  It has been a challenging but rewarding journey to transform a manual and paper-based Smithsonian into the digital Smithsonian of today.  This evolution of automated library and archives systems and the collaboration that made it all possible at the Smithsonian is impressive, and I wrote these three blogs to share this history with you.


PART I : THE FIRST INTEGRATED LIBRARY SYSTEM (1965-1994)


Ahead of Its Time from the Beginning

At its inception, the Smithsonian Institution Libraries (SIL) depended on paper card catalogs.  In 1965, the SIL began to slowly convert from the Dewey Decimal System to the Library of Congress’s cataloging classification system.  Simultaneously, it began to transition from using handwritten catalog cards to computer printed ones.  Smithsonian invested in the latest technology: card punching machines to support the data entry.

Starting in 1975, SIL began working towards in-house automation for library operations and joined OCLC (Ohio College Library Center) as a member in 1976. 

In 1980, under the leadership of Director Robert Maloy, SIL envisioned a unified electronic system that would link the ordering, accounting, receiving, indexing, circulation and inventory control functions into one data flow. Mr. Maloy and others also began advocating for making information in libraries, archives, and museums accessible in Smithsonian computers for public access.  This vision proved to be very challenging to accomplish since SIL was still using random manual and semi-automated systems for its daily business. Even though this seemed to be an impossible goal at the time, it set the Smithsonian on its path for our accomplishment 30 years later.

In 1980 Stephen Toney,(the first system librarian at SIL),  began to work closely with the Smithsonian central IT office, OIRM (Office of Information Resource Management), for stronger computer support.  OIRM and SIL worked to purchase a dedicated computer system that was intended to support not only the needs of the library, but also of the archives and museum research offices.  An RFP (request for proposal) was sent out in February 1983 to library system vendors. The proposals were reviewed by staff from the SIL, OIRM, Smithsonian Institution Archives(SIA), National Museum of American Art (NMAA) and others.  In September 1983, a GEAC system was selected and named SIBIS (Smithsonian Institution Bibliographic Information System).

Implementing the First Library System

The GEAC system contained multiple modules to support Acquisition, Cataloging, Circulation and Email functionality.  It was based on supporting data in MARC (Machine-Readable Cataloging) format. The mainframe GEAC computer was installed in the basement of the National Museum of Natural History (NMNH), the museum in which most of the SIL staff worked.    At the time, there was no Local Area Network (LAN) at the Smithsonian, so computer terminals could only be connected to the mainframe computer by long wires from within the building.

SIL was successful in importing the catalog card data from OCLC to the SIBIS system via computer tapes.  The new automation system brought a change in the library’s work culture: many staff were surprised that automation didn’t reduce their amount of work; instead, it needed different types of work.  The automated system required more accurate data, identified mismatched inventory lists and shelving issues, identified missing or unreturned books, and produced lists of records for enhancement.  The inconsistent data from pre-automation days caused inaccurate search and display problems; therefore, top priorities following the implementation focused on data clean up, problem tracking, data standardization and enhancement work for many years to come.  

The library also transformed its departments and workflow to integrate the automated system which allowed copy cataloging from records in OCLC.  The head of the newly formed SIL Systems Office, Tom Garnett, learned to program on GEAC to produce reports for new title list, inventories, acquisition orders, etc.  Marcia Adams (A systems librarian) focused initially on automating the circulation system that tracks book check-ins, check-outs, borrower records, and circulation reports.    Even with much more work, everyone agreed that the automated system increased work efficiency and the quality of library management .

A GEAC Computer Room in 1980s
OIRM provided critical operational support for this groundbreaking endeavor.  The GEAC system required 24-hr coverage of computer operators and was composed of proprietary hardware and software, oversized mainframe CPU chassis, disk storage units and tape drives for 10.5-inch magnetic tape reels.  Computer operation support included regular magnetic tape loads for OCLC records, daily batch jobs that helped to maintain databases, the generation of reports and printouts, and daily backup and restore during the midnight shift.   





Adapting Existing Standards for Non-Bibliographic Content

Soon after installing the GEAC system, the Smithsonian began installing CO-LAN modems, which served as the primitive predecessor of the computer network.  This allowed connections from GEAC mainframe to computer terminals in different buildings.  The American Art Museum Research and Scholar Center and the Archives of American Art were the first museum and archives to use a library management system for automation. In the early 1980s, there were no established data standards for non-library materials undergoing computer automation. Among existing standards, there were two that came closest to fitting the Smithsonian’s needs: 
1.       UNIMARC (Universal MARC) format): Although most of these existing standards relate more closely to library materials than to archival ones, the general approach and specific guidelines was still relevant.
2.       AMC (Archival and Manuscripts Control) format: Developed by the Society of American Archivists in 1985, the instruction manual provided standards that were specifically for archives.

With the standard selected, the immediate challenge was to map the data into the MARC format and enter the data into the library system. The GEAC system was implemented in three separate databases:  Library, Archives and Art Inventory.   Archives of American Art began creating descriptions (mostly collection level) for their collection in the Archives database.   A couple of thousand descriptions were entered in just two years.  However, the limitations of using a library system as an archival system soon became apparent: record size and field occurrence limitations caused major frustration among archivists for years.

I joined the Smithsonian OIRM in 1988 as the system administrator and a technical lead and became part of this exciting project.  We worked hard to push the software vendors to fix these limitations, but the necessary technology was not available to address these issues at that time. However, several more archives joined SIBIS and continued to add records with greater complexity.  Early implementers included the Smithsonian Institution Archives, National Museum American History Archives Center, National Anthropological Archives, and Human Studies Film Archives. The Smithsonian grew to become the institution with the largest archival electronic records online.

NMAA’s Art Inventory project also joined SIBIS as an early museum adaptor to a library system.  The highly specialized Art Inventory Database, which compiled and cataloged artworks created by American artists, was one of the leading online reference resources. The dataset documented sculptures and paintings with many data elements outside of traditional MARC format.  Eleanor Fink, (Chief of the office of Research Support, NMAA), advocated to adapt and test the flexibility of the MARC Visual Material format for three dimensional objects.  OIRM SIBIS customized the GEAC system to accommodate the unusual data fields to support indexing, searching and display purposes.  This strong collaboration between NMAA and OIRM created the first successful large-scale art project adaptation in a library system.   This implementation had early success with 16,000 sculpture records successfully imported in just a couple of years.  It also pushed the GEAC system to its limit, unable to support many customized data fields and special search indexes.


Raising Expectations and Improving Automation

Encouraged by the initial success of SIBIS in 1989, The Smithsonian Castle formed a SIBIS Management and Planning Committee with the purpose of elevating its performance, increasing funding to OIRM, and expanding its usage to more Smithsonian units.  The funding structure was a “Cost Center” model where units would transfer funds annually to OIRM.  Ross Simon, (An assistant to the Smithsonian Secretary), became the first chairman for this management committee.  In 1992, SIBIS was renamed SIRIS (Smithsonian Institution Research Information System) to match the broader goal of the committee.  The SIRIS board decided to look for a new generation of library information system.   In December 1993, the NOTIS system was purchased and records were migrated to it.  This new system ran on IBM 4381 mainframe computers.  Computer terminals were on Zenith PCs which were booted with floppy disks to emulate IBM 3270 terminals.  Later, the PCs were upgraded to PS2 computers, which had local disk drives that could hold the terminal emulating software.  Floppy disks were retired.

Before the Smithsonian joined the World Wide Web (WWW), there was WAN, Gopher, and WAIS, which allowed internet access beyond the Smithsonian network. One of the first to do so, SIRIS successfully implemented remote Telnet connections.  The NOTIS system supported TCP protocol with a TAG machine (IBM RISC server) for internet searching capability. George Bowman, the main library system administrator, was the key technical staff to take advantage of the latest technology.

In 1994, OIRM SIRIS team successfully implemented a PACLINK function which allowed the SIRIS computer to remotely access online catalogs from several remote institutions such as the Harvard Library, Yale Library, and WRLC Consortium (of George Washington University, Catholic University, American University and George Mason University) on the SIRIS terminals for the first time.  We also made the Smithsonian Catalogs (Library, Archives and Art Inventories) available to many other libraries around the US and Canada in 1994.   The PACLINK function was based on Z39.50 protocol for searching and retrieving information from a remote database using TCP protocols.  These services predated the WWW at the Smithsonian and the first desktop PC web browser; it was cutting edge! 




Ching-hsien Wang,  Branch Manager
Library and Archives Systems Support Branch (LASSB)
Office of the Chief Information Officer


No comments:

Post a Comment