Smithsonian Collections Blog

Highlighting the hidden treasures from over 2 million collections

Collections Search Center

Tuesday, August 13, 2013

Unpacking Our Treasures: An Introduction to the Smithsonian Transcription Center

Across every museum, archive, and library, Smithsonian staff are hard at work digitizing our wonderful collection materials, with the goal of preserving and making accessible the millions of documents, specimens, and artifacts that represent our nation's heritage, our world cultures and our diverse planet.

But what happens after we digitize something? With a mission of "the increase and diffusion of knowledge", we at the Smithsonian are always looking for ways to make our efforts more useful and informative to researchers and members of the public.

The Need for Transcription

When it comes to letters, diaries, manuscripts, it's clear that transcription is very handy. Before the invention of typewriters, it was far more common to find documents written long-hand - which is not machine-readable, and therefore hard to search. Plus sometimes it's just plain hard to read!

For instance, take a look at the paragraph below

That's an excerpt from Mary Anna Henry's diary. She was the daughter of the Smithsonian's first Secretary, Joseph Henry, and wrote about her experiences in Washington, D.C., her father's work, and the start of the Civil War.

There's a wealth of wonderful stories that could be shared from that diary, but unless you're personally poring over these documents for hours or reading someone else's report on the diary, you'll never hear them.

A Quick Peek at the Smithsonian Transcription Center

That's why the Smithsonian is working on a crowdsourced transcription project to invite the public to help us uncover our many treasures. In the past few weeks, hundreds of volunteers participated in our open beta and have made thousands of contributions on a variety of Smithsonian collections, from field notebooks of early 20th century American naturalists, to the painting diaries of the artist Oscar Bluemner, to the research diaries of chemist Leo Baekeland.

My colleague Sarah Allen and I (Jason Shen) are new additions to the Smithsonian, but we're incredibly excited and honored to build on the great work that's already started by a team that includes folks from the Office of the Chief Information Officer, SI Archives, Archives of American Art, Natural History, American History and many more.

We're still in beta and plan to make lots of improvements to the software, and if you're up for becoming a digital volunteer, we'd love to have you. Come check out the Smithsonian Transcription Center.

A little background on us:

  • Jason is a tech entrepreneur from San Francisco and cofounded a social transportation company called Ridejoy. He writes a blog on startups and personal development called The Art of Ass-Kicking.
  • Sarah has 20 years of software development experience and helped build the technology behind After Effects, Shockwave and Flash. She also helps women and minorities learn how to code through her nonprofit RailsBridge and writes at Ultrasaurus.
We look forward to sharing more our work in the coming weeks and months so stay tuned!

Jason Shen & Sarah Allen
Presidential Innovation Fellows

Oh, and about that diary entry: thanks to the work of volunteer transcribers, we now have a machine (and human!) readable form for that paragraph:
Nov 22 1858
Father has been looking over one of his old books tonight in which are recorded some of the experiments he made while at Princeton. The review has made him somewhat sad. He spoke of one experiment upon the effect of electricity encircling a ray of polarized light. It failed for want of sufficiently strong galvanic battery. Five years later Faraday made the same experiment, succeeded & gained the plaudits of the scientific world. I do not exactly understand this experiment -- shall ask for an explanation.


  1. Excellent post Jason! We are glad to have your help in developing our transcription centet.

  2. Unpacking Our Treasures: An Introduction to the Smithsonian Transcription Center
    transcription services