Scattered collections, brittle diaries, rare artifacts, handwritten plays, and more are flashing across computer screens worldwide, via the University’s groundbreaking digital library project.
By Caren Lissner
Ms. Doro Petre, who lived in America in the early 1700s, had a great recipe for almond pudding: “A quorter of a pound of almonds, 8 yoaks of egges, and 1/2 lb of butter, and halfe a pound of light honey sugear, and the peal of a leamon.”
She also had a concoction for combating consumption (tuberculosis): “Take 1 ounce of liquorish, 1 ounce of anneseeds, 1 ounce of fox lungs, 1/2 an ounce of flower of brimston, 4 drachms of alecampane & 4 ounces of brown or white sugar candy, beat all these to a fine powder & take as much att a time as will lye upon a groute, either fasting or going to bed it is best, but any time so it be 2 hours from eating.”
Such mixtures came from Petre’s 1705 cookbook, where herbal remedies had a place beside such edibles as “potato Pye,” venison sauce, pickled mushrooms, and wine custard. Early cookbooks, according to Michael Ryan, the director of rare books and manuscripts at Van Pelt Library, tell us about Colonial medicine, the culinary arts, and women’s roles in 18th-century home life.
To read Ms. Petre’s book in her own loopy ink, you could travel across campus or around the world to the sixth floor of Van Pelt library, make a request to see it, and remain at a carrel with the dusty volume—the pages are as brittle and discolored as Goody Petre’s French bread, so you can’t take them with you. However, the pages also have been scanned by the staff of the library so that they can appear, as is, on your computer screen, allowing you to voyeuristically peek into a young woman’s innermost culinary thoughts. And just a few mouse-clicks away are page upon page from Philadelphian Elizabeth Cowperthwaite’s 1857-58 personal diaries; images of Alexander Calder’s and Alexander Lieberman’s respective large red steel sculptures on opposite ends of the Penn campus; pages from Shakespeare’s original folios; and the 1909 Sears, Roebuck annual corporate report, which cheerfully announces a tidy profit of $6 million. This is all part of the virtual infrastructure being laid for Penn’s evolving “digital library,” a growing cyber-tool that allows rare and scholarly materials to be viewed, linked, and compared side by side on the Web.
The goal of Penn’s groundbreaking digital library project, which can be entered by clicking a link on the University library system’s main Web page (www. library.upenn.edu) is not just accessibility via computer screen—it is also easy utilization of information, with possibilities like “hotlinking” footnotes, so that someone reading a research paper might click on a footnote and then zoom directly to the article to which the paper just referred. Another project allows viewers to cross reference Shakespeare’s plays with a number of contemporary works that Shakespeare might have read—for example, 400-year-old pages of Hamlet and a 16th century Bible appear side by side on a computer screen.
Five years of work have made Penn’s growing digital library one of the best in the world. Along with similar projects elsewhere, it will change scholarship by making more and more items visible, linkable, and accessible from all over.
No longer will scholars studying the 10th-century Egyptian Genizah text fragments that describe Mediterranean Jewish life have to travel to Philadelphia, New York, and the United Kingdom to see all of them, or to put the pieces together. No longer will music fans wanting to hear Robert and Molly Freedman’s 3,000 Yiddish folk records have to visit the couple’s home in Philadelphia. And someday, no longer will a scholar who wants to look at a collection of Colonial cookbooks, a favorite of Michael Ryan’s, have to visit the sixth floor of Van Pelt.
“They contain unexpected surprises,” says Ryan about the cookbooks, “and that’s the stuff of scholarship.”
The digital library started with what used to be the card catalogue. In 1986, the University’s library system—a network of 15 libraries, including fine arts, business, and engineering—began to make its card catalogue available on a network of computers. This computerized “Franklin” system is now accessible on a Web page rather than by dialing into a computer network.
Many functions have been added to the library’s main Web page over the years, including various databases and electronic journals. One recent addition of special interest to alumni is the “Alumni and Friends Portal,” which provides links available to alumni and others (see box).
The digital library, which can also be accessed from the library’s home page, contains several categories of information that have their own links. They include:
- The Schoenberg Center for Electronic Text and Image (SCETI), where the images of rare books, art, and other items are exhibited. There are currently 10 collections on the site that came out of specific Penn projects for which the library received a grant or support from alumni. There are also 10 “exhibitions,” including one about the invention of ENIAC at Penn’s Moore School of Engineering.
- Oxford University Press history books, which are readable online. The library is adding to the site the 200 to 300 historical volumes published by Oxford University Press each year. The books do not have to be scanned in, because the manuscripts are already on publishers’ computers and can be converted into Web format by Adobe Acrobat.
- “The online books page,” a rapidly growing list of links to thousands of volumes of fiction and nonfiction on Web sites all over the world. There’s a special (and quite helpful) database for women’s books.
- “Research and prototypes,” tools created to blaze new paths in digital librarianship. A big one is the Typed Object Model (TOM), a way to interpret and translate data formats so that materials added to or reachable through the library’s Web site are accessible even if the format is obscure.
Of course, each of these categories has links within it, some overlap, and many are likely to change as the digital library evolves.
The initial impetus for the digital library came from individual projects being pursued by various alumni and faculty members. In the 1990s, members of the library system’s Board of Overseers began proposing funds for digital library projects. Board chairman Larry Schoenberg C’53 WG’56, founder of AGS Computers, Inc. (an early software and computer services company), who has a collection of medieval manuscripts, gave $300,000 to the library in 1996 to begin scanning rare materials and putting them on the Web.
The library bought digital cameras, computers, flatbed scanners, CD writers, and slide scanners. They hired student assistants and set to work. The Schoenberg Center for Electronic Text and Image was born.
Soon after, the National Endowment for the Humanities awarded the library a $500,000 grant to help endow the center. The grant must be matched by $2 million in additional contributions to reach a $2.5 million endowment. So far, they’re halfway there.
“We grew the site large enough that we had to create a digital library,” says Michael Ryan. “We had gotten so big that it was just unwieldy to have this collection of sites not knotted together. We needed to stitch all of our sites into a single database. It took it from being just another conglomeration of Web pages to, I think, being the first totally integrated Web library.”
At the same time, an English Department project was sprouting and eventually attached its tendrils to the library Web site.
In 1996, Dr. Rebecca Bushnell, professor of English and associate dean in the School of Arts and Sciences, along with Dr. Jamey Saeger Gr’96, then a doctoral student, considered making the original quartos and folios of Shakespeare’s plays—the different-sized and different quality versions dating from the Bard’s own time—available to their students. Shakespeare’s writings have been altered so much over the centuries, and even during his own time, that it’s difficult to know which words he originally used. But scholars can compare and decide by reading the first quartos and folios. Until the site went up, there was no way to do that without sitting in a university library shifting a pile of 400-year-old books.
Bushnell and Saeger applied to the National Endowment for the Humanities and in 1998 received a $200,000 grant to create a Web site over three years. What resulted is a site that does not merely allow researchers to see pictures of Shakespeare’s quartos and folios. Instead, the site boasts a mechanism by which users can call up a Shakespeare text and a book Shakespeare might have read at the time and view them side by side. (To try it, go to www.library.upenn .edu/etext/collections/furness/index.html).
Say, for example, that you are skimming through Othello and recognize Iago’s line “I am not what I am” as similar to a line from Exodus in the Bible. You can then click the Shakespeare library’s 1583 Bible, “translated according to the Ebrew and Greeke, and conferred with the best translations in divers languages … imprinted at London, by Christopher Barker, Printer to the Queenes most excellent Maiestie.” Pages of both texts appear on the screen simultaneously.
You can also compare two early versions of Hamlet (which had three versions during Shakespeare’s time) side by side, or simultaneously view Raphael Holinshed’s Historicall Description of the Iland of Britaine (1587) and Henry Smith’s late-1500s sermon on The Examination of Usury.
(Unfortunately—and this is something digital librarians generally are working on—there is no particular search method available to find the actual line you are looking for in the Bible or King Lear. Because each page is scanned as a single image, the individual words are not recognized by a search. Coding each word of handwritten text is a long and expensive process, but it has been done at other universities with some materials.)
“I don’t know of any other place where you have the two frames where you can view the materials,” says Bushnell. “It is really cool. The thing that’s exciting and unique about the site as well is that it’s digital facsimiles, rather than hypertext. It preserves the way the text looks. What we’re trying to do is reproduce the experience of reading an early modern book.”
A valuable tool for both serious scholars and novice Shakespeareans, the texts have been used to teach Penn undergraduates and Philadelphia high school students as well.
“What you usually get when you buy the Penguin edition is a modern editor’s reconstruction of what that text could be,” Bushnell says. “If you go to the Web site, you get to make up your own mind about what text you want to read. It allows students, scholars, and teachers to get the raw materials that people use to study Shakespeare.”
Rayna Goldfarb, chair of the English Department at Lincoln High School in Northeast Philadelphia, says that she and more than a dozen other teachers in the Philadelphia public schools benefited from being able to bring the scanned Shakespeare texts to students. But she also feels that the site reinvigorated the teachers’ own intellectual curiosity.
“It taught me to look at the various texts, the permutations and drafts, to look at textual criticism, to look at the history,” Goldfarb says. “It was an opportunity for English teachers to come together from several high schools, and to make it a little more scholarly, although not dauntingly so.”
Besides the Shakespeare site, other SCETI collections include:
- Annual corporate reports from the Lippincott business library. How did GM do in 1918? How did the Bangor & Aroostook Railroad do in 1924? How did United Carbon do in 1941? (A peek at the report shows that it did a lot better than in 1931!)
- The Robert and Molly Freedman Jewish Music archive
- South Asia studies
- Women’s studies (that’s where Ms. Petre’s cookbook is)
- The Marian Anderson collection of photographs, 1898-1992
Besides the “collections” section on the Schoenberg site is a list of equally interesting “exhibitions,” including, among others:
- Audio and video of “Marian Anderson: A Life in Song”
- “John W. Mauchly and the development of the ENIAC Computer,” which recounts the efforts of Mauchly Hon’60 and J.Presper Eckert EE’41 GEE’43 Hon’64 to create the Electronic Numerical Integrator and Computer.
Rare books director Michael Ryan wants to get all of the cookbooks up, along with Penn’s holding of Italian imprints from the 16th and 17th centuries.
“There are some Spanish Golden Age materials,” he says. “There’s nothing like it in this country. If we could find a pot of gold somewhere, I could begin working on this. It’s terribly important.”
Although the images of rare materials in the Schoenberg digital center are of great interest, the digital library site is also useful for finding actual digital books, which fall into two categories:
1. Books on the Internet: The online books page, free and established by John Mark Ockerbloom in the early 1990s as a personal project when he was a Carnegie Mellon student (he was hired by Penn in the summer of 1999 and allowed to take his work with him) has links to 14,400 English-language volumes available all over the Web. Ockerbloom doesn’t put the books up himself, but receives information about books and magazines posted on other sites and links to them.
The page also links to “A Celebration of Women Writers,” edited by Ocker-bloom’s wife, Mary. The site includes a database of books by or about women, which is searchable, amazingly, by author, century, country, and ethnicity. The site also encourages a “Build-a-Book” program through which people can volunteer to type or scan in their favorite book and add it to the site. Only books whose copyright has expired (meaning books that were published before 1923 and not revised or translated since then) can be added. An exception is a book whose copyright holder agrees to allow it to appear online for free. Once the books are scanned or typed in, they must be proofread, a service often performed by Mary Ockerbloom. Approximately 90 books have been added this way.
2. Oxford University Press history books: Assisted by a grant from the Mellon Foundation, the library expects to post every new history book published by Oxford University Press over five years, or 200 to 300 books annually. (In all, the Press publishes about 3,000 books per year.) Besides offering the books to students and faculty, the library is doing a study on how the e-books “impact on teaching, learning, and book sales.” (Cambridge University Press is scheduled to begin participating later this year, and will post 75 history titles.)
The study will examine who is using the e-books, how often they are being used, and how electronic availability affects sales and library circulation of the actual hard copies. Library researchers wanted to wait to start the study until there were enough books on the site. Now that they have 300, they will conduct the study for the last three years of the grant, collecting data via the Web and from bookstores, as well as from focus groups and interviews with users. They’re also considering the use of online surveys.
History books were chosen because of the strengths of both Penn and Oxford University Press in the field. “We have a very strong and large history department,” says Paul Mosher, vice provost and director of the libraries. “History cuts across so many disciplines—anthropology, English, foreign languages, the history and sociology of science, anything that looks at the past to educate us about the present. And Oxford University Press is a good history publisher. So what they’re strong in would be the thing to do.”
The Oxford books can’t be accessed by alumni, but there are three sample books on the “public preview” portion of the site that can be read by anyone with Adobe Acrobat Reader (which is free to download).
Also in the works is an effort to use hyperlinks to connect listings in the books’ tables of contents and indexes to related passages inside the books. And the feasibility of footnote hotlinks is being tested by Penn. The footnotes could be linked to the electronic library’s 2,000-plus journals, or to journals on other institutions’ sites.
With all of the universities in the world, some of which, like Penn, are hard at work on digital libraries, and all of which have their own particular academic interests, who decides what goes up on Penn’s site and what goes elsewhere? Why does Penn get to do the Shakespeare project? What about the Egyptian Genizah fragments?
“That’s clearly a big issue right now,” Ryan says. “So much has happened. There is a very pressing need for coordination. We try to let [others know] now, since people are already advanced on projects. We’re going to try to, in the future, steer clear of Shakespeare-type projects someone else might be doing. There’s a lot of encoded text out there for Shakespeare’s plays (text that is marked up so it’s searchable in certain ways). A number of places are focusing on that: MIT [and] Tufts in Boston, Canadian universities, and I’m sure in England.”
Penn is a member of the Digital Library Federation, a Washington-based consortium of libraries and related agencies founded in 1995 by the Library of Congress, the National Archives and Records Administration, the New York Public Library, and 12 university libraries. Its main goal, according to its 1995 charter, is to promote “the implementation of a distributed, open digital library conforming to the overall theme and accessible across the global Internet. This library shall consist of collections to be created from the conversion to digital form of documents contained in our and other libraries and archives, and from the incorporation of holdings already in electronic form.” The DLF is kept alive by grants and by donations from members.
“Penn has attracted talented computer scientists and systems experts,” says Sarah Thomas, the university librarian at Cornell’s Olin Library. “The Penn Library stands out for the elegance and clarity of its communications. [Paul] Mosher and his staff are developing considerable momentum—much to the benefit of Penn students and faculty and the research community at large.”
Mosher cites a research study that said that 93 percent of what was on the Internet was sales-related, and that much of the rest was religion and pornography. A site like Penn’s cuts out the chaff. “The attempt is to try and select and make accessible information that has use within our scholarly community,” he says. “If people do a Google search, 90 percent of what they uncover has no real value.”
In the future, of course, the initial steps taken on the digital library will seem primitive. “It will never end, quite literally,” Mosher says. “By the time we even get a big hunk of it done, we’ll be entering a post-digital age. You’ll be able to read the back of your hand and pull out whatever you want.”
Obsolescence of technology worries Mosher a bit. But there’s something else that worries him more.
“In the beginning,” he says, “if we wanted to study the creative process, we looked at the introductory sketches. We have those from all over the world. Now, if you are working on a computer, you usually strike out the old as you create the new.”
But while those creative steps are erased because of the advancement of technology, other items are preserved. The use of the digital library prevents pages in old books from continually being flipped through and thus falling apart (go to the Shakespeare site and check out the sad, sad first page of the second quarto of Othello).
And, since it’s done in the comfort of one’s own home or office, the alumni, scholars, Philadelphia high school students, and anyone else who visits, have a better chance of actually using the materials. And this, according to Michael Ryan, is the point.
“If we can build new constituencies, that’s going to make the humanities so much stronger and so much healthier in this country,” Ryan says. “All the fights that go on over the piddling humanities budget, it’s because the humanities are marginalized. Now the humanities have a chance to come out of the closet and demarginalize themselves on the Web. And that’s where the University is headed as well.”
Caren Lissner C’93 liZZner@aol.com is the editor of the Hudson Reporter newspaper group in North Jersey. A novel she wrote was recently optioned by a film production company. She has no idea where to find fox lungs.
What’s Available Through the Alumni Portal
You can tap a few computer keys and read “After the Ball is Over: Bringing Cinderella Home” in the new folklore periodical Cultural Analysis, but you can’t read anything in the Scandinavian Journal of Rheumatology.
You can access the Population Index (a database of demographic information from 1986 to the present), Toxnet (a group of government databases on hazardous chemicals and poisons) and Northern Light (a database full of searchable newspaper and magazine articles and Internet documents), but you can’t get into Lexis-Nexis (an extensive compendium of major newspaper and magazine articles and biographical data).
The library system has worked to make many Web site materials available to alumni and the general public, but not everything is. Thus, it has created a link called the Alumni and Friends Portal that contains only unrestricted items.
Here are some links that can be found on the alumni and friends portal (www.library.upenn.edu/portal/) on the library homepage:
- E-news and e-journals (links to newspapers like TheNew York Times, The Wall Street Journal and The Daily Pennsylvanian, plus a host of electronic journals organized by subject)
- High-content Web sites (links to thousands of sites for specific topics, ranging from law to Philadelphia to organic chemistry to African-American studies)
- The “library showcase” (a link to the digital library)
- The library book club (descriptions of featured books by Penn authors that are available for purchase)
- The library catalogue (Franklin)
- Alumni news and events, and information about donating. Of the databases available on the library’s main page, 22 are available to anyone, and 187 can be accessed only by Penn faculty, students, and staff. A section for online journals has 951 available for anyone and 4,877 restricted to Penn faculty, students, and staff.
The reason for this is money. The University pays a fee each year to license certain online databases and journals. Sometimes Penn pays as part of a consortium, sometimes alone. To make the materials available to the 40,000 students and others in the University community costs one amount, but to make them available to the 248,000 people in the alumni community is much more expensive.