Thornton Staples - Learning Resources
Thornton Staples
Director, Office of Research Information Services at the Smithsonian Institution
Interviewed 1/28/2010/
Summary: Thornton Staples shares the story of how he discovered Sandra Payette’s paper on the Flexible Extensible Digital Object Repository Architecture system (better known as Fedora), how he liked its information architecture, and how his team tested (and proved) its scalability with 30 million objects. Staples also shares a model of how to successfully get humanities faculty hooked on digitizing their work and taking it to the classroom. What is the new frontier in digital libraries? According to Staples, it is in developing durable, extensible and interoperable repositories.
Quote: “So the new frontier is pulling it all together in a way that doesn’t get in the way of the scholars doing their work, but ends up with a durable product that can be in a repository and can be moved from one repository to another as it needs to be but is a stable part of the scholarly record, or I would even say the human record. I think the human record is the Web, and is this digital—sphere that we’re building. And—if we don’t get good at it, I think we’re in for a dark age.”
Hook the humanities faculty
The key was to hook the humanities faculty on using their digital information. This was the way to reach the classrooms and the students. The library needs to “work with faculty to digitize texts and make them available to them more generally.”
“And their notion was that if you hooked the faculty on using digital information in their work—in their research—they’ll take it to the classroom, and to spend a lot of money and a lot of time trying to inject this directly into classroom situations was a non-starter.”
“The committee that put together the original vision had enough vision to say, get the faculty hooked on their own research, and they’ll take it to the classroom. And that was really the driving—and I really don’t—wouldn’t change that. I think it really—it was successful.”
Test your system for scalability
Will your digital repository system be able to handle 30 million objects? Test it for its stability and scalability.
“So we did a new interpretation of their—of their architecture using one SQL database and one Java servelet and demonstrated all the principles would work and put thirty millions objects by doubling—like, copying objects and changing the identifiers until we had, like, forty thousand real objects and we kept duplicating them to get thirty million, and the system was still working.”
Network and join forces to get funding
Thornton Staples optimized Fedora for real use. He joined forces with Sandra Payette of Cornell to get funding on the Fedora project. Fedora is based on Sandy’s original research known as the Flexible Extensible Digital Object and Repository Architecture (Fedora). By joining forces, they were able to obtain a Mellon grant that was the beginning of the Fedora project.
“So in the meantime we started looking around for funding, and I had a couple of other Mellon grants for some other things. And Don Waters at Mellon had been the head of the Digital Library Federation right before he went to Mellon. And he’s very—he’d always been interested in FEDORA, the architecture. And we were having a drink one night at a conference, talking about another grant, and he says, “What about FEDORA? What are you guys doing with FEDORA?” and so that meant green light, green light, make a proposal, and so Sandra Payette who had done the original work at Cornell had in the meantime contacted me, saying, I’d really, she was basically getting jealous, so—we decided we got—we had a meeting. She came to Charlottesville and we had a meeting and we decided we were gonna join forces and try to get some funding. And Don had already—we’d already had this sort of opening with Mellon, so we put together a proposal. And that was the beginning of the FEDORA project.”
No rules
Sometimes, you’ll have to make it up as you go along.
“There weren’t any rules; no one knew what we were doing; we were making it all up from scratch.”
You are a peer, not the help
As technical experts, consider yourself as a peer of the faculty, and not the help. The faculty knows their subject and you know the technology.
“We were worried in the very beginning that...these faculty were very well known in their fields and we thought we were going to be treated as the help and we weren’t.”
“We were treated—we were considered peers around the table at IF because they didn’t know what they were doing either.”
“They knew their subject; we knew the technology, but none of us really knew computing and the humanities and what it meant to put these two things together.”
“So they were pretty good about it. But we were worried, because when you work in the university as a technical person, you often get treated like the help.”
A successful model
Thornton Staples shares a model that was successful in getting faculty hooked on digitizing their work and taking it to the classroom.
•You have a committee of faculty to judge proposals
•Faculty members propose projects
•When awarded by the committee, the faculty member takes a year off to work with the institute to digitize their work
•They get hooked on digitizing their work
•They take their digitized work back to the classroom to the students
“The way IF got set up is the faculty would apply, they had a project that they’d propose, there was a committee of faculty who had to say that that was an interesting enough project, presented interesting technical problems, and the scholarly—value was high enough to make it worthwhile, and then they got a year off teaching. They got office space in the institute and we were—all the technical people, myself and others—were actually housed in the institute, so we were there together for a year. So—I think—that model worked really well.”
Lessons from Staples Thornton
Focus on structure, standards and organization early on in the project as part of the research.
“I think I would have switched—not in the first year but—over the four years, if we had switched to think more about—thinking of the overall structure and the organization of these projects as being part of the research, I think we would have—we would have shortened—we would have been where we should have been sooner.”
“It was—it was very much about new technology and not about standardizing the output. And that’s a good thing, but if we had just a little more thinking that standardization is research, I think we would have—we would have put the pieces of the puzzle together better sooner. And I don’t think we’ve put those pieces of the puzzle together yet, really, at all.”
“But if we had been thinking about that…we would have arrived at the digital library platting later—with the idea that we’re not just putting—digitizing books and putting them online, that’s part of it, but we’re really have to prepare ourselves for these complex—webs, graph-like structures of related objects that—that we’re—it’s clearly dealing with now.”
“The Web, scholarly record is clearly becoming like the Web, not like books and articles and journals.”
The new frontier
The new frontier is developing durable, extensible and interoperable repositories.
“So the new frontier is pulling it all together in a way that doesn’t get in the way of the scholars doing their work, but ends up with a durable product that can be in a repository and can be moved from one repository to another as it needs to be but is a stable part of the scholarly record, or I would even say the human record. I think the human record is the Web, and is this digital—sphere that we’re building. And—if we don’t get good at it, I think we’re in for a dark age.”
“They—they worked their butts off and they worked all these graduate students for years to get these really brilliant projects out there, and they’re like built on sand, and they don’t know it.”
“And they all, you would ask them and they would tell you the library’s gonna collect it and save it forever. And you know, the library, we already knew that we didn’t know how to do that…I think people still think that the libraries or the archives are just gonna do it for them.”