Kenning Arlitsch - Learning Resources
Associate Dean Information Technology Services, Marriott Library, University of Utah
Kenning Arlitsch discusses four very successful digitization projects: the Mountain West Digital Library (portal for digital collections about the Mountain West region), Utah Digital Newspapers, Western Waters Digital Library (primary and secondary resources on water in the western United States) and the Western Soundscape Archive (thousands of recordings of Western animal species and their environments). Learn about the “lightweight” digital library model, the power of pilot projects, the importance of marketing your digital collections to the public, and the demand for catalogers who understand linked data and how it can help create context around materials.
“Everybody, virtually everybody in the library is now a stakeholder in how their digital collections…get presented to the world. And whether the world can find them.”
Mountain West Digital Library (MWDL)
MWDL “is a portal to digital resources from universities, colleges, public libraries, museums, archives, and historical societies in Utah, Nevada, Idaho, and Hawaii.” (About)
What initially began as a project in 2001 to digitize 200 glass-plate negatives from the Utah State Historical Society’s collection has grown into a portal for over 690,000 resources (which translates to several million digital objects) from the digital collections of 17 institutions and over 60 partners including universities, colleges, public libraries, museums, historical societies, and government agencies, counties, and municipalities in Utah, Nevada, Idaho, Wyoming, Hawaii and other parts of the U.S. West.
The successful MWDL’s distributed and lightweight digital library model
- The University of Utah manages the aggregating server and the website
- Each digitization center digitizes their own collections and puts them on the University of Utah’s aggregating server
- Each center also supports and hosts other institutions (historical societies, museums, etc.) that cannot afford their own digitization and digital preservation infrastructure
- Use an OAI harvester that just harvests metadata from each of the servers back to the aggregating server; the metadata is exposed through the OAI
-- Lessons --
- Management: Hire a program director to lead, manage, and grow the digital library by bringing in new partners and new collections
- Financial: Use a lightweight model that does not rely long-term on soft money (grant funding)
- Ex Libris
- MWDL is using Ex Libris’ Primo for their harvesting mechanism and search interface.
- University of Utah recently became only the third institution in the United States to purchase Ex Libris’ Rosetta digital preservation software. With two of the other four institutions being in Utah (the LDS Church and BYU), they are now considering building a digital preservation network on the backbone of MWDL.
Utah Digital Newspapers (UDN)
The Utah Digital Newspapers Program has over 1.3 million pages. Because every article on every page is currently a separate file, those 1.3 million pages translate to about 17 million individual files.
They are gradually transitioning to the newer method of digitizing, which focuses more on creating individual JPEG 2000 files for each page, with built-in article coordinates. This will reduce the file count from 17 million down to the actual number of pages, currently at 1.3 million. They will use METS/ALTO, the new metadata and processing standard for digital newspapers. This will address scalability issues.
-- Lessons --
- If you try developing processes in-house and they prove unable to scale, find external help. Work with an outside vendor to develop the process.
- Hire a project manager
- Write the grant with the project manager
- Encourage the project manager to write and submit other grants
Western Waters Digital Library (WWDL)
WWDL “provides free public access to digital collections of significant primary and secondary resources on water in the western United States.” (Homepage)
WWDL was essentially built on the model of the MWDL. A successful concept, project and model prove that you have a track record and will help you expand and get more grants.
Western Soundscape Archive
This collection has nearly 3,000 individual sound files. They can all be streamed and listened to live on the website. Some of the files can also be downloaded and re-used for educational purposes, if the copyright or creative commons licenses allow.
-- Lessons --
Audio and video files are huge. They will require pedabyte size storage space, which is expensive.
- “Audio and video files are our biggest sector of growth.”
- “We have—we have roughly 100 terabytes of digitized data that we’re—that we’re having to manage. Over the next five years, we expect that to grow to about 250 terabytes, or a quarter pedabyte.”
- “The vast majority of that growth will be in audio and video files. Because they’re—they’re just bigger. So yeah, that creates—that creates stresses on our infrastructure, it creates stresses on our funding, and it creates huge stress on storage and digital preservation.”
A lightweight funding sustainability model
Develop lightweight projects that require minimal staff and funding.
- “The models that have been set up for Mountain West Digital Library and for Western Waters Digital Library are not contingent on new funding coming in. The model is lightweight enough that there are very few personnel, beside my IT staff, who support these, and—it just sort of vacuums up the metadata of collections that the participating institutions we hope would digitize anyway. So in that way it’s a relatively lightweight model.”
Marketing what libraries have created
Libraries, in general, do not market either themselves or what they have created very well. Libraries create products and services that are focused on the public. Libraries would benefit from greater public support if they marketed these products and services effectively.
Fund your marketing costs by building in a marketing budget into grant proposals. Create a sustained marketing plan. Engage your marketing and development departments to promote your digital collections.
- “So as popular as this program has been, I think it could be a lot more popular if we had a genuine marketing and advertising program.”
- “I think what we’re still lacking is getting into the—general consciousness of—of the public. And this particular project, the Utah Digital Newspapers Program, and Mountain West Digital Library are—are really focused at the general public.”
- “We don’t advertise ourselves well enough. And I think what we have to do—it’s a culture change, frankly, in libraries.”
- “The public doesn’t need us as much as they used to. So we have to make more of an effort to show them what we can provide. And part of that means thinking more like businesses. Actively engaging our university marketing departments.”
- “I think in general, development departments and marketing departments have to start promoting digital collections more.”
Utah Digital Newspapers is still not part of the Mountain West Digital Library due to scalability issues. Using Ex Libris’ Primo aggregating software will help make UDN more scalable. Then the metadata from the millions of records in UDN will be aggregated into MWDL. Using the new METS/ALTO metadata and processing standard will reduce the file count from 17 million down to the actual number of pages, currently at 1.3 million, and will address some scalability issues.
The power of pilot projects
Pilot project allow you to prove concepts. When you develop a concept, build a pilot website to visually and audibly show your concept. Pilots are powerful tools that help you communicate your vision to grant funders.
- “Pilot projects are always great things…they allow you to prove the concept.”
- “I hired Jeff Rice for six months just on departmental money to help us develop the concept, build a pilot website, put some—put some sound files into a collection. And then, based on that, we wrote a proposal to IMLS. And this was late 2006, early 2007 and we were funded in September of 2007. And that was another three year proposal.”
Find private donors
Focus on working with private donors to fund your digital collections.
- “Most of what I see in terms of—in terms of development, in terms of donor relations, still focuses on the building and print materials, in particular our special collections. There’s not very much of a focus yet on trying to get external funders, private donors, to contribute money to digital collections like this. And that—that has to change.”
- “How we do that? We just have to keep talking about it. Just have to keep promoting it and pushing it. And—you know, things change slowly. But they do change.”
Copyright limits knowledge increase and innovation.
- “Copyright was never intended to be for the life of the author or even beyond. The—the current copyright laws are ridiculous. It’s something like life of the author plus 75 years or maybe—maybe it’s even longer at this point.”
- “Initially—when copyright was developed by the founding fathers, it was intended to give the inventor or the author some years of profit from their—from their works. But eventually, things have to go into the public domain because that’s how innovation happens. That’s how knowledge increase happens. And locking these things up and making them unavailable—freely accessible—I think hurts us.”
ADVICE FOR STUDENTS
Scientists need help managing their data.
- “There are—there is room for tremendous innovation and the places now where I’m seeing a lot of room for growth and a lot of room for librarians really taking the bull by the horns is data management—managing—figuring out with scientists how to manage their data.”
- “Because I can tell you that most scientists really have no idea how to deal with their data. They don’t even know the right questions to ask. Everything from basic storage, where do I—where do I store my data to migrating it to—putting it in a database to make—and assigning metadata to it and making it accessible. So—so data management is a huge issue.”
Catalogers and linked data
Wanted: Catalogers who understand linked data and how it can help create context around materials.
- “Metadata itself is—I think is—I think there is more need now for cataloging librarians than there’s ever been before, it’s just a—it’s just a tremendous paradigm shift.”
- “The cataloguing librarians who are willing to think in new ways, who are willing to think about linked data and how it can—help—create context around materials is—is just enormous.”
- “And we’re behind the 8-ball. We’re falling behind in that area again. So I think there’s tremendous potential for growth there.”
Read widely and broadly
- “I would suggest that you—read widely. You don’t have become an expert in any particular field but you have to know what’s going on broadly.”
Hire the right people into the right positions, give them the power to run projects and the opportunity to creatively solve problems.
- “Try to achieve success by leading and providing opportunity.”
- “I think the best things that I’ve done in my career are hiring the right people and putting them into the right positions. Giving them the power to run with projects. You have—you have no idea what—what people can do until you present them with a problem and an opportunity and see how creative they can be. See what they can bring back to you.”
- “If you go into management, you have to be prepared to earn your salary. Which means that sometimes you have to—you have to counsel people out. You have to—deal with—with staff who are not doing a good job and who are unproductive.”
- “Always try to—to achieve success by leading and providing opportunity, but you have to be prepared to deal with the other side as well, and that’s—that’s not easy.”
Core issues in digital libraries
The core issues in libraries and digital libraries are the same. What has changed are the tools, methods and the need to work across boundaries.
- “We still have to acquire materials, we still have to organize them, we still have to make them accessible to the public and we still have to preserve them. Right? Those are the core—core issues in libraries”
- “In a digital library it’s no different. I need the special collections people to bring the collections in, I need the public services people to tell their patrons about these collections, I need the catalogers to help—help people make sense of the digital collections. None of that has really changed. It’s just the tools and the methods have changed.”
- “But what has changed is the need to work across the lines, the need to work across boundaries.”
SEO and institutional repositories (IRs)
Arlitsch’s search engine optimization (SEO) research shows that institutional repositories are practically invisible to Google Scholar. However, Arlitsch discovered that “this is less of a technical problem than it is of an administrative and a communication problem.”
“Everybody, virtually everybody in the library is now a stakeholder in how their digital collections—or how the library’s digital collections get—presented to the world. And whether the world can find them. And so search engine optimization has to be talked about broadly across all departments.”
Article: Invisible institutional repositories: addressing the low indexing ratios of IRs in Google.
Kenning Arlitsch, Associate Dean for IT Services and Patrick O’Brien, SEO Research Manager, are both from the J. Willard Marriott Library at the University of Utah.
Two of the authors’ pilot studies on University of Utah’s IR, USpace, demonstrated that when the metadata tags were converted to the more precise bibliographic Highwire Press tags, then the ratio increased from 0% to 62% on the second pilot and to over 90% on the third pilot.
The authors conclude that IRs can substantially increase indexing ratios when libraries do the following:
- Use the metadata schemas that Google Scholar recommends -- Highwire Press, EPrints, PRISM, and Bepress
- Provide precise bibliographic information in the HTML page header tags.
Libraries specify two additional factors that can improve IR content visibility to search engine crawlers:
- Addressing technical SEO issues
- Optimizing HTML tags in PDF files
This article has impact on academic libraries and institutions. Libraries invest significant hours into marketing their IRs to faculty in order to establish user buy-in and persuade faculty to deposit their publications into the IR. Libraries will be able to demonstrate to faculty that using the IR will make their publications significantly more visible. Being able to communicate personal benefits to faculty can help to establish long-term user adoption. Additionally, being able to communicate how the university can benefit from the higher rankings as a result of an improved IR can help bolster library support.
Arlitsch, K. and O’Brien, P.S. (2012). Invisible institutional repositories: addressing the low indexing ratios of IRs in Google. Library Hi Tech, 30(1), 60–81. Retrieved from http://dx.doi.org/10.1108/07378831211213210
ACRL Webinar from June 6, 2012: Google Scholar and Institutional Repositories: Improving IR Discovery (http://www.ala.org/acrl/irdiscovery)