By Allison Campbell-Jensen
Collecting research data on Lake Superior’s seasonal variations and understanding how lake temperatures might respond to climate change is the work of Jay Austin, who is a professor in the Department of Physics and Astronomy at the University of Minnesota Duluth and member of the Large Lakes Observatory.
Ensuring that data can be shared with other researchers, however, takes the efforts of librarians and data curators at the University of Minnesota and beyond — members of the Data Curation Network (DCN).
Austin and other researchers increasingly are encountering requirements from scientific journals and grant funders to store their data in places and ways that make it available online to anyone.
“This is the direction that things are going — more transparency — and I embrace that,” Austin says. The Data Repository for University of Minnesota (DRUM), part of the University Digital Conservancy (UDC), meets these storage requirements.
To ensure that the data is reusable, however, takes expertise. When Lisa Johnston, Data Management/Curation Lead and Co-director of the UDC, noted that one of Austin’s datasets used MATLAB software, she tapped into the expertise of the broader Data Curation Network, for which the University of Minnesota is a founding member and the administrative home. The DCN coordinator consulted the pool of curators and a Johns Hopkins University data curator with expertise in that software was assigned to curate the dataset.
Data curators are a second set of eyes looking for any changes needed in Austin’s data — perhaps a file was missing, or the curator might have caught a mistake. After such a review, the data would be assuredly stored in a way that is findable, accessible, interoperable, and reusable (FAIR), which is the mission of DCN. Because of the DCN Curator’s work, Austin, his research peers, and community members seeking to use his data, benefit.
The DCN website cites 271 datasets that have been curated via this cross-institutional staffing model. Johnston says, “the Network is always learning new methods as we share and iterate our processes of data curation. It’s been amazing to do that with these 50 people,” she says.
The Data Curation Network is about four years old. During that time, Johnston, the principal investigator on a grant from the Alfred P. Sloan Foundation to plan and launch a Data Curation Network, has seen it grow from six institutions to 14. The Sloan Foundation and the Institute of Museum and Library Services have supported the DCN as it has evolved.
Duke University joined DCN in its second year. Duke was building a program and wanted to create a model connected to others, says Tim McGeary, Associate University Librarian for Digital Strategies and Technology at Duke University. Also, Duke was in the process of hiring four data curators but recognized that the demand likely would exceed their human resources so having collaborators available was important. Another valuable part of the DCN, McGeary says, is that the network is “normalizing the expectation that data curation is important to our research institutions.”
One of the Duke faculty was particularly eager to pursue data curation, McGeary says. This chemistry professor literally could not sleep at night because of receiving requests for projects and research data he hadn’t worked on for years. In contrast to the past, many research assistants do not continue in academia.
The Michael J. Fox Foundation recently joined the DCN. They “want researchers to cure Parkinson’s, and part of that is hosting the data” that they want researchers to re-use.
“The expectation was if you stay in academia, you’re taking the data with you, you’re expanding on it or continue to work on it,” he says. Outside of academia, there isn’t that level of commitment.
“After we had hired a team, he was the first to work through a process of what a data curation workflow looks like, all the way through publishing his data in the research data repository, which we had just recently built,” McGeary says. The professor now requires all his collaborators, inside Duke University and outside, to participate in data curation. And, he tells McGeary, now he can sleep at night.
While most of the data curators and data repositories belong to universities, a recent addition was the Michael J. Fox Foundation. Johnston was initially surprised that a funder of research would also serve as a data repository. “But they just want researchers to cure Parkinson’s, and part of that is hosting the data” that they want researchers to re-use, she says.
Part of the vision for the DCN, as well as its grant support, called for developing a sustainability plan for the network. At first, a fee-for-service model was considered but that was not what the data curators as a community wanted to do. They prefer to help each other, Johnston says, and they have grown to trust each other.
Recent step toward sustainability
Because of the data curators’ preferences, the move to sustainability involved developing a membership approach to address shared costs, such as the salary of the data curation network coordinator.
As of April 2021, DCN has three tiers of membership:
- DCN Sustainer Institutions contribute curator staff to the network and have full participation in governance, at a cost of $10,000 and receiving up to 200 hours of curation.
- DCN Member Institutions are just beginning data curation and receive support and consultation from the network. This tier is in the testing stage — participants have been invited — and a price has not been set for it.
- DCN Ambassador Institutions invite three DCN instructors to teach a two-day workshop for local and/or regional staff for $2,500 plus travel.
So far the DCN has grown without any advertising.
“We want to expand the diversity of institutions that are part of the Data Curation Network,” McGeary says. “If there are institutions that are just getting started or considering getting started, we can help train, we can help do instruction, we can help advocate at the administrative level.”
The network’s fiscal home is the University of Minnesota, and it benefits from the U’s in-kind support. “Everybody is thrilled we have the stability of the U of M,” Johnston says.
Surveys of data curators showed that no one wanted the DCN to be “just another professional society,” she adds. “We are very applied; we really get our hands dirty.” Annual get-togethers — All-Hands Meetings — introduce participants, demonstrate case studies, share primers and research, and invite guests, such as this year’s Portage CEG, a similar data curator network from Canada.
Building a sustainable model has not been easy, Johnston says. “Everybody knows how to run a grant; few people know how to run a sustainable organization. But we’re doing that and I’m proud of it.”