Environmental Data Management
Data have become a critical issue for contemporary science…
In recent years data-driven approaches to science have become increasingly dominant features of the scientific landscape. They are characterised by being highly collaborative, observation-driven, data-heavy and demanding heavy computing power. They are undertaken by scientists accessing well-managed data.
The movement towards data-intensive science is of considerable significance, such that some commentators are mooting it as the next great paradigm in scientific history.
…and data bodies represent significant latent value.
New Zealand’s vast and varied data assets represent significant taxpayer and ratepayer investment, going back many decades.
As well as the valuable information they hold, these data bodies represent thousands of hours of expert time in collection, and in many cases ongoing effort to curate them.
Data holdings are growing exponentially; since the “digital revolution”, commonplace equipment can now produce and hold orders of magnitude more data than was previously possible.
Mobilising them for reuse unlocks this value…
Re-using existing data--often for purposes and in ways the original collectors never envisaged--is the source of potentially immense benefit to society.
Internationally, governments’ and scientific communities’ recognition of this is driving the burgeoning “open access” movement in scientific practice and science policy (see links for more).
Federated (distributed) data infrastructures mean data can remain with their holding agencies and be accessed remotely, and the construction of large data ‘warehouses’ is often unnecessary.
...and also starts exercising new e-science infrastructure.
High-powered e-science infrastructure (such as grid computing and the KAREN) comes into its own when it is being used to capacity. This enables scientists to undertake powerful new research and analysis, using e-research infrastructure to process large volumes of mobilised data through e-science tools such as multi-layer geospatial visualisations and sophisticated predictive models.
Mobilising data requires some changes from the status quo in science systems.
Internationally, there is strong consensus that tapping into the benefits requires subtle but pervasive shifts in scientific practice, science policy and science funding. There is a variety of barriers to making these changes.
The environmental RS&T sector holds particular promise...
Data-heavy science, using large multivariate models, is the only way for contemporary society to get information it needs to tackle major challenges. Many of the most critical are environmental issues, such as climate change, sustainable resource management and natural hazards.
The environment research community is leading the way in many aspects of e-science, with nodes of excellence in areas such as the development of infrastructure for collaborative e-research and data-sharing, and in internationally-federated data infrastructures.
Some of New Zealand’s most significant publicly-funded datasets are in the environmental RS&T domain - from datasets created by individual research projects and local government monitoring, through to the Nationally Significant Databases.
…so as part of its push for re-use of scientific data, MoRST has developed a set of Principles for Managing Publicly-Funded Environmental Data .
The principles are here. They signpost the direction of the Government’s intention for publicly-funded environmental research outputs.
The principles are designed to guide organisations towards this vision:
In 2015, open access to environmental research data from public funding is easy, timely, user-friendly and preferably web-based.
This year MoRST is working in all spheres of the environment RS&T domain to encourage pursuit of these principles:
- We are working with the Foundation for Research, Science & Technology and other funding agencies to encourage the RS&T sector to progress towards open access and the requisite data management practices
- We are supporting other government agencies’ work on open government information in the State Services Commission, Department of Internal Affairs, the National Library and Archives New Zealand
- We are supporting agencies’ opening up of their own information, such as Ministry for the Environment, Ministry of Fisheries, local government, the Department of Conservation, Land Information New Zealand (LINZ) and the Geospatial Office
- We will be celebrating leaders and encouraging progress alongside individual research organisations and across the sector, such as the New Zealand Organisms Register, Landcare Research and NIWA
- We are engaging with the private and NGO sectors in the open access to government information movement
Putting principles into practice
The principles are designed to signpost the direction of the Government intent for environmental research outputs. In practice, mobilising New Zealand's data bodies for re-use in scientific activity involves:
- standardised data collection, storage and metadata (information about data), and "translators" between obsolete and modern standards
- using accepted protocols so that systems holding and disseminating data can interoperate
- "open access" as the business as usual paradigm in science systems
- science funding that prioritises data management as an integral part of data collection
- science policy that encourages sharing and re-use, and promotes an open access approach.
MoRST has an established work-stream dedicated to environmental data access and management, and has commissioned some key research to investigate how practically to encourage best practice.
The 2007 Environment Data 2.0 report presents a picture of the environmental data 'landscape’ in New Zealand. It confirms the existence of immense public benefit in freeing up and reusing environmental data, and enabling the environmental RS&T community to harness eResearch.
For more information on environmental data management contact Isabella Cawthorn at Isabella Cawthorn or (04) 917 3066.