Ecology is a young and evolving science. Historically, approaches to ecological data collection have been individualistic in nature (i.e. focused only on the specific question being addressed), undertaken at small spatial scales over short time periods and at relatively low cost.
More recently, there has been growing awareness of the need for more multi-disciplinary approaches, in recognition of some of the new problems being faced (e.g. impacts of climate change on biodiversity (3,130 KB)). To address these more integrated issues, much larger volumes of data are required over broad spatial scales and long time periods.
The sheer breadth of ecological diversity and complexity means there are many different aspects of ecosystems to understand and hence a myriad of different observations that can be recorded. Observations can range from that of individual organisms and interactions, through to populations, communities, ecosystems and across broad global landscapes. Over time, the focus of ecological research changes in response to current paradigms, problems and technologies, which in turn affects the kinds of ecological observations recorded. This diversity of observations provides a major challenge for data management systems.
Fragmentation of methods and classification
There is often very little consistency between datasets in terms of methods and approaches, classification systems, data storage structures, and information expression (e.g. measurement units, datums and other values.). This inconsistency is termed ‘fragmentation’. In addition, specific storage structures can often make adding new data to an existing dataset difficult, leading to further fragmentation (i.e. different parts of the same dataset held in a corporate system and a standalone database). At a national level, this fragmentation makes federating and integrating different datasets extremely challenging as they don’t ‘match up’, a problem akin to the different rail gauges originally used from State to State across Australia. In isolation, the individual specifications were unimportant, it was only when the system was to be integrated nationally that problems arose.
Comprehension and Re-use
During data collection implicit and contextual information are rarely stored with the data. Details which are obvious at the time of collection, or those which do not directly support the observation, are generally not recorded. In order for other researchers to interpret datasets accurately (specifically in terms of their fitness for repurposing) characterisation of the data content and context in terms of scope, coverage, sampling, accuracy and other associated factors, is critical. How to locate and integrate this contextual information to ensure optimum data comprehension is a major challenge.
Australia has a huge wealth of ecological knowledge, collected over many decades by dedicated researchers and managers. A major obstacle to routine synthesis of these data is the current dispersed nature of data storage at a national level. Data are typically stored independently in each State and Territory across a broad range of different data management systems, each with a specific thematic focus (e.g., pastoral and biological survey databases). These systems are rarely interoperable and are often combined with other offline or otherwise inaccessible storage systems. Bringing these disparate data collections together in a single integrated system presents arguably the biggest challenge, and hence the greatest opportunity to improve ecological research and policy capacity in Australia.
The Current Data Management Problem
These key data issues lead to a difficult problem to resolve with traditional data management approaches:
A new flexible approach is therefore needed to address method drift over time, additional data, modified scope, and changes in organisational intent and purpose.
When we talk about ecological data, we need to recognise that it is not just the data that is important. We need to recognise the context-dependency of the data – i.e. that there is significant additional detail required in order to correctly represent it so the data users can draw valid rather than invalid conclusions about its appropriate reuse.
One of the key differences in the ÆKOS over other data publication platforms is that the data are put in context by using ontological, description and indexing models to ensure that the related information to support appropriate interpretation is supplied to data users.