MEDITERRANEAN TARGETED PROJECT ASSISTANCE IN DATA MANAGEMENT
MEDITERRANEAN TARGETED PROJECT ASSISTANCE IN DATA MANAGEMENT
Racapé1, J. F., Pfeiffer2,
K., Stanislas3, G., Berle4,
H. K., Ruester5, A., Philipovitch1,
T., Hellmann2, B. & Nodar3,
D.
1 CETIIS, BP 33000, F-13791 Aix-en-Provence Cédex
3, France
2 HYDROMOD, Scientific Consulting, Bahnhofstr.. 52,
D-22880 Wedel, Germany
3OCI Services, 3 Ave de l'Europe, F-31000 Toulouse,
France
4CRI/AS, Bregnerødvej 144, P.O. Box 173, DK-3460
Birkerød, Denmark
5Freie Universität Berlin, Institut für Geologie,
Geophysik und Geoinformatik, Malteserstr. 74 - 100
(Haus D), D-1000 Berlin 46, Germany
Introduction
The MADAM Project is a MAST supporting initiative to
assist the Mediterranean Targeted Project (MTP) in
data management. The purpose of the project is to implement
or test issues to manage the very large amount of data
collected during the course of the MTP projects.
In that context, the MADAM project aims to develop (or
test) in parallel the two ways of managing this kind
of data set :
*the centralised way, by gathering all the data available
in a unique data set and publishing it at the end of
the project on a unique medium,
*the distributed one, by testing a network of distributed
servers based on the actual WWW tools.
The objectives of the project are outlined on Fig. 1.
It is possible to point up three main objectives:
* To compile and produce a MTP Data set. As far as
possible, this data set will gather the data collected
during the ongoing MTP. The MTP Dataset publication
on a CD-ROM is scheduled at the end of the project
(i.e. the end of 1996).
* To set up and animate a network. This network can
provide links between the MTP partners and a Focal
Point located at CETIIS, Aix-en-Provence. The collected
data are gathered in this focal point where a MADAM
WWW server is installed and propose information on
the ongoing project.
* To carry out demonstrations on future networking issues
for data management, in particular with Internet tools.
Description of the data collected during the MTP
The first task of MADAM was a survey of each project
in order to collect information for future adaptation,
modification and - if required - establishment of procedures,
guidelines and standards to receive and merge the MTP
data into the joint data set. The survey was performed
by direct interview with each project co-ordinator.
A questionnaire has been established to help the MADAM
interviewers in collecting the relevant information.
This questionnaire was sent to each MTP co-ordinator
preliminary to the interviews. A survey report was
written from information collected from diverse sources:
*filled-in questionnaires,
*interviews with co-ordinators,
*technical annexes of the projects,
*ROSCOP forms of MTP cruises,
*some annual reports.
The great diversity of the MTP projects in terms of
area of interest (Fig. 1), scientific domain, methodology,
scales of interest, etc., leads to a great diversity
of data collected during the course of the MTP. Furthermore,
the MTP is very "data productive" and a great
amount of data has already been collected during the
course of the project the diversity of which is very
high in terms of:
*scientific domain: hydrology, currents, biology, sedimentology,
biogeochemistry,
*type of data: vertical profiles, time series, discrete
data, underway measurements,
*formats.
Fig. 2 and table 1 give a general presentation of location
of the data collected during the MTP. Data has been
gathered into four main groups according to the following
understanding:
*HYD: Hydrological data collected with CTD and bottles
(including basic biogeochemical data such as nutrients,
POC, PON, DOC, DON, Chlorophyll, etc.)
*CUR: Current data obtained either from moored current
meters (classical SED: Sediment data obtained by sediment
traps, sediment cores, box cores, etc.
*BIO: Pelagic and benthic macro and micro-biological
data
Some cruises have gathered several MTP projects. These
cruises has been included only once in table 1. The
survey of CINCS, PALEOFLUX and PELAGOS is not complete
in this day. The number of specific cruises of these
projects is assessed from the respective Technical
Annexes.
The MTP Dataset
One of MADAM's main activities and goals is the compilation
of a unified dataset which includes most of the data
collected by the research cruises and field measurement
conducted within the MTP. This activity will be conducted
as far as technically possible and as far as input
is provided from the scientific teams of the MTP in
due time while the project is operative. To a limited
amount this dataset will include supporting information
for instance about projects, institutions, scientists,
cruises, research vessels, or instruments used.
This so called MTP Dataset will include a library of
tools supporting its customisation and user-friendly
operation. The dataset will be operated via a geographically
based graphical user shell controlling the data and
information access. The user can select, filter, browse,
or display data upon his own research needs and individual
requirements for investigations. However, the data
assessment will be limited to a standard and feasible
level ensuring operation with low complexity. For more
sophisticated evaluations authorised customers will
have the possibility to extract data in a standard
file format for further evaluation and processing with
their own software packages they are used to.
According to the present plans and schedules the MTP
Dataset will include the following data acquired within
the MTP:
*Vertical profiles as acquired by CTD and other multi-parameter
sondes, XBTs, XCTDs, rosette samplers, or other bottle
samplings as well as single probes (i.e. treated as
vertical profiles with one set of values) or bottom
samples (treated as sub-bottom profiles or single samples)
in processed or analysed and quality controlled form;
*time series as acquired by moored instruments in processed
or analysed and quality controlled form;
*track-related data as acquired by automatic ship- or
drifter-borne devices (e.g. navigational logs, meteorological
logs, T-S logs, ADCP) in processed or analysed and
quality controlled form;
*data header information as described for the MODB/MEDATLAS
headers and as far as necessary supplemented by additional
comments included in the respective header section;
*supporting information (text, references, figures)
as far as available and possible to integrate with
reasonable work effort and as far as they do not extent
storage limitations of the CD-ROM or accessing efforts
by PC software tools.
The biggest amount of data will be standard oceanographic
data. Biogeochemical data are included to the extent
as described in the Technical Annex of the MTP. So
far no laboratory data and results of numerical modelling
are included. Also some very special data (e.g. as
derived from tracer oceanography or ocean tomography)
will not be included into the dataset as they require
either special interpretation technologies and knowledge
or will not be available in a reasonable period of
time.
The quality control of data is strictly assigned to
the obligation of the scientific teams who have acquired,
processed, or delivered the data but some quality information
are included in the data itself.
The dataset design will be in open in such a way that
it can be easily upgraded by additional tools upon
if users and customers shall so require. The dataset
will be designed open for incorporation of supplementary
data and information in a similar manner as conducted
within the MADAM project. We consider this as very
important as it is already feasible that not all the
data collected within the MTP and the thereto related
information can be made available in the short period
of time between the envisaged dates of close-out or
data delivery assigned for most of the MTP projects
and the end of the MADAM project.
Special emphasis is dedicated to define convenient and
as far as possible standardised data formats for the
large variety of multi-disciplinary data which will
be delivered in heterogeneous data formats by the scientific
teams. For vertical profiles and cruise describing
data the MADAM team has already decided to use and
apply the MEDATLAS format as the standard for the MTP
Dataset. This ensures compatibility to many other activities
within the MTP II and hopefully also within the MAST
III Programme. Other data envisaged for incorporation
in this dataset will be time series and ship-track
related data. Therefore we have applied and modified
existing data formatting concepts as `"close by
as possible''"to the MODB-format. This could result
in an interesting and efficient contribution for a
large variety of applications as it deals with commonly
known problems of handling and managing oceanographic
and -- more generally -- geographically referenced
data arising from different measurement and survey
concepts and being processed in different institutions.
After completion of the adaptation and data merging
activities the MTP Dataset will be published in a limited
amount on CD-ROM ready to operate on a state of the
art personal computer under the MS. WINDOWS operation
system. We expect the MTP Dataset to become a unified,
merged, and comprehensive collection of data and information
acquired and processed within the MTP for "everybody's
use and benefit". Beside the value of such a
dataset for future research and science we trust that
this dataset will depict and illustrate how the results
of research activities of many disciplines, institutions,
and scientific teams from all over Europe can be finally
brought together, integrated, and disseminated.
However, the success of this task depends on the input
and support from the scientific teams and the National
Oceanographic Data Centres assigned for delivery and
banking of data acquired by individual MTP sub-projects.
We are confident that these respective MADAM activities
can be optimised and closely combined with data delivery
procedures and work and that we thereby assist and
support the scientific teams and diminish their thereto
related efforts and work load.
MADAM WWW PRESENTATION FOR MTP
During the course of the project, the MADAM Consortium
will collect a lot of information, on scientists, on
projects, data sets... All those information will benefit
to MTP members if available. This is one of the purposes
of the MADAM WWW Demonstrator, to demonstrate how the
World Wide Web and Internet technologies will help
the scientist to be more informed, to contribute more
immediately to the MTP effort, and benefit in quite
"Real time" of the other members contribution.
Why those tools? The Internet was selected due to its
wide diffusion inside scientific teams, yet enabling
them to communicate electronically. The World Wide
Web was selected due to his simplicity of use, this
technology hide efficiently the apparent complexity
of Internet making a world wide electronic travel as
easy as a book reading.
The Demonstrator main objectives are:
*To make all the available information about MTP accessible
from a WWW Browser (like Mosaic or Netscape). Thanks
to this interface, users can connect from everywhere
and be aware of the evolving information easily and
much more efficiently than in a book or newspaper
*To improve the MTP Inter-Projects communication through
the Internet, helping scientific teams to be present
on the World Wide Web, and communicate between them
with the Internet related tools. And also helping them
when they want to set up their own World Wide Web Server
giving them tools and knowledge needed to manage as
easily as possible their server
*To help to identify and collect Data sets and keep
quickly informed other members, Internet WWW will become
the fastest way to share information between a disseminated
group of scientist like in MTP
*To provide on-line access to the evolving MTP Dataset
with all the security requirements, as diffusion of
the Data or documents must be controlled by the scientists,
they will decide to who and when the data will be delivered
*To try to make available the MTP Dataset simultaneously
On-the-Net and on a CD-ROM in a new complementary
way. Trying to take advantage of the high capacity
of a CD-ROM and the regular evolution of an on-line
archive.
To conclude, hereafter is a short passage of a book
present on the Internet: "What is the Internet
World Wide Web?
From the user point of view, the World Wide Web is information,
a great tangled web of information. The user doesn't
care anything (well, almost anything) about where the
information is stored, about how it's stored, or about
how it gets to his screen - he just says ``Oh, that
looks interesting'', clicks the mouse, and (after a
short time), the information arrives... " (Jon
Crowcroft)
But to make this true, there must be someone, somewhere
who give this information to others, Internet and Web
will then help users to retrieve easily the great amount
of information disseminated in several places.
Acknowledgements
This work is undertaken in the frame of a MAST supporting
initiative for ocean data and information management
under contract MAS2-CT94-0097. The participants acknowledge
with gratitude all the scientists involved in the MTP,
without the helpful collaboration of them this project
would not be achievable.