Problems Of Data Retrieval From Entomological Databases
Problems Of Data Retrieval From Entomological Databases
Eckhard K. Groll and Andreas Taeger
Deutsches Entomologisches Institut, Schicklerstasse
5, D-16225 Eberswalde, Germany
One of the most important tasks for the DEI at present
and in future is the recording of entomological data.
This includes entomological literature, names of taxa,
description data and assignment to the systematics,
names and biographic data of entomologists, faunistics,
biological and other notes.
In this presentation, we discuss one problem associated
with the use of computerised databases for this task:
how to hold all data in a condition that allows retrieving
all the information stored in the databases. In our
opinion the main problems in retrieving results arise
from:
SEQ ParaNumbers1_0 \* Arabic \r 11.different spellings
(names) for the same objects (synonyms)
SEQ ParaNumbers1_0 \* Arabic \n2.change of the sense
of names
SEQ ParaNumbers1_0 \* Arabic \n3.same name for different
objects (homonyms)
Modelling of the data structure and normalizing the
data, necessary for designing and programming databases,
do not reduce these problems. On the contrary, many
additional data are necessary to make possible a complete
retrieval of data.
SEQ ParaNumbers1_0 \* Arabic \r 11.Different spellings
(names) for the same objects (synonyms)
Example: Names of entomologists extracted from labels,
literature and other databases
Lepeletier, St. Farg., Le Peletier, Le Peletier de Saint
Fargeau, Lep., Le Pelletier, Saint Fargeau, and so
on.
Lepeletier (convention) himself used several spellings.
All the spellings mentioned above are to be considered
to be valid, from the point of view of taxonomists
- objective synonyms. Furthermore, misspellings are
possible and to be found. Similar problems result from
the use of vernacular spellings. The solution is an
additional information (a code) to fix the connection
of all the spellings. The same code is used for all
variations.
SEQ ParaNumbers1_0 \* Arabic \n2.Change of the sense
of names
Objects sometimes may change. Unfortunately their names
often do not reflect the alterations of the sense.
Consequently, the usage of the names become ambiguous.
That means, databases should be able to store the sense
in addition to a name.
Solution: Geographical names in their historical context
Name
Explanation of the object
Code
Deutschland
until 1945
e5z
Deutschland
1945-1949
q3wr
Deutschland
1949-1990 Bundesrepublik Deutschland (Western Germany)
p9c5
Deutschland
since 1990 Bundesrepublik Deutschland (Western and Eastern
Germany)
q3wr
SEQ ParaNumbers1_0 \* Arabic \n3.Same name for different
objects (homonyms)
Example: The same name for more then one object
Name
Explanation of the name
Code
Brandenburg
town
ap3
Brandenburg
district
9h8
The examples show some of the difficulties, which are
to be found between feeding data into and retrieving
them from a database. We propose to solve these problems
by the following steps:
SEQ Outline_0 \* ROMAN \r 2II.semantic bridges should
be created;
SEQ Outline_0 \* ROMAN \nIII.these bridges should be
created and maintained by specialists;
SEQ Outline_0 \* ROMAN \nIV.these bridges should be
available for other users in a network, perhaps the
Internet.