The IBUKI Named Entity Hierarchy
IBUKI uses IBML as the basis if its Named Entity Hierarchy. IBML provides the
following features.
IBML types sometimes behave like schemas
Hierarchical organization with multiple inheritance
multiple inheritance means that a thing may have
more than in type and they do not need to be comparable
eg ass names a kind of animal
ass names a body part
multiple inheritance means that the types
a 3 toed 2 legged animal
a 2 legged 3 toed animal
are the same, ie, the order of specifying the attributes of a type
doesn't matter. This is not true of single inheritence systems,
eg, Java classes.
abstract objects/types are available
My apple is an example of Pippen apple
Pippen apple is a subtype of apple
apple is a subtype of fruit
IBML deals with terms not words
terms are case sensitive and may incorporate spaces and punctionation
eg, |Pippen apple| not 'pippen_apple'
|Schrodiger's cat| not ???
A set of basic entities types comes predefined for terms
Entity sets are easily extended by adding new definitions
with easy to use tools for creating and editing entity sets
Extensive database of nameable objects and their properties
'JFK' names a person usually called 'John F. Kennedy'
The data entry for this preson knows
that he was the 35th president of the United States
terms may have many different senses.
This and other lexical information is associated with terms
somtware tools for language processing
a broad generalization of the more traditional Named Entity sets (e.g., The
first - defined during MUC (Grishman et al., 1996), the set developed by IREX
(Sekine et al., 2000), and the the Extended Named Entity hierarchy (Sekine
et al., 2002)).
The usual applications for Named Entity recognition include Questions and
Answering (Q&A), Information Extraction (IE), Machine Translation (MT),
Summarization and Information Retrieval (IR) and general search.
The previous Named Entiry systems have rigided type hiearchies as determined
by the designer. The IBUKI Named Entity Hierarchy is described as a
collection of IBML types and which can be replaced by sonething other
definitions of the provided ones seem poorly chosen. The software for
actually doing NER simply takes this collection as a parameter.
look at: http://nlp.cs.nyu.edu/ene/version6_1_0eng.html