[NLP2RDF] Changes to NIF Ontologies was: (Re: Extending NIF Ontologies)

Carina Haupt carina.haupt at scai-extern.fraunhofer.de
Thu Mar 8 15:37:23 CET 2012


Hi Sebastian,

sorry for the late reply, but we had some further schema development on 
our side.



On 16.02.2012 18:40, Sebastian Hellmann wrote:
> Hello Carina,
> finally I found some time to review everything and think about an
> extension of NIF. I extensively talked to the Raphaël Troncy and
> Giuseppe Rizzo of the NERD project [1] and we seem to converge finally.
> Pelase give us feedback. I attached an early draft, which is already
> outdated again.
> Here are the *proposed* changes:
>
> 1. the offset URI will not have a human readable part any more. It
> serves no function after all.
> 2. The class str:Document is replaced by str:Context. The definition of
> Context is defined in the attached PDF .
> 3. All URIs of type str:String have to refer to an element of the
> powerset of the concatenation of Unicode characters. So they will get a
> strict formal interpretation.
> 4. scms:means will be removed.
> 5. we will include 2 properties sso:oen and sso:oec which allow to
> attach Linked Data URIs to Strings. (this is One Entity per Name and One
> Entity per Context)

I think this changes make sense. But I would suggest to not name the 
predicates sso:oec and sso:oen since this labels are not understandable 
without background knowledge.

> For the string to entity part, we should explain the so-called variant
> 1, and mentions 3 cases:
> - case 1: a NER extractor has provided a linked data URI to disambiguate
> the entity ... we re-use it
> - case 2: a NER extractor has provided a non-linked data URI to
> disambiguate the entity (typically, the foaf:homepage of an
> organization) ... we mint a new linked data URI
> - case 3: a NER extractor does not provide disambiguation links ... we
> mint a new linked data URI
>
> We are still unsure how the Linked Data URI will look like though...

We are actually linking our entity to an own data (or concept) URI which 
then again is linked to an existing data (or concept) URI.
What exactly do you mean with "linked" data URI? Does is mean that the 
URI has to part of an existing database or that it has to be 
dereferenced or just that it has to be a URI at all?

To the concept and hit problematic: We decided to use the AO (Annotation 
ontology) schema. The advantage of their schema is that they already 
deal with versioning, storing provenance information to annotation sets, 
and especially they can also handle i.e. images, which is one of our 
next steps. Here sso does not fit our needs since we do not only want to 
do text but data mining.
But we will still use sso. We plan to use the AO schema only for the 
basic structure to connect the documents with the annotations and these 
with the concepts (which have to be of type skos:Concept or we could not 
use skos to describe the relations between the concepts).
AO also has so called Selectors which describe the annotation. We want 
these to be subclasses of NamedEntity, Relation, Image, etc., and we 
would like to reuse all text mining related classes which become part of 
sso.

Best regards,
Carina

> It is still open how to attach skos:concepts :
>
> On 01/18/2012 02:44 PM, Carina Haupt wrote:
>> I would propose to generate a property like pao:incarnationOf
>> (actually I am not 100% happy with this expression), which needs a
>> pao:Hit as domain and skos:Concept as range, and also is a subproperty
>> of dc-terms:subject and perhaps also of scms:means. But to be able
>> include scms:means, we would first need to have it's definition, so
>> that we can check if everything is consistent.
>
> I would suggest to name the class "NamedEntity" as this would cover all
> three occurences (OEN, OEC, skos:Concept)
> Here is what Raphael said:
>
>> dcterms:subject seems to fit well:
>> :offset_x_y dcterms:subject
>> <http://dbpedia.org/resource/Category:International_nongovernmental_organizations>
>>
>
> "I thought about that ... but this predicate is very general, on
> purpose, while I think here we want to be a bit more precise, stating
> that a particular string of chars, that happen to be recognized as the
> label of a real world named entity, occurs within a context ... so I
> would prefer creating a new predicate to materialize this semantics,
> thus the sso:oen ... now I'm happy if you define this term in the sso
> ontology or at least if we agree on the definition. "
>
> sso:oen could have a NamedEntity as Domain. It could cover both use
> cases, i.e. any Entity including skos:Concepts. Or we could make a
> separate Property. Having NamedEntity as Domain and skos:Concept as
> Range. I am also not 100% happy with calling it "sso:incarnationOf" Any
> suggestions?
>
> I am not sure how you provenance model can cope with the new grounding
> of String and Context on Unicode. I hope it separates the layers more
> nicely now...
> If I look at your image Datenschema.png I think you would need to
> replace str:Document with foaf:Document and then define a str:Context
> node and connect via a property. Should we call it str:occursIn with
> Domain str:Context and Range foaf:Document?
>
> Sorry again for answering so late. Ontologies seem to need endless
> discussions. But I think, we are close to covering the core concepts of
> the NERD domain ....
> All the best,
> Sebastian
>
> [1] http://nerd.eurecom.fr
>
>

-- 
Carina Haupt

Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
Schloss Birlinghoven
D-53754 Sankt Augustin

Tel.: +49 - 2241 - 14 - 3480
E-mail: carina.haupt at scai-extern.fraunhofer.de
Internet: http://www.scai.fraunhofer.de

and

Bonn-Aachen International Center for Information Technology (B-IT)
Dahlmannstrasse 2
D-53113 Bonn

E-mail: hauptc at informatik.uni-bonn.de
Internet: http://www.b-it-center.de


More information about the NLP2RDF mailing list