[NLP2RDF] Extending NIF Ontologies

Carina Haupt carina.haupt at scai-extern.fraunhofer.de
Fri Jan 13 11:30:01 CET 2012


Dear Seabstian,

thanks for your reply. I think I have been a bit fuzzy too with my 
description.

Am 13.01.2012 07:44, schrieb Sebastian Hellmann:
> Dear Carina,
> nice to hear that NIF fits your Use Case, I have several comments inline.
> I think my answers are a little bit fuzzy, as I would need more concrete
> examples to answer more precise.
>
> On 01/11/2012 05:06 PM, Carina Haupt wrote:
>> Hi,
>>
>> I am part of the OpenPHACTS project and it is my task to present
>> textmining results in RDF. Therefore I am using the String as well as
>> the SSO ontology of NIF. It allows me to represent most of my
>> information, but unfortunately some predicates and classes are missing
>> for my use case.
>> What I want to do is to represent not only the annotations made by the
>> text mining tool, but also the texts they were found in, as well as
>> the concepts which the annotations represent.
>> To represent the texts I use dc-term and to describe the concepts skos.
> NIF uses URIs to represent texts or fragments of text. So I do not
> really understand what you mean by "To represent the texts I use
> dc-term" . Does it mean you annotate NIF URIs (representing texts) with
> the dcterms vocab?

I mean that I use dcterms to describe the publications. I use for 
example, author, title, abstract, etc.

>>
>> What I am missing is the connection between a concept and an
>> annotation, as well as a type for the annotation itself. In my
>> Institute (Fraunhofer SCAI), we call such an annotation a hit. To be
>> able to complete my RDF schema I extended the ontology by adding
>> pao:Hit and pao:incarnationOf (pao stands for Prominer Annotation
>> Ontology and is based on SSO). pao:Hit thereby is a subclass of
>> string:String and sso:incarnationOf needs a skos:Concept as domain and
>> has pao:Hit as range.
> Hm, as far as i understood it the pao:incarnationOf property is quite
> similar to the scms:means property, which is used to connect Strings
> with DBpedia Entities. So your basic use case is that you have a text
> with "Mentions" or "Hits" which are mapped to "Concepts" . Using your
> own property for this is fine. You could also use dcterms:subject
> directly. An example would really help here.

An example would be:

scai:Hit123 rdf:type pao:Hit .
scai:Hit123 rdfs:label "Acetylsalicylsäure" .
prominer:concept123 pao:incarnationOf scai:Hit123
prominer:concept123 rdf:type skos:Concept .
prominer:concept123 rdfs:label "Aspirin" .

 From your explanation I suggest that scms:means is exactly the property 
I need. I just didn't used it, because it is missing any description 
(see 
http://nlp2rdf.lod2.eu/schema/doc/string/objectproperties/means___-186529037.html).

> Can we have a look at the
> prominer ontology? We could include the ontology into NIF and also
> generate Java Classes (OWL2Java) and Documentation (OWLDoc) for it.

I attached the ontology. It does not contain much, since most things are 
covered by your ontologies.

>>
>> Next to the text mining results I also store provenance information
>> where I need to describe the used text mining tools. I think this use
>> case is not covered by NIF so far, but should be suggested in further
>> development. Im my case I added the class pao:Annotator and the
>> predicate pao:annotatorClass.
> That is a problem of RDF in general and it is indeed an issue that has
> not been solved. In general, the NIF architecture pushes this problem to
> the client. So if there is a request, the client receives RDF data and
> then needs to store it in a way that it can attach provenance
> information. E.g. it could make one Named Graph for each different tool
> it requested or partition it with higher granularity (e.g. a named graph
> per tool and per request). Another possibility is to use the "prefix"
> variable and encode the provenance in the URI, e.g.
> http://prominer.org/syntax/tool/doc1#....
> In case you would like to annotate individual triples, OWL axiom
> annotations would be a possibility although they increase the size of
> the model immensely.

Actually I am doing exactly what you suggest. I add provenance to the 
different graphs using the OPMV schema. But to fill this schema with 
senseful information I somehow have to describe the tool I used to 
create the annotations and therefore I need the predicate 
pao:annotatorClass and much more the class pao:Annotator.

> For your specific application you might also mix RDF with a relational
> database table : varChar:TripleID, varChar:key, varChar:value
> As TripleID you can use the md5 hash over the NTriple serialization of
> the triple. This should be sufficient unique. Do not forget to index the
> first column otherwise it will be very slow ). key and value would be
> the provenance info. It only works one-way of course.
>
> Regards,
> Sebastian
>

Regards,
Carina

-- 

Carina Haupt

Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
Schloss Birlinghoven
D-53754 Sankt Augustin

Tel.: +49 - 2241 - 14 - 3480
E-mail: carina.haupt at scai-extern.fraunhofer.de
Internet: http://www.scai.fraunhofer.de

and

Bonn-Aachen International Center for Information Technology (B-IT)
Dahlmannstrasse 2
D-53113 Bonn

E-mail: hauptc at informatik.uni-bonn.de
Internet: http://www.b-it-center.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pao.owl
Type: text/xml
Size: 4056 bytes
Desc: not available
URL: <http://lists.informatik.uni-leipzig.de/pipermail/nlp2rdf/attachments/20120113/eba93964/attachment.xml>


More information about the NLP2RDF mailing list