[NLP2RDF] Changes to NIF Ontologies was: (Re: Extending NIF Ontologies)

Sebastian Hellmann hellmann at informatik.uni-leipzig.de
Thu Feb 16 18:40:25 CET 2012


Hello Carina,
finally I found some time to review everything and think about an 
extension of NIF. I extensively talked to the Raphaël Troncy and 
Giuseppe Rizzo of the NERD  project [1] and we seem to converge finally. 
Pelase give us feedback. I attached an early draft, which is already 
outdated again.
Here are the *proposed* changes:

1. the offset URI will not have a human readable part any more. It 
serves no function after all.
2. The class str:Document is replaced by str:Context. The definition of 
Context is defined in the attached PDF .
3. All URIs of type str:String have to refer to an element of the 
powerset of the concatenation of Unicode characters. So they will get a 
strict formal interpretation.
4. scms:means will be removed.
5. we will include 2 properties sso:oen and sso:oec which allow to 
attach Linked Data URIs to Strings. (this is One Entity per Name and One 
Entity per Context)
Note the document is outdated. There will be no Blank nodes or such, but 
this:
:offset_23107_23110 sso:oen dbpedia:W3C .
dbpedia:W3C rdf:type nerd:Organization .

For the string to entity part, we should explain the so-called variant 
1, and mentions 3 cases:
   - case 1: a NER extractor has provided a linked data URI to 
disambiguate the entity ... we re-use it
   - case 2: a NER extractor has provided a non-linked data URI to 
disambiguate the entity (typically, the foaf:homepage of an 
organization) ... we mint a new linked data URI
   - case 3: a NER extractor does not provide disambiguation links ... 
we mint a new linked data URI

We are still unsure how the Linked Data URI will look like though...

It is still open how to attach skos:concepts :

On 01/18/2012 02:44 PM, Carina Haupt wrote:
> I would propose to generate a property like pao:incarnationOf 
> (actually I am not 100% happy with this expression), which needs a 
> pao:Hit as domain and skos:Concept as range, and also is a subproperty 
> of dc-terms:subject and perhaps also of scms:means. But to be able 
> include scms:means, we would first need to have it's definition, so 
> that we can check if everything is consistent.

I would suggest to name the class "NamedEntity" as this would cover all 
three occurences (OEN, OEC, skos:Concept)
Here is what Raphael said:

> dcterms:subject seems to fit well:
> :offset_x_y dcterms:subject
> <http://dbpedia.org/resource/Category:International_nongovernmental_organizations> 
>

"I thought about that ... but this predicate is very general, on 
purpose, while I think here we want to be a bit more precise, stating 
that a particular string of chars, that happen to be recognized as the 
label of a real world named entity, occurs within a context ... so I 
would prefer creating a new predicate to materialize this semantics, 
thus the sso:oen ... now I'm happy if you define this term in the sso 
ontology or at least if we agree on the definition. "

sso:oen could have a NamedEntity as Domain. It could cover both use 
cases, i.e. any Entity including skos:Concepts. Or we could make a 
separate Property. Having NamedEntity as Domain and skos:Concept as 
Range. I am also not 100% happy with calling it "sso:incarnationOf"   
Any suggestions?

I am not sure how you provenance model can cope with the new grounding 
of String and Context on Unicode.  I hope it separates the layers more 
nicely now...
If I look at your image Datenschema.png I think you would need to 
replace str:Document with foaf:Document and then define a str:Context 
node and connect via a property. Should we call it str:occursIn with 
Domain str:Context and Range foaf:Document?

Sorry again for answering so late. Ontologies seem to need endless 
discussions. But I think, we are close to covering the core concepts of 
the NERD domain ....
All the best,
Sebastian

[1] http://nerd.eurecom.fr


-- 
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects: http://nlp2rdf.org , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: nif_draft.pdf
Type: application/pdf
Size: 231984 bytes
Desc: not available
URL: <http://lists.informatik.uni-leipzig.de/pipermail/nlp2rdf/attachments/20120216/5f3e1737/attachment-0001.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Datenschema.png
Type: image/png
Size: 86476 bytes
Desc: not available
URL: <http://lists.informatik.uni-leipzig.de/pipermail/nlp2rdf/attachments/20120216/5f3e1737/attachment-0001.png>


More information about the NLP2RDF mailing list