[NLP2RDF] document and corpus level aggregates

Steve Cassidy steve.cassidy at mq.edu.au
Thu May 30 08:24:01 CEST 2013


Thanks Felix, is there a difference though between making an assertion
about the document and making one about the string that results from
pre-processing the document?

It's probably not an important point but it seems odd to me to qualify it
in this way.

Steve


On 30 May 2013 16:14, Felix Sasaki <fsasaki at w3.org> wrote:

>  Am 30.05.13 08:07, schrieb Steve Cassidy:
>
>
>
>> The basic unit in NIF is the nif:Context, so the document-level is
>> covered, when the string in a nif:Context equals the content of a document.
>
> ...
>> <Alcoholism.txt#char=37028,37043>
>>         a  nif:RFC5147String ;
>>         nif:beginIndex "37028" ;
>>         nif:endIndex "37043" ;
>>         itsrdf:taIdentRef <http://dbpedia.org/resource/Benzodiazepine> ;
>>         nif:referenceContext <Alcoholism.txt#char=0,91429>  .
>>
>
>  Just wondering why you don't use <Alcoholism.txt> when making assertions
> about the document as a whole rather than giving the entire character range
> as a qualifier.
>
>
> Hi Steve,
>
> Sebastian may have a different answer, but here is my view from how this
> is used in ITS 2.0: when you convert a  document like
>
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-HTML-whitespace-normalization
> to NIF, you will make a lot of decisions what to drop (white space nodes,
> content of HTML "head" or "script" inside "body") and how to segment (e.g.
> not extract content of "span" separately but rather as part of "p").
> nif:referenceContext gives you together with nif:isString clear information
> what the extracted complete string is.
>
> Best,
>
> Felix
>
>    Presumably the same assertion would be true of
> <Alcoholism.txt#char=0,91427>  too but if you are trying to encode document
> level meta-data and you have an identifier for the document, why not use
> it?
>
>  Steve
>
>  --
> Department of Computing, Macquarie University
> http://web.science.mq.edu.au/~cassidy/
>
>
> _______________________________________________
> NLP2RDF mailing listNLP2RDF at lists.informatik.uni-leipzig.dehttp://lists.informatik.uni-leipzig.de/mailman/listinfo/nlp2rdf
>
>
>


-- 
Department of Computing, Macquarie University
http://web.science.mq.edu.au/~cassidy/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.informatik.uni-leipzig.de/pipermail/nlp2rdf/attachments/20130530/a5a71731/attachment.html>


More information about the NLP2RDF mailing list