[NLP2RDF] document and corpus level aggregates

Thu May 30 09:01:58 CEST 2013

On 30 May 2013 16:39, Felix Sasaki <fsasaki at w3.org> wrote:

Well, do avoid the problem you need two pieces of information:
> - document URI independent of complete character range
> - document URI + complete character range
> http://example.com/exampledoc.html#=char=0,29 gives you both, and the
> ability to distinguish between different calculations of complete character
> ranges.
>

<http://example.com/exampledoc.html#=char=0,29> xx:wordcount 5 .
<http://example.com/exampledoc.htm<http://example.com/exampledoc.html#=char=0,29>l>
xx:wordcount 5 .

These are two separate statements and not related unless we say

<http://example.com/exampledoc.htm<http://example.com/exampledoc.html#=char=0,29>
l>
        xx:full_character_range <
http://example.com/exampledoc.html#=char=0,29> .

which of course you could assert.

I guess the question is for a processing component that wants to make an
assertion in its output about the document as a whole so that a subsequent
step can use it.  Should it use the input document URI or make an assertion
about the character range that it used to represent the document
internally.  Given that the character range might be different between
different components, it would seem useful to have a way of making
assertions about the whole document that didn't depend on how it was
pre-processed.

Can you give a triple and a sparql query that only works if we drop
> #=char=0,29 from the URI?
>
> Well, it would be the result of two components making assertions about
different character ranges each believing that it is making an assertion
about the whole document.

Steve

-- 
Department of Computing, Macquarie University
http://web.science.mq.edu.au/~cassidy/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.informatik.uni-leipzig.de/pipermail/nlp2rdf/attachments/20130530/81a65d99/attachment.html>