<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Am 30.05.13 09:01, schrieb Steve
Cassidy:<br>
</div>
<blockquote
cite="mid:CADg8aoinp3bPqSPK=hkNwG0NHpK_b+R7Ec5L2oiVAjgkQ-SVrg@mail.gmail.com"
type="cite">
<div dir="ltr">On 30 May 2013 16:39, Felix Sasaki <span dir="ltr"><<a
moz-do-not-send="true" href="mailto:fsasaki@w3.org"
target="_blank">fsasaki@w3.org</a>></span> wrote:
<div><br>
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Well, do avoid
the problem you need two pieces of information:<br>
- document URI independent of complete character range<br>
- document URI + complete character range <br>
<a moz-do-not-send="true"
href="http://example.com/exampledoc.html#=char=0,29"
target="_blank">http://example.com/exampledoc.html#=char=0,29</a>
gives you both, and the ability to distinguish between
different calculations of complete character ranges.<br>
</div>
</blockquote>
<div style=""><br>
</div>
<div><<a moz-do-not-send="true"
href="http://example.com/exampledoc.html#=char=0,29"
target="_blank">http://example.com/exampledoc.html#=char=0,29</a>>
xx:wordcount 5 .</div>
<div><<a moz-do-not-send="true"
href="http://example.com/exampledoc.html#=char=0,29"
target="_blank">http://example.com/exampledoc.htm</a>l>
xx:wordcount 5 .<br>
</div>
<div><br>
</div>
<div style="">These are two separate statements and not
related unless we say</div>
<div style=""><br class="">
<<a moz-do-not-send="true"
href="http://example.com/exampledoc.html#=char=0,29"
target="_blank">http://example.com/exampledoc.htm</a>l> </div>
<div style=""> xx:full_character_range <<a
moz-do-not-send="true"
href="http://example.com/exampledoc.html#=char=0,29"
target="_blank">http://example.com/exampledoc.html#=char=0,29</a>>
.</div>
<div style=""><br>
</div>
<div style="">which of course you could assert. </div>
<div style=""><br>
</div>
<div style="">I guess the question is for a processing
component that wants to make an assertion in its output
about the document as a whole so that a subsequent step
can use it. Should it use the input document URI or
make an assertion about the character range that it used
to represent the document internally. Given that the
character range might be different between different
components, it would seem useful to have a way of making
assertions about the whole document that didn't depend
on how it was pre-processed.</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
<br>
I think you have the pre-processing information via
nif:wasConvertedFrom, see<br>
<a class="moz-txt-link-freetext" href="http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/nif/EX-nif-conversion-output.xml">http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/nif/EX-nif-conversion-output.xml</a><br>
and a URI like<br>
<a class="moz-txt-link-freetext" href="http://example.com/exampledoc.html#xpath(/html/body">http://example.com/exampledoc.html#xpath(/html/body</a>[1]/h2[1]/b[1])<br>
gives you the source of the NIF data before the pre-processing. It
is defined in <br>
<a class="moz-txt-link-freetext" href="http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/version-1.0/nif-core.ttl">http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/version-1.0/nif-core.ttl</a><br>
as a sub property of prov:wasDerivedFrom.<br>
<br>
Best,<br>
<br>
Felix<br>
<br>
<br>
<blockquote
cite="mid:CADg8aoinp3bPqSPK=hkNwG0NHpK_b+R7Ec5L2oiVAjgkQ-SVrg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div class="gmail_extra">
<div class="gmail_quote">
<div style=""><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Can you give a
triple and a sparql query that only works if we drop
#=char=0,29 from the URI?<br>
<div class="im"><br>
</div>
</div>
</blockquote>
<div style="">Well, it would be the result of two
components making assertions about different character
ranges each believing that it is making an assertion
about the whole document.</div>
<div style=""><br>
</div>
<div style="">Steve</div>
<div style=""><br>
</div>
</div>
-- <br>
Department of Computing, Macquarie University
<div><a moz-do-not-send="true"
href="http://web.science.mq.edu.au/%7Ecassidy/"
target="_blank">http://web.science.mq.edu.au/~cassidy/</a></div>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>