<div dir="ltr">Thanks Felix, is there a difference though between making an assertion about the document and making one about the string that results from pre-processing the document? <div><br></div><div style>It's probably not an important point but it seems odd to me to qualify it in this way.</div>
<div style><br></div><div style>Steve</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 30 May 2013 16:14, Felix Sasaki <span dir="ltr"><<a href="mailto:fsasaki@w3.org" target="_blank">fsasaki@w3.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>Am 30.05.13 08:07, schrieb Steve
Cassidy:<br>
</div><div class="im">
<blockquote type="cite">
<div dir="ltr"><br>
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>
The basic unit in NIF is the nif:Context, so the
document-level is covered, when the string in a
nif:Context equals the content of a document. </blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">...<br>
<Alcoholism.txt#char=37028,37043><br>
a nif:RFC5147String ;<br>
nif:beginIndex "37028" ;<br>
nif:endIndex "37043" ;<br>
itsrdf:taIdentRef <<a href="http://dbpedia.org/resource/Benzodiazepine" target="_blank">http://dbpedia.org/resource/Benzodiazepine</a>>
;<br>
nif:referenceContext
<Alcoholism.txt#char=0,91429> .<br>
</blockquote>
<div><br>
</div>
<div>Just wondering why you don't use
<Alcoholism.txt> when making assertions about the
document as a whole rather than giving the entire
character range as a qualifier. </div>
</div>
</div>
</div>
</blockquote>
<br></div>
Hi Steve,<br>
<br>
Sebastian may have a different answer, but here is my view from how
this is used in ITS 2.0: when you convert a document like<br>
<a href="http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-HTML-whitespace-normalization" target="_blank">http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-HTML-whitespace-normalization</a><br>
to NIF, you will make a lot of decisions what to drop (white space
nodes, content of HTML "head" or "script" inside "body") and how to
segment (e.g. not extract content of "span" separately but rather as
part of "p"). nif:referenceContext gives you together with
nif:isString clear information what the extracted complete string
is.<br>
<br>
Best,<br>
<br>
Felix<br>
<br>
<blockquote type="cite"><div class="im">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> Presumably the same assertion would be true
of <Alcoholism.txt#char=0,91427> too but if you are
trying to encode document level meta-data and you have an
identifier for the document, why not use it? </div>
<div><br>
</div>
<div>Steve</div>
<div> </div>
</div>
-- <br>
Department of Computing, Macquarie University
<div><a href="http://web.science.mq.edu.au/%7Ecassidy/" target="_blank">http://web.science.mq.edu.au/~cassidy/</a></div>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div><div class="im"><pre>_______________________________________________
NLP2RDF mailing list
<a href="mailto:NLP2RDF@lists.informatik.uni-leipzig.de" target="_blank">NLP2RDF@lists.informatik.uni-leipzig.de</a>
<a href="http://lists.informatik.uni-leipzig.de/mailman/listinfo/nlp2rdf" target="_blank">http://lists.informatik.uni-leipzig.de/mailman/listinfo/nlp2rdf</a>
</pre>
</div></blockquote>
<br>
</div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Department of Computing, Macquarie University<div><a href="http://web.science.mq.edu.au/~cassidy/" target="_blank">http://web.science.mq.edu.au/~cassidy/</a></div>
</div>