<div dir="ltr">Thanks Felix, is there a difference though between making an assertion about the document and making one about the string that results from pre-processing the document? <div><br></div><div style>It&#39;s probably not an important point but it seems odd to me to qualify it in this way.</div>

<div style><br></div><div style>Steve</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 30 May 2013 16:14, Felix Sasaki <span dir="ltr">&lt;<a href="mailto:fsasaki@w3.org" target="_blank">fsasaki@w3.org</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  
  <div bgcolor="#FFFFFF" text="#000000">

    <div>Am 30.05.13 08:07, schrieb Steve

      Cassidy:<br>

    </div><div class="im">

    <blockquote type="cite">

      <div dir="ltr"><br>

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><br>

              The basic unit in NIF is the nif:Context, so the

              document-level is covered, when the string in a

              nif:Context equals the content of a document. </blockquote>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">...<br>

              &lt;Alcoholism.txt#char=37028,37043&gt;<br>

                      a  nif:RFC5147String ;<br>

                      nif:beginIndex &quot;37028&quot; ;<br>

                      nif:endIndex &quot;37043&quot; ;<br>

                      itsrdf:taIdentRef &lt;<a href="http://dbpedia.org/resource/Benzodiazepine" target="_blank">http://dbpedia.org/resource/Benzodiazepine</a>&gt;

              ;<br>

                      nif:referenceContext

              &lt;Alcoholism.txt#char=0,91429&gt;  .<br>

            </blockquote>

            <div><br>

            </div>

            <div>Just wondering why you don&#39;t use

              &lt;Alcoholism.txt&gt; when making assertions about the

              document as a whole rather than giving the entire

              character range as a qualifier. </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br></div>

    Hi Steve,<br>

    <br>

    Sebastian may have a different answer, but here is my view from how

    this is used in ITS 2.0: when you convert a  document like<br>

<a href="http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-HTML-whitespace-normalization" target="_blank">http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-HTML-whitespace-normalization</a><br>


    to NIF, you will make a lot of decisions what to drop (white space

    nodes, content of HTML &quot;head&quot; or &quot;script&quot; inside &quot;body&quot;) and how to

    segment (e.g. not extract content of &quot;span&quot; separately but rather as

    part of &quot;p&quot;). nif:referenceContext gives you together with

    nif:isString clear information what the extracted complete string

    is.<br>

    <br>

    Best,<br>

    <br>

    Felix<br>

    <br>

    <blockquote type="cite"><div class="im">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div> Presumably the same assertion would be true

              of &lt;Alcoholism.txt#char=0,91427&gt;  too but if you are

              trying to encode document level meta-data and you have an

              identifier for the document, why not use it? </div>

            <div><br>

            </div>

            <div>Steve</div>

            <div> </div>

          </div>

          -- <br>

          Department of Computing, Macquarie University

          <div><a href="http://web.science.mq.edu.au/%7Ecassidy/" target="_blank">http://web.science.mq.edu.au/~cassidy/</a></div>

        </div>

      </div>

      <br>

      <fieldset></fieldset>

      <br>

      </div><div class="im"><pre>_______________________________________________

NLP2RDF mailing list

<a href="mailto:NLP2RDF@lists.informatik.uni-leipzig.de" target="_blank">NLP2RDF@lists.informatik.uni-leipzig.de</a>

<a href="http://lists.informatik.uni-leipzig.de/mailman/listinfo/nlp2rdf" target="_blank">http://lists.informatik.uni-leipzig.de/mailman/listinfo/nlp2rdf</a>

</pre>

    </div></blockquote>

    <br>

  </div>


</blockquote></div><br><br clear="all"><div><br></div>-- <br>Department of Computing, Macquarie University<div><a href="http://web.science.mq.edu.au/~cassidy/" target="_blank">http://web.science.mq.edu.au/~cassidy/</a></div>


</div>