<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <p>

      <meta http-equiv="content-type" content="text/html; charset=utf-8">

    </p>

    <p>

      <title></title>

      <meta name="generator" content="LibreOffice 4.2.8.2 (Linux)">

      <style type="text/css">

        <!--

                @page { margin: 2cm }

                h3.cjk { font-family: "Droid Sans Fallback" }

                h3.ctl { font-family: "FreeSans" }

                p { margin-bottom: 0.25cm; line-height: 120% }

                a:link { so-language: zxx }

        -->

        </style>

    </p>

    <p><b>DBpedia Open Text Extraction Challenge - TextExt</b></p>

    <p>Website: <a href="http://wiki.dbpedia.org/textext">http://wiki.dbpedia.org/textext</a></p>

    <p><strong><u>Disclaimer: The call is under constant development,

          please refer to the news section. We also acknowledge the

          initial

          engineering effort and will be lenient on technical

          requirements for

          the first submissions and will focus evaluation on the

          extracted

          triples and allow late submissions, if they are coordinated

          with us</u></strong>.</p>

    <h3 class="western">Background</h3>

    <p>DBpedia and Wikidata currently focus primarily on representing

      factual knowledge as contained in Wikipedia infoboxes. A vast

      amount

      of information, however, is contained in the unstructured

      Wikipedia

      article texts. With the DBpedia Open Text Extraction Challenge, we

      aim to spur knowledge extraction from Wikipedia article texts in

      order to dramatically broaden and deepen the amount of structured

      DBpedia/Wikipedia data and provide a platform for benchmarking

      various extraction tools.</p>

    <h3 class="western">Mission</h3>

    <p>Wikipedia has become the ubiquitous source of knowledge for the

      world enabling humans to lookup definitions, quickly become

      familiar

      with new topics, read up background infos for news event and many

      more - even settling coffee house arguments via a quick mobile

      research. The mission of DBpedia in general is to harvest

      Wikipedia’s

      knowledge, refine and structure it and then disseminate it on the

      web

      - in a free and open manner - for IT users and businesses.</p>

    <h3 class="western">News and next events</h3>

    <p>Twitter: <a href="https://twitter.com/dbpedia">Follow @dbpedia</a>,

      Hashtag: <a

href="https://twitter.com/search?f=tweets&q=%23dbpedianlp&src=typd">#dbpedianlp</a></p>

    <ul>

      <li>

        <p style="margin-bottom: 0cm"><a href="http://ldk2017.org/">LDK</a>

          conference joined the challenge (Deadline March 19th and April

          24th) </p>

      </li>

      <li>

        <p style="margin-bottom: 0cm"><a

            href="http://2017.semantics.cc/">SEMANTiCS</a> joined the

          challenge (Deadline June 11th and July 17th) </p>

      </li>

      <li>

        <p style="margin-bottom: 0cm">Feb 20th, 2017: Full example added

          to this website </p>

      </li>

      <li>

        <p>March 1st, 2017: Docker image (beta) <a

href="https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge">https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge</a>

        </p>

      </li>

    </ul>

    <p>Coming soon:</p>

    <ul>

      <li>

        <p style="margin-bottom: 0cm">beginning of March: full example

          within the docker image </p>

      </li>

      <li>

        <p>beginning of March: DBpedia full article text and tables

          (currently only abstracts) <a

            href="http://downloads.dbpedia.org/2016-10/core-i18n/">http://downloads.dbpedia.org/2016-10/core-i18n/</a>

        </p>

      </li>

    </ul>

    <h3 class="western">Methodology</h3>

    <p>The DBpedia Open Text Extraction Challenge differs significantly

      from other challenges in the language technology and other areas

      in

      that it is not a one time call, but a continuous growing and

      expanding challenge with the focus to <strong>sustainably</strong>

      advance the state of the art and transcend boundaries in a <strong>systematic</strong>

      way. The DBpedia Association and the people behind this challenge

      are

      committed to provide the necessary infrastructure and drive the

      challenge for an indefinite time as well as potentially extend the

      challenge beyond Wikipedia.</p>

    <p>We provide the extracted and cleaned full text for all Wikipedia

      articles from 9 different languages in regular intervals for

      download

      and as Docker in the machine readable <a

        href="http://persistence.uni-leipzig.org/nlp2rdf/">NIF-RDF</a>

      format (Example for <a

href="https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge/blob/master/BO.ttl">Barrack

        Obama in English</a>). Challenge participants are asked to wrap

      their

      NLP and extraction engines in Docker images and submit them to us.

      We

      will run participants’ tools in regular intervals in order to

      extract:</p>

    <ol>

      <li>

        <p>Facts, relations, events, terminology, ontologies as RDF

          triples (Triple track)</p>

      </li>

      <li>

        <p>Useful NLP annotations such as pos-tags, dependencies,

          co-reference (Annotation track)</p>

      </li>

    </ol>

    <p>We allow submissions 2 months prior to selected conferences

      (currently <a href="http://ldk2017.org/"><u>http://ldk2017.org/</u></a>

      and <a href="http://2017.semantics.cc/"><u>http://2017.semantics.cc/</u></a>

      ). Participants that fulfil the technical requirements and provide

      a

      sufficient description will be able to present at the conference

      and

      be included in the yearly proceedings. <strong>Each conference,

        the

        challenge committee will select a winner among challenge

        participants, which will receive 1000€. </strong>

    </p>

    <h3 class="western">Results</h3>

    <p>Every December, we will publish a summary article and proceedings

      of participants’ submissions at <a href="http://ceur-ws.org/"><u>http://ceur-ws.org/</u></a>

      . The first proceedings are planned to be published in Dec 2017.

      We

      will try to briefly summarize any intermediate progress online in

      this section.</p>

    <h3 class="western">Acknowledgements</h3>

    <p>We would like to thank the Computer Center of Leipzig University

      to give us access to their 6TB RAM server Sirius to run all

      extraction tools.</p>

    <p>The project was created with the support of the H2020 EU project

      <a href="https://project-hobbit.eu/">HOBBIT</a> (GA-688227) and

      <a href="http://aligned-project.eu/">ALIGNED</a> (GA-644055) as

      well

      as the BMWi project <a href="http://smartdataweb.de/">Smart Data

        Web</a>

      (GA-01MD15010B).</p>

    <h3 class="western">Challenge Committee</h3>

    <ul>

      <li>

        <p>Sebastian Hellmann, AKSW, DBpedia Association, KILT

          Competence Center, InfAI, Leipzig</p>

      </li>

      <li>

        <p>Sören Auer, Fraunhofer IAIS, University of Bonn</p>

      </li>

      <li>

        <p>Ricardo Usbeck, AKSW, Simba Competence Center, Leipzig

          University</p>

      </li>

      <li>

        <p>Dimitris Kontokostas, AKSW, DBpedia Association, KILT

          Competence Center, InfAI, Leipzig</p>

      </li>

      <li>

        <p>Sandro Coelho, AKSW, DBpedia Association, KILT Competence

          Center, InfAI, Leipzig</p>

      </li>

    </ul>

    <p>Contact Email: <a

        href="mailto:dbpedia-textext-challenge@infai.org"><u>dbpedia-textext-challenge@infai.org</u></a></p>

  </body>

</html>