[NLP2RDF] Salient words extraction (e.g. ontology, Open Source projects)

Jean-Marc Vanel jeanmarc.vanel at gmail.com
Sun Jan 13 13:03:09 CET 2013


In fact it is not hard to understand an ontology, but it is hard to
know which ontology to use.

There is no "directory" of ontologies. It's like the menu of ice
creams, there are many. There are rather search engines but
traditional ones, not conceptual, such as swogle, falcons [1] ... It
is so open, that it's hard even for knowledge experts to choose good
ontologies.

To remedy this, what I have planned is to create tools to help
authors, users or developers to annotate ontologies with concepts from
DBpedia or WordNet, using NLP analyzers.

So what it would be is a tool for extracting salient words from
English, which outputs 5 to 10 relevant words, typically from a
rdfs:comment. These words are then (if necessary) disambiguated , for
example using a Wikipedia Web Service (the one you use when typing in
the Wikipedia search field).

Salient words (here music), will be put in triples such as:

<myOntology> skos:subject DBpedia:Music.

which can then be used in the ontology itself (the best), or added in
Turtle or RDF documents online or SPARQL databases and / or
collaborative sites such prefix.cc [2].

Thus a human or an agent program could find a software component more
accurately. The issue about ontologies is similar to Open Source
programs, and many other types of resources.

Ideally, the software component for the NLP extraction would in Java
and Open Source, which would facilitate the addition in the EulerGUI
environment [3]. I feel that nlp2rdf could help. It already has a web
service for parsing. What is missing is processing the syntax tree in
RDF for the salient words, or directly using an NLP tool.

[1] Finding ontologies on the Web:

http://eulergui.svn.sourceforge.net/viewvc/eulergui/trunk/eulergui/html/documentation.html#Finding2

[2] collaborative website for ontologies and their prefixes: http://prefix.cc

[3] EulerGUI , GUI environment and framework for Semantic Web and rules

http://eulergui.svn.sourceforge.net/viewvc/eulergui/trunk/eulergui/html/documentation.html



-- 
Jean-Marc Vanel
Déductions SARL - Consulting, services, training,
Rule-based programming, Semantic Web
http://deductions-software.com/
+33 (0)6 89 16 29 52
Twitter: @jmvanel ; chat: irc://irc.freenode.net#eulergui


More information about the NLP2RDF mailing list