[NLP2RDF] [Corpora-List] Announcement: NLP Interchange Format (NIF)

Thu Dec 8 11:01:43 CET 2011

Hello,

Am 06.12.2011 19:04, schrieb John F. Sowa:
>> great stuff (I did enjoy the Tim Bray piece)
>
> Thanks. Another point to consider is the comment by R.V. Guha,
> who worked with Bray in defining RDF:
>
> "Somehow, RDF never caught on... At least RDFa is here to stay."
>
> Guha now works at Google, where he is working on schema.org, which
> avoids RDF/XML. He made the above comment in the following talk:
>
> http://ontolog.cim3.net/file/resource/presentation/Schema.org--RVGuha_20111201/Schema.org_RVGuha_20111201b.mp3

RDFa is used for embedding RDF in HTML pages. Hence, it is quite obvious 
that it is a better choice for schema.org than other RDF syntaxes. There 
are, of course, other scenarios in which you just want to exchange 
information (without HTML), in which one of the other RDF serialisations 
is more appropriate.

> MB
>> Everybody can use other serializations like N3 or Turtle.
>
> I agree. But the major computational issue is not the aesthetics
> of RDF/XML, but its inefficiency for computer processing. See
> the note below.
>
> WW
>> Obviously literals can have types...
>> You can, of course, make statements about a URI, and with a little bit
>> of indirection can even talk about the URI itself rather than what it
>> denotes.
>
> Yes, but that just adds more bloat to an already overstuffed notation.
> In JSON, for example, a typed triple can be represented
>
> {Type1:A, Type2:B, Type3:C}
>
> That would require 4 RDF triples to represent the same information.

I don't see the fourth triple, but do you think the following is too 
complex?

A a Type1 .
B a Type2 .
C a Type3 .

In particular, how is that "inefficient for computer processing" as your 
write above?

Note that there is also a JSON task force at W3C: 
http://www.w3.org/2011/rdf-wg/wiki/TF-JSON (including some examples of 
RDF/JSON)
This might be what you are looking for.

> JFS
>>> They [Facebook] do not use RDF. They use RDFa, which is a notation
>>> for tagging HTML (or XML) documents. But RDFa has nothing in common
>>> common with RDF/XML other than the three letters R, D, and F.
>
> WW
>> Here the confusion behind your arguments is quite clear. RDF/XML and
>> RDFa are just two ways of writing down exactly the same thing.
>
> I agree that the semantics is critical, and RDF/XML defines the
> semantics for the RDFa tags. See the note below.

RDF/XML does not define RDF semantics. RDFa and RDF/XML are two 
serialisation formats for RDF.

> WW
>> Personally I don't think RDFa is a terribly good idea since it
>> mixes up the data with the presentation but there are some use cases
>> for which it makes sense
>
> Any metalanguage about web documents must have some way of linking
> the language to the documents. Webmasters prefer RDFa because it is
> a concise extension to what they've been doing for years. That is
> also why the new schema.org is growing much faster than RDF, as
> Guha said in his talk.
>
> At the end of the following note, I recommend an alternative to
> RDF/XML for the Semantic Web. Whether or not the SemWebbers adopt it,
> I would suggest something along these lines for NIF.

Note that RDF in JSON would just be another serialisation and there have 
many efforts towards it (the task force I linked maybe being the most 
"official" one). Please also look at the following post explaining why 
JSON is not always useful (although it can be in many scenarios): 
http://www.ldodds.com/blog/2010/12/rdf-and-json-a-clash-of-model-and-syntax/

 > For Watson, they used a large volume of web resources, including some
 > that may have been developed with RDF.  But to say that IBM actually
 > used RDF in any essential way would be misleading.

I talked to Chris Welty (http://en.wikipedia.org/wiki/Chris_Welty) about 
the use of DBpedia (http://dbpedia.org) in Watson. They do make use of 
it. The strength of Watson is to use many algorithms and many web 
resources. DBpedia is used for 6-8% of the questions generated by the 
system in its latest version if I remember correctly. Text corpora are 
more important for Watson than structured data, so it is misleading to 
state "IBM did not adopt [RDF] for Watson" as in your initial mail.

> They do not use RDF.  They use RDFa, which is a notation for tagging
> HTML (or XML) documents.  But RDFa has nothing in common with RDF/XML
> other than the three letters R, D, and F.  Facebook, like nearly
> everybody who uses RDFa tags, translates the data from those tags
> to a more efficient notation than RDF -- JSON, for example.

RDF is the underlying formalism and RDFa a syntax for representing it.

> GoodRelations is an ontology that happens to be expressed in OWL.
> But if you look at the actual OWL statements, you'll notice that
> they don't use any features of OWL that could not be expressed
> in Aristotle's original syllogisms.  In fact, the overwhelming
> majority of sites that claim to use OWL don't go beyond Aristotle.

That may be right (I did not check it), but your original argument was 
that "Google never adopted it [RDF]", which is not correct in case of 
Google Shopping. Whether or not the OWL ontologies should be more 
expressive is a different matter.

> Furthermore, Google is one of the founding members of schema.org,
> which has developed their own vocabulary and methods of processing.
> See their hierarchy of terms:  http://schema.org/docs/full.html
>
> Look at the way they use those terms:  http://schema.org/docs/gs.html
> You won't see any RDF or OWL there.

While Microdata is their preferred choice, it is not hard to map it to 
RDF. They maintain an official OWL ontology 
(http://schema.org/docs/schemaorg.owl) for schema.org.

Kind regards,

Jens

-- 
Dr. Jens Lehmann
Head of AKSW/MOLE group, University of Leipzig
Homepage: http://www.jens-lehmann.org
GPG Key: http://jens-lehmann.org/jens_lehmann.asc