[NLP2RDF] [Corpora-List] Announcement: NLP Interchange Format (NIF)

John F. Sowa sowa at bestweb.net
Tue Dec 6 19:04:15 CET 2011


Dear Adam, William, and Michael,

I sent a note to Ontolog Forum (copy below), which addresses
many of the points raised in this thread.

AK
> great stuff (I did enjoy the Tim Bray piece)

Thanks.  Another point to consider is the comment by R.V. Guha,
who worked with Bray in defining RDF:

    "Somehow, RDF never caught on... At least RDFa is here to stay."

Guha now works at Google, where he is working on schema.org, which
avoids RDF/XML.  He made the above comment in the following talk:

http://ontolog.cim3.net/file/resource/presentation/Schema.org--RVGuha_20111201/Schema.org_RVGuha_20111201b.mp3

WW
> I don't think you will find many who disagree that RDF/XML is not very
> pretty and not very convenient to use or process. But that's just the
> surface encoding. Really its just an obtuse way to represent 3-tuples
> and there are other more convenient ways.

MB
> Everybody can use other serializations like N3 or Turtle.

I agree.  But the major computational issue is not the aesthetics
of RDF/XML, but its inefficiency for computer processing.  See
the note below.

WW
> Obviously literals can have types...
> You can, of course, make statements about a URI, and with a little bit
> of indirection can even talk about the URI itself rather than what it
> denotes.

Yes, but that just adds more bloat to an already overstuffed notation.
In JSON, for example, a typed triple can be represented

    {Type1:A, Type2:B, Type3:C}

That would require 4 RDF triples to represent the same information.

JFS
>> They [Facebook] do not use RDF.  They use RDFa, which is a notation
>> for tagging HTML (or XML) documents.  But RDFa has nothing in common
>> common with RDF/XML other than the three letters R, D, and F.

WW
> Here the confusion behind your arguments is quite clear. RDF/XML and
> RDFa are just two ways of writing down exactly the same thing.

I agree that the semantics is critical, and RDF/XML defines the
semantics for the RDFa tags.  See the note below.

WW
> Personally I don't think RDFa is a terribly good idea since it
> mixes up the data with the presentation but there are some use cases
> for which it makes sense

Any metalanguage about web documents must have some way of linking
the language to the documents.  Webmasters prefer RDFa because it is
a concise extension to what they've been doing for years.  That is
also why the new schema.org is growing much faster than RDF, as
Guha said in his talk.

At the end of the following note, I recommend an alternative to
RDF/XML for the Semantic Web. Whether or not the SemWebbers adopt it,
I would suggest something along these lines for NIF.

John

-------- Original Message --------
Subject: [ontolog-forum] If you can't beat 'em, join 'em.
Date: Tue, 06 Dec 2011 12:11:11 -0500
From: John F. Sowa <sowa at bestweb.net>
To: [ontolog-forum] <ontolog-forum at ontolog.cim3.net>

Schema.org can be viewed as a threat or an opportunity for the
Semantic Web.  It was founded by a collaboration of Google,
Microsoft (Bing), and Yahoo! as an alternative to RDF or RDFa
for tagging web pages.  See http://schema.org/docs/faq.html

With that backing and with the simplicity of the schema.org
notation, the adoption rate of schema.org has been faster
than RDFa and much, much faster than RDF/XML.  Some people
have considered that a threat to the Semantic Web.

But a new web site provides a mapping of the full schema.org
type hierarchy to JSON and four notations for RDF:  XML,
N3, Turtle, and NTriples.  See http://schema.rdfs.org/

Of those notations, JSON is the most humanly readable and
the most computationally efficient.  JSON is the native data
format of JavaScript, and mappings have been defined to all
the major programming languages.  See http://www.json.org/

The original RDF/XML was a disaster for humans and for computers.
It is horribly inefficient for computation, and the native XML
tools that process it are too slow for critical applications.
For that reason, its adoption rate has been glacially slow.

The rapid adoption rate of schema.org and the JSON notation
should be a wake-up call for the Semantic Web.  R. V. Guha,
the original designer of RDF, said that he "wished" he could
have used LISP notation for RDF.  The JSON notation is
essentially LISP with brackets and curly braces.

The schema.rdfs.org web site is useful for showing how the
Semantic Web tools can interoperate with schema.org.  But
anybody who compares JSON to the RDF notations will have
no incentive to adopt any version of RDF.

For these reasons, Schema.org and the JSON notation are the
wave of the future.  The W3C cannot compete with Google,
Microsoft, Yahoo!, and other companies that are joining the
consortium.  (One example is the Russian search company
Yandex, which is now translating the vocabulary.)

To avoid sinking into irrelevance, the Semantic Web must do
more than specify a way to migrate from XML notation to JSON.
Even declaring JSON to be an alternative is not sufficient.
A modest proposal:

   1. Phase out RDF/XML as the official base for RDF.  There is
      no need to say that it's "deprecated". A softer term would
      be IBM's euphemism "functionally stabilized".

   2. Adopt JSON notation as the official base, but define a formal
      semantics for JSON.  Pat Hayes collaborated with Guha to define
      the logic base (LBase) for RDF.  Pat also worked on the ISO
      project for Common Logic (CL) and defined the CL model theory
      as an upward compatible extension to LBase.  Define the JSON
      semantics by a mapping to CLIF (Common Logic Interchange Format).
      CLIF uses a LISP-like notation that has an almost one-to-one
      mapping from JSON.

   3. Use the CL semantics to define other useful logic languages
      as extensions to JSON.  One example would be a version of OWL
      that uses JSON.  Another would be a rule language that uses
      a Horn-clause subset of CL with a syntax based on JavaScript.

   4. The option of writing N-tuples in JSON can support a direct
      mapping to and from the tables of a relational database.
      The rule language could include a version of Datalog to state
      SQL queries, constraints, and updates.  The types defined by
      schema.org would be a valuable enhancement to SQL.

Common Logic is very expressive, and it is not necessary for the
Semantic Web tools to implement theorem provers for the full
ISO 24707 standard.  However, it would be possible to extend
the JSON-based notation to support the full CL semantics.

In fact, the W3C could work with ISO to include a JSON-based
dialect in the next update to the 24707 standard.  A collaboration
of ISO, W3C, and the major web companies could establish the
Semantic Web as a solid foundation for mainstream applications.

John


More information about the NLP2RDF mailing list