Fear the reaper?

Sounds like someone is trying to ring the funeral bell for the semantic web:

Should semantic web experts fear the reaper ?

Dearly beloved, we are gathered here today to mark the end of an era. I’m talking about the passing of Web 3.0 – ostensibly the era of the next great revolution in the information industry.

In its short life the semantic web we knew so little passed through the peak of inflated expectation, went round the cape of unrealistic ambition and finally found a resting place in the great junkyard of unwanted technology in the virtual cloud. At one time our information industry seemed to have the most to gain (or lose) from the threats and opportunities presented by our recently lost friend. So, what went wrong?

The era passed with the recent announcement by Google, Yahoo and Microsoft of the launch of schema.org. Schema.org provides technical documentation on the ways in which the major search engines will recognize structured data in your web pages. It shows how to get rich snippets of content and data from your site directly into search engine results pages. Rich snippets are the next step in the evolution of search, because they allow search engines to read meaningful semantics into content on the web.

For an e-commerce centric publishing tech company,  it seems true.  From their point of view,  getting your wares harvested by ‘Bingle’ and other commercial partners is the only thing that really matters. Microdata could do this reasonably well with minimal extra hassle.

When we explored the emerging schema.org microdata formats as part of the COMET investigation into linked data, (plug) we found the two technologies to have different  and in some cases complimentary uses.  One improves search engine results display (if not harvesting and ranking, not got a clue about that) the other provides structured datasets for total re-purposing or enrichment.  Schema.org is a better way of presenting documents for scraping. Linked Data (RDF or not) is a better way of sharing re-usable metadata. Neither totally works for academic libraries either.

What the blog does not acknowledge that the semantic web has actually died and been reborn several times, with Linked Data being perhaps the most popular incarnation yet. And that search engines have not outrightly ignored one format in favour of another, they are simply promoting schema.org right now as it fits their model.

Neither linked data nor embedded HTML 5 microdata can meet all use case requirements for semantic enrichment of documents and data.  And because  neither is the expected semantic magic bullet train we’ve all been hoping to catch for the past ten years, each will have its critics.

Both mechanisms are also  equally open to being abused. Anyone remember search engine meta tags?

Back in HE and cultural heritage where we think about the Internet in terms beyond search engines and advertising, there is still strong interest in  linked data. Here in the UK, the JISC are funding resource discovery projects with linked data firmly in mind.  Its still experimental, but has such a wide potential application that it can’t be ignored.

Carl Grant from Ex Libris has a wonderfully pragmatic approach to Linked Data. We need better and new examples of its functionality (as core services, rather than tech demos). Don’t discount it, but don’t expect to make a profit yet either.  Laying long term foundations takes time.


  1. Hi Ed,

    Semantic enrichment and text mining are, in my view, technologies with demonstrable practical utility. Ditto ontologies and taxonomies for tagging content to aid faceted search and discovery.

    However, I’d contend that JSON has far more practical utility as a flexible data interchange format then RDF, and that NoSQL databases (e.g. mongo, Cassandra, couchdb) are demonstrably more successful technologies for fluid schema data storage than triple stores.

    The grand vision of a parallel linked-data web, allowing machines to harvest all of the rich semantics of the human readable web has not materialised. The principle use case for this type of data mining is clearly web scale search, and Google et al. have routed round the the RDF stack (after having briefly flirted with RDFa).

    Other world-changing use cases for the semantic web have not appeared. This is not due to a lack of technology – it’s just that there are no killer applications for the semantic web.

    Sometimes what looks like a long investment window simply turns out to be a dead end. This is a dead end (if it was ever alive).

  2. Cheers Richard. After a bit of toe-dipping, I’d personally be tempted to write off SPARQL and triplestores in favour of NoSQL and fast indexing tech (Sphinx, elasticSearch etc.) and should also add that JISC are also funding some serious investigations into this, but others disagree.

    JSON is a great interchange format indeed, but only that. I’m tracking the efforts of a number of folk involved in development of BibJSON, a lightweight bib exchange format based on BibTex. Its good stuff and refreshing compared to hefty data formats, but always structure begins to creep in, either by design or by default.

    The richness of RDF vocabularies and the ability to interchange them is incredibly useful, even if it makes the structure harder to ‘grok’ at first glance. The trouble is some folk have gone overboard. I’m particually fond of the phrase ‘linked to death’.

    There are many out there who exist beyond ‘Google et al’ and their definition of data mining. This may result in more silos, but there we go.

  3. Is there any RDF browser still available online? I tried a couple of demos and they all appeared to be dead (including dataviewer.zitgist.com). I simply wanted to try out a couple of RDF files I found at ontologycentral, and I couldn’t figure out how to do it. Based on my search so far RDF does seem to be dead.

