How can we sell semantic technologies?

By Paulo Villegas, September 28, 2009 12:34 pm

There is a permanent discussion on the current status and prospects of semantic processing (see a previous post by Javier): whether it is now ready for the masses, still in the lab for a sizable future, or being phased out because it did not fullfill its initial promise (i.e., it got stuck in the Trough of Disillusionment).

My current stance on this is that the grand unified vision of the overall Semantic Web is likely to remain a vision for a significant amount of time (Linked Data progress notwithstanding). At the same time I also expect semantic processing to get used within information systems almost as a commodity. However it will be in the form of “semantic islands”,capable of processing rich information of increasingly higher volumes, but interacting with the sorroundings in more traditional ways. It’s the usual way of getting a foothold from where to grow, while not disrupting the environment too much (i.e. an “incremental revolution”). It will be part of the fabric, and in being so it will compete with alternate solutions such as the usual suspects (the traditional RDBMs) or the new kid in town (the key-value stores).

Now, the issue is: how do you sell one of this “islands”? In other words, how can you convince a client (or a management layer) that they should invest in a semantic processing module (new thing, less proven, sounds advanced but maybe too advanced to be real) for their concrete data processing/mining/etc need?

First let’s clarify what I’m referring to when I talk about a “semantic solution”. I mean the “hardcore concept” of ontology-based processing, querying and inferencing with a Knowledge Base, generally throwing in all the standard soup of RDF, OWL, SPARQL, triple stores, you name it. Which means I’m excluding two particular semantic-like systems:

  • Processing of web-based structured information such as RDFa or microformats, which can be gathered with a framework like the ones provided by Yahoo!’s Search Monkey and Google’s Rich Snippets. Though the data can be considered semantic, it is usually processed in ’simple’ ways (e.g. for structured presentation purposes). In summary, it is a way to consume semantic data, not to process it (note that applications using Linked Data are not in this group, that is a different beast)

  • Extraction of semantic data from text or text-like sources, in other words, from mostly unstructured data, probably by using technology related to NLP (Natural Language Processing) and/or statistical pattern recognition (machine learning). This is exemplified by applications such as Open Calais (which works mostly as a Named Entity Recognition service). In this case those applications are producing semantic data, but not necessarily processing it.

In other words, to qualify as a “semantic island” an application needs to mostly work internally with semantic data, but not necessarily to produce or consume it (though obviosuly at some point something will be done with the result). So here’s the catch: most, if not all, what can be done within that semantic island can be also done through other “traditional” means: once you have the structured semantic data, store it in an RDBMS, use standard ways of querying it and develop application code that hooks to the database to perform the analytics needed. It works. Inferencing can also be reduced to plain procedural programming (it’s a matter of re-scheduling processes and throwing in lines of code), and all in all, the resulting run-of-the-mill module would probably consume less resources and work faster than a pure semantic module anyway (given the current maturity of Triple Stores).

So, what are then the advantages of the semantic way? Just think of having two demos of a given processing “island”, one semantic and the other non-semantic, running side-by-side. Taken as black boxes, would anyone notice the difference?

I can only think of two arguments (which are both variants on the same idea):

  1. Versatility: semantic apps are more malleable, and can be better tailored to the problem at hand. As I said above, any of the approaches will work, but maybe the semantic approach can be made more precise, thus potentially offering better results.

  2. Upgrade path: a big selling point is that semantics separates more clearly the infrastructure (KB, rule engine, etc) from the data and the logic associated to it (ontology, rules). Evolving a semantic system can be done by modifying the semantic structures, updating rules, etc, a task that (if it has been properly done) is always more fluid than modifying a RDBM schema and hacking code around it.

The point is: is that convincing enough? Specially when you factor in that, for the semantic option, you need to involve more specialized (and harder to find) development staff. Moreover, those features are much harder to show in a demo.

4 Responses to “How can we sell semantic technologies?”

  1. Maria says:

    Now, the important thing is the development of semantic technology. Then, the benefits of technology will make it salable.

    However, I think that for any enterprise, his information system will be better and finding client will be easier

  2. I absolutely agree with both arguments but they are not convincing enough, as BI technologies are more mature than semantic ones, and most clients has invested lot of money in data mining software in the last years. BI vendors have lot or client references, while there are just a few ones on semantic applications.

  3. Maria says:

    I think an option is not incompatible with the other. In fact, a good idea would be a mixture of the two: BI and semantics.

  4. I was actually implying the mixture, when I spoke about “islands”: a semantic module interacting with the pre-existing infrastructure, since I agree that it’s utopic to try to clash upfront with the existing mature and proven base of non-semantic systems.
    The idea was then, to “sell” one semantic module that can coexist happily with the existing base, but can also provide enhanced aspects. I was then looking for those “enhancements” as selling points. The problem is that semantic technology is mostly obscure and prone to technical jargon; it’s difficult to turn that into layman arguments.

Leave a Reply

Panorama theme by Themocracy