Obviously, one of the great technical innovations of our time is the World Wide Web, invented by Tim Berners-Lee. Here, he speaks about the launch of the next generation of his creation: the Semantic Web.
What is the Semantic Web? Perhaps the best way to understand the concept is to contrast it to the current Web, which is set up to help you find documents that may (or may not) have the information you want. The Semantic Web, on the other hand, would catalogue important data to indicate the type of information that data represents—places, things, people—thus enabling a new dimension of archiving and search. The Semantic Web can therefore be thought of as a "smarter," more useful resource.
The World Wide Web Consortium (W3C), founded and led by Berners-Lee, has taken the lead in developing the necessary standards to make the next generation of the Web possible. W3C has been instrumental in the development of the core infrastructure of the current Web, which has grown significantly from the original Berners-Lee specifications for URIs [Uniform Resource Identifier], HTTP and HTML.
Recently, the W3C completed work on the standards that will enable the Semantic Web to come into existence: the Resource Description Framework (RDF) and the Web Ontology Language (OWL). Links to these and other Semantic Web Recommendations (as W3C refers to its standards) and materials can be found here: https://www.w3.org/2001/sw/
Updegrove: Before we get into specifics, what is it like bringing your vision to the world for the second time, now that the Semantic Web is beginning to take hold?
Berners-Lee: Our work in promoting rather than developing the Semantic Web technologies has been like "déjà vu all over again" for me. Fifteen years ago, one of the hardest things to do was not to develop the initial version of HTTP, or to create a browser that was also an editor, or even to get approval for the purchase of the equipment (!). The difficult thing was to convince people that the Web was something they should adopt.
At CERN (the European Laboratory for Particle Physics, in Geneva), the killer app that got us through the technical barriers (operating system, hardware, philosophy) was making the phone book available through the Web. In the outside world, beyond lab settings, what helped the Web breakthrough were two simultaneous developments—that CERN was making the code available to anyone who would like it free of charge or other encumbrance, and that young developers were coming up with browser software, including multiple implementations that supported inline images.
And so with the potential licensing barriers down and the relative ease of setting up a server, things took off. But imagine, if you can, online information systems before the Web, and what it was like to try to explain the whole idea of the Web to people.
Envisioning life in the Semantic Web is a similar proposition. Some people have said, "Why do I need the Semantic Web? I have Google!"
Google is great for helping people find things, yes! But finding things more easily is not the same thing as using the Semantic Web. It's about creating things from data you've compiled yourself, or combining it with volumes (think databases, not so much individual documents) of data from other sources to make new discoveries. It's about the ability to use and reuse vast volumes of data.
Yes, Google can claim to index billions of pages, but given the format of those diverse pages, there may not be a whole lot more the search engine tool can reliably do. We're looking at applications that enable transformations, by being able to take large amounts of data and being able to run models on the fly—whether these are financial models for oil futures, discovering the synergies between biology and chemistry researchers in the Life Sciences, or getting the best price and service on a new pair of hiking boots.
Updegrove: As you look at the Semantic Web project now, some eight years after its inception, are you encouraged or discouraged? Does it look to you today as if you will be able to accomplish less, as much, or more than you had originally envisioned?
Berners-Lee: The Semantic Web has a whole lot more to it than the original Web. Building something which will be a firm logical foundation for interoperating business systems and query systems and so on takes more work and has to be a lot more well defined than a simple jotting down of some HTML tags! However, we have the entire URI and HTTP infrastructure to build on, of course.
One can always wish things were further along, but in fact I think the progress has been great. We were asked to hold up the query and rules work because people didn't want to start on it until the ontology work had finished, so for some we were in danger of going too fast. Now we have a good solid layer of RDF and OWL, which allows systems to be described, and data to be exchanged. OWL turned out to be more powerful than I had expected, and that is great. The query language I think will be a major step, as it will allow major databases to be exposed without one having to transfer the whole file. It will also provide a way of integrating across SQL and XQuery systems.
I'm disappointed that we haven't seen RDF used as an export format on random applications such as desktop and enterprise applications. This may be because the RDF/XML syntax is a little off-putting. It is an irony that the RDF model itself is simpler than that of XML, but it isn't evident when you encode it in the standard syntax. The informal N3 syntax provides a learning- and more human-friendly on-ramp for export and import, and it may be that standardizing that would be a useful step. On the other hand, there is an ever-growing set of adapters from various formats to RDF.
I am very happy about the reception that the Semantic Web has had in specific areas where people "get it." The FOAF [Friend of a Friend] project, for example, has a great spirit, and is a quite decentralized web of information about people's business cards, CVs, and who knows who. The whole area of life sciences and healthcare has been hopping with excitement as work is done to take down the boundaries between different silos of information across the field. We had a very vibrant workshop in the area, and Semantic Web was the talk of the recent BIO-IT conference.
I think the hope for more true interactivity in terms of collaborative tools, particularly real-time collaborative tools, has yet to be realized—it's something I had hoped for in the early days, and I am still hoping to see it happen.
Updegrove: Since this is your second time around designing the Web, what did you learn from taking the Web from concept to reality the first time that may help us anticipate how the Semantic Web will become real?
Berners-Lee: The Semantic Web idea—that of having data as well as documents on the Web—has been around since the start of the Web. It is just more complicated to do.
Experience from the initial growth of the Web of documents? Well, it was a very rigid exponential growth, which couldn't be slowed or hastened. Different people "got it" in different years, and to them it's seemed that the Web had "happened" all that year. It spread first among enthusiasts, and then among small sub-communities where one could get to critical mass with the momentum of a few champions. These communities (High Energy Physics for the WWW, possibly Life Sciences for Semantic Web) are full of people who have very big challenges to tackle, and are largely scientifically minded people who understand the new paradigm. These things may be very similar.
Where it is different is that there is attention from the press. We work under floodlights. Whereas the WWW took off in the hands of the converts, and others were left in blissful ignorance, the Semantic Web takes off with articles like this one, and people checking to see whether it is time for them to get involved. This has helped in some ways, hindered in others. We have to work hard to make sure that expectations are not overstated.