Wicked Cool Java: Crawling the Semantic Web

February 6th, 2006  |  Published in News, Technology

Java World

The Semantic Web is the next-generation web of concepts linked to other concepts, rather than a collection of hypertext documents linked by keywords. If you think about it, an HTML anchor tag (link) is a keyword reference to another document. It supplies a word or phrase that links to another document, usually displayed as underlined text on a browser. But the link doesn’t exactly say how the two documents are related to each other. HTML hyperlinks don’t give any real indication about relationships between files, and the text in the link may be extremely vague. A new standard, the Resource Description Framework (RDF), makes it possible to be much more specific about how things are related to each other. In fact, RDF describes much more than documents—any entities or concepts can be linked together. This is the basic idea behind the Semantic Web—that concepts, rather than documents, can be linked together.

As Java developers, how can we participate in building the Semantic Web? First, you’ll need to know something about official standards such as RDF. You will then need to tag your documents appropriately. Many sites are already starting to do some of this by creating RDF Site Summary (RSS) feeds. An RSS feed syndicates the content from a Website so that it can be combined with information from other sites and delivered to the users as aggregated content. RSS makes a small portion of a site available as a summary, similar to what you see in an article or news abstract. However, RSS enabling is only the first step in moving toward a Semantic Web. In this article I discuss enough to get you started working with RDF and introduce some APIs that help in producing or consuming content.

Go To Article »

Popularity: 4% [?]

Leave a Response