Archive for the 'Conferences' Category

Putting a Conference into the Semantic Web

Chris Gutteridge asked this question about semantically enabling conference Web sites, which is a subject close to my heart. It’s hard to give a meaningful response in 140 characters, so I decided to get some headline thoughts down for posterity. If you want a fuller account of some first-hand experiences, then the following papers are a good place to start:

Top Five Tips for Semantic Web-enabling a Conference

1. Exploit Existing Workflows

Conferences are incredibly data-rich, but much of this richness is bound up in systems for e.g. paper submission, delegate registration, and scheduling, that aren’t native to the Semantic Web. Recognise this in advance and plan for how you intend to get the data from these systems out into the Web. The good news is that scripts now exists to handle dumps from submission systems such as EasyChair, but you may need to ensure that the conference instance of these systems is configured correctly for your needs. For example, getting dumps from these systems often comes at a price, and if you’re using one instance per track rather than the multi-track options, you may be in for a shock when you ask for the dumps. Speak to the Programme Chairs about this as soon as possible.

In my experience, delegate registration opens months in advance of a conference and often uses a proprietary, one-off system. As early as possible make contact with the person who will be developing and/or running this system, and agree how the registration system can be extended to collect data about the delegates and their affiliations, for example. Obviously there needs to be an opt-in process before this data is published on the public Web.

Collecting these types of data from existing workflows is so monumentally easier than asking people to submit it later through some dedicated means. With this in mind, have modest expectations (in terms of degree of participation) for any system you hope to deploy for people to use before, during and after the conference, whether this is a personalised schedule planner, paper annotation system or rating system for local restaurants. People have massive demands on their time always, and especially at a conference, so any system that isn’t already part of a workflow they are engaged with is likely to get limited uptake.

2. Publish Data Early then Incrementally Improve

Perhaps your goal in publishing RDF data about your conference is simply to do the right thing by eating your own dog food and providing an archival record of the event in machine-readable form. This is fine, but ideally you want people to use the published data before and during the event, not just afterwards. In an ideal world, people will use the data you publish as a foundation for demos of their applications and services and the conference, as means to enhance the event and also to promote their own work. To maximise the chances of this happening you need to make it clear in advance that you will be publishing this data, and give an indication of what the scope of this will be. The RDF available from previous events in the ESWC and ISWC series can give an impression of the shape of the data you will publish (assuming you follow the same modelling patterns), but get samples out early and basic structures in place so people have the chance to prepare. Better to incrementally enhance something than save it all up for a big bang just one week before the conference.

3. Attend to the details

Many of the recent ESWC and ISWC events have done a great job of publishing conference data, and have certainly streamlined the process considerably. However, along the way we’ve lost (or failed to attend to) some of the small but significant facts that relate to a conference, such as the location, venue, sponsors and keynote speakers. This stuff matters, and is the kind of data that probably doesn’t get recorded elsewhere. Obviously publishing data about the conference papers is important, but from an archival point of view this information is at least recorded by the publishers of the proceedings. The more tacit, historical knowledge about a conference series may be of great interest in the future, but is at risk of slipping away.

4. Piggy-back on Existing Infrastructure

As I discovered while coordinating the Semantic Web Technologies for ESWC2006, deploying event-specific services is simply making a rod for your own back. Who is going to ensure these stay alive after the event is over and everyone moves onto the next thing? The answer is probably no-one. The domain-registration will lapse, the server will get hacked or develop a fault, the person who once knew why that site mattered will take a job elsewhere, and the data will disappear in the process. Therefore it’s critical that every event uses infrastructure that is already embedded in everyday usage and also/therefore has a future. The best example of this is, the de facto home for Linked Data from Web-related events. This service has support from SWSA, and enough buy-in from the community, to minimise the risk that it will ever go away. By all means host the data on the conference Web site if you must, but don’t dream of not mirroring it at, with owl:sameAs links to equivalent URIs in that namespace for all entities in your data set.

5. Put Your Data in the Web

Remember that while putting your data on the Web for others to use is a great start, it’s going to be of greatest use to people if it’s also *in* the Web. This is a frequently overlooked distinction, but it really matters. No one in their right mind would dream of having a Web site with no incoming or outgoing links, and the same applies to data. Wherever possible the entities in your data set need to be linked to related entities in other data sets. This could be as simple as linking the conference venue to the town in which it is located, where the URI for the town comes from Geonames. Linking in this way ensures that consumers of the data can discover related information, and avoids you having to publish redundant information that already exists somewhere else on the Web. The really great news is that already provides URIs for many people who have published in the Semantic Web field, and (aside from some complexities with special characters in names) linking to these really can be achieved in one line of code. When it’s this easy there really are no excuses.


Reading the above points back before I hit publish, I realise they focus on Semantic Web-enabling the conference as a whole, rather than specifically the conference Web site, which was the focus of Chris’s original question. I think we know a decent amount about publishing Linked Data on the Web, so hopefully these tips usefully address the more process-oriented than technical aspects.

I’m not a lawyer, but…. ISWC2009 Tutorial on Data Licensing

As the Linked Data community shifts its emphasis from publishing data on the Web to consuming it in applications, one question inevitably arises: “what are the terms under which different data sets can be reused?” There’s been a considerable amount of time and money invested by Talis and others in providing some clarity in this area; work that has evolved into the Open Data Commons. However, there remains a lot of confusion in this area, as evidenced by this thread on the Linking Open Data mailing list. Clearly there is more education and outreach work to be done about how licenses and waivers can be applied to data.

With this in mind Leigh Dodds, Jordan Hatcher, Kaitlin Thaney and I submitted a tutorial proposal to this years International Semantic Web Conference addressing exactly these kind of issues. The good news is that our proposal has been accepted, and therefore there will be a half day tutorial on “Legal and Social Frameworks for Sharing Data on the Web” at ISWC2009 in Washington DC in October (more details online soon). With Jordan present to provide the legal perspective there’ll finally be someone taking part in the discussion who can’t prefix their statements with “I’m not a lawyer, but…”

Common Sense prevails at Triple-I

I’m here in Graz for Triple-I. All credit to Klaus Tochtermann, Hermann Mauerer and many others for putting together what is shaping up to be a great event. Things got off to a good start with the first keynote, from Henry Liebermann, who talked about how we manage and utilise common sense knowledge. The talk got me thinking on many levels, such as how we build task- or goal-oriented interfaces, and whether we might be able to get the 21 “most commonly used relationships in common sense statements” into an appropriate form for use on the Semantic Web. With the bar already set very high, I’d best go and work some more on the slides for my own keynote on Friday, titled “Humans and the Web of Data“.

Semantic Web In Use at ISWC2009

ISWC2008 isn’t upon us just yet, but already the preparations for ISWC2009 are under way. I’m pleased to say that I’ll be serving as co-chair of the Semantic Web In Use track alongside Lee Feigenbaum.

In my experience the ISWC series has been growing steadily in strength year on year and, while I’m inevitably biased, last year’s conference did seem a watershed moment for the series and the Semantic Web as a whole. There was a tangible energy in the air that suggested the Semantic Web was no longer just a vision, but both real and inevitable. It will be interesting to see in Karlsruhe how things are shaping up one year on. I can only speculate about where we’ll be at by autumn 2009, but I’m very much looking forward to finding out.