Linked Data impact and long-term URL preservation

December 21, 2009

After a quick pass through Paul Miller’s draft Linked Data report for JISC, I looked out the notes I had made when we talked in the Black Medicine cafe. There were unusually few notes, for quite a long conversation.

I don’t think we really discussed anything that featured in the draft Linked Data report; not the implementation issues. We talked about the broader implications of linked open data for JISC services, about business models for support of open data, about the upcoming effort on …

One topic I did take notes on was that of long-term URL preservation – what kind of institution to approach to make a commitment to keep a URL around for 30+ years for the use of, say, a library special collection georeferencing project (and hopefully many others).

Here is an edit to a set of notes I wrote for @simonjbains and others at the Digital Library in Edinburgh. It’s likely this requirement is not unique to geo-services, but bibliography and media archive projects would surely face similar needs to make sure that references really stick around.

RCAHMS was another interesting choice given their involvement in digital gazetteer reference already with
Scotland’s Places.

Improving the geo-text parser interface

December 17, 2009

We’ve fixed some issues with Unlock Text, the placename text extraction service, thanks to feedback from members of the OSGeo-UK community.

Parsing of HTML pages is now reinstated and there’s a bit more documentation on the form-based interface; the meaning of the options should be a bit clearer.

The main outstanding caveat is that if you use the ‘XML’ document type option, but the document you send isn’t well-formed XML, you’ll still see the unfriendly “Error executing command-line application”.

Next step is to trap the different error states better so we can offer more meaningful feedback to the user if these errors do occur. However that will take a while longer to resolve, with input from our partners in this part of the project, the wonderful people at the Language Technology Group in the University of Edinburgh’s School of Informatics.

We’ve also streamlined the process of registering an API key to access OS data through Digimap, so it should be a bit clearer where to go to add a new key for a new IP address.

Build it because we can?

December 8, 2009

We came up with a long list of functions that we could implement in Unlock Places.

We could get into more detailed location searches – going beyond ‘where is it?’ to ‘what’s next to is it?’ or ‘where is the same size as it?’ But we’d be answering these questions “because we can”, not because someone needs to know. Here’s some of the new functions we’ve considered for the API:

  • Buffer searches around footprints (within 1 mile of the edge of town…)
  • Area of a footprint
  • Centroid, or approximate centre point, of a footprint
  • Perimeter of a footprint
  • Distance between features
  • More spatial operators (only ‘within’ and ‘contains’ right now – could get into ‘overlaps’, ‘intersects’ etc.)
  • Searches within footprints (pass in ID of a polygon, get back matching names or feature types
  • Buffered searches for “find all places within x miles of feature y’s footprint”
  • Equivalence – “what towns are there, the same size as Edinburgh?”
  • Reprojection of output (all in WGS84 now)
  • New output formats – GML, WKT, even microformats?

I’m reluctant to “build it because we can”; our focus is on “enhancing the productivity of research” so we need some evidence of a research need or benefit for whatever we implement.

Got a use case or a criticism?