Geospatial anarchy: Managing datasets the Open Source way

Session

Teknologi & Metoder II

Resume

Created by a select few professionals, managed by a limited set of users, and used by a limited number of organizations. This is a traditional description of geospatial datasets. The advent of the Open Data movement and crowdsourcing initiatives has drastically changed this situation.

Despite what seems like an absolute anarchy, OpenStreetMap has become a large player in the geospatial world. Several companies provides paid access to mapping products using OpenStreetMap data. In many areas the data found in OpenStreetMap is equivalent or better than the commercial or governmental data.

If the Open Source model works great for OpenStreetMap, how can these ideas be incorporated into management of governmental datasets? How can governmental geospatial data released using an open license be used to enhance an open, much less structured, dataset like OpenStreetMap?

Målgruppe

Looking for an "academic track", but target audicence should be both governmental bodies working with distribution of open data, and users of both "official" data and OpenStreetMap-type data.

Abstract

Created by a select few professionals, managed by a limited set of users, and used by a limited number of organizations. This is a traditional description of geospatial datasets. The advent of the Open Data movement and crowdsourcing initiatives has drastically changed this situation.

Anyone with basic computer knowledge and access to the Internet can access, edit and use a large pool of geospatial data from various sources. These sources varies from traditional datasets created and managed in a rigid manner by governmental bodies, using strict data schemas, codes for datatypes and specific standards for representation and exchange of data. Other datasets are loosely structured collections of POIs, consisting of a position and a set of attributes. Another source of data is OpenStreetMap, an open mapping platform where everyone can add, edit and delete features, a mapping equivalent of Wikipedia.

Despite what seems like an absolute anarchy, OpenStreetMap has become a large player in the geospatial world. Several companies provides paid access to mapping products using OpenStreetMap data. In many areas the data found in OpenStreetMap is equivalent or better than the commercial or governmental data.

How is this possible? How can an uncontrolled group of volunteers, only communicating through mailing lists and commit messages, manage geospatial data? On the surface this looks like a doomed project, but like Wikipedia, this bottom-up approach to mapping and data management seems to hold some merit. An interesting observation is that this model of collaboration and management of data is closely related to the methods used to develop Open Source software. The Linux kernel includes contributions from about 8000 individuals, working from all parts of the world.

If the Open Source model of decentralizing work, a less rigid way of defining attributes and opening up for contributions from everyone works great for OpenStreetMap, how can these ideas be incorporated into management of governmental datasets? How can governmental geospatial data released using an open license be used to enhance an open, much less structured, dataset like OpenStreetMap?

These are, among others, questions we deal with. We have been developing software used by government agencies for managing geospatial datasets for decades. In the later years, we have been extending our business by also providing these datasets to other customers, often in combination with other open datasets, including OpenStreetMap.

We think that this mixture of data from different sources will be even more important in the future. From September 2016 I will start as a PhD candidate at NTNU, focusing on the matters described here (and more).

At Kortdage 2016 I will present some of my initial findings, focusing on the practices and workflows of crowdsourcing initiatives like OpenStreetMap and how some of these ideas can be applied in managing other datasets.

Læs også Atle Sveens artikel om emnet i Geoforum Perspektiv: Data anarchy