Gamma Blog: The Open Data Advantage (and disadvantage)

The Open Data Advantage (and disadvantage)

 
What is Open Data?

At Gamma data is our lifeblood. We rely on it to make decisions and to gain insight, we blend data from an extensive range of sources to meet and exceed client needs, and we build systems that present this data in a clear and understandable way. The data landscape has changed vastly in recent years as we move from a situation of data scarcity to data abundance.

One of the main drivers of this change is the rise of Open Data. In the interest of transparency, accountability, cost savings and a myriad of other reasons, data producers across the globe have embraced Open Data. Public bodies, in particular, are moving towards a scenario where the data they produce is ‘Open by Default’. In Ireland, Open Data has been recognised by the government as having a powerful role to play in Public Sector Reform, and therefore there is considerable support for Open Data initiatives at a high level.

At its core, Open Data is about making data available for reuse by others. That means that the data should be openly licensed, openly accessible, structured, linked and made available in an open format. Many administrations have established Open Data portals at a national or local level which serve as a central location for accessing Open Data. The UK’s portal is particularly impressive with over 43,000 datasets, Germany’s has over 25,000, and the EU itself has a portal with over 12,000 datasets. Ireland’s Open Data portal isn’t quite as large but still contains over 5,300 datasets from 100 producers today, up from 4,700 datasets from 94 publishers in early 2017.

If we look at which bodies are producing that data in the Irish context, the Central Statistics Office produce the majority of the datasets available on data.gov.ie. However, the presence of governmental bodies working in the geospatial space in the top ten should be noted.

Publisher Number of datasets Percent
Central Statistics Office 3250 61%
Health Service Executive 285 5%
Department of Housing, Planning and Local Government 238 4%
Environmental Protection Agency 228 4%
Ordnance Survey Ireland 160 3%
Marine Institute 130 2%
Roscommon County Council 92 2%
Dublin City Council 85 2%
Geological Survey of Ireland 69 1%
Department of Culture, Heritage and the Gaeltacht 60 1%
Others 743 14%
Total 5340 100%

Of course, official governmental portals are not the only sources of Open Data. In the geospatial context, the standout source is OpenStreetMap. The entire OSM database, named planet.osm, can be downloaded, but as it’s a 64GB file that can be a bit unwieldy. Smaller extracts are available from services such as GeoFabrik and BBBike.

Using data from OpenStreetMap and some other Open Data sources do raise potential issues relating to licences. Some licences, notably CC-BY-SA and the ODbL which is used by OSM, are ‘share-alike’ licences. This means that any changes to the source data made by the end user must be shared by the end user under the same terms as the source data. This licence type is sometimes referred to as ‘Viral’ as including and enriching share-alike data into business processes can have legal repercussions. Other licence terms to watch out for are NC, which means No commercial use of the data and ND which forbids changes to be made to the data. There is more information on licence types on the Creative Commons website.

At Gamma, we are big users of and believers in Open Data. Nonetheless, we are very careful about the licencing of Open Data which we use in our systems.

@ 2018 Gamma.ie by Richard Cantwell