12 November 2020

The Impact of High Value Datasets

As former EuroGeographics Secretary General and Executive Director, Mick Cory retires from a lifetime career working in National and International Geospatial Agencies, he examines the impact of high value datasets. The views expressed in this article are his own and do not necessarily represent the views of EuroGeographics or its members.

The Open Data PSI Directive

Mick CoryThe Open Data PSI Directive[1] encourages European Union Member States to make certain high-value public sector datasets available as open data.  ‘Open’ means free for re-use with minimal legal restrictions, free of charge and in machine- readable format, via suitable APIs[2] and, where relevant, as a bulk download.   According to the Directive, ‘high-value’ means data with the potential to:

  • generate significant socio-economic or environmental benefits and innovative services;
  • benefit a high number of users, in particular small to medium sized enterprises (SMEs);
  • assist in generating revenues; and
  • be combined with other datasets.

Six thematic categories of high-value datasets are identified in the Directive:

  1. Geospatial
  2. Earth observation and environment
  3. Meteorological
  4. Statistics
  5. Companies and company ownership
  6. Mobility

National Mapping, Land Registry and Cadastral Authorities in the European Union are included within the scope of the Directive, as are all public bodies such as government departments, state agencies and municipalities, as well as organisations funded mostly by or under the control of public authorities. 

Proposed High-Value Geospatial datasets

Administrative Units

Describe the geographic location and extent of areas of public administration where Member States exercise legal jurisdictional rights for local, regional and national governance, separated by administrative boundaries.  When combined with demographic and other statistical data they inform regional and urban policy implementation, urban and regional development planning, managing the delivery of public services, for judicial or other legal purposes and for determining parliamentary or local democratic constituencies. 

Place Names

Placenames, or geographics names, are the proper noun applied to a natural, man-made or cultural feature on Earth. They represent an important reference system used by individuals and societies throughout the world, and often have historical and cultural significance, and are valuable in emergency response, for economic, social and environmental analysis, or as a reference to cultural identity and heritage studies. 

Addresses

Address datasets typically containing the road name, house number, postal code and geographic location of properties, and normally refer to a building, or other permanent construction intended or used for the shelter of people, having at least one entrance from publicly-accessible space.  Such data are of high value because the combination of addresses with location permits sophisticated geospatial analysis for a wide range of uses, including statistical analysis based on location (linking, for example, census data to location), locating people for emergency rescue, permitting accessibility studies and the analysis of economic activities.  The efficient and effective delivery of mail, parcels and a wide range of public services (such as utilities) rely on addresses in general, and can be improved greatly by including some form of geospatial analysis, such as route optimisation, when combined with transportation network data.

Buildings

This dataset contains the location and extent of the two-dimensional footprint of a building or a three-dimensional model of the building.   Such data are considered high-value as they refer to facilities essential for the shelter and employment of people and in combination with other datasets can provide important information on usage, environmental impact, for air and noise pollution, risk assessment for earthquake, fire or flood, monitoring of land use and consumption, analysis of population concentration and the requirement for and access to services. 

Cadastral Parcels

Cadastral parcels describes the geographic location and boundaries of areas of the Earth surface under homogeneous real property rights and unique ownership.  Cadastral parcel datasets are considered high-value as they provide a link between the land parcel, its ownership or other rights and potentially other information held in a national cadastral database (such as property value).  These are important for the definition and protection of state lands, they reduce land disputes, facilitate land reform, agriculture, land management, disaster management, the real estate market and form the basis of an equitable property tax system. 

Geospatial datasets - legislative intervention

The geospatial thematic category is estimated to have the largest share (at 34%) of the public sector information market, with the potential for further growth if the Commission introduce some low-level legislative interventions in:

  • licences and terms of use,
  • APIs and bulk download,
  • Formats,
  • Granularity (scale),
  • Metadata and
  • key attributes

In all of these cases, proposals are broadly aligned to the requirements, definitions and standards already set out in the INSPIRE Directive, with some new open standards and formats being suggested to increase re-usability.

Earth Observation and Environment

The Earth Observation and Environment thematic category includes the following geospatial datasets:

  • Digital Elevation models.These are three-dimensional models of the earth’s surface, and include terrestrial elevation, bathymetry and shoreline. 
  • Hydrography, covering the topographic description of all inland water and marine areas covered by river basin districts as defined in the Water Framework Directive.
  • Land parcels describe the location and extent of areas of land in terms of their physical or biological cover, such as agricultural, woodland or water bodies (land cover) and their economic and ecological purpose, such as farming, tourism etc (land use).
  • Ortho-images are geographically referenced satellite or airborne imagery (from the visible and non-visible parts of the electromagnetic spectrum) that have been geometrically corrected (orthorectified) to remove distortion caused by differences in elevation, sensor tilt and by sensor optics.

These data are of high-value for environmental reporting, for geological investigations and engineering planning, and ortho-imagery may be used to supplement a wide range of mapping applications, cadastral surveying and agricultural planning and management.    Higher levels of legislative intervention are being considered for Earth Observation and Environment datasets, removing restrictive terms of use and fees, and extending the  INSPIRE Directive data harmonisation efforts to include the datasets listed as open data

Mobility

These data include transportation datasets required to support Intelligent Transport Systems (ITS) and published under the INSPIRE Directive.  Geospatial datasets within this thematic category   contain the geographic location and extent of road, rail, water and other transportation networks.  They are considered vital to increase safety and tackle Europe's growing emission and congestion problems and to achieve a more efficient management of the transport network for passengers and business.  A low-level legislative intervention is being considered by the Commission for data under this thematic category, including on licensing, formats, accessibility, completeness, granularity, and data attributes.

Next steps

The final proposal for high-value datasets is expected to be submitted to the European Commission’s Open Data Committee during the first quarter of 2021.  An Implementing Regulation is then planned during 2021 which will define the agreed list of specific high-value data sets along with legislative interventions agreed to further encourage the availability of the datasets identified.

A number of actions are underway or planned to support this policy implementation, including the availability of data through the European Union’s Open Data digital infrastructure: the European Data Portal and the EU Open Data Portal; and funding through the Connected Europe Facility (CEF) and the development of the Digital Europe Programme (DEP), with the specific objective of supporting the provision of data for Artificial Intelligence (AI).

In the longer term the European Commission is required to carry out an evaluation of the impact of the Directive no sooner than 17 July 2025, that will assess the scope and scale social and economic impact and to identify further possibilities for improving the proper functioning of the internal market and supporting economic and labour market development.

[1] The ‘Open Data PSI Directive’ (Directive (EU) 2019/1024) entered into force on 16 July 2019. It replaces the Public Sector Information Directive, also known as the ‘PSI Directive’ (Directive 2003/98/EC) which dated from 2003 and was subsequently amended by the Directive 2013/37/EU.

[2] An API is an Application Programming Interface.  It is a set of rules and protocols that act as a software intermediary to allow two applications to talk to each other: in this case as a means to utilise the data.