About open data
In this way society, citisenship, companies, university and any other institution, can easily access to inform themselves or to create new services increasing the social value and, as the case may be, also the comercial value.
Thus, facilitate public information in open formats for everyone make use is to go beyond the simple process of allowing the re-use of information: it is to return public information to society and encourage them to use it for everything they want.
The public administrations have much information necessary to carry out the public services that they request. But this information can be much more useful, therefore, if it is returned to citizens allowing them to reuse it for other purposes and increase the benefit of this information.
An open data project must work under the following objectives
- Open public data for all sectors of the institution.
- Contribute to changing the culture of re-use of public information.
- Stimulate the use and reuse of open data.
- Strengthen the initiative to open public data in other public and private institutions.
- Promote the economic fabric through this initiative.
The European Directive 2003/98 / CE, of 17 November 2003 on the re-use of public sector information established a set of rules for the treatment of reusable public information. This was amended as Directive 2013/37/UE, on June 26, 2013, and transposed at the state level as Law 18/2015, of July 9, modifying the previous Law 37/2007, of November 16, on reuse of public sector information. The Law 18/2015 aims at the basic regulation of the legal regime applicable to the reuse of documents prepared or guarded by public sector Administrations and agencies.
On the other hand, and related to the opening of public data, there are additional regulations, such as State law 19/2013, of December 9, on transparency, access to public information and good governance and Regional law 19/2014, of December 29, on transparency, access to public information and good governance. These laws aim to promote citizen participation, forcing public entities to give account to the citizens, in accordance with the principle of responsibility, their activity and the management of public resources.
About the use of data
- CSV: Comma-Separated Values (CSV) files are an open document type that represents tables with columns separated by commas and rows by line breaks.
- XLS, XLSX: The XLS, XLSX format refers to the files that the Microsoft Excel calculation program uses. The data is presented in lines and columns.
- XML: The XML (eXtensible Markup Language) files are based on a language developed by the World Wide Web Consortium (W3C) that allows defining the grammar of specific languages to structure large documents.
- RDF: RDF (Resource Description Framework) files are World Wide Web Consortium (W3C) specifications originally designed as metadata models. Its usual use is to give a conceptual description to the web pages.
- KML: Keyhole Markup Language (KML) files specify a set of features (place marks, images, polygons, 3D models, textual descriptions, etc.) for display in Google Earth, Maps and Mobile, or any other geospatial application software of the KML coding. Each site always has a length and a latitude.
- DAT: These DAT files can be encoded in plain text format, while some DAT files are implemented with binary coding specifications.
The classification based in stars and developed by Tim Berners-Lee allows to quantify the technological quality of the open data according to the format used to represent the data.
This scheme is incremental where each level includes the previous one.
- ★ One star
- Data or documents available on the web in any format.
- Under an open, non-restrictive license.
- Unstructured format.
- The dataset or document can be viewed on the web but not automatically processed.
Examples: an image in JPG or PNG format, or a document scanned in PDF format.
- ★★ Two stars
- Structured data or documents.
- Automatically processable.
- Proprietary format (not open).
Example: A spreadsheet in Microsoft Excel format.
- ★★★ Three stars
- Structured and open format (non-proprietary).
Example: Spreadsheet in CSV (Comma Separated Values) format instead of Microsoft Excel.
- ★★★★ Four stars
- Data can be referenced with persistent web addresses or Uniform Resource Identifiers (URIs).
- W3C standard and open formats are used to semantically describe the information.
Example: representation in the RDF (resource description infrastructure) model of the buildings of a public body, with its contact and location data, atomic data in which it can be accessed by web addresses (URI). Certain APIs could also be considered.
- ★★★★★ Five stars
- Data is linked and semantically described with other external datasets to provide context to the information.
- Semantic relationships are established between linked information.
Example: In the above case, descriptions of the location of public buildings could be enriched with links to DBpedia (http://dbpedia.org). These links could include a detailed description of localities, regions, or countries and thus have direct access to socioeconomic or toponymic information of these places.
Technical excellence - five stars - is achieved when data is linked to other resources on the web through semantic mechanisms, which offer full interoperability between different systems, and allow a much more efficient reuse later.
By default, the resources published in the Open Data BCN portal will be subject to a license CC-BY 4.0, which means that a re-use of the information (copy, adapt, process, etc.) can be carried out and distributed with the condition of citing the origin of the data.
The themess and subthemes available are as follows:
- Society and Welfare
- Town planning and Infrastructures
- Culture and Leisure
- Public opinion
- Public sector
- Human resources
- Legislation and justice
Economy and Business
- Science and technology
Moreover, the catalogue is available in .csv, .rdf and .json downloadable formats from the option 'API Catalogue' and also with the dataset ''Barcelona City Council Open Data catalogue - Open Data BCN".
To reuse existing data in the catalogue citing the origin of the same it should be explicitly stated that they come from the Open Data BCN portal, like this:
In the case of allowing HTML code:
This product or service uses data from the Open Data BCN portal.
In case of only allow text:
This product or service uses data from the Open Data BCN portal (https://opendata-ajuntament.barcelona.cat/en).
In georeferenced datasets, the coordinate system in which the information is published is indicated. We highlight that the datasets that come from CartoBCN, indicated in the 'More information' metadata of the dataset file, are published in the official reference system ETRS89 (EPSG: 25831), except for the GeoJSON, GeoPDF and KML formats, which following its own standard are in WGS84. Likewise, the historical cartographic products are published in the ED50 system (EPSG: 23031).
To convert them to UTM31 ED50 coordinates, follow the following procedure:
1. Split them by 1,000 (or put a decimal point in the third position on the right)
2. Add 400,000 to the X coordinates (put a 4 in front)
3. Add 4,500,000 to coordinates Y (put a 45 in front)
Therefore, as an example we take these coordinates with internal format:
If we apply the procedure specified above, we obtain:
X = 30733208 / 1000 = 30733.208.
30733.208 + 400000 = 430733.208
Y = 88007.542 / 10000= 88007.542
88007.542 + 4500000 = 4588007.542
CSV files follow a standardized format and data visualization depends on the tool used. Using a text editor, the data appears on a single line separated by commas. On the other hand, in most spreadsheets and similar tools the format is automatically detected or options are provided to display the data in a grid.
The RDF model allows you to specify metadata to describe resources of any kind (physical or virtual) on the web. This model can be represented in different formats and allows the exchange of information between automatic systems.
As an example, the dataset catalogue of the Open Data BCN portal is represented in this format. This allows other initiatives at the supranational level to process and make operations to add their contents (eg, the European Data Portal performs an aggregation of European catalogs through descriptions in RDF).
The RDF/XML format can be opened with any XML or text editor such as Notepad ++.
The WMS service defined by the OGC produces maps with spacial data referred dinamically from geographic information. With this service one cannot obtain plain data but an image that allows the representation of that data in digital format.
The WMTS service, or tiled maps service, as the WMS, provides a digital image from geographic data, but with higher answer velocity since it takes helps from the collections of tiles or portions of images already generated in defined scale intervals.
The link to a WMS service from the Open Data BCN does not provide any map directly but a URL of a map service that operates with the standard WMS (Web Map Service). This standard has to be used with a GIS software like QGIS (open license), that allows connection with the map server and navigate (movement, zoom...). Any other type of GIS software, for example ArcGIS or Geomedia, are able to work with WMS.
This service can also be used in a geoportal that allows it, like the GeoportalBCN, the portal of the Institut Cartogràfic i Geològic de Catalunya, or any other web map client.
Keep in mind that the display of these services at scales greater than 1:5000 may be disabled.
The resources of a dataset can be represented by data in different ways.
In WIDE format, the concept will be expressed in lines and a column will be created for each value that concept can have. For example, if we want to represent the ‘Population by neighbourhoods of the city of Barcelona’ data, a line with the concept ‘Population’ and a column with each of the neighbourhoods’ value will be created.
|el Raval||el Barri Gòtic||la Barceloneta|
In LONG format, the concept will be expressed in columns and the lines will reflect the different values. With the same example as before, if we want to represent the data of the ‘Population by neighbourhoods of the city of Barcelona’, two columns will be created, one with the concept, ‘Neighbourhood’ and another one with the value, ‘Population’.
|el Barri Gòtic||14.734|
The LONG format allows news values to be introduced easily despite it being less visually intuitive.