During the last two years, I have participated in the project of implementing open government policies in local authorities in Catalonia, within the framework of the “Network of Transparent Local Governments of Catalonia” and we have promoted the opening of public information ensuring open formats, interoperability, standardization, reusability, and good data quality.
We have done it by applying the “once only” principle of efficiency: that is, we do not ask local administrations to publish data that has already been reported previously to province, regional, or federal administrations: we have contacted them, and we have requested the opening of their consolidated data for all the local administrations about budgets, personnel, debt, subsidies, contracts, etc. This information has been made available on a shared open data platform, and more than 1,000 local public authorities that use the transparency service of the Open Government of Catalonia, have this information automatically published and regularly updated on its website, without the need to perform any additional task.
From this experience, I would like to present a list of the main barriers, attitudes, or “syndromes” that have shown up during the process of opening consolidated public data. According to the Collins dictionary, a “syndrome” is a group of symptoms that, together, are signs of a certain specific disorder or disease. Here is the list of “syndromes”:
Raiders of the lost law. Nobody in Spain mentions, remembers, comprehends what it says, or knows where to find the Directive on the re-use of public sector information (Directive 2003/98/EC entered into force in 2003 and revised by Directive 2013/37/EU, which came into effect on 2013). Their obligations are almost unheard of. Apparently, after its approval, this Directive was lost in the secret valley of the Well of the Souls in Egypt, next to the lost ark.
Gollum’s syndrome. In the movie, Lord of the Rings, the ring is “my treasure” and transforms the owner in a powerful but selfish person, obsessed with not sharing it and paranoid about “losing” it. It happens the same way with the government data. There are roles in the authorities that have spent their lifetime preserving and protecting the government data, and now suffer from “paranoia” when someone wants to make it open. The famous sentence, the “data belongs to citizens” is a chimera for them.
Kilian Jornet syndrome. The list of difficulties that show up when trying to open the public-sector data seems longer and more complicated than climbing Mount Everest, twice in a week, without artificial oxygen, as the famous Catalan climber, Kilian Jornet, did in May 2017. The data “guardians” claim privacy problems, poor data quality (with the risk of giving a very bad image), technological difficulties, high costs, that no one needs it, and so on. An extensive list of excuses: as Kilian Jornet says, “I do not look for excuses to train, do you?”
Twin Peaks syndrome. As in the Twin Peaks series, nothing is as it seems and at every step, there is a surprise. Data controllers agree to share the information in an open data format, but our initial satisfaction is transformed when we realize that it is only made available in PDF format. They are “semi-open” data, but quite “dark” data because they are not easy to exploit and analyze.
Forest Gump syndrome. This fictional character unintentionally became a world champion of ping-pong with a very simple strategy: to return all the balls and to defeat to the contrary by exhaustion. In the public authorities, we have many “Forest Gump” experts in the art of returning ping-pong requests. They are asked to share information in an open data format and respond, but with a different thing that obviously is not what we have asked for. We return the ping-pong ball with more energy and precision, and return it to us on the other side of “the table”, but continue to be of little use. And so, on and on, until we give up in exhaustion.
“Home alone” syndrome. We, sometimes, get this answer: of course, you can access “my public” information, but at my “home” alone (website). To access the data, we should create a link to “their house” and thus, it is evident who the “owner” of the data is and, moreover, the visitor counter of “their website” goes up.
Syndrome of the Despacito. You can access the data, and you can even download it in open and structured data files (CSV or spreadsheet) using a search engine but, as the song says, “despacito” (slowly). A restriction has been implemented in the search engine that only allows downloading the information in “little bits” of a few hundred records at a time, due to some strange technological limitations and so that no one gets filled up with too much data. If someone wants to do a global analysis of the data, they can do it, but “pasito a pasito” (step by step).
The Da Vinci Code syndrome. The data are all available in an open format, structured and standardized. It seems that we have finally succeeded, but we are surprised that some of the key codes to solve some puzzles are missing: for example, in a contractor’s database, the company code is missing, and we only have access to the name. As each public authority may have written the name of the company differently, it would be very complicated to do data crossings and analytical reports. Not even a Da Vinci’s genius would get it: for example, to analyze all contracts awarded to a company in various Administrations.
“Others have it bigger” syndrome. Excuse the foul language, but it is a very graphical expression. We eventually get to publish a data set of, for example, all the information about the budget of all the local administrations of Catalonia in open, interoperable, standardized, with all the key code data, etc. But then, it shows up an alleged expert who elaborates open data rankings that disregards it because it is only a single dataset, and puts as an example of good practice the authorities that have published hundreds of datasets. Apparently, the more the better. Analyzing the supposed good practices, we see that there are open data portals that have a dataset for each entity, for each year and for a very specific concept (for example, the budget of expenses by one of the three concepts of the Spanish accountability). Well, if we used this criterion, a single dataset of the budget of all local governments of Catalonia would become at least 30,000 datasets: (+1000 local entities) x (10 years of budget history) * (3 concepts). But what is the most useful if you want to do a comparative analysis per year or between local authorities?. I think, the simplest is the best.
Despite the difficulties, I believe that an excellent job has been done, as we have managed to publish 35 sets of open data, with consolidated information for all the local authorities of Catalonia, although certainly much remains to be done. As Confucius said, “the man who moves a mountain begins by carrying away small stones”. There we are.
Note: Thanks to Josep Matas