Exploring €1.3 trillion in public contracts with graph visualization
The European Union is giving free access to detailed information about public purchasing contracts. That data describes which European institutions are spending money, for what and who benefits from it. We are going to explore the network of public institutions and suppliers and look for interesting patterns with graph visualization.
Every year, more than €1.3 trillion in contracts are awarded by public entities in Europe. In an effort to make these public contracts more transparent, the European Union has decided to make the tenders announcements public. The information can be found online through the EU’s open data portal. OpenTED, a non profit organization, has gone a step further and made the data available in a CSV format.
Public contracts are complex though. It involves at least one commercial entity which is awarded a contract by a public authority. The public entity may be acting as a “delegate” for another public entity. The contract can be disputed in a certain jurisdiction of appeal.
We have multiple entities and relationships. What this means is that the tenders data describe a graph or network. We are going to explore the tenders graph with Neo4j and Linkurious.
We will focus on the 2015 tenders. There are 73,269 tenders in one single CSV file with 45 columns.
We decided to model the graph in the following way:
The graph model above highlights the relationships between the contracts, appeal bodies, operators, delegates and authorities in our data.
To put it in a Neo4j database, we wrote a script that you can see here.
This script takes the 2015 tenders data and turn it into a Neo4j graph database with 161,541 nodes and 536,936 edges. Now that it’s in Neo4j, we can search, explore and visualize it with Linkurious. It’s time to start asking questions to the data!
As a first step, let’s try to identify the big public contracts which happened in Europe in 2015 and what organizations they involved. In order to get that answer, we’ll use Cypher (the Neo4j query language) within Linkurious.
Here’s how to find the 10 biggest contracts and the public authorities and commercial providers they involved.
// The top 10 biggest contracts and who they involve
MATCH (b:CONTRACT)
WHERE b.contract_initial_value_cost_eur <> ‘null’
WITH b
ORDER BY b.contract_initial_value_cost_eur DESC
LIMIT 5
MATCH (a:AUTHORITY)-[:IS_AUTHORITY_OF]->(b)<-[:IS_OPERATOR_OF]-(c:OPERATOR)
RETURN a, c, b
The result is the following graph.
We can see for example that Ministry of Defence has awarded a large contract to Babcock International Group Plc.
The graph structure of the data also allows us to spot missed opportunities. Let’s take for example KPMG. It’s one of the biggest recipient of public contracts within our dataset. What other opportunities the company could have been awarded?
To answer that question, we can identify which of KPMG’s customers awarded contracts to its competitors.
Let’s identify the biggest missed opportunities of KPMG:
// KPMG’s biggest missed opportunities
MATCH (a:OPERATOR {official_name: ‘KPMG’})-[:IS_OPERATOR_OF]->(b:CONTRACT)<-[:IS_AUTHORITY_OF]-(customer:AUTHORITY)-[:IS_AUTHORITY_OF]->(lost_opportunity:CONTRACT)<-[:IS_OPERATOR_OF]-(competitor:OPERATOR)
WHERE NOT a-[:IS_OPERATOR_OF]->competitor
RETURN a,b,customer,lost_opportunity,competitor
We can visualize the result in Linkurious:
KPMG is highlighted in red. It is connected to 10 contracts by 9 customers. These customers have awarded 246 other contracts to 181 firms.
This visualization could be used by KPMG to identify its “unfaithful” customers and its competitors. Specifically we may want to filter the visualization to focus on the contracts for similar services to the ones KPMG offer. To do that we will use the CPV code, a procurement taxonomy of products and services.
Here is a visualization filtered to only display contracts with similar CPV codes as KMPG’s contracts:
If we zoom in, we can see for example that GCS Uniha, one of KMPG’s customer, has also awarded contracts to some of KPMG’s competitor (Ernst and Young, PwC and Deloitte).
Finally, we can use Linkurious to visualize a particular economic sector. Let’s focus for example on IT spending in France. Who are the biggest spenders in that sector? Which companies are capturing these contracts? Finally what are the relationships between all these organizations?
Using a Cypher query, we can identify the customers and suppliers linked to IT contracts (which all have a CPV code starting with “48”):
//The French public IT market
MATCH (a:AUTHORITY)-[:IS_AUTHORITY_OF]->(b:CONTRACT)<-[:IS_OPERATOR_OF]-(c:OPERATOR)
WHERE b.contract_cpv_code STARTS WITH “48” AND a.country = ‘FR’
RETURN a,b,c
We can visualize the result directly in Linkurious.
If we zoom in, we can see that Conseil Général des Bouches du Rhône has awarded 9 IT contracts in 2015, mostly to Dalkia and Idex énérgies.
We have dived into €1.3 trillion of public spending using Neo4j and Linkurious. We were able to identify some key actors of the public contracts market and map interesting ecosystems.
A spotlight on graph technology directly in your inbox.