Bird: A Silver Eye; Open source procurement tools often take ornithological names

Collating Comparable Contracting Data from Multiple Publishers

National Treasury South Africa,

14 minute read

GDS’s Global Digital Marketplace Programme (GDMP) reducing the barriers to publishing good quality procurement data openly using the Open Contracting Data Standard (OCDS).


GDS’s Global Digital Marketplace Programme (GDMP) wanted to reduce the barriers to publishing good quality procurement data openly using the Open Contracting Data Standard (OCDS). Less than 3% of the world’s public procurement data is published openly and less is published in OCDS.

At the time, GDMP was working with South Africa’s National Treasury. The National Treasury is responsible for managing procurement data for South African government agencies. It wanted to create a new data pipeline to collect spending data from those agencies, for example to track spending on Covid-19 related procurements.

GDMP therefore worked with Spend Network, a UK digital SME developer, to develop Silvereye, a new tool that makes it easier to submit and aggregate OCDS data, as well as monitoring what data has been submitted and by whom.

This was implemented with South Africa’s National Treasury from 2020 to 2021, by a team including the National Treasury, a local South African technology firm, and GDMP technical experts. The team used Silver Eye to create a data pipeline from government agencies to the National Treasury’s database, testing the user interface with procurement officers who would be providing the data.

At the highest level, Silvereye is an open-source ‘OCDS aggregation toolkit’. It allows users to aggregate procurement data from multiple sources and to monitor both the completeness and quality of the data submitted. Users can quickly and easily create OCDS data by uploading data using spreadsheets or automatically via an API. The tool includes the ability to submit tender notices, contract notices and transaction data. Central buying departments can monitor submissions to identify which department is or isn’t publishing and where data is falling below critical quality standards so it can take the necessary action to improve publication.

Accessing and contributing to Silvereye

Silvereye is built on an open source technology stack to allow anyone to use or contribute to it. A demo deployment of Silvereye can be accessed on this link:

https://ocds-silvereye.herokuapp.com/publisher-hub/

Silvereye source code is open source and on Github to allow anyone to branch from it and build on it. The source code as well as its full documentation, including deployment instructions, can be found on this link: https://github.com/spendnetwork/silvereye

Lastly, Silvereye is viewable on OCP Helpdesk’s tools Airtable Gallery

Technical report on Silvereye

Below is a summary of Silvereye’s recommended technical stack as well as a diagram showing how each technology fits in:

Data schema:

  • OCDS schema: https://standard.open-contracting.org/latest/en/getting_started/

Datastore and processing:

  • Python 3 (required)
  • PostgreSQL 11+ (recommended: provides performance benefits)

Web framework and hosting:

  • Django (required)
  • Heroku for app hosting (recommended: provides quick start deployment)
  • AWS S3 Storage (recommended: provides cloud storage benefits)

Deploying Silvereye

Upon submission to Silvereye, data is uploaded to a Django application, which validates the file and then, if valid, submits the data to the PostgreSQL database in compliant OCDS (JSON) format. The data is then restructured in native, normalised data tables, making it accessible to the Django application for analysis.

Figure showing Silvereye data processing / structures

Data submission

Our discovery research showed us three things:

  • While central ministries had the technical capability to process data local authorities often lacked these same skills when it came to producing and submitting data.
  • A lack of resource capacity in central ministries was being exacerbated by the overly manual task of processing what poorly/ inconsistently formatted data was submitted.
  • Local/ regional authorities and smaller ministries had their own problems with resource capacity which were worsened by the lack of process around data submission.

The lack of capacity at the local level meant that submissions were slow or even omitted entirely by local authorities. The lack of capability meant that these submissions were inconsistent, incomplete, undigitised or semi-digitised (screenshots and pdf scans were not uncommon). At a central level this created problems in understanding who had and had not submitted , which was a difficult and labour intensive task, as well as understanding the quality of those submissions and the completeness of the overall data set.

The quality of technical systems was also inconsistent, with e-procurement systems available to central ministries but no systems available to sub-national authorities and smaller ministries. Even when multiple systems were available, such as in South Africa, staff were being required to export their data to a new schema for central aggregation.

Finally, in some cases, internet infrastructure can be limited which has a knock on effect on data submission.

There is also frequently a requirement for public entities to submit extra data (for instance data on transactions) and data from multiple sources. The user need identified in transaction data is centred on two use cases: contract management and anti-corruption (when procurement data is linked to personnel and/or beneficial ownership data). For contract management, departments can identify whether spend levels were in line with the overall contract and therefore inform managers when better control needs to be exercised. For contract management and anti-corruption, transaction data offers visibility on who is authorising spend on the buyer’s side and who is receiving spend on the supplier’s side.

Given these issues identified, the team aimed to streamline the process of publishing and aggregating OCDS data regardless of the technology available to the publisher. The goal was to ensure that users with only a knowledge of spreadsheets could become publishers of valid OCDS. In order to do so, users fill in a streamlined and simple to complete CSV template and upload it using the Silvereye web interface.

The team also incorporated OCDS spend into this submission process, by creating a dedicated CSV template for spend, and creating a dedicated JSON that is generated on upload.

The fields needed for the three datasets are as follows. These fields are configurable to better accommodate users’ existing data fields. For example, the fields required by a given administration might be changed, and individual fields may be renamed in order to be pertinent to local context:

Table with field list for ocds mapping

Silvereye’s upload page

Users can also use a URL to direct Silvereye to a CSV data source. In this way Silvereye has a mechanism to communicate with, and ingest data from, data APIs. Silvereye will then validate this source and present a validation report (see Data Validation below). If the data is compliant to Silvereye’s ingestion parameters, Silvereye will ingest the data into its database.

Silvereye’s url upload tool

Through simple CSV uploads, Silvereye can provide the ‘missing link’ between regional authorities and the systems of central administrators, who are likely to want to monitor procurement data nationally. Because users of Silvereye only need a knowledge of spreadsheet software and the data they are submitting, there is less need for expensive software, infrastructure, and technical skills.

Data ingestion

Silvereye is optimised to convert a rudimentary spreadsheet into a format that is then converted to compliant Open Contracting Data Standard records. Silvereye currently provides the ability to submit three types of record: tender notices, contract notices and transactions. Silvereye includes the ability to amend the data fields required by anyone submitting data by simply updating a mapping file. This mapping file dynamically creates the CSV templates that can be downloaded and ensures compliance with OCDS.

Below is a diagram showing how data can move through Cove and into the database before being presented in Silvereye’s web interface. Additional information can be added to the flow of data into the Silvereye database. For instance, Silvereye records the details of the organisation submitting data and individuals who are authorised to submit data (see `Data augmentation’ below). Additional data, such as HR data could also be added to the Silvereye database, if required.

Silvereyes data pathway

Data validation

At the point of ingestion, the data is validated using the Cove tool, developed by the Open Contracting Partnership. In line with the vision of a simple to use tool, the team streamlined the reporting of our validation tool, so errors and exceptions for data submission are reported back in a more user friendly way and users see a less detailed set of results. Invalid files are rejected with advice on how to submit a successful file.

At the time of mapping data to OCDS each record is validated in the following ways:

  • The CSV file goes through a mapping from the user-friendly field headers to the OCDS JSON paths needed to convert the data into OCDS JSON.
  • Silvereye checks whether the CSV constitutes a viable record. A minimum set of fields must be present in a record for it to be a viable record. For instance, buyer name, a title and a tender deadline date are required for tenders.
  • The record must include viable data. For example, the deadline date should be later than the current date.
  • Every record is then checked for OCDS compliance.
  • Mappings can be customised and added too. At present, Silvereye operates a 1 tender to 1 award approach but this can be expanded to multiple awards in the event of a large contract being let to multiple winners.
  • Silvereye can record IDs as part of submission of simple sheets. This means that the data created with this tool can be compatible with Beneficial Ownership data submitted through Bluetail.
146c7e46.png

Image: An example of successful and validated submission to Silvereye.

Once submitted, a compliant file is converted into OCDS JSON and saved to an Amazon S3 data store. The data is then converted from JSON into a relational database, so that it can easily be made accessible to anyone who can write SQL, business intelligence tools and the Silvereye website.

Data augmentation

Silvereye augments data received in CSV format as part of mapping to make it compliant with OCDS. The OCDS scheme requires ‘Buyer’ to be listed in two locations in the flattened version. However, Silvereye does not force people to put the same name twice as doing so would make errors more likely. Instead, Silvereye duplicates the relevant columns automatically.

Silvereye also automatically adds a date stamp at the time of submission and validation. This is to overcome faulty date entry or invalid or inconsistencies in format and allows data to be analysed in a meaningful way.

Using the data in Silvereye

Silvereye comes with concise metrics and measurements that allow central procurement agencies to monitor the publishing of different public bodies and to determine whether a publisher should have published data recently and the nature of the data being published.

Agencies can identify through top level metrics on counts and coloured trends how much data is flowing through into Silvereye and what the differences are in publishing behaviours over time. The top level metrics also measure field coverage.

page with high level statistics

Image: An example of the highest level dashboard for Silvereye (metrics are based on sample data).

Silvereye offers the ability to search and filter publishers based on agency type so that central procurement agencies can monitor publishing rates and behaviours. Counts and colour codes based on the last submission are also available and the latter can be used to sort the view.

A demo list of UK public sector organisations who could publish OCDS data

Image: An example list of publishers on Silvereye (metrics are based on sample data)

Lastly, central agencies can drill down into individual publishers and identify publisher level metrics. These metrics are the same as the top level metrics. Identifiers and contact details can be applied and added.

Low level statistics for a single demo publisher

Image: An example drill down of a publisher on Silvereye (note metrics are based on sample data).

Visualising OCDS data

The team’s original intention was to add a dashboard containing a suite of analytics tools and visualisations to Silvereye. To that end, they explored the use of open source tools (specifically, Apache Superset) with the intention of creating a dashboard that would assist a central procurement agency or a Finance ministry to visualise and analyse data.

To inform this work, the team conducted user research and held discussions with the Racial Equality team at the Cabinet Office. From this, they learnt that conflicting and often divergent use cases meant that dashboards were either fit for purpose only for a particular segment of users or risked suffering from significant bloat in features, charts and data. It is really hard to get a single tool to effectively cover all of the user journeys / needs that different user groups might want.

The team also learnt that, regardless of the visualisations on offer, the user nonetheless extracts the data and then spends time trying different visualisations to find their own least worst visual solution for each facet of the dataset. The challenge of communicating information therefore ends up being subsumed into the practicalities of using the tool. The conclusions from this lead the team to ditch the idea of a dashboard altogether.

The team nonetheless wanted to give users of Silvereye the means of setting up analytics tools. We see that there is value in commoditised tools as they make the act of creating and testing visualisations quickly. Analysts in particular would find these valuable in order to create and share charts as part of their reporting and research. Therefore, the team created a series of open source charts and notebooks that can take data extracted from Silvereye and automatically create visualisations, available here: https://github.com/spendnetwork/charts:

To fulfil the need for analysis of procurement data, creation of visualisations and to expedite the creation of a contract register (for instance for Covid-19), the team built the functionality to allow data downloads from Silvereye. This means that data extracted from Silvereye can be aggregated and analysed, either through the above visualisations or through the user’s own creations. Data is available in tabular CSV format, with all headers and the data within each row.

Image: Data download page and buttons in Silvereye

This allows users to analyse structured data submissions. Example analysis use cases include efficiencies: does one supplier cost more in a category than another, does one department spend more on a category than another? Anti-corruption is another example: identifying spend or contracting with a contentious supplier to supplement an investigation.

Data extracted or downloaded from Silvereye can be used to create a contracting register wherein all the contracts across a country’s public sector is available from one source that can be analysed. This idea is useful for identifying buying efficiencies and also trends in buying behaviours, especially around new categories such as technology buying on a local level. Covid-19 has further highlighted the need to have a contracting register outlining buyers, suppliers and categories.

Considerations when implementing Silvereye

  1. Users submitting data need to understand open contracting and their responsibilities when submitting data. Clearly mandates and instructions about what data should be submitted will be established at a local level, so instructions on how and what to submit to Silvereye need to be created for each context.
  2. By default anyone with access to the tool can submit data. Although a username and password login feature, and an email login feature, have both been trialled successfully this has not been incorporated into the Silvereye codebase because we found the requirement and chosen mechanism will be context specific. Extending the default Django username and password authentication used for the admin to cover the submission view should not be beyond the competency of a modern web application developer.
    While the exact appropriate authentication mechanism will vary according to context, detailed notes on the approach implemented with one national government is published online.
  3. Mapping configuration is currently deployed using a CSV to provide a lookup reference. This facility might benefit from being deployed as a live database table that has been exposed to the Django Admin facility, so that users can configure these mappings within the application.
  4. The metrics provided by Silvereye are relatively limited, covering the volume of rows, the average number of fields in use and the date of submission. Additional metrics covering data quality (for example, number of null contract values or outliers to contract values) could be useful to some users.
  5. Following our research into the value of ‘dashboards’ to users, there is no visualisation beyond metrics and trend arrows in place. Given that charts are sometimes seen as a proxy for sophistication in software, it may be desirable to deploy some of the visualisations we have open sourced in the tool.

Conclusion

Silvereye fills an important gap in the OCDS publishing lifecycle, both as a tool for placing OCDS data into a queryable database from a low-tech source, but also as a mechanism for aggregating data from multiple sources. Silvereye overcomes the need for multiple publishers to implement and maintain high-cost, sophisticated software such as tender portals when the primary focus is to publish and aggregate data.

Silvereye is highly configurable, and can be used as a contracts register as well as a data quality monitoring platform. By ensuring that only compliant data is submitted to the database, Silvereye provides a robust, consistent and reliable data pipeline for administrators who can use the underlying data for either further publishing or for their own transparency needs.

Silvereye also uses the same underlying data structures as Bluetail meaning that it is possible to implement both projects on the same database.

We hope that Silvereye will be used extensively by Governments as they seek to expand their publishing of OCDS data.