CKAN Gateway
Your data catalogue as a broker. CKAN discovering and proxying access to data wherever it lives.
The Catalogue That Brokers Access
CKAN is the world’s most widely deployed open data platform. It catalogues datasets: metadata, descriptions, download links. But it doesn’t control access to the data itself. The catalogue knows where data lives; it doesn’t broker the connection. If the data requires authentication, rate-limiting, usage tracking, or licence enforcement, CKAN can’t help. You download the file and whatever happens next is between you and the data.
CKAN Gateway changes that. It extends CKAN from a catalogue into a data broker. Instead of pointing users to a download link, it proxies the request through a policy layer. Who is asking? What agreement governs their access? Which fields are they permitted to see? The gateway enforces these rules and returns only the data the consumer is entitled to — nothing more.
The catalogue already knows where data lives. CKAN Gateway gives it the authority to manage who gets in.
A Librarian Who Checks Your Card
Imagine a library where the catalogue tells you which shelf a book is on, but anyone can take any book. Now imagine the librarian starts checking your library card before handing you a book — making sure you’re allowed to borrow it, noting down that you took it, and only giving you the chapters you’re allowed to read. That’s CKAN Gateway: the same catalogue, but now with a librarian in the loop.
From Catalogue to Broker
1. Discover
CKAN Gateway uses standard CKAN catalogue functionality to discover data sources. Datasets from CKAN portals, DCAT-AP feeds, and API endpoints are all catalogued with their metadata.
2. Agree
Before a consumer can access data, a data sharing agreement must be in place. The agreement specifies: who (consumer organisation), what (which dataset/registry), how (permitted fields, rate limits), and why (stated purpose). Agreements follow ODRL (Open Digital Rights Language) for machine-readable policy expression.
3. Proxy
When a consumer requests data, the gateway checks the active agreement, verifies the purpose matches, enforces rate limits, fetches the data from the source, filters it to only the permitted fields, and returns the result. The consumer never talks directly to the data source.
4. Audit
Every access is logged: who requested what, when, which fields were returned, and under which agreement. The audit trail is immutable and supports GDPR accountability requirements.
Controlled Data Sharing
Cross-agency data access
A government department needs company registration data from the CRO for regulatory purposes. Rather than bulk data sharing, the gateway mediates per-request access. The department sees only the fields their agreement permits. Every query is logged for audit.
Researcher access to sensitive datasets
A university research group needs access to health statistics. Their agreement permits aggregated data but not individual records. The gateway enforces this automatically — no manual redaction needed.
Commercial data marketplace
A data provider wants to monetise access to premium datasets. The gateway handles authentication, usage tracking, and field-level filtering. The provider sets the policies; the gateway enforces them.
Under the Bonnet
Standards
DCAT-AP for catalogue metadata. ODRL for usage policies and data sharing agreements. SHACL for data shape validation. Dataspace Protocol (DSP) for interoperability with EU data spaces.
Architecture
CKAN catalogue extended with a proxy/gateway layer. Policy engine (Open Policy Agent) evaluates access requests against ODRL policies. Connectors translate between the gateway and heterogeneous data sources (REST APIs, SPARQL endpoints, file-based resources).
Access control
Agreement-based: consumer must have an active data sharing agreement with the data provider. Field-level filtering: only agreed fields are returned. Rate limiting: per-agreement request caps. Purpose binding: stated purpose must match agreement terms.
Audit
Immutable access log. Every query logged with: consumer identity, dataset accessed, fields returned, timestamp, agreement reference, query parameters. Supports GDPR Article 30 record-keeping.
Status
Design phase. Architecture validated through the DDSP prototype. Implementation planned as a CKAN extension with the proxy/policy layer as a separate service.