Validating JSON Schemas in PostgreSQL

Posted by gavin

DEVELOPMENT

Now that postgres has JSON columns, it's tempting to store data that's awkward to model relationally as JSON. However, I don't want to give up the safety of an automatically-checked schema just to be able to use JSON. I'd like the database to always verify the integrity of the data my application gives it, like a good database should.

JSON Schema is a standard for writing for specifying the schema of a JSON document, as a JSON document itself. It's similar to XSD for XML. By implementing JSON schema as a postgres function, we could ensure our data always validates against our schema by calling the validator as a CHECK constraint.

To do this, I wrote the postgres-json-schema postgres extension. Once installed, it provides a single function validate_json_schema(schema, data) that returns true if the data validates against the schema, and false otherwise.

Example

To install postgres-json-schema, clone the repository and run make install as root, then CREATE EXTENSION "postgres-json-schema"; as the database superuser. Then create a table with a CHECK constraint. As an example, we'll use a schema that requires our data to be a list of strings or integers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
CREATE TABLE example (id serial PRIMARY KEY, data jsonb);
ALTER TABLE example ADD CONSTRAINT data_is_valid CHECK (
  validate_json_schema($$
    {
      "type": "array", 
      "items": {
        "anyOf": [
          {"type": "integer"},
          {"type": "string"}
        ]
      }
    }
  $$, data)
);

Now, postgres will reject any INSERTs or UPDATES to the table that set data to anything not valid against its schema.

1
2
3
4
5
6
7
8
example=> INSERT INTO example (data) VALUES ('[1, 2, "abc"]');
INSERT 0 1
example=> INSERT INTO example (data) VALUES ('[1, 2, true]');
ERROR:  new row for relation "example" violates check constraint "data_is_valid"
DETAIL:  Failing row contains (2, [1, 2, true]).
example=> INSERT INTO example (data) VALUES ('{}');
ERROR:  new row for relation "example" violates check constraint "data_is_valid"
DETAIL:  Failing row contains (3, {}).

postgres will now help protect our data integrity from being compromised by application bugs. JSON schema has many more features to validate more complex conditions as well, I would recommend looking into it more to be able to write schemas that are as strict as possible.

Thanks for reading! Fusionbox provides Postgres Consulting. Contact us if we can help with your project.

← Return to Blog