Home Blog Validating JSON Schemas in PostgreSQL

DEV

Validating JSON Schemas in PostgreSQL

Posted by gavin on Nov. 7, 2016, 2:01 p.m.

Now that postgres has JSON columns, it's tempting to store data that's awkward to model relationally as JSON. However, I don't want to give up the safety of an automatically-checked schema just to be able to use JSON. I'd like the database to always verify the integrity of the data my application gives it, like a good database should.

JSON Schema is a standard for writing for specifying the schema of a JSON document, as a JSON document itself. It's similar to XSD for XML. By implementing JSON schema as a postgres function, we could ensure our data always validates against our schema by calling the validator as a CHECK constraint.

To do this, I wrote the postgres-json-schema postgres extension. Once installed, it provides a single function validate_json_schema(schema, data) that returns true if the data validates against the schema, and false otherwise.

Example

To install postgres-json-schema, clone the repository and run make install as root, then CREATE EXTENSION "postgres-json-schema"; as the database superuser. Then create a table with a CHECK constraint. As an example, we'll use a schema that requires our data to be a list of strings or integers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
CREATE TABLE example (id serial PRIMARY KEY, data jsonb);
ALTER TABLE example ADD CONSTRAINT data_is_valid CHECK (
  validate_json_schema($$
    {
      "type": "array", 
      "items": {
        "anyOf": [
          {"type": "integer"},
          {"type": "string"}
        ]
      }
    }
  $$, data)
);

Now, postgres will reject any INSERTs or UPDATES to the table that set data to anything not valid against its schema.

1
2
3
4
5
6
7
8
example=> INSERT INTO example (data) VALUES ('[1, 2, "abc"]');
INSERT 0 1
example=> INSERT INTO example (data) VALUES ('[1, 2, true]');
ERROR:  new row for relation "example" violates check constraint "data_is_valid"
DETAIL:  Failing row contains (2, [1, 2, true]).
example=> INSERT INTO example (data) VALUES ('{}');
ERROR:  new row for relation "example" violates check constraint "data_is_valid"
DETAIL:  Failing row contains (3, {}).

postgres will now help protect our data integrity from being compromised by application bugs. JSON schema has many more features to validate more complex conditions as well, I would recommend looking into it more to be able to write schemas that are as strict as possible.