Creating a schema¶
- Introduction
- Starting the schema
- Defining the properties
- Going deeper with properties
- Nesting data structures
- References outside the schema
- Taking a look at data for our defined JSON Schema
Introduction¶
The following example is by no means definitive of all the value JSON Schema can provide. For this you will need to go deep into the specification itself -- learn more at https://json-schema.org/specification.
Let's pretend we're interacting with a JSON based product catalog. This catalog has a product which has:
- An identifier:
productId
- A product name:
productName
- A selling cost for the consumer:
price
- An optional set of tags:
tags
.
For example:
While generally straightforward, the example leaves some open questions. Here are just a few of them:
- What is
productId
? - Is
productName
required? - Can the
price
be zero (0)? - Are all of the
tags
string values?
When you're talking about a data format, you want to have metadata about what keys mean, including the valid inputs for those keys. JSON Schema is a proposed IETF standard how to answer those questions for data.
Starting the schema¶
To start a schema definition, let's begin with a basic JSON schema.
We start with four properties called keywords which are expressed as JSON keys.
Yes. the standard uses a JSON data document to describe data documents, most often that are also JSON data documents but could be in any number of other content types like text/xml
.
- The
$schema
keyword states that this schema is written according to a specific draft of the standard and used for a variety of reasons, primarily version control. - The
$id
keyword defines a URI for the schema, and the base URI that other URI references within the schema are resolved against. - The
title
anddescription
annotation keywords are descriptive only. They do not add constraints to the data being validated. The intent of the schema is stated with these two keywords. - The
type
validation keyword defines the first constraint on our JSON data and in this case it has to be a JSON Object.
We introduce the following pieces of terminology when we start the schema:
- Schema Keyword:
$schema
and$id
. - Schema Annotations:
title
anddescription
. - Validation Keyword:
type
.
Defining the properties¶
productId
is a numeric value that uniquely identifies a product. Since this is the canonical identifier for a product, it doesn't make sense to have a product without one, so it is required.
In JSON Schema terms, we update our schema to add:
- The
properties
validation keyword. - The
productId
key.description
schema annotation andtype
validation keyword is noted -- we covered both of these in the previous section.
- The
required
validation keyword listingproductId
.
productName
is a string value that describes a product. Since there isn't much to a product without a name it also is required.- Since the
required
validation keyword is an array of strings we can note multiple keys as required; We now includeproductName
. - There isn't really any difference between
productId
andproductName
-- we include both for completeness since computers typically pay attention to identifiers and humans typically pay attention to names.
Going deeper with properties¶
According to the store owner there are no free products. ;)
- The
price
key is added with the usualdescription
schema annotation andtype
validation keywords covered previously. It is also included in the array of keys defined by therequired
validation keyword. - We specify the value of
price
must be something other than zero using theexclusiveMinimum
validation keyword.- If we wanted to include zero as a valid price we would have specified the
minimum
validation keyword.
- If we wanted to include zero as a valid price we would have specified the
Next, we come to the tags
key.
The store owner has said this:
- If there are tags there must be at least one tag,
- All tags must be unique; no duplication within a single product.
- All tags must be text.
- Tags are nice but they aren't required to be present.
Therefore:
- The
tags
key is added with the usual annotations and keywords. - This time the
type
validation keyword isarray
. - We introduce the
items
validation keyword so we can define what appears in the array. In this case:string
values via thetype
validation keyword. - The
minItems
validation keyword is used to make sure there is at least one item in the array. - The
uniqueItems
validation keyword notes all of the items in the array must be unique relative to one another. - We did not add this key to the
required
validation keyword array because it is optional.
Nesting data structures¶
Up until this point we've been dealing with a very flat schema -- only one level. This section demonstrates nested data structures.
- The
dimensions
key is added using the concepts we've previously discovered. Since thetype
validation keyword isobject
we can use theproperties
validation keyword to define a nested data structure.- We omitted the
description
annotation keyword for brevity in the example. While it's usually preferable to annotate thoroughly in this case the structure and key names are fairly familiar to most developers.
- We omitted the
- You will note the scope of the
required
validation keyword is applicable to the dimensions key and not beyond.
References outside the schema¶
So far our JSON schema has been wholly self contained. It is very common to share JSON schema across many data structures for reuse, readability and maintainability among other reasons.
For this example we introduce a new JSON Schema resource and for both properties therein:
- We use the
minimum
validation keyword noted earlier. - We add the
maximum
validation keyword. - Combined, these give us a range to use in validation.
Next we add a reference to this new schema so it can be incorporated.
Taking a look at data for our defined JSON Schema¶
We've certainly expanded on the concept of a product since our earliest sample data (scroll up to the top). Let's take a look at data which matches the JSON Schema we have defined.