schematools.types module
Python types for the Amsterdam Schema JSON file contents.
- class schematools.types.AdditionalRelationSchema(_id: str, _parent_table: DatasetTableSchema | None = None, **kwargs)
Bases:
DatasetType
Data class describing the additional relation block.
- __init__(_id: str, _parent_table: DatasetTableSchema | None = None, **kwargs)
- property format
“summary” or “embedded”.
- Type
Format
- property id
- is_reverse_relation(field: DatasetFieldSchema)
See whether this relation
- property parent_table
Return the field this reverse relation queries to find objects.
Return the table this relation references.
- class schematools.types.DatasetFieldSchema(*args: Any, _parent_table: DatasetTableSchema | None, _parent_field: DatasetFieldSchema | None = None, _required: bool = False, _temporal_range: bool = False, **kwargs: Any)
Bases:
DatasetType
A single field (column) in a table.
- __init__(*args: Any, _parent_table: DatasetTableSchema | None, _parent_field: DatasetFieldSchema | None = None, _required: bool = False, _temporal_range: bool = False, **kwargs: Any) None
- property db_name: str
Return the name that is being used in the database. This can be a different name then the internal name when the field is a relation, or has a short-name.
- get_field_by_id = <methodtools._LruCacheWire object>
- property has_shortname: bool
Reports whether this field has a shortname.
You should never have to call this: name returns the shortname, if present.
- property id: str
The id of a field uniquely identifies it among the fields of a table.
Note that comparisons against id should be avoided when fields are retrieved using
.get_fields(include_subfields=True)
. In such case, a subfield with a similar ID will match with the top-level field.
- property is_composite_key
Tell whether the relation uses a composite key
- property is_loose_relation
Determine if relation is loose or not.
- property is_object: bool
Tell whether the field references an object. This might also be a relation, with a composite key. In both cases, the object subfields could be inlined in the main SQL table. See also:
is_nested_object
andis_composite_key
.
- property is_primary: bool
When name is ‘id’ the field should be the primary key For composite keys (table.identifier has > 1 item), an ‘id’ field is autogenerated.
- property is_relation_temporal
Tell whether the 1-N relationship is modelled by an intermediate table. This allows tracking multiple versions of the relationship.
- property is_subfield: bool
Tell whether this field is part of an embedded object (e.g. temporal relation)
- property is_temporal_range: bool
Tell whether the field is used to store the range of a temporal dimension. (e.g. beginGeldigheid or eindGeldigheid).
- property is_through_table: bool
Checks if field is a possible through table.
NM tables always are through tables. For 1N tables, there is a through tables if the target of the relation is temporal.
- property name: str
The name as it is shown to the external world, camel-cased. In general, the “id” field is already camel-cased, but in case that didn’t happen this property will fix that.
- property nested_table: DatasetTableSchema | None
Access the nested table that this field needs to store its data.
- property parent_field: DatasetFieldSchema | None
Provide access to the top-level field where it is a property for.
For a relation field, returns the identifiers of the referenced fields.
The returned list contains only the fields, e.g., [“id”, “volgnummer”]. These are fields on the table self.related_table.
For loose relations, it will only return the first field of the related table.
If self is not a relation field, the return value is None.
Convenience property that returns the related field schemas.
If this field is a relation, return the table this relation references.
- property reverse_relation: AdditionalRelationSchema | None
Find the opposite description of a relation.
When there is a relation, this only returns a description when the linked table also describes the other end of relationship.
- property shortname: str
The shorter name if present, otherwise the ID. Note this is only used to generate human-readable database table names.
- property subfields: list[schematools.types.DatasetFieldSchema]
Return the subfields for a nested structure.
For a nested object, fields are based on its properties, for an array of objects, fields are based on the properties of the “items” field.
When subfields are added as part of an 1m-relation those subfields need to be prefixed with the name of the relation field. However, this is not the case for the so-called dimension fields of a temporal relation (e.g. beginGeldigheid and eindGeldigheid).
If self is not an object or array, the return value is an empty iterator.
- property table: DatasetTableSchema | None
The table that this field is a part of
- property through_table: DatasetTableSchema | None
Access the through table that this fields needs to store its data.
- property type: str
Returns the type of this field.
The type is one of the JSON Schema types “string”, “integer”, “number”, “object”, “array” or “boolean”, or the URL of a schema defining a type (for geo types). “null” is never used by Amsterdam Schemas.
Dates and URLs have type “string”. Check the format to distinguish them from free-form text.
See https://schemas.data.amsterdam.nl/docs/ams-schema-spec.html#data-types for details.
- class schematools.types.DatasetSchema(data: dict, dataset_collection: CachedSchemaLoader | None = None)
Bases:
SchemaType
The schema of a dataset.
This is a collection of JSON Schema’s within a single file.
- class Status(value)
Bases:
Enum
The allowed status values according to the Amsterdam schema spec.
- beschikbaar = 'beschikbaar'
- niet_beschikbaar = 'niet_beschikbaar'
- __init__(data: dict, dataset_collection: CachedSchemaLoader | None = None) None
When initializing a datasets, a cache of related datasets can be added (at classlevel). Thus, we are able to get (temporal) info about the related datasets.
- Parameters
data – The JSON data from the file.
dataset_collection – The shared collection that the dataset should become part of. This is used to resolve relations between different datasets.
- build_nested_table(field: DatasetFieldSchema) DatasetTableSchema
Construct an in-line table object for a nested field.
- build_through_table(field: DatasetFieldSchema) DatasetTableSchema
Build the through table.
The through tables are not defined separately in a schema. The fact that a M2M relation needs an extra table is an implementation aspect. However, the through (aka. junction) table schema is needed for the dynamic model generation and for data-importing.
FK relations also have an additional through table, because the temporal information of the relation needs to be stored somewhere.
For relations with an object-type definition of the relation, the fields for the source and target side of the relation are stored separately in the through table. E.g. for a M2M relation like this:
- “bestaatUitBuurten”: {
“type”: “array”, “items”: {
“type”: “object”, “properties”: {
- “identificatie”: {
“type”: “string”
}, “volgnummer”: {
“type”: “integer”
}
}
}, “relation”: “gebieden:buurten”, “description”: “De buurten waaruit het object bestaat.”
}
- The through table has the following fields:
ggwgebieden_id
buurten_id
ggwgebieden_identificatie
ggwgebieden_volgnummer
bestaat_uit_buurten_identificatie
bestaat_uit_buurten_volgnummer
- classmethod from_dict(obj: dict[str, Any], dataset_collection: CachedSchemaLoader | None = None) DatasetSchema
Parses given dict and validates the given schema.
- get_table_by_id = <methodtools._LruCacheWire object>
- get_tables(include_nested: bool = False, include_through: bool = False) list[schematools.types.DatasetTableSchema]
List tables, including nested.
- property is_default_version: bool
Is this Default Dataset version. Defaults to True, in order to stay backwards compatible.
- json_data(inline_tables: bool = False) Union[str, int, float, bool, None, Dict[str, Any], List[Any]]
Overwritten logic that inlines tables
- property nested_tables: list[schematools.types.DatasetTableSchema]
Access list of nested tables.
Access the list or related schema ids.
This property calculates the related data that are needed, so the users of this dataset can preload these datasets. This can also include the current dataset, for relations that point to other tables within the same dataset.
- property table_versions: dict[str, schematools.types.TableVersions]
Access different versions of the table, as mentioned in the dataset file.
- property tables: list[schematools.types.DatasetTableSchema]
Access the tables within the file.
- property through_tables: list[schematools.types.DatasetTableSchema]
Access list of through_tables, for n-m relations.
- class schematools.types.DatasetTableSchema(*args: Any, parent_schema: DatasetSchema, _parent_table: DatasetTableSchema | None = None, nested_table: bool = False, through_table: bool = False, **kwargs: Any)
Bases:
SchemaType
The table within a dataset. This table definition follows the JSON Schema spec.
This class has an id property (inherited from SchemaType) to uniquely address this dataset-table in the scope of the DatasetSchema. This id is used in lots of places in the dynamic model generation in Django.
There is also a db_name method, that is used for the auto-generation of database table names. This also reads the shortname, to define a human-readable abbreviation that fits inside the maximum database table name length.
- __init__(*args: Any, parent_schema: DatasetSchema, _parent_table: DatasetTableSchema | None = None, nested_table: bool = False, through_table: bool = False, **kwargs: Any) None
- property additional_relations: list[schematools.types.AdditionalRelationSchema]
Fetch list of additional (backwards or N-N) relations.
This is a dictionary of names for existing forward relations in other tables with either the ‘embedded’ or ‘summary’ property
- property dataset: DatasetSchema
The dataset that this table is part of.
- property db_name: str
Return the standard database name for the table.
For some custom situations (e.g. importer, or handling table versions), use
db_name_variant()
.
- db_name_variant(*, with_dataset_prefix: bool = True, with_version: bool = False, postfix: str = '', check_assert: bool = True) str
Return derived table name for DB usage.
- Parameters
with_dataset_prefix – if True, include dataset ID as a prefix to the table name.
with_version – if True, include the major and minor version number in the table name.
postfix – An optional postfix to append to the table name
check_assert – Check max table length name. Can be turned of to have the check done by validation code (with much better error reporting.)
- Returns
A derived table name suitable for DB usage.
- property display_field: DatasetFieldSchema | None
Tell which fields can be used as display field.
- property fields: list[schematools.types.DatasetFieldSchema]
All the fields of the table.
This returns the direct fields that are part of the table. Fields that have “type=object” can define nested fields, which are not included here. These fields can either be read using
field.subfields
, or be inlined usingget_fields(include_subfields=True)
.
- get_additional_relation_by_id = <methodtools._LruCacheWire object>
- get_dataset_schema(dataset_id: str) DatasetSchema | None
Return the associated parent datasetschema for this table.
- get_field_by_id = <methodtools._LruCacheWire object>
- get_fields(include_subfields: bool = False) Iterator[DatasetFieldSchema]
Get the fields for this table.
- Parameters
include_subfields – This includes the subfields of
object
fields, so those can be inlined in the main table. This is useful for ORM and SQL databases, that can’t support a nested structure.
- get_reverse_relation(field: DatasetFieldSchema) AdditionalRelationSchema | None
Find the description of a reverse relation for a field.
- property has_composite_key: bool
Tell whether the table uses multiple attributes together as it’s identifier.
- property identifier: list[str]
The main identifier field, if there is an identifier field available. Default to “id” for existing schemas without an identifier field.
- property identifier_fields: list[schematools.types.DatasetFieldSchema]
Return the field schema’s for the identifier fields.
- property is_autoincrement: bool
Return bool indicating autoincrement behaviour of the table identifier.
- property is_through_table: bool
m relation table) or base table.
- Type
Indicate if table is an intersection table (n
- property main_geometry: str
The main geometry field, if there is a geometry field available. Default to “geometry” for existing schemas without a mainGeometry field.
- property main_geometry_field: DatasetFieldSchema
The main geometry as field object
- property parent_table: DatasetTableSchema | None
The parent table of this table.
For nested and through tables, the parent table is available.
- property parent_table_field: DatasetFieldSchema | None
Provide the NM-relation that generated this through table.
Tell which dataset ID’s relations point to.
This can also include the current dataset, for relations that point to other tables within the same dataset.
- property shortname: str
The shorter name if present, otherwise the ID. This is only used to generate human-readable database table names.
- property temporal: Temporal | None
The temporal property of a Table. Describes validity of objects for tables where different versions of objects are valid over time.
Temporal has an identifier property that refers to the attribute of objects in the table that uniquely identifies a specific version of an object from among other versions of the same object.
Temporal also has a dimensions property, which gives the attributes of objects that determine for what (time)period an object is valid.
- property through_fields: tuple[DatasetFieldSchema, DatasetFieldSchema] | None
Return the left and right side of an M2M through table.
This only returns results when the table describes the intermediate table of an M2M relation (
is_through_table
is true).
- class schematools.types.DatasetType(dict=None, /, **kwargs)
Bases:
JsonDict
Base class for child elements of the schema.
- class schematools.types.Faker(name: str, properties: dict[str, typing.Any] = <factory>)
Bases:
object
Name and properties that can be used for mock data.
- class schematools.types.Permission(level: PermissionLevel, sub_value: str | None = None, source: str | None = None)
Bases:
object
The result of an authorisation check.
The extra fields in this dataclass are mainly provided for debugging purposes. The dataclass can also be ordered; they get sorted by access level.
- classmethod from_string(value: str | None, source: str | None = None) Permission
Cast the string value to a permission level object.
- level: PermissionLevel
The permission level given by the profile
- none = Permission(level=<PermissionLevel.NONE: 0>, sub_value=None, source='schema')
- class schematools.types.PermissionLevel(value)
Bases:
Enum
The various levels that can be provided on specific fields.
- ENCODED = 40
- LETTERS = 10
- NONE = 0
- RANDOM = 30
- READ = 50
- SUBOBJECTS_ONLY = 1
- classmethod from_string(value: str | None) PermissionLevel
Cast the string value to a permission level object.
- highest = 50
- class schematools.types.ProfileDatasetSchema(_id: str, _parent_schema: ProfileSchema, data: Union[str, int, float, bool, None, Dict[str, Any], List[Any]])
Bases:
DatasetType
A schema inside the profile dataset.
It grants
permissions
to a dataset on a global level, or more fine-grained permissions to specifictables
.- __init__(_id: str, _parent_schema: ProfileSchema, data: Union[str, int, float, bool, None, Dict[str, Any], List[Any]]) None
- property permissions: Permission
Global permissions that are granted to the dataset. e.g. “read”.
- property profile: ProfileSchema | None
The profile that this definition is part of.
- property tables: dict[str, schematools.types.ProfileTableSchema]
The tables that this profile provides additional access rules for.
- class schematools.types.ProfileSchema(dict=None, /, **kwargs)
Bases:
SchemaType
The complete profile object.
It contains the
scopes
that the user should match, and definitions for variousdatasets
.- property datasets: dict[str, schematools.types.ProfileDatasetSchema]
The datasets that this profile provides additional access rules for.
- classmethod from_dict(obj: Union[str, int, float, bool, None, Dict[str, Any], List[Any]]) ProfileSchema
Parses given dict and validates the given schema
- classmethod from_file(filename: str) ProfileSchema
Open an Amsterdam schema from a file.
- class schematools.types.ProfileTableSchema(_id: str, _parent_schema: ProfileDatasetSchema, data: Union[str, int, float, bool, None, Dict[str, Any], List[Any]])
Bases:
DatasetType
A single table in the profile.
This grants
permissions
to a specific table, or more fine-grained permissions to specificfields
. When themandatory_filtersets
is defined, the table may only be queried when a specific search query parameters are issued.- __init__(_id: str, _parent_schema: ProfileDatasetSchema, data: Union[str, int, float, bool, None, Dict[str, Any], List[Any]]) None
- property dataset: ProfileDatasetSchema | None
The profile that this definition is part of.
- property fields: dict[str, schematools.types.Permission]
The fields with their granted permission level.
This can be “read” or things like “letters:3”.
- property mandatory_filtersets: list[list[str]]
Tell whether the listing can only be requested with certain inputs.
E.g., an API user may only list data when they supply the lastname + birthdate.
Example value:
[ ["bsn", "lastname"], ["postcode", "regimes.aantal[gte]"] ]
- property permissions: Permission
Global permissions that are granted for the table, e.g. “read”.
- class schematools.types.SchemaType(dict=None, /, **kwargs)
Bases:
JsonDict
Base class for top-level schema objects (dataset, table, profile).
- class schematools.types.SemVer(version: str)
Bases:
str
Semantic version numbers.
Semantic version numbers take the form X.Y.Z where X, Y, and Z are non-negative integers, and MUST NOT contain leading zeroes. X is the major version, Y is the minor version, and Z is the patch version. Each element MUST increase numerically. For instance: 1.9.0 -> 1.10.0 -> 1.11.0.
See also: https://semver.org/ (where the above “definition” was taken from)
This class allows semantic version numbers to be prefixed with “v”. Eg “v1.11.0”. However, their canonical form, as outputted by the
__str__()
and__repr__()
methods, will not include that prefix.In addition, the minor and patch version can be left unspecified.
SemVer
will assume them to be equal to 0 in that case.This class was specifically made a subclass of
str
to allow for seamless JSON serialization.- PAT: ClassVar[Pattern[str]] = re.compile("\n ^v? # Optionally start with a 'v' (for version)\n (?P<major>\\d+) # A major version number is compulsory\n (?:\\. # Optionally f, re.VERBOSE)
- __init__(version: str) None
Create a SemVer using a str that could be interpreted as an semantic version number.
Examples
>>> SemVer("1.0.0") SemVer("1.0.0")
>>> SemVer("v54") SemVer("54.0.0")
>>> SemVer("v3.9.0") SemVer("3.9.0")
- Parameters
version – A semantic version number, optionally prefixed with a “v”.
- Raises
ValueError if the string supplied is not a semantic version number. –
- class schematools.types.TableVersions(table_id: str, default_version: str, version_paths: dict[str, str], parent_dataset: DatasetSchema)
Bases:
Mapping
[str
,DatasetTableSchema
]Lazy evaluated dict that provides access to other table versions.
- class schematools.types.Temporal(identifier: str, identifier_field: ~schematools.types.DatasetFieldSchema, dimensions: dict[str, schematools.types.TemporalDimensionFields] = <factory>)
Bases:
object
The temporal property of a Table.
Describes validity of objects for tables where different versions of objects are valid over time.
- identifier
The key to the property that uniquely identifies a specific version of an object from among other versions of the same object.
This property combined with the fixed identifier forms a unique key for an object.
These identifier properties are non-contiguous increasing integers. The latest version of an object will have the highest value for identifier.
- Type
- dimensions
Contains the attributes of objects that determine for what (time)period an object is valid.
Dimensions is of type dict. A dimension is a tuple of the form “(‘valid_start’, ‘valid_end’)”, describing a closed set along the dimension for which an object is valid.
Example
With dimensions = {“time”:(‘valid_start’, ‘valid_end’)} an_object will be valid on some_time if: an_object.valid_start <= some_time < an_object.valid_end
- __init__(identifier: str, identifier_field: ~schematools.types.DatasetFieldSchema, dimensions: dict[str, schematools.types.TemporalDimensionFields] = <factory>) None
- dimensions: dict[str, schematools.types.TemporalDimensionFields]
- identifier_field: DatasetFieldSchema
- class schematools.types.TemporalDimensionFields(start: DatasetFieldSchema, end: DatasetFieldSchema)
Bases:
NamedTuple
A tuple that describes the fields for start field and end field of a range.
This could be something like
("beginGeldigheid", "eindGeldigheid")
.- end: DatasetFieldSchema
Alias for field number 1
- start: DatasetFieldSchema
Alias for field number 0