Skip to content

Commit d93a175

Browse files
committed
Merge branch 'main' of https://github.com/seedcase-project/check-datapackage into refactor/rename-file-to-extensions
2 parents 388acef + cc3d6c9 commit d93a175

File tree

18 files changed

+470
-266
lines changed

18 files changed

+470
-266
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,6 @@ repos:
2525
# sub-packages, which confuses pre-commit when it tries to find the latest
2626
# version
2727
- repo: https://github.com/adhtruong/mirrors-typos
28-
rev: v1.38.1
28+
rev: v1.39.0
2929
hooks:
3030
- id: typos

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,30 @@ often, sometimes several in a day. It also means any individual release
99
will not have many changes within it. Below is a list of releases along
1010
with what was changed within it.
1111

12+
## 0.12.0 (2025-11-03)
13+
14+
### Feat
15+
16+
- :sparkles: adds `Extensions` meta-class (#165)
17+
18+
## 0.11.1 (2025-11-03)
19+
20+
### Refactor
21+
22+
- :recycle: handle grouped errors without modifying input (#177)
23+
24+
## 0.11.0 (2025-11-03)
25+
26+
### Feat
27+
28+
- :sparkles: add example field in resource (#174)
29+
30+
## 0.10.0 (2025-11-03)
31+
32+
### Feat
33+
34+
- :sparkles: exclude required issues at a given JSON path (#138)
35+
1236
## 0.9.0 (2025-10-29)
1337

1438
### Feat

_quarto.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ quartodoc:
7373
desc: Classes to configure behaviour of the checks.
7474
contents:
7575
- name: Config
76+
- name: Extensions
7677
- name: CustomCheck
7778
- name: RequiredCheck
7879
- name: Exclusion
@@ -87,6 +88,7 @@ quartodoc:
8788
contents:
8889
- name: example_package_properties
8990
- name: example_resource_properties
91+
- name: example_field_properties
9092

9193

9294
metadata-files:

docs/design/interface.qmd

Lines changed: 5 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -144,63 +144,15 @@ Package standard that are not relevant to your use case.
144144

145145
See the help documentation with `help(Exclusion)` for more details.
146146

147-
#### {{< var wip >}} `Extensions`
147+
#### {{< var done >}} `Extensions`
148148

149149
This sub-item of `Config` defines extensions, i.e., additional checks
150-
that supplement those specified by the Data Package standard. It
151-
contains subitems that store additional checks, such as `RequiredCheck`
152-
and `CustomCheck`. This `Extensions` class might be expanded to include
153-
more types of extensions.
154-
155-
```` python
156-
@dataclass(frozen=True)
157-
class Extensions:
158-
"""Extensions to the standard checks.
159-
160-
This contains additional checks to be made alongside the standard
161-
Data Package checks.
162-
163-
Attributes:
164-
required_checks: A list of `RequiredCheck` objects defining properties to set as required.
165-
custom_checks: A list of `CustomCheck` objects defining extra, custom checks to run alongside the standard
166-
checks.
167-
168-
Examples:
169-
170-
```python
171-
# TODO: Add curly brackets around python on the line above when the class has been implemented.
172-
import check_datapackage as cdp
173-
174-
extensions = cdp.Extensions(
175-
required_checks=[
176-
cdp.RequiredCheck(
177-
jsonpath="$.description",
178-
message="Data Packages must include a description."
179-
),
180-
cdp.RequiredCheck(
181-
jsonpath="$.contributors[*].email",
182-
message="All contributors must have an email address."
183-
)
184-
],
185-
custom_checks=[cdp.CustomCheck(
186-
type="only-mit",
187-
jsonpath="$.licenses[*].name",
188-
message="Data Packages may only be licensed under MIT.",
189-
check=lambda license_name: license_name == "mit",
190-
)]
191-
)
192-
# check(properties, config=cdp.Config(extensions=extensions))
193-
```
194-
"""
195-
required_checks : list[RequiredCheck] = field(default_factory=list)
196-
custom_checks : list[CustomCheck] = field(default_factory=list)
197-
````
150+
that supplement those specified by the Data Package standard.
198151

199-
Each extension class must implement its own `apply()` method that takes
200-
the `datapackage.json` properties `dict` as input and outputs an `Issue`
201-
list that contains the issues found by that extension.
152+
See the help documentation with
153+
[`Extensions`](/docs/reference/Extensions.qmd) for more details.
202154

203-
#### {{< var wip >}} `RequiredCheck`
155+
#### {{< var done >}} `RequiredCheck`
204156

205157
A sub-item of `Extensions` that allows users to set specific properties
206158
as required that are not required by the Data Package standard. See the

docs/guide/check.qmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ Data Package standard, or add your own custom checks.
100100
::: callout-tip
101101
For more details on configuring checks, see the [Configuring
102102
checks](config.qmd) guide or the [`Config` reference
103-
documentation](/docs/reference/config.qmd).
103+
documentation](/docs/reference/Config.qmd).
104104
:::
105105

106106
## Stop program on failed checks (`error=True`) {#stop-program-on-failed-checks-errortrue}

docs/guide/config.qmd

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,7 @@ To register your custom checks with the `check()` function, you add them
135135
to the `Config` object passed to the function:
136136

137137
```{python}
138+
#| eval: false
138139
package_properties = {
139140
"name": "woolly-dormice",
140141
"title": "Hibernation Physiology of the Woolly Dormouse: A Scoping Review.",

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "check-datapackage"
3-
version = "0.9.0"
3+
version = "0.12.0"
44
# TODO: Add a description of the package.
55
description = ""
66
authors = [

src/check_datapackage/__init__.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,26 @@
22

33
from .check import check
44
from .config import Config
5-
from .examples import example_package_properties, example_resource_properties
5+
from .examples import (
6+
example_field_properties,
7+
example_package_properties,
8+
example_resource_properties,
9+
)
610
from .exclusion import Exclusion
7-
from .extensions import CustomCheck, RequiredCheck
11+
from .extensions import CustomCheck, Extensions, RequiredCheck
812
from .issue import Issue
913
from .read_json import read_json
1014

1115
__all__ = [
1216
"Config",
1317
"Exclusion",
1418
"Issue",
19+
"Extensions",
1520
"CustomCheck",
1621
"RequiredCheck",
1722
"example_package_properties",
1823
"example_resource_properties",
24+
"example_field_properties",
1925
"check",
2026
"read_json",
2127
]

src/check_datapackage/check.py

Lines changed: 64 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import re
2-
from dataclasses import dataclass
2+
from dataclasses import dataclass, field
33
from functools import reduce
4-
from typing import Any, Iterator, Optional
4+
from typing import Any, Callable, Iterator, Optional
55

66
from jsonschema import Draft7Validator, FormatChecker, ValidationError
77

@@ -44,7 +44,7 @@ class for more details, especially about the default values.
4444
_set_should_fields_to_required(schema)
4545

4646
issues = _check_object_against_json_schema(properties, schema)
47-
issues += apply_extensions(properties, config.custom_checks)
47+
issues += apply_extensions(properties, config.extensions)
4848
issues = exclude(issues, config.exclusions, properties)
4949

5050
return sorted(set(issues))
@@ -137,46 +137,31 @@ def _validation_errors_to_issues(
137137
return _map(schema_errors, _create_issue)
138138

139139

140-
def _handle_grouped_error(
141-
schema_errors: list[SchemaError], parent_error: SchemaError
142-
) -> list[SchemaError]:
143-
"""Handle grouped schema errors that need special treatment.
144-
145-
Args:
146-
schema_errors: All remaining schema errors.
147-
parent_error: The parent error of a group.
148-
149-
Returns:
150-
The schema errors after processing.
151-
"""
152-
# Handle issues at $.resources[x]
153-
154-
if parent_error.schema_path.endswith("resources/items/oneOf"):
155-
schema_errors = _handle_S_resources_x(parent_error, schema_errors)
156-
157-
# Handle issues at $.resources[x].path
158-
if parent_error.schema_path.endswith("resources/items/properties/path/oneOf"):
159-
schema_errors = _handle_S_resources_x_path(parent_error, schema_errors)
140+
@dataclass(frozen=True)
141+
class SchemaErrorEdits:
142+
"""Expresses which errors to add to or remove from schema errors."""
160143

161-
return schema_errors
144+
add: list[SchemaError] = field(default_factory=list)
145+
remove: list[SchemaError] = field(default_factory=list)
162146

163147

164148
def _handle_S_resources_x(
165149
parent_error: SchemaError,
166150
schema_errors: list[SchemaError],
167-
) -> list[SchemaError]:
151+
) -> SchemaErrorEdits:
168152
"""Do not flag missing `path` and `data` separately."""
153+
edits = SchemaErrorEdits()
169154
errors_in_group = _filter(schema_errors, lambda error: error.parent == parent_error)
170155
# If the parent error is caused by other errors, remove it
171156
if errors_in_group:
172-
schema_errors.remove(parent_error)
157+
edits.remove.append(parent_error)
173158

174159
path_or_data_required_errors = _filter(
175160
errors_in_group, _path_or_data_required_error
176161
)
177162
# If path and data are both missing, add a more informative error
178163
if len(path_or_data_required_errors) > 1:
179-
schema_errors.append(
164+
edits.add.append(
180165
SchemaError(
181166
message=(
182167
"This resource has no `path` or `data` field. "
@@ -189,31 +174,31 @@ def _handle_S_resources_x(
189174
)
190175

191176
# Remove all required errors on path and data
192-
return _filter(
193-
schema_errors, lambda error: error not in path_or_data_required_errors
194-
)
177+
edits.remove.extend(path_or_data_required_errors)
178+
return edits
195179

196180

197181
def _handle_S_resources_x_path(
198182
parent_error: SchemaError,
199183
schema_errors: list[SchemaError],
200-
) -> list[SchemaError]:
184+
) -> SchemaErrorEdits:
201185
"""Only flag errors for the relevant type.
202186
203187
If `path` is a string, flag errors for the string-based schema.
204188
If `path` is an array, flag errors for the array-based schema.
205189
"""
190+
edits = SchemaErrorEdits()
206191
errors_in_group = _filter(schema_errors, lambda error: error.parent == parent_error)
207192
type_errors = _filter(errors_in_group, _is_path_type_error)
208193
only_type_errors = len(errors_in_group) == len(type_errors)
209194

210195
if type_errors:
211-
schema_errors.remove(parent_error)
196+
edits.remove.append(parent_error)
212197

213198
# If the only error is that $.resources[x].path is of the wrong type,
214199
# add a more informative error
215200
if only_type_errors:
216-
schema_errors.append(
201+
edits.add.append(
217202
SchemaError(
218203
message="The `path` property must be either a string or an array.",
219204
type="type",
@@ -223,7 +208,52 @@ def _handle_S_resources_x_path(
223208
)
224209

225210
# Remove all original type errors on $.resources[x].path
226-
return _filter(schema_errors, lambda error: error not in type_errors)
211+
edits.remove.extend(type_errors)
212+
return edits
213+
214+
215+
_schema_path_to_handler: list[
216+
tuple[str, Callable[[SchemaError, list[SchemaError]], SchemaErrorEdits]]
217+
] = [
218+
("resources/items/oneOf", _handle_S_resources_x),
219+
("resources/items/properties/path/oneOf", _handle_S_resources_x_path),
220+
]
221+
222+
223+
def _handle_grouped_error(
224+
schema_errors: list[SchemaError], parent_error: SchemaError
225+
) -> list[SchemaError]:
226+
"""Handle grouped schema errors that need special treatment.
227+
228+
Args:
229+
schema_errors: All remaining schema errors.
230+
parent_error: The parent error of a group.
231+
232+
Returns:
233+
The schema errors after processing.
234+
"""
235+
236+
def _get_edits(
237+
handlers: list[
238+
tuple[str, Callable[[SchemaError, list[SchemaError]], SchemaErrorEdits]]
239+
],
240+
) -> SchemaErrorEdits:
241+
schema_path, handler = handlers[0]
242+
edits = SchemaErrorEdits()
243+
if parent_error.schema_path.endswith(schema_path):
244+
edits = handler(parent_error, schema_errors)
245+
246+
if len(handlers) == 1:
247+
return edits
248+
249+
next_edits = _get_edits(handlers[1:])
250+
return SchemaErrorEdits(
251+
add=edits.add + next_edits.add,
252+
remove=edits.remove + next_edits.remove,
253+
)
254+
255+
edits = _get_edits(_schema_path_to_handler)
256+
return _filter(schema_errors, lambda error: error not in edits.remove) + edits.add
227257

228258

229259
def _validation_error_to_schema_errors(error: ValidationError) -> list[SchemaError]:

src/check_datapackage/config.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
from typing import Literal
33

44
from check_datapackage.exclusion import Exclusion
5-
from check_datapackage.extensions import CustomCheck, RequiredCheck
5+
from check_datapackage.extensions import Extensions
66

77

88
@dataclass
@@ -12,8 +12,8 @@ class Config:
1212
Attributes:
1313
exclusions (list[Exclusion]): Any issues matching any of Exclusion objects will
1414
be excluded (i.e., removed from the output of the check function).
15-
custom_checks (list[CustomCheck | RequiredCheck]): Custom checks listed here
16-
will be done in addition to checks defined in the Data Package standard.
15+
extensions (Extensions): Additional checks (called extensions)
16+
that supplement those specified by the Data Package standard.
1717
strict (bool): Whether to include "SHOULD" checks in addition to "MUST" checks
1818
from the Data Package standard. If True, "SHOULD" checks will also be
1919
included. Defaults to False.
@@ -37,12 +37,17 @@ class Config:
3737
)
3838
config = cdp.Config(
3939
exclusions=[exclusion_required],
40-
custom_checks=[license_check, required_title_check],
40+
extensions=cdp.Extensions(
41+
custom_checks=[license_check],
42+
required_checks=[required_title_check]
43+
)
4144
)
45+
46+
# check(properties, config=config)
4247
```
4348
"""
4449

45-
exclusions: list[Exclusion] = field(default_factory=list)
46-
custom_checks: list[CustomCheck | RequiredCheck] = field(default_factory=list)
50+
exclusions: list[Exclusion] = field(default_factory=list[Exclusion])
51+
extensions: Extensions = Extensions()
4752
strict: bool = False
4853
version: Literal["v1", "v2"] = "v2"

0 commit comments

Comments
 (0)