This repository demonstrates LinkML schema development for modeling plant tissue sample metadata. It serves as a training resource for learning key LinkML features including data modeling, validation, and artifact generation.
This project implements a PlantTissueSample schema that captures comprehensive metadata for plant tissue samples, including:
- Sample identification and container information (tubes, plates, well locations)
- Taxonomic classification using NCBI Taxonomy IDs
- Biological characteristics (ploidy levels, tissue types, cultivar/strain information)
- Collection metadata (timestamps, sample sizes, tissue descriptions)
- Environmental context using ENVO (Environment Ontology) terms
- Geospatial information (depth, elevation)
- Plant anatomy using Plant Ontology (PO) terms
This tutorial showcases important LinkML modeling patterns:
- Uses standard biomedical ontologies (ENVO, PO, PATO, NCBITaxon)
- Demonstrates semantic mappings with
meaning,exact_mappings, andslot_uri - Shows
reachable_fromfor dynamic enumeration from ontology hierarchies
- Required vs. optional fields
- Enumerated values with controlled vocabularies
- Pattern constraints (e.g., plate well positions:
^[A-H][1-9][0-2]?$) - Type ranges (string, integer, float, datetime, uriorcurie)
- Multivalued slots for multiple ontology term annotations
- Classes: PlantTissueSample with identifier and metadata slots
- Enumerations: SampleContainerEnum, PloidyEnum with PATO mappings
- Dynamic Enumerations: NCBITaxonEnum and TissueTypeEnum using
reachable_from - Slots: Field definitions with descriptions, constraints, and semantic annotations
By exploring this repository, you will learn how to:
- Define LinkML schemas with classes, slots, and enumerations
- Integrate ontologies for semantic interoperability
- Add validation constraints (required fields, patterns, ranges)
- Generate artifacts (Python classes, Pydantic models, JSON Schema)
- Create test data (valid and invalid examples)
- Validate data using linkml-validate
- Document schemas with auto-generated documentation
https://linkml.github.io/linkml-tutorial-2025
- docs/ - mkdocs-managed documentation
- elements/ - generated schema documentation
- examples/ - Examples of using the schema
- project/ - project files (these files are auto-generated, do not edit)
- src/ - source files (edit these)
- linkml_tutorial_2025
- schema/ -- LinkML schema (edit this)
- datamodel/ -- generated Python datamodel
- linkml_tutorial_2025
- tests/ - Python tests
- data/ - Example data
# Clone the repository
git clone https://github.com/linkml/linkml-tutorial-2025.git
cd linkml-tutorial-2025
# Install dependencies
uv sync
# Run tests
just test
# Validate example data
uv run linkml-validate -s src/linkml_tutorial_2025/schema/linkml_tutorial_2025.yaml \
-C PlantTissueSample tests/data/valid/PlantTissueSample-001.yaml
# Generate artifacts (Python, Pydantic, JSON Schema, etc.)
just gen-projectThe repository includes example data to demonstrate validation:
- PlantTissueSample-001.yaml - Complete valid sample with all required fields
- PlantTissueSample-missing-required.yaml - Missing required fields (strain_variety_cultivar, ncbi_taxonomy_id, tissue)
- PlantTissueSample-bad-range.yaml - Invalid enum values and type mismatches
- PlantTissueSample-pattern-violation.yaml - Pattern constraint violations (plate location, sample size format)
Run validation to see error messages:
linkml-validate -s src/linkml_tutorial_2025/schema/linkml_tutorial_2025.yaml \
-C PlantTissueSample tests/data/invalid/PlantTissueSample-missing-required.yamlThere are several pre-defined command-recipes available.
They are written for the command runner just. To list all pre-defined commands, run just or just --list.
just test- Run all tests and generate artifactsjust gen-project- Generate Python datamodels, JSON Schema, etc.just docs-serve- Serve documentation locally
This project uses the template linkml-project-copier published as doi:10.5281/zenodo.15163584.