Add soft link and external links in JLD2 #686
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A few words by me: This is the result of a much more refined AI coding experiment. It certainly needs more human review, but sucessfully added new features, refactoring a non-trivial amount of JLD2 internals to enable that. All previous tests still pass and new tests & docs were added. There's quite a bit of security code for detecting if external file links point somewhere malicious. I think it might be better to remove these checks for simplicity. JLD2 struct loading has been shown to allow arbitrary code execution with maliciously prepared doctored files. (also unfixably so)
Here's the AI's PR summary:
Add External Links and Soft Links Support
This PR implements comprehensive external link and soft link support for JLD2, enabling cross-file references and flexible data organization patterns while maintaining full backward compatibility and HDF5 specification compliance.
Features Added
🔗 External Links
🔗 Soft Links
🏗️ Architecture
HardLink
,SoftLink
,ExternalLink
extendingAbstractLink
API
Performance Characteristics
Compatibility
h5dump
,h5debug
,h5py
jldsave
/load
Implementation Highlights
Files Changed
Core Implementation
src/links.jl
- Abstract link type hierarchysrc/external_files.jl
- External file management and cachingsrc/path_resolution.jl
- Secure path resolutionsrc/groups.jl
- Enhanced group operations with link supportsrc/headermessages.jl
- HDF5 link message parsing/writingsrc/explicit_datasets.jl
- Dataset access through linksIntegration
src/JLD2.jl
- Module integration and exportsOrderedDict{String,RelOffset}
toOrderedDict{String,AbstractLink}
Testing
test/links.jl
- Core link functionality teststest/phase2_external_links.jl
- External link creation teststest/phase4_advanced_error_handling.jl
- Error handling teststest/phase5_soft_link_support.jl
- Soft link functionality teststest/performance_benchmarks.jl
- Performance validationDocumentation
docs/external_links.md
- Complete user documentationexample_external_links.jl
- Comprehensive demo scriptLINK_DEV_PROGRESS.md
,DEVELOPMENT_INSIGHTS.md
Example Usage
Testing
All tests pass with comprehensive coverage:
Breaking Changes
None - This is a purely additive feature with full backward compatibility.
This implementation enables powerful modular data workflows while maintaining JLD2's performance and reliability characteristics. External links allow splitting large datasets across files, creating reusable data libraries, and building complex analysis pipelines with clear data provenance.