Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"permissions": {
"allow": [
"Bash(git branch:*)"
],
"deny": [],
"ask": []
}
}
2 changes: 0 additions & 2 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,6 @@ website:
search: true
logo: assets/isampleslogopetal.png
tools:
- icon: table
href: https://hyde.cyverse.org/isamples_central/ui/
- icon: github
href: https://github.com/isamplesorg
- icon: slack
Expand Down
10 changes: 9 additions & 1 deletion about.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,18 @@ title: "About iSamples"

# Project Objectives

1. Design and develop iSamples infrastructure (iSamples in a Box and iSamples Central);
1. Design and develop iSamples infrastructure (iSamples in a Box and distributed data systems);
2. Build four initial implementations of iSamples for adoption and use case testing (Open Context, GEOME, SESAR, and Smithsonian Institution);
3. Conduct outreach and community engagement to developers, individual researchers, and international organizations concerned with material samples.

## Current Data Access

**Note**: iSamples Central is currently unavailable. The project has transitioned to a **geoparquet-based approach** for data access and analysis:

- **Primary Data Source**: Comprehensive geoparquet files containing millions of sample records
- **Analysis Platform**: Browser-based tools using DuckDB-WASM and Observable
- **Coverage**: Complete datasets from SESAR, OpenContext, GEOME, and Smithsonian collections

![iSamples diagram](assets/iSamplesArchitecture.png)


Expand Down
2 changes: 1 addition & 1 deletion design/requirements.md
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ Components
## 15 All content sources should be assumed to be dynamic and attached components should facilitate efficient synchronization of subscribed content.


iSamples central will need to continually update the catalog and promote dissemination of the content to subscribers (e.g. iSB instances).
With the transition to geoparquet-based data access, content synchronization now occurs through periodic updates of parquet files rather than real-time API synchronization. This approach provides better performance and reliability for analytical workloads.

Derived from:

Expand Down
9 changes: 9 additions & 0 deletions index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,15 @@ subtitle: "Toward an Interdisciplinary Cyberinfrastructure for Material Samples

The Internet of Samples (iSamples) is a multi-disciplinary and multi-institutional project funded by the National Science Foundation to design, develop, and promote service infrastructure to uniquely, consistently, and conveniently identify material samples, record metadata about them, and persistently link them to other samples and derived digital content, including images, data, and publications.

## Current Data Access: Geoparquet-Based Approach

**Note**: iSamples Central is currently unavailable. The project now uses **geoparquet files** for efficient, browser-based data access and analysis:

- 📊 **[Interactive Tutorials](/tutorials/)** - Modern browser-based analysis with DuckDB-WASM
- 🗺️ **Comprehensive Coverage** - Complete datasets from SESAR, OpenContext, GEOME, and Smithsonian
- 🚀 **High Performance** - 5-10x faster than traditional approaches with minimal memory usage
- 🌐 **Universal Access** - Works in any modern browser without software installation

**Resources**

* [Recording of project presentation at the 2020 SPNHC & ICOM NATHIST Conference](https://youtu.be/eRUw5NMksFo?t=105)
Expand Down
81 changes: 27 additions & 54 deletions tutorials/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,66 +2,39 @@
title: "Tutorials: Overview"
---

Here's where we park our various tutorials!
Welcome to the iSamples tutorials! These tutorials demonstrate how to work with sample data using modern browser-based tools and geoparquet files.

Get the OpenAPI spec.
## Available Data Sources

```{ojs}
//| echo: true
With iSamples Central currently unavailable, all tutorials now use **geoparquet files** as the primary data source:

// Get the OpenAPI specification and display detailed endpoint information
viewof apiEndpointDetails = {
// Show loading indicator
const loadingElement = html`<div>Loading API endpoints...</div>`;
document.body.appendChild(loadingElement);
### Primary Data Sources
- **Zenodo Complete Dataset**: ~300MB, 6+ million records from all iSamples sources
- **OpenContext Parquet**: Curated archaeological sample data
- **Domain-specific Collections**: Specialized datasets for focused analysis

try {
const OPENAPI_URL = 'https://central.isample.xyz/isamples_central/openapi.json';
### Tutorial Categories

// Fetch the OpenAPI spec
const response = await fetch(OPENAPI_URL);
if (!response.ok) throw new Error(`Failed to fetch API spec: ${response.status}`);
**🗺️ Geographic Analysis**
- Interactive mapping and spatial exploration
- Regional distribution analysis
- Cesium-based 3D visualizations

const apiSpec = await response.json();
**📊 Data Analysis**
- Statistical analysis with DuckDB-WASM
- Material category distributions
- Cross-collection comparisons

// Extract detailed information about each endpoint
const endpointDetails = [];
**🚀 Performance Demonstrations**
- Browser-based big data analysis
- Efficient sampling and visualization techniques
- HTTP range request optimization

for (const [path, pathMethods] of Object.entries(apiSpec.paths)) {
for (const [method, details] of Object.entries(pathMethods)) {
endpointDetails.push({
endpoint: path,
method: method.toUpperCase(),
summary: details.summary || '',
operationId: details.operationId || '',
tags: (details.tags || []).join(', '),
parameters: (details.parameters || [])
.map(p => `${p.name} (${p.required ? 'required' : 'optional'})`)
.join(', ')
});
}
}
## Why Geoparquet?

// Create a table with the detailed endpoint information
return Inputs.table(
endpointDetails,
{
label: "iSamples API Endpoints Details",
width: {
endpoint: 150,
method: 80,
summary: 200,
operationId: 200,
tags: 100,
parameters: 300
}
}
);
} catch (error) {
return html`<div style="color: red">Error fetching API endpoints: ${error.message}</div>`;
} finally {
// Remove loading indicator
loadingElement.remove();
}
}
```
Our tutorials showcase how **geoparquet + DuckDB-WASM** enables:
- ✅ **Universal access**: No software installation required
- ✅ **Fast analysis**: 5-10x faster than traditional approaches
- ✅ **Memory efficient**: Analyze 300MB datasets using <100MB browser memory
- ✅ **Minimal data transfer**: Only download what you need
- ✅ **Interactive exploration**: Real-time parameter adjustment