-
Notifications
You must be signed in to change notification settings - Fork 31
temporary methylation readme #807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jkgoodrich
wants to merge
1
commit into
main
Choose a base branch
from
jg/temp-methylation-readme.txt
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# gnomAD Methylation Sites Data | ||
|
||
This directory contains methylation site annotations for the GRCh38 reference genome, used in gnomAD constraint calculations and variant annotation pipelines. | ||
|
||
## Files Description | ||
|
||
### BED Files | ||
- **methylation.bed**: Methylation site annotations for autosomes (chr1-22) | ||
- **methylation_chrX.bed**: Methylation site annotations for chromosome X | ||
|
||
### Hail Table Files | ||
- **methylation.ht/**: Hail Table containing methylation annotations for autosomes only | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. keeping the slash hlps reinforce that HTs are actually directories rather than single files, but they do seem a little distracting (maybe just me?). what do you think about keeping or removing them? |
||
- **methylation_chrX.ht/**: Hail Table containing methylation annotations for chromosome X only | ||
- **methylation_all.ht/**: Merged Hail Table containing methylation annotations for all chromosomes (autosomes, chrX, and chrY) | ||
|
||
## Recommended Usage | ||
|
||
**We recommend using `methylation_all.ht/` for all applications** as it provides comprehensive methylation annotations across all chromosomes in a single file. This eliminates the need to handle multiple files and ensures consistent annotation coverage across the entire genome. | ||
|
||
The individual chromosome-specific files (`methylation.ht/` and `methylation_chrX.ht/`) are legacy files that were created when complete methylation data was not available for all chromosomes. These files are maintained for backward compatibility but should not be used for new analyses. | ||
|
||
## Methylation Score Scales | ||
|
||
The methylation scores use different scales depending on the genomic region: | ||
|
||
- **Autosomes (chr1-22)**: 0-15 scale | ||
- **chrX PAR regions**: 0-15 scale (same as autosomes) | ||
- **chrX and chrY non-PAR regions**: 0-12 scale | ||
|
||
|
||
## Data Format | ||
|
||
### BED Files | ||
Standard BED format with the following columns: | ||
- Column 1: Chromosome | ||
- Column 2: Start position (0-based) | ||
- Column 3: End position (1-based) | ||
- Column 4: Methylation score | ||
|
||
### Hail Tables | ||
The Hail Tables contain the same methylation score information in a format optimized for large-scale genomic analyses. Key annotations include: | ||
- `locus`: Genomic position | ||
- `methylation_level`: Methylation score (0-15 for autosomes/PAR, 0-12 for non-PAR) | ||
|
||
## Citation | ||
|
||
If you use these methylation annotations in your research, please cite: | ||
|
||
Chen, S., et al. "A genomic mutational constraint map using variation in 76,156 human genomes." bioRxiv (2022). https://www.biorxiv.org/content/10.1101/2022.03.20.485034v2.full |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.