- 
                Notifications
    
You must be signed in to change notification settings  - Fork 16
 
Data Preparation Modules: panoply_preprocess_gct
        wcorinne edited this page Aug 27, 2025 
        ·
        2 revisions
      
    This module collapses a feature-level GCT to a gene-centric (for ssGSEA) or site-centric (for PTM-SEA) level, as an appropriate input GCT for the panoply_ssgsea module.
- 
input_ds: (.gctfile) input GCT file - 
yaml_file: (.yamlfile) master-parameters.yaml - 
output_prefix: (String) File prefix for output files. 
- 
acc_type: (String) Type of accession number in 'rid' object in GCT file ("uniprot", "refseq" (default), "symbol"). - 
id_type: (String) Notation of site-ids: 'sm' - Spectrum Mill (default); 'wg' - Web Gestalt; 'ph' - Philosopher. Only relevant for PTM-SEA. - 
id_type_out(String) Type of site id for output: 'uniprot'(default), 'refseq', 'seqwin'. Only relevant for PTM-SEA. - 
level(String) Mode of report:- 'ssc' - single-site-centric
 - 'gc' - gene-centric (default)
 - 'gcr' - gene-centric-redundant
 
 - 
loc(Boolean) If TRUE only fully localized sites will be considered (default: TRUE). Localization infromation is expected to be encoded in the site identifier. Respective parsing rules are determined by '--id_type'. - 
gene_col: (String) Name of column listing gene names; used for gene centric reports (default: "geneSymbol"). - 
humanize_gene: (Boolean) If TRUE, gene symbols will be capitalized; can be used to crudely humanize mouse or rat gene symbols. - 
seqwin_col: (String) "Column containing flanking sequences, separated by '|'. Only relevant for PTM-SEA and if '--id_type_out' = 'seqwin' (default: 'VMsiteFlanks'). - 
SGT_col: (String) Column used to collpase subgroup-top (SGT) reports (default: "subgroupNum). Only relevant for Spectrum Mill protein reports. - 
mod_res: (String) Modified residues, e.g. "S|T|Y" or "K" (default: "S|T|Y"). - 
mod_type: (String) Type of post-translational modification, e.g "p" for phospho (default) or "ac" for acetylation - 
mode: (String) Determines how multiple features (e.g. proteins, PTM sites, etc.) mapping to the same gene symbol will be aggregated:- "mean" - mean
 - "median" - median
 - "sd - most variable (standard deviation) across sample columns
 - "SGT" - subgroup top: first subgroup in protein group (Spectrum Mill)
 - "abs.max" - for log-transformed, signed p-values"
 
 
- 
result: (.gctfile) Preprocessed GCT file, appropriate for use in ssGSEA or PTM-SEA 
- Home
 - PANOPLY Tutorial
 - Data Preparation Modules
 - 
Data Analysis Modules
- panoply_association
 - panoply_blacksheep
 - panoply_clumps_ptm_diffexp
 - panoply_clumps_ptm
 - panoply_clumps_ptm_postprocess
 - panoply_cmap_analysis
 - panoply_cna_correlation
 - panoply_cons_clust
 - panoply_immune_analysis
 - panoply_metaboanalyst
 - panoply_mimp
 - panoply_nmf
 - panoply_nmf_postprocess
 - panoply_omicsev
 - panoply_quilts
 - panoply_rna_protein_correlation
 - panoply_sankey
 - panoply_ssgsea
 
 - 
Report Modules
- panoply_association_report
 - panoply_blacksheep_report
 - panoply_clumps_ptm_report
 - panoply_cna_correlation_report
 - panoply_cons_clust_report
 - panoply_immune_analysis_report
 - panoply_metaboanalyst_report
 - panoply_mimp_report
 - panoply_nmf_report
 - panoply_normalize_ms_data_report
 - panoply_rna_protein_correlation_report
 - panoply_sampleqc_report
 - panoply_sankey_report
 - panoply_ssgsea_report
 
 - Support Modules
 - Navigating Results
 - PANOPLY without Terra
 - Customizing PANOPLY
 - 
Workflows
- panoply_association_workflow
 - panoply_blacksheep_workflow
 - panoply_clumps_ptm_workflow
 - panoply_immune_analysis_workflow
 - panoply_metaboanalyst_workflow
 - panoply_nmf_workflow
 - panoply_nmf_internal_workflow
 - panoply_normalize_filter_workflow
 - panoply_process_SM_table
 - panoply_sankey_workflow
 - panoply_ssgsea_workflow
 
 - Pipelines