Skip to content

Conversation

FernandoDuarteF
Copy link

@FernandoDuarteF FernandoDuarteF commented May 13, 2025

Closes #67

Changes

  • Modfied input validation. Samplesheet has 4 columns:
    • Id. Identifier name (not sampe name as sample information should be in the vcf, phenotype and covariate files)
    • vcf. Genotype information in vcf format
    • phenotype. Phenotype information in plink format. It has at least four columns:
      • FID. Family identifier. If no information about the relationship between individuals is present/needed, this should be the same as the individual indentifier
      • IID. Idividual identifier
      • Phen. Phenotype/trait information (quantitative/qualitative)
    • covariate. File with covariates. It has at least four columns:
      • FID. Family identifier. If no information about the relationship between individuals is present/needed, this should be the same as the individual indentifier
      • IID. Idividual identifier
      • Cov. Covariate information (quantitative/qualitative)
  • Added plink/vcf nf-core module. Converts vcf into plink 1 binary format (.bed, .bim, .fam)
  • Added plink/gwas nf-core modue. Association study

Comments

I had to modify the plink/vcf module as it did't accept a pheno file as input. The pheno file is important, if it's not given plink will output a .fam file without phenotype information (the last column will have -9 integer where pheno information should be).

I had to also modify plink/gwas module as the output association table has a different file extension depending on the phenotype data (quantitative or qualitative):

  • If pheno is quantitative, it will output a .assoc file
  • If pheno is qualitative, it will output a .qassoc file

Before these modifications, only .assoc extension was considered, the pipeline failed when qualitative phenotype was given.

More than one covariate can be given, column Cov1, Cov2... can be added after the IID field. Regarding phenotype, for now analysis are univariate.

@FernandoDuarteF FernandoDuarteF changed the title Input valid Input validation May 13, 2025
@chriswyatt1 chriswyatt1 moved this to In progress in Hackathon May 2025 Boston May 14, 2025
@FernandoDuarteF FernandoDuarteF changed the title Input validation WIP: Input validation May 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

Add validation of file inputs
1 participant