Skip to content

Conversation

@Zepeng-Mu
Copy link

Hi,
I developed a new track class for plotting sashimi plots in pyGenomeTracks. This my first PR, please let me know if there's anything I did wrong.

Thanks so much!

@lldelisle
Copy link
Collaborator

Dear @Zepeng-Mu ,
Thanks for sharing this work with us. I have some questions regarding the choices you made for this track:

  • The input files are:
    • bw with the coverage
    • links with exon-exon junction and score. If I align with STAR, the splice junction file has another format (chr, first base of intron, last base of intron, strand, intron motif, annotation, number of uniquely mapped reads, number of multi-mapped reads, max spliced alignment overhang). Which software have you used to generate your link file? Was it the natural output format or you needed to reformat it to match the "links" format?
    • You kept from the bigwig track the 'operation' 'second_file' and 'transformation' but I feel like they are not really useful, especially because you expect only a positive coverage in order to display the links below the line y = 0, right? Do you mind if I remove them? Alternatively, I can keep them and use the minmum value between the transformed value and 0 but I don't feel like the second_file would be useful.

@Zepeng-Mu
Copy link
Author

Dear @lldelisle,
Thank you for taking care of this
For the link and score file, I believe I'm following the link track class in the original pgt. You are right that we usually reformat the input. I personally have not used STAR output directly, but we usually use leafcutter or regtools, which can have different outputs I guess. I was hoping that this link file can be very generic, and users just format their outputs based on the specific tools they used upstream. Do you think this makes sense?

For the additional input, I agree that 'operation' and 'second_file' are probably not useful, so they can be removed. For transformation, I imagine in some cases log10 might be helpful if the change in splicing is very subtle?

Thanks so much!

@lldelisle
Copy link
Collaborator

Dear @Zepeng-Mu
Thanks for your swift reply.
Let's keep transformation then.
For the input format, Regtools output a bed12 and STAR output a bed-like. I think it makes more sense to use a bed file to describe the junctions where start end are the intron coordinates and the fifth column (score) is the score for the junction. This means that the output of regtools could be used directly as well as any custom bed file and STAR output would need to be slightly reshaped. Also this will save time for processing because the bed can be subsetted by bedtools while links (bedpe) cannot, we need to process all links of the chromosome. Would you agree with this change?
Best,
Lucille

@Zepeng-Mu
Copy link
Author

Hi, I think it's a good idea. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants