Skip to content

mrunyan1/SpHAEC

Repository files navigation

Genotype Specific Splice Site and Usage Prediction

This repository provides a pipeline for analyzing RNA-seq data to identify and quantify splice site usage (SSU), map genetic variants to transcripts, and train neural networks to predict splice sites and their usage.


Requirements

  • Python: 3.7-3.10
  • GCC: Tested with GCC 11.1.0.
    • Note: Older GCC versions might not support the required C++ standards required by RegTools (e.g., C++11).
  • CUDA: 11.2 (for TensorFlow 2.10.0)
  • cuDNN: 8.1
    • Note: Ensure your CUDA/cuDNN versions are compatible with your TensorFlow version. Refer to the TensorFlow GPU support guide for compatibility details.

Installation

Install SpliSER and python requirements.

conda env create -f environment.yml
conda activate proc-rnaseq

cd pipeline
git clone git@github.com:NNeuralDynamics/SpliSER.git

Repository Structure

spliceai_ssu

Contains the modified SpliceAI model integrated with SSU regression. Includes training and testing scripts, as well as tools for calculating evaluation metrics.

pipeline

Nextflow pipeline and supporting scripts to process RNA-Seq data in BAM format to find splice-sites and SSU values in each sample and combine to create data for machine learning.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages