This pipeline is based on Juicer and HiC-pro which combines the advatages of these two processing pipelines. HiCpipe is much faster than Juicer and HiC-pro and can output multile features of Hi-C maps.
The outputs is listed as following:
| name | software | output content |
|---|---|---|
| mapping | bwa | merged mapped reads(.bam) |
| filter | HiC-pro | contact pairs (.txt) |
| pair2hic | juicer (pre) | compressed Hi-C maps(.hic) |
| hic2map | juicer(dump) | sparse and dense matrix (.mat) |
| compartmet | R eigen | PC1 values(.txt, .bw) |
| TAD | Insulation score | TAD boundaries(.bed); insulation score(.bw) |
| CDB | HiCDB | CDBs(.bed); relative insulation score(.bw) |
| loop | HiCloop | loops(.bedpe) |
| qc | shell | Hi-C quality report |
Other utility:
Easy clustering based on compartment and insulation.
Statistics of Hi-C features.
All software metioned before should be installed first. To install this pipeline, simply download this pipeline and use the shell script.
git clone https://github.com/ChenFengling/HiCpipe.gitOrganize your data as PROJECT_PATH/sample/sample.fq.gz, for example
BLHiC-project1
├── sample1
│ ├── sample1_R1.fq.gz
│ └── sample1_R2.fq.gz
└── sample2
├── sample2_R1.fq.gz
└── sample2_R2.fq.gz
You will get the summarized data in PROJECT_PATH/all_results/
use the following code to analyse your BL-HiC data
sh main.sh $PROJECT_PATH $Resolution $genome $core $HiCpipe_PATHConfigurations should be changed in config-hicpro_*.txt: BOWTIE2_IDX_PATH GENOME_SIZE GENOME_FRAGMENT.
1.change tss annotation in compartment.r
tss=read.table("YOUR_TSS_FOLDER/tss.bed")2.change BOWTIE2_IDX_PATH GENOME_SIZE GENOME_FRAGMENT in config-hicpro.txt follow the instrcution in https://github.com/nservant/HiC-Pro/tree/master/annotation to generate the sites of restriction enzyme.
/home/software/HiC-Pro/bin/utils/digest_genome.py -r GG^CC -o mm9_ggcc.bed /home/reference/mouse/mm9/Sequence/BWAIndex/genome.faUse HiCqc.sh to generate Hi-C qc report
sh HiCqc.sh PROJECT_PATH REPORT_NAMEYou will find the qc report REPORT_NAME_report.txt under PROJECT_PATH.
trim the BL-linker and discard the reads with less than 15 bases.
Total_PETs
Expect_PETs
Expect_both_PETs
Chim_PETs
1Empty_PETs: The PETs with one end does'not have linker
2Empty_PETs: The PETs with two ends don't have linker
Valid_PETs: Trimed PETs with short reads filtered
Total_pairs_processed
Unmapped_pairs
Low_qual_pairs
Unique_paired_alignments
Multiple_pairs_alignments
Pairs_with_singleton
Low_qual_singleton
Unique_singleton_alignments
Multiple_singleton_alignments
Reported_pairs
filter the data according to restriction sites
Valid_interaction_pairs
Valid_interaction_pairs_FF
Valid_interaction_pairs_RR
Valid_interaction_pairs_RF
Valid_interaction_pairs_FR
Dangling_end_pairs
Religation_pairs
Self_Cycle_pairs
Single-end_pairs
Dumped_pairs
valid_interaction
valid_interaction_rmdup
trans_interaction
cis_interaction
cis_shortRange
cis_longRange
valid/total
rmdump/valid
intra/inter
ChIA-PET2 https://github.com/GuipengLi/ChIA-PET2 Hi-Cpro sample Hi-Cpro Juicer tools pre https://github.com/theaidenlab/juicer/wiki/Pre#4dn-dcic-format juicerbox https://github.com/theaidenlab/Juicebox video for Juicebox usage cnv and transloctaion tools: HiCtrans HiCnv HiCapp
related papers li cheng lab(CNV)