a. Bedtools
b. Bioconductor packages: affy, R.basic and MASS
type following lines to install these 3 packages in R
#affy:
source("http://bioconductor.org/biocLite.R")
biocLite(c("affy"))
#R.basic and MASS:
source("http://www.braju.com/R/hbLite.R")
hbLite(c("R.basic","MASS"))
2. Usage:
I. Create a folder and place in the folder MAnorm.sh, MAnorm.r, and all 4 bed files (peak file and read file for the 2 samples under comparison) to be analyzed.
II. run command:
./MAnorm.sh sample1_peakfile[BED] sample2_peakfile[BED] sample1_readfile[BED] sample2_readfile[BED] sample1_readshift_lentgh[INT] sample2_readshift_length[INT] output_folder[string]
3. Example:
./MAnorm.sh sample1_peaks.bed sample2_peaks.bed sample1_read.bed sample2_read.bed 150 150 output_folder
The first 4 input files should be in bed format with no header lines
The first 2 peak files have 3 columns: chromosome, start, end.
The next 2 read files should have 6 columns (information from column 1, 2,3,6 are used): chromosome, start, end, description, description, strand (+/-)
The 5th and 6th parameters are the number of bp to be shifted for each read, which is approximately half of the fragment size. MACS estimates and give the shiftsize in the headerlines of MACS peak file *_peaks.xls after "# d =".
The last parameter is the output folder
MAnorm.r is called from MAnorm.sh, and there is no need to run it separately. Please check file Rcommand.out for the output file from running the R script for error tracking.
4. Output:
4.1 Output files created:
MAnorm_result.xls: Includes all peak coordinates, # raw reads from sample1, #raw reads from sample2, M-value_rescaled, A-value_rescaled, -log10(p-value).
This file lists the peak coordinates, raw reads, M and A values, and MA-norm p-value for the set of common peaks in each sample and for the peaks unique to each sample.
The file below (MAnorm_result_commonPeak_merged.xls) provides the list of merged common peaks.
MAnorm_result_commonPeak_merged.xls: This output file is similar to MAnorm_result.xls, except that the common peaks are merged. This file corresponds to a list of merged peaks for the two samples being compared.
sample1_peaks.wig: wig file for sample1 peak list, with the width of the bar representing the M value
sample2_peaks.wig: wig file for sample2 peak list, with the width of the bar representing the M value
MAplot_before_rescaling (common peaks).png: MAplot for common peaks before MAnorm
green line: robust regression
red line: LOWESS regression
blue line: line M = 0
MAplot_after_rescaling (all peaks).png: MAplot for all peaks after MAnorm
red line: LOWESS regression
blue line: line M = 0
4.2 Temporary Output files removed after MAnorm is done (If these files are needed, please modify MAnorm.sh):
peak1.bed: sample 1 peak list
peak2.bed: sample 2 peak list
read1.bed: sample 1 read used for mapping to peaks
read2.bed: sample 2 read used for mapping to peaks
peak1_dump.bed: sample 1 peaks not used
peak2_dump.bed: sample 2 peaks not used
read1_dump.bed: sample 1 read NOT used for mapping to peaks
read2_dump.bed: sample 2 read NOT used for mapping to peaks
5. other useful scripts:
5.1 classify_by_M_value.sh:
description:a useful script for peak classification based on M value, which could be uploaded directly to a genome browser. Required to be in the same folder of MAnorm_result.xls generated by MAnorm.sh.
usage:
./classify_by_M_value.sh absolute_M-value_cutoff_unbiased absolute_M-value_minimum_for_unique_peaks -log10p_cutoff_unique
example_command:
./classify_by_M_value.sh 0.5 1 5
In the above example, sample 1 unique peaks (non-concordant peaks) are peaks with M-values greater than 1 and that have a log base 10(p-value) greater than 5. Similarly, sample 2 unique peaks (non-concordant peaks) are peaks with M-values less than (-1) that have a log base 10(p-value) greater than 5. Unbiased peaks (concordant peaks) are peaks with M-values between (-0.5) and (+0.5).
output:
3 bed files: 2 unique peak lists (classI_sample1_uniq_peaks.bed and classII_sample2_uniq_peaks.bed)
1 unbiased peak list (classIII_unbiased_peaks.bed)