Tutorial 2 - Analyzing Splicing Data


Introduction
The primary use of AltAnalyze is the calculate statistics to assess alternative splicing, alternative promoters or other forms of alternative gene regulation. To do this, AltAnalyze filters the users raw expression data to remove probe sets considered to be "not-expressed", calculates a splicing score (splicing index fold and ttest p-value), assings exon/intron/splicing annotations to the high scoring results and further assess protein, protein domain and microRNA binding site changes based on these results. AltAnalyze makes this process relatively easy, with the user only required to provide three very basic files and in a simple format. In the following tutorial we will walk through these steps using a sample dataset and the results that are subsequently produced.

Downloading Sample Data
Before you run AltAnalyze, you should have either CEL files or two text files (expression and detection p-values for all probe sets). AltAnalyze can process the CEL files you have in order to produce these two files using builtin calls to the program Affymetrix Power Tools. To download sample CEL files, click here. Otherwise, you can download already processed AltAnalyze expression files from here. These files have data for all 1.4 million probesets on the human Exon 1.0 array.

Installing AltAnalyze and Saving Your Data
AltAnalyze can be downloaded for multiple operating systems from http://AltAnalyze.org. Once you have downloaded the compressed archives to your computer, extract these to an accessible folder on hard-drive (e.g., your user account).

Creating a Comparison and Groups File (OPTIONAL)
If your dataset has over 30 CEL files or dozens of groups, it may save you time to make the groups and comps files in advance. Although not recommended when working with this sample dataset, go here if this applies to your own dataset.

Running AltAnalyze
Now you are ready to process your raw input files and obtain alternative probe sets with splicing and functional annotations. To proceed:

1)

Open the AltAnalyze_v1beta folder and select the binary file "AltAnalyze". In Windows, this file has the extension ".exe". If you are working on a Linux machine or are having problems starting AltAnalyze, you can also start the program directly from the source code.
2) (AltAnalyze: Introduction) In the resulting introduction window, select "Begin Analysis".
3) (AltAnalyze: Main Dataset Parameters) Select the species type "Homo sapiens" and "Continue".
4) (AltAnalyze: Main Dataset Parameters) Select the "CEL files" button and the array type "Affymetrix Exon ST 1.0 Array".
5) (AltAnalyze: Select CEL files for APT) For dataset name, type in "hESC_differentiation", select the directory containing the CEL files (make sure these have been extracted from the TAR file and Gzip files) and then select an empty directory to save the results to.
6) (AltAnalyze: Select CEL files for APT) In the resulting warning window, select "download" for AltAnalyze to automatically download and install the library and annotaiton files for that array.
7) (AltAnalyze: Expression Analysis Parameters) Accept the default parameters by clicking "Continue". These coptions can be modified later on, if you wish to increase or decrease the stringency of the analysis.
8) (AltAnalyze: Alternative Exon Analysis Parameters) Accept the default parameters by clicking "Continue".
9) (AltAnalyze: Assign CEL files to a Group) Type in the group name for each sample (first three are hESC and second three are cardiac_precursors).
10) (AltAnalyze: Establish All Pairwise Comparisons) Selects the groups to compare (cardiac_precursors vs. hESC).
11) A new window will appear that displays the progress of the analysis. Analysis of the sample dataset should take approximately 20 minutes, but can take longer depending upon the machine and operating system. When finished, AltAnalyze will display a new pop-up window, informing the user that the analysis is complete. If analyzing multiple experimental groups, these will be run in succession.

Interpreting the Results
Proceed here.