GCAT 2004 DATA Workshop with MAGIC Tool and SAM
Assigned Reading to be Completed Prior to Workshop
The GCAT 2004 DATA workshops will focus on using DNA microarrays to analyze gene expression. If you do not know anything about DNA microarrays, then please begin by looking at the Flash animation of how an expression microarray is done, found at this web site: http://www.bio.davidson.edu/courses/genomics/chip/chip.html. This animation is a good tool to assign for classes who are going to use or interpret microarrays, so trying it yourself is a good idea even if you are already cognizant of microarrays.
The purpose of this reading assignment is to familiarize you with many of the terms and concepts commonly used in microarray papers. If possible, read the papers and answer the questions in this handout. If you are not able to do this in advance, we recommend that you do it during the workshop itself. Additional readings will be assigned at the workshop.
Atul Butte. The Use and Analysis of Microarray Data. Nature Reviews Drug Discovery. 1: 951-960 (Dec., 2002).
DeRisi, J. , Iyer, V, and Brown, P. O. Exploring the metabolic and genetic control of gene expression on a global scale. Science 278:680-686 (1997).
News Feature: Claire Tilstone. Vital Statistics, Nature 424:610-612 (2003).
Preparation and Hybridization of Microarrays
c. Explain how the arrays pictured could be used to obtain the data graphed in Figure 5. In other words, what needs to happen in order to take the image you see in Figures 1 and 2 and convert it into the relative intensity represented by ‘Fold induction---fold repression’ plotted in Figure 5? (Hint: refer to the Flash animation of microarrays at the site given above if you are unclear about what happens between the image and the quantitative data).
Analysis
6. What is the difference between supervised and unsupervised analysis?
7. List the different supervised and unsupervised methods described in the Butte paper.
Unsupervised:
Supervised:
8. List 4 aspects you need to consider when measuring gene expression according to Butte.
9. List two major caveats to measurements of gene expression (things to be careful to avoid)
according to Butte:
Unsupervised Analysis
10. Referring to ‘Unsupervised Analysis’ in Butte, define these terms:
a. Feature Determination –
b. Cluster Determination –
c. Network Determination –
d. Dissimilarity – Can you reword the paper’s definition of dissimilarity without using a
form of the word “similarity”?
e. Clustering
11. Name some of the types of clustering that are being used, as cited in Butte and in Tilstone.
12. Using the Tilstone article, identify statistical methods other than those involved in the grouping of genes with similar expression patterns that have been used to examine microarray data.
13. From the Tilstone article, why is it considered so important to replicate microarrays?
14. From Tilstone and Butte, what is a false positive in a set of microarray data? Are many of these expected?
15. What does it mean to get a ‘significant result’ from a microarray experimental analysis?