May 27, 2004
Significance Analysis of Microarrays (SAM) Installation and Usage
SAM Installation and Setup
In order for SAM to run properly on your computer you first must have winzip, Microsoft Java Virtual Machine and Microsoft Data Access Components installed on your computer. For Microsoft Java Virtual Machine you will need msjavx86.exe .
To download SAM you will need to register for a username and password at http://www-stat-class.stanford.edu/`tibs/clickwrap/sam.html. An email confirming your user name and password will be sent to you with the following site to log into:
http://www-stat.stanford.edu/~tibs/clickwrap/sam/academic. Once you have logged into this site click where sam.zip is highlighted. Save the file to your computer and proceed with downloading it. Note that WinZip may ask if you want to “Associate WinZip with Zip files now?” click Yes. Once downloading is complete a box that reads WinZip- sam.zip will open. Click on setup.exe in order to install SAM. Proceed with the installation and set up.
Once installation has been completed on your computer open the SAM ‘Significance Analysis of Microarrays’ Users guide and technical document pdf file for assistance in using SAM. This file should be located at C:\Program Files\SAMVB\doc\SAM.pdf . This guide will be referred to later.
To install SAM into Excel first open up Excel and go to ‘Tools’> ‘marco’>’security’ and run Excel on a medium security. Then click on ‘Tools’> ‘Add-Ins’, and browse for the SAM Addin (C:\Program Files\SAMVB\Addin and double click on the SAM icon). ‘Significance Analysis for Microarrays’ will now appear as an add-in option. Check the box next to ‘Significance Analysis for Microarrays’, a security warning may appear that asks if you would like to ‘Enable Macros’, and you do. At this point ‘SAM’ and ‘SAM Plot Control’ should appear on your Excel tool bar every time you open Excel. If not, you can open SAM from C:\Program Files\SAMVB\Addin at any time.
SAM Usage
Once you are in Excel you should be able to open up the SAM example datasets from C:\Program Files\SAMVB\examples. These examples are briefly described on page 8 of the Users Guide. When you have selected an example it should appear highlighted in Excel (leave it highlighted). If you are using your own data you must first make sure that your data is setup according to the SAM layout described on page 8 of the Users Guide.
For an example of how to use SAM open the dataset labeled ‘twoclass.xls’. Click on ‘SAM’ in the tool bar. Within the SAM dialogue box make sure ‘Two class, unpaired data’ is highlighted for the response type. You should also check ‘logged’ data and leave the rest as the default (100 should be highlighted as the number of permutations). Then click ‘OK’. A graph should appear on the screen under the worksheet name SAM Plot with a SAM PLOT CONTROLLER box on top of it. The graph shows positive significant genes in red and negative significant genes in green. There is a solid blue line where x=y and two dashed black lines that represent the positive and negative cutoff points. The cutoff can be adjusted by changing delta.
Within the controller there is a slider that you can use to adjust the delta value and a box which allows you to specify a fold change. Adjusting the delta value allows you to decide the median number of false positive genes (referred to as false significant) you are comfortable with. As you increase delta the number of significant genes and the number of false positives decrease. You can see the effect of different deltas on the number of significant genes by checking ‘List Delta Table’. Once you have chosen a desired delta click on ‘List Significant Genes’. An Excel sheet with a list of the significant genes and information on them should appear labeled ‘SAM Output’. An explanation of this output can be found on page 16 of the Users Guide. At this point, to make another adjustment you have to click on ‘SAM Plot Control’ for the SAM PLOT CONTROLLER box to reappear. Note that making another change will replace the ‘SAM Output’ Excel sheet.
As an additional criterion for determining significant genes you can ask SAM to perform a fold change. You can place a number greater than or equal to 1 in the fold change box in the SAM PLOT CONTROLLER box. Increasing the fold decreases the number of significant genes and false positives. Further explanations and figures of running SAM can be found in pages 11 through 15 in the Users Guide.
Complicated Data
If you are working with data that has more than 256 columns (such as the SAM example ‘twoclassb’) then you will have to work with multiple sheets. For this case, highlight the data you want from all sheets. You must have sheet one open when you click on SAM. This time highlight the additional sheets you are using in the dialogue box. You can then proceed as normal. See page 15 in the Users Guide. If you are working with data that has missing data entries (such as the SAM example ‘twoclassm’) refer to page 11 of the Users Guide.