StatGene Research Group Blog
This blog is for information sharing, bookkeeping, and other related group activities of the statistical genetics research group in the Department of Statistics at Columbia University.
Tuesday, May 19, 2009
Saturday, March 21, 2009
Some good readings on important trends in statistical genetics
I. OVERVIEW
1. Science 7 November 2008: Vol. 322. no. 5903, pp. 881 - 888
Genetic Mapping in Human Disease
2. Nature Reviews Genetics 10, 241-251 (April 2009)
Human genetic variation and its contribution to complex traits
II. Association Studies, especially GWAS
3. Nature Reviews Genetics 9, 356-369 (May 2008)
Genome-wide association studies for complex traits: consensus, uncertainty and challenges
4. Nature Reviews Cancer 4, 850-860 (November 2004)
Association studies for finding cancer-susceptibility genetic variants
III. Challenges of associations studies (interaction, population
stratification, etc)
5. Human Molecular Genetics, 2002, Vol. 11, No. 20 2463-2468
Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans
6. Nature Genetics 37, 1243 - 1246 (2005)
Population structure, differential bias and genomic control in a large-scale, case-control association study
IV. LD and Haplotype block
7. The American Journal of Human Genetics, Volume 69, Issue 1, 1-14, 1 July 2001
Linkage Disequilibrium in Humans: Models and Data
8. Nature Reviews Genetics 4, 587-597 (August 2003)
Haplotype blocks and linkage disequilibrium in the human genome
V. Selected methodology reviews
9. The American Journal of Human Genetics, Volume 78, Issue 3, 437-450, 1 March 2006
A Comparison of Phasing Algorithms for Trios and Unrelated Individuals
10. The American Journal of Human Genetics, Volume 74, Issue 5, 979-1000, 1 May 2004
Methods for High-Density Admixture Mapping of Disease Genes
Friday, October 05, 2007
Reference for Tim's presentation
History of human genome project:
http://www.sciencemag.org/cgi/content/full/291/5507/1195
Science paper:
The sequence of human genome
http://www.sciencemag.org/cgi/content/full/291/5507/1195
The international hapmap project
http://www.hapmap.org/downloads/nature02168.pdf
Properties of linkage disequilibrium
http://www.pnas.org/cgi/content/full/99/26/17004
SNP discover (a presentation)
http://wheat.pw.usda.gov/ITMI/RafalskiXITMI2/index.htm
Let me know if you would like to know other topics along this line and I will find more references.
Friday, September 28, 2007
Group meeting [Sep.28.2007]
Today we discussed the dpGaP data sets.
1. [Tian] Even though we have been granted the access to the data, I don't think the data files are available. We submitted archive request online but the status of those requests has been "preparing". Another clue was for only one of the three studies, the file sizes were known, which was huge.
2. [Yuejing] Two papers were listed as reference to the Major Depression data but only one was available. [Yuejing] found a presentation file from UNC which updated the status of the study. The study screened families that participated in a Netherland study on MDD over the span of 10 years and selected those with concordant twins with extreme high or low trait values, and discordant twins with sharp contrasts (high <-> low). Linkage results from previous studies show little overlap. The sample size described on dpGaP differs slightly from the presentation file.
3. [Chien Hsun Huang] searched for "Nephropathy" and Type I diabetes and found there are not many papers on this topic. Little is known about its genetic component, we can only assume.
A little background: Nephrotic syndrome is a disorder where the kidneys have been damaged, causing them to leak protein from the blood into the urine. It is characterised by proteinuria (>3.5g/ day) hypoalbuminemia, hyperlipidemia and edema. Diabetic nephropathy (nephropatia diabetica), is a progressive kidney
disease caused by angiopathy of capillaries in the kidney glomeruli. It is characterized by nephrotic syndrome and nodular glomerulosclerosis. It is due to longstanding diabetes mellitus, and is a prime cause for dialysis in many Western countries.
The reference paper describes the study data and design without any analysis results. The most important point is that this study has three designs (case/control, case-trio and control-trio) which could be an advantage for results validation but can also cause interpretability difficulty as the authors discussed.
There is one recent paper from Nature that identifies a Type I diabetes gene.
4. [Jun and Tian] The ADHD data might be the first data we can get our hands on. The 2006 Molecular Psychiatry paper outlines the recruitment of subjects. The data came from a multi-center study involves eight countries. This might create some population structure issues. However since this data set consists of only case-parent trios, the impact of population stratification is not of concern. Despite its high heritability, there has not been any identified genes for this disorder. Previous genome scans didn't have much overlapping significant results. In the 2006 paper, candidate genes were study and only nominal significant results were obtained for 7 out 51 genes. The 2006 Behavioral and Brain function paper described the genome scan efforts using the same sample of subject. It outlines the plan of this project. 600,000 tagging SNPs are to be used. Gene-gene interactions should be considered. Gene-environment interaction may play an important role for this disorder but is hard to account for in analysis. One thing we might try first is to construt a candidate gene study first.
5. Next week, we will check the data status. We will also go over some review papers in genetic epidemiology (see the last post) for Jun, Bo, Julia, Chien-Hsun.
[Reading list] Background reading on statistical genetics
Here are papers we want to discuss for next Friday.
If you have difficulty accessing the PDF, please let me know.
1. Nat Rev Genet. 2001 Feb;2(2):91-9.
Association study designs for complex diseases.
By Cardon LR, Bell JI.
2. Nat Rev Genet. 2005 Feb;6(2):95-108.
Genome-wide association studies for common diseases and complex traits.
By Hirschhorn JN, Daly MJ
3. Lancet. 2005 Sep 24-30;366(9491):1121-31.
Genetic association studies.
By Cordell HJ, Clayton DG.
4. : Lancet. 2005 Sep 17-23;366(9490):1036-44.
Genetic linkage studies.
By Dawn Teare M, Barrett JH.
5. Am J Hum Genet. 1996 Nov;59(5):983-9.
The TDT and other family-based tests for linkage disequilibrium and association.
By Spielman RS, Ewens WJ.
Friday, September 21, 2007
Group meeting [Sep.21.2007]
Last Friday, we held a research open house for PHD students and visitors. In the future, such format might be turned into a regular research club of some sort.
Today, two MA students joined our group meeting.
- Bo Qiao. A recent graduate from Tsinghua University.
- Chien Hsun Huang (sound). A statistical genetics researcher from Taiwan
Iuliana visited us today. We discussed the possibility of her visiting us monthly, supported by our newly funded NSF/NIGMS grant.
Dr. Jun Xie (associate professor from Purdue University) is visiting our department and would like to participate in our research for this semester. Welcome!
We discussed the three data sets that were granted to us from NCBI: genome association studies on major depression, type I diabetes, and ADHD. These three sets have very different data structure. We will study the background information of these studies in the coming week and will discuss how to analyze them during next meeting.
Major depression: Ding and Lo
Diabetes: Qiao and Huang
ADHD: Zheng and Xie
It is nearly unthinkable that we still have confusion over the meeting time. And we now decided that we will meet at 10:30 on Friday in Room 901, until further notice.
Friday, September 07, 2007
Group meeting [Sep.07.2007]
It is the first meeting of this year. Also the first time Tim and Hugh joined the group meeting. We discussed the following items
- Apply for available genetic data: we have applied for several sets of genetic data and are waiting for board approval.
- Hugh reported his study on subset coverage issues in his computing.
- We discussed new research problems related to true signals and "imposter" signals and how to deal with them.
- Next week, we are going to have research open hour for PHD students and visiting researchers to our department.

