Statistics

Statistical modeling and analysis, including the collection and interpretation of data, form an essential part of the scientific method in diverse fields, including social, biological, and physical sciences. Statistical theory is primarily based on the mathematical theory of probability, and covers a wide range of topics, from highly abstract areas to topics directly relevant for applications. Research in statistics covers many issues, some closely tied to theoretical principles of statistical inference, and others more concerned with developing and extending techniques for descriptive and exploratory analysis of data. The theory and practice of designing the efficient collection of data through experiments, surveys, and observational studies constitute important areas of statistics. Since computers play a major and often crucial role in statistical research through simulation techniques, and in statistical applications through the analysis of data, statistical computation is another major subfield.

Statisticians are frequently concerned with modeling complex phenomena, especially by developing and applying appropriate probability models to empirical data, and often these efforts are intimately connected to policy-relevant decision-making in business and government. We seek to train statisticians who will contribute to theory, develop innovative and useful statistical models and methods, and conduct serious applied statistical scientific investigations. Individual statisticians will vary in their emphasis, but the field includes all of these aspects.

Statisticians with advanced training are in substantial demand for positions in academic teaching and research, in research laboratories and organizations, in government agencies, and in business. As society, science, and the technology of data handling grow in complexity, the need for highly qualified statisticians is expected to grow steadily.

The Department of Statistics offers courses of study leading to both the PhD and Master’s degrees. The department encourages applications from students with strong mathematical backgrounds who plan to concentrate on theoretical statistics, students with training in substantive fields whose primary interest is in applied statistics, and students whose backgrounds and interests lie between these two extremes. In addition to formal course work and dissertation research, students are encouraged to work closely with faculty and to attend seminars concerning current problems in empirical research and thereby to gain experience with interdisciplinary statistical research and consulting. All PhD candidates are expected to engage in some teaching during their period of training.

Preparation in Mathematics, Statistics, and Computation

The minimum mathematical preparation for admission to graduate study in statistics is
linear algebra and advanced calculus. Ideally, each student’s preparation should include at least one term each of mathematical proba-bility and mathematical statistics. Additional study in statistics and related mathematical areas, such as analysis and measure theory, is helpful. In the initial stages of graduate study, students should give high priority to acquiring the mathematical level required to satisfy their objectives. Before registering for their fall term classes, all entering students will be required to take a diagnostic test in mathematics. Performance on this test will assist the department in determining whether students need additional mathematics preparation.

Successful applicants demonstrate that they understand what the discipline of statistics entails, and show evidence of involvement in applications or a strong theoretical interest. They are able to articulate a strong motivation for studying statistics.

As statistics is so intimately connected with computation, computation is an important part of almost all courses and research projects in the department. Ideally, students should have programming experience in, or exposure to, some high-level computer language, such as SAS, S+, Fortran, and C.

Doctor of Philosophy (PhD)

The formal residence requirement for the PhD is 16 half-courses devoted to advanced study. Other formal requirements are the passing of a qualifying exam, the completion of a qualifying paper, the fulfillment of the cognate requirement, and the completion
of a PhD dissertation. Details are provided below.

Program of Study. Students should plan their course program with three objectives in view: (i) acquiring basic knowledge in preparation for the qualifying examination; (ii) investigating a range of advanced topics; and (iii) exploring in some depth a field
outside of statistics. To satisfy (i) and (ii), students will normally take a minimum of nine half-year courses offered by the Department of Statistics, including at least four on advanced topics.

Cognate Requirement. To satisfy (iii), a substantial connection to a cognate field
is required. This connection can be made through four half-year courses at the grad-uate level in the selected cognate field, chosen with the help of the advisor. For example, students with strong mathematical aptitudes and preparation who wish to pursue this direction may select a cognate in mathematics in preparation for research in mathematical statistics, and satisfy this requirement with four half-courses in mathematics; an analogous plan may be appro-priate for students with a strong interest in computer science.

Students who intend to work primarily in some field of application of statistics
may select a cognate such as astrophysics, computational biology, environmental science, economics, psychology, education, engi-neering science, sociology, public policy, business, or public health, and may satisfy this requirement with four half-courses at the graduate level in the chosen field.

In other cases, the cognate can be satisfied by a major investment in the cognate field through involvement in projects at another School or department at Harvard, at some other research institution, or at a government agency. The most important criterion
is a major investment in the language, methods, and use of statistics in the cognate field. The department has no formal language requirement but does require demonstrated competence at communication in the selected cognate field. Details of programs should be established in consultation with the faculty advisors.

During the second year of study, students should submit their prospective programs for approval by the department. Students will be expected to complete all work with distinction.

Qualifying Examination. The student must pass a written qualifying examination in
statistics, which is given once each year. The examination is normally taken by students in their second year. It is given at the end of the spring term, with two parts, the first on theoretical statistics including probability and mathematical statistics, and the second on applied statistics including statistical design and data analysis.

Research Presentations. At the end of each term, all students who have passed the qualifying exam present to department faculty and to fellow students brief summaries of their research in progress.

Qualifying Paper. The objective of the qualifying paper is to provide the student with an opportunity to explore a serious topic in statistics and to express the findings coherently in a written document. Although the work need not be original, it should demonstrate under-standing of the topic, knowledge of the tools of research, and clarity of exposition. The effort involved is expected to require no more than the equivalent of one term at one-third TIME. This paper should be submitted and accepted by the department as early as possible, and preferably during the year following the qualifying exam. Delays in submission require permission of the department.

Dissertation. Each student is expected to exercise initiative in seeking out both a dissertation topic and a faculty advisor who will take primary responsibility for supervising the student’s work. The PhD dissertation is expected to be a research contribution of high quality adding to our knowledge of either the theory or practice of statistics. A PhD dissertation in statistics may also consist primarily of an innovative analysis of a specific, complex body of data in some substantive field. Generally, the material in a PhD dissertation should be pub-lishable in a refereed journal.

Two copies of the completed dissertation must be submitted for consideration in the department office at least two weeks prior to a department colloquium on the substance of the dissertation. The faculty will consider the submitted dissertation and make recommendations, which generally lead to revisions. Next, the faculty, with the explicit advice of three -faculty readers nominated by the department, vote on the completed dissertation as submitted in finished form, which must conform to the requirements described in The Form of the PhD Dis-sertation, available in the Registrar’s office. The approved final dissertation can be submitted to the Registrar. The time from the colloquium to the final vote is ordinarily about one month.

Recent dissertation topics have included:

2002
“Bayesian Hierarchical Time Series Modeling of Mortality Rates”
(Claudia Pedroza)
“An Application of Missing Data Methods: Testing for the Presence of a Spectral Line in Astronomy and Parameter Estimation of the Generalized Hyperbolic Distributions” (Rostislav Protassov)
“Fitting and Evaluating Certain Two-Level Hierarchical Models” (Ruoxi Tang)
“Causal Inference with Principal Stratification: Some Theory and Applications” (Junni Zhang)
2003
“Extensions and Applications of Three Statistical Models” (David Esch)
“Stochastic Models for Sequence Pattern Discovery” (Mayetri Gupta)
2004
“Belief Functions Applied to Reliability Testing and Product Improvement” (Wai Fung Chiu)
“Modeling Monotone Nonlinear Disease Progression and Checking the Correctedness of the Associated Software” (Samantha R. Cook)
“Statistical Techniques for Examining Gene Regulation” (Shane T. Jensen)
“Matching Methods for Estimating Causal Effects Using Multiple Control Groups” (Elizabeth A. Stuart)
2005
“On Population Based Markov Chain Monte Carlo Methods” (Gopika Goswami)
“Principal Stratificaiton for Causal Inference with Extended Partial Compliance (Hui Jin)
“Markov Chain Monte Carlo Applications in Bioinformatics and Astrophysics” (Hosung Kang)
“Three Contributions to Statistical Computing” (Yaming Yu)
2006
“Towards Inference on Bayesian Network Structures” (Byron Ellis)
“Inference and Efficient Computation for Highly Structured Models with Applications”
   (Taeyoung Park)
“Detecting Cis-Regulatory Modules by Modeling Correlated Structures in Genomic
   Sequences” (Qing Zhou)
2007
“Contributions to law and Empirical Methods” (D. James Greiner)
“Decoding Mammalian Gene Regulatory Programs through Efficient Microarray, ChIP-chip and Sequence Analysis” (Hongkai Ji)

Limitation of Time to Degree. The department policy is that, except in unusual circumstances, students cannot register for the PhD program or be paid research assistant or teaching assistant salaries after their sixth year. A student who has completed the sixth year in the department and satisfied all requirements except the PhD dissertation may take a leave of absence, and the department will ordinarily consider a dissertation submitted before or during the ninth year. After the ninth year, the student is required to petition the faculty to have a dissertation considered, and will ordinarily be required to retake and pass the qualifying exam.

Master of Arts (AM)

The Department of Statistics welcomes applicants for the terminal AM degree. Typical AM candidates are PhD candidates in another field at Harvard for whom a statistics minor is
appropriate, well-prepared undergraduates- eli-gible for the AB/AM program, and candidates with appropriate mathematics backgrounds (linear algebra and multivariate calculus) who can demonstrate motivation for pursuing a terminal AM degree. As the Department of Statistics cannot provide tuition fellowships for terminal AM candidates, candidates seeking only the AM degree must be financially self-supporting. Teaching fellowships may be available for partial financial support.

The AM degree requires the satisfactory completion of eight half-courses approved by the department, normally requiring two terms of residence and study at Harvard. The courses must include at least six letter-graded half-courses at the level of Statistics 110 and above taken within the Department of Statistics. The actual course of study will vary according to the student’s interest and preparation and will be determined in consultation with the student’s advisor. Statistics 110 or 210 and Statistics 111 or 211 or equivalent are required. AM students must earn a B average in Statistics courses and no more than one C in all courses. Terminal AM students can take at most one 300-level course, which ordinarily cannot be used to meet the minimum requirement for letter-graded statistics courses.

The remaining two half-courses may include courses in related areas (such as economics, psychology, and biostatistics) that develop statistical methodology and are judged to be at an equivalent level to Statistics 110 or above. They may also include upper-level mathematics courses, computer science courses, or, in some cases, other courses that broaden the student’s ability to apply statistical methods. The department maintains a list of approved related courses. Generally, the department encourages a coherent theme connecting the related courses.

Admissions and Financial Aid

Students are admitted for the fall term only; applications must be received by December 15 for admission in the following fall. Applications received after December 15 cannot be guaranteed consideration. For more detailed information and forms, write to: Admissions Office, Harvard Graduate School of Arts and Sciences, 1350 Massachusetts Avenue, Cambridge, MA 02138. We encourage online submission of the application. See http://apply.embark.com/grad/Harvard/GSAS. GRE General scores are required, and subject scores, particularly in mathematics, are recommended. GREs should be taken by October so that examination score reports arrive in time for admission decisions. For financial aid, the appropriate financial aid application should be completed.

The statistics department usually provides adequate financial support, which includes tuition, health fees, and living expenses, to PhD students in good standing. In the first year of graduate study, this support typically involves a grant-in-aid to cover tuition, fees, and living expenses. In the second year, support is typically a grant-in-aid to cover tuition and fees, and teaching/research fellowships to cover living expenses. In the third through sixth years, when tuition is considerably reduced, the department usually can provide a teaching and research fellowship sufficient to cover tuition and living expenses. The department cannot provide financial aid beyond the sixth year.

Teaching and research fellowships are normally limited to 40 percent of full-time in the first two years and to 60 percent of full-time in the third through sixth years.

The statistics department is able to support a very limited number of qualified applicants each year. Applicants are therefore expected to apply for all non-Harvard and competitive Harvard scholarships for which they are eligible. For example, US citizens should investigate fellowships offered by the National Science Foundation and many other public and private sources.

Students with an interest in biostatistics should explore the PhD program in biostatistics at the School of Public Health. For more information, write to Department of Biostatistics, HSPH, 677 Huntington Avenue, Boston, MA 02115, or see www.biostat.harvard.edu .

Research Interests of the Faculty Currently Teaching in the Department

For more information on research in the Department of Statistics, see
www.stat.harvard.edu .

Jose Blanchet, Assistant Professor of Statistics. B.Sc. in Applied Mathematics and in Actuarial Science, ITAM (Mexican Institute of Technology); M.Sc. and Ph.D. in Operations Research, Stanford University. Applied probability, computational finance, MCMC Queueing Theory, simulation methodology, rare event estimation, and risk theory.

Joseph Blitzstein, Assistant Professor of Statistics. B.S. in Mathematics, California Institute of Technology, M.S. in Statistics, Stanford University, Ph.D. in Mathematics, Stanford University. Monte Carlo algorithms, Markov chains, importance sampling and variance-reduction techniques, combinatorics, and random graph models.

Tirthankar Dasgupta, Assistant Professor of Statistics; B.Sc. in Statistics, University of Calcutta; M.Stat. in Applied Statistics and Data Analysis and M.Tech. in Quality, Reliability and Operations Research, Indian Statistical Institute; Ph.D. in Industrial Engineering, Georgia Institute of Technology. Design of experiments, modeling and optimization of experimental data, process control and quality engineering. Areas of application include large-scale, reproducible, high-yield manufacture of nanostructures, the design of robust control systems, measurement systems, six sigma quality and supply chain management.

Yingying Fan, Lecturer on Statistics; B.S. in Statistics and Finance, University of Science and Technology of China; Ph.D. in Operations Research and Financial Engineering. Nonparametric Methods, financial econometrics, machine learning, bioinformatics.

Rima Izem, Assistant Professor of Statistics. Maitrise and Licence de Mathematiques, Universite de Montpellier II; PhD, Statistics, University of North Carolina at Chapel Hill. Statistical methods in genetics and evolutionary biology. Analysis of high-dimension-, low-sample-size data; non-parametric methods. Functional data analysis (analysis of curves, images or shapes), smoothing, longitudinal data analysis, analysis of non-linear variations; and, more recently, spatial analysis.

S.C. Samuel Kou, John L. Loeb Associate Professor of the Natural Sciences. Computational Mathematics, Peking University; MS and PhD, Statistics, Stanford University. Stochastic modeling in natural sciences (such as nano-biophysics, chemistry and biology) and in economics and finance; inference about stochastic models (processes); statistical analysis of single-molecule experiments; non-parametric methods; model selection; Bayesian and empirical Bayesian methodology; Monte Carlo methods.

Yoonjung Lee, Assistant Professor of Statistics. BSc in Statistics, minor in Economics, Ewha Women’s University; MS, Statistics, Florida State University; Joint PhD in Statistics and Finance, University of Wisconsin-Madison. Stochastic analysis with applications to finance, financial engineering, market microstructure; financial markets; liquidity/credit risk modeling; filtering; parameter estimation in partially observed systems; stochastic partial differential equations; and stochastic control.

Jun S. Liu, Professor of Statistics. BS, Mathematics, Beijing University; PhD, The University of Chicago. Bayesian methodol-ogy: modeling, test-ing, and nonparametrics. Monte Carlo methodology: Gibbs sampling and MCMC methods; MC methods in physics, material science, chemistry, and structural biology; rate of convergence. Dynamic systems: nonlinear state-space models; target tracking; digital signal processing; financial data modeling. Bioinformatics and computational biology: gene regulation; sequence alignment; protein structure prediction; gene clustering and classification; genetics. Statistical missing data problems.

Xiao-Li Meng, Professor of Statistics. BS, Mathematics, Fudan University; AM and PhD, Statistics, Harvard University. Statistical inference under complex settings, such as partially observed data, pre-processed data, and simulated data. Quantifying statistical information and efficiency in scientific studies. Statistical principles and foundational issues, especially regarding model uncongeniality, self-efficiency, and quantifying ignorance. Bayesian wavelet and multi-resolution methods, especially with missing data. Bayesian ranking and mapping. Stochastic and deterministic iterative algorithms, especially perfect sampling.Applications of the above research to astrophysics, genetic and environmental studies, demosaicing and image reconstruction, mental health surveys, and survival analysis.

Carl N. Morris, Professor of Statistics. BS, Engineering, California Institute of Technology; MS and PhD, Statistics, Stanford University. Hierarchical modeling, Bayesian and like-lihood theory, exponential families, and -statistical applications, especially in mental health and health policy research, and also in other scientific areas, including sports and competition.

Bernard Rosner, Professor of Medicine, Harvard Medical School, and Professor of Biostatistics, Harvard School of Public Health. BA, Mathe-matics, Columbia University; MA, Statistics, Stanford University; PhD, Statistics, Harvard. Methodological issues in biostatistics and outlier detection theory; the former includes longitudinal data analysis, analysis of clustered binary data, measurement error problems, methodological problems in hypertension screening and evaluation, sports statistics, and modeling breast cancer incidence data.

Donald B. Rubin, John L. Loeb Professor of Statistics. AB, Psychology, Princeton University; MS, Computer Science, Harvard University; PhD, Statistics, Harvard. Causal inference in experiments and observational studies, including complex situations with noncompliance and dropout; computation and inference in sample surveys with nonresponse and in missing data problems, including EM-type algorithms; application of Bayesian and empirical Bayesian techniques; and developing and applying statistical models to data in a variety of scientific and policy relevant disciplines.

Alan M. Zaslavsky, Associate Professor of Statistics in the Department of Health Care Policy, Harvard Medical School. AB, Government, Harvard; MS, Mathematics, Northeastern University; PhD, Applied Mathematics-Statistics, Massachusetts Institute of Technology. Health care policy, hierarchical Bayes models, design and analysis of surveys, discrete data, small area estimation, government statistics, statistical computing environments.

Emeritus Faculty

Herman Chernoff, Professor of Statistics Emeritus. BS, Mathematics, City College of New York; MSc, and PhD, Applied Mathematics, Brown University. Sequential analysis, optimal design of experiments and pattern recognition, statistics applied to molecular biology.

Arthur P. Dempster, Research Professor of Theoretical Statistics. BA, Mathematics and Physics, University of Toronto; MA, Mathematics, University of Toronto; PhD, Mathematical Statistics, Princeton University. Statistical science as probabilistic reasoning from data and model assumptions with reference to unique inferential situations, primarily through methods and analyses based on the Dempster-Shafer calculus and its specializations to Fisherian and Bayesian inference. Areas of applied interest include biometric identification, machine learning applied to pattern and network identification in genomics, and physical and statistical modeling and analysis related to climate change and similar environmental issues.