Frontiers in Statistics
(Leader: Dr. George Yanev)
Friday, April 20, 2007
| Title |
SAS Opportunities for Students and Faculty |
| Speaker |
Elizabeth Ceranowski
SAS Student Program Manager
SAS Institute |
| Time |
1:00-2:00 p.m. |
| Place |
PHY 120 |
Abstract
We will talk about the opportunities SAS has for students and faculty. This
includes: software, recognition, jobs, and certification — technical or otherwise.
The speaker is also a BASE and Advanced SAS Certified Programmer.
Thursday, April 12, 2007
| Title |
Subgroup Analysis: A Stylized Bayes Approach |
| Speaker |
Siva Sivaganesan
University of Cincinnati |
| Time |
3:00-4:00 p.m. |
| Place |
LIF 262 |
Abstract
Subgroup analysis is recognized to be important in clinical trials but lacks
a formal approach that addresses the main issues such as accounting for multiple
testing and limits on the number of tests. We introduce a new approach
to inference for subgroups. The main elements of the proposed approach are
the use of a priority ordering on covariates to define potential subgroups
and the use of the posterior probabilities to identify subgroup effects for
reporting. We employ Bayesian model selection methods with objective priors
to determine the posterior probabilities of subgroup effects. As usual in Bayesian
clinical trial design we compute frequentist operating characteristics (OC).
We achieve desired OCs by obtaining a suitable threshold for the posterior
probabilities.
Monday, February 26, 2007
| Title |
Mixture models in genetic research |
| Speaker |
Wongkuk Kim
Stony Brook University |
| Time |
2:00-3:00 p.m. |
| Place |
PHY 109 |
Abstract
Two different types of mixture models will be presented. My first model is
a mixture model with known mixing proportions. The classical regularity conditions
for the asymptotic convergence of the null distribution of the likelihood ratio
test statistic (LRTS) are not satisfied because of the degeneracy of the Fisher
information matrix. The talk covers a brief sketch of the proof that the asymptotic
null distribution of the likelihood ratio test (LRT) for two or more components
does not depend on the number of components. As an example, Gamma mixtures
are applied to an F-2 breeding experiment in classical genetics to detech a
major gene.
Second, the test of whether the distribution of genotypes of a single nucleotide
polymorphism (SNP) in a control population is the same as the distribution
in an affected population can be made out using the 2×3 test of independence.
When the genotyping is determined by an underlying continuous measure that
is the measure of three normal components, the LRT of the equality of mixing
proportions is an alternative. We compare the performance of these tests by
first calculating the power of the LRT and the relative efficiency of the 2×3
test to the LRT. When the minor SNP allele frequency is less than 0.2 in both
cases and controls and the separation between genotype components is small,
the LRT is more efficient than the 2×3 test. We present detailed tables
of efficiencies and the limiting behavior of the relative efficiency.
Friday, February 23, 2007
| Title |
A general formulation for a one-sided group sequential design |
| Speaker |
Barry K. Moser
Duke University Medical Center |
| Time |
2:00-3:00 p.m. |
| Place |
PHY 108 |
Abstract
This talk focuses on one-sided group sequential designs based on conditional
probabilities. A general design formulation is developed. This formulation
is then shown to be equivalent to the commonly used one-sided group sequential
procedures developed by Pampallona and Tsiatis. The value of the unknown parameter
of the conditional probability is shown to control the interpretation of the
results of the design. A graphical procedure is proposed to address issues
of futility or efficacy when any test statistic is used at an interim stage.
An example is used to illustrate the proposed graphical procedure. Finally,
the interim boundaries developed from the conditional probabilities also have
implication for stochastic curtailment procedures. These implications lead
to recommendations on the application of stochastic curtailment stopping rules.
Thursday, February 22, 2007
| Title |
Screening for Differentially Expressed Genes Using Bayes
Factors |
| Speaker |
Fang Yu
University of Connecticut |
| Time |
2:00-3:00 p.m. |
| Place |
LIF 261 |
Abstract
A common interest in microarray data analysis is to identify genes having
different expression levels between two conditions. The existing methods include
using two-sampled t-statistics, a modified t-statistics (SAM), semiparametric
hierarchial Bayesian models, and nonparametric permutation tests. All of these
methods essentially compare two population means. In this talk, we consider
using the Bayes factor to compare gene expression levels. The Bayes factor
approach is quite attractive and flexible in evaluating the evidence for a
gene to be differentially expressed as it allows us to compare not only two
population means but also the population distributions. To facilitate the use
of the Bayes factor, we propose a new calibration approach that weighs two
types of error probabilities differently from the prior predictive distribution
of the Bayes factor for each gene and at the same time controls overall error
rates for all geners under consideration. Moreover, a novel gene selection
algorithm based on the calibration of the Bayes factor is developed and the
theoretical properties of the proposed method are carefully examined. Our method
is shown to have smaller false discovery rate (FDR) and false non-discovery
rate (FNDR) than several existing methods through simulations. Finally, a real
dataset from an affymetric microarray experiment to identify genes associated
with the onset of osteoblast differentiation is used to further illustrate
the proposed methodology.
Tuesday, February 20, 2007
| Title |
Analyzing and modeling dichotomous traits in large complex pedigrees |
| Speaker |
Charlampos Papachristou
University of Chicago |
| Time |
3:00-4:00 p.m. |
| Place |
PHY 118 |
Abstract
Although it is believed that many common complex disorders have a genetic
basis, attempts to unravel the transmission mechanism governing such traits
have met with limited success. It has been suggested that isolated founder
populations with large, known pedigrees may be advantageous for complex trait
mapping. However, their utility has been moderated by the extreme computational
intensity involved in the analysis of such pedigrees as a whole.
We are proposing a likelihood method for modeling the transmission of
dichotomous traits that can handle large pedigrees in a fast and efficient way.
Using generalized linear mixed models, we extend the method of Abney
et al. (2002) for mapping quantitative trait loci (QTLs), to accommodate
binary traits. The high dimensionality of the integration involved in the
likelihood prohibits exact computations. We show that one can overcome this hurdle
and obtain the maximum likelihood estimates of the model parameters through the
use of an efficient Monte Carlo expectation maximization (MCEM) algorithm.
Analysis of data from a 13-generation pedigree consisting of 1,653 Hutterites,
focusing on the diabetes phenotype, reveals evidence for the existence of at
least one locus with dominance mode of trait transmission.
Monday, February 19, 2007
| Title |
Estimating Reaction Constants in Stochastic Intracellular Networks |
| Speaker |
Greg A. Rempala
University of Louisville |
| Time |
2:00-3:00 p.m. |
| Place |
PHY 013 |
Abstract
One of the key issues of interest in analyzing stochastic kinetic models of
reaction networks involving RNA and DNA molecules (like, e.g., gene transcription)
is how to infer the values of the reaction constants. Under mass action kinetics
assumption this is relatively straightforward when the system trajectories
are fully observed, however, this is rarely the case in practice. The talk
shall summarize some recent developments in the area of Bayesian inference
for reaction constants using MCMC methodology in “data-poor” settings.
In particular, it shall attempt to indicate the benefits as well as the challenges
of this approach with some examples of inferences for well-known biochemical
networks models like, e.g., gene transcription and auto-regulation.
Friday, February 9, 2007
| Title |
Inferential Procedures Based on the Generalized Variable
Approach With Applications |
| Speaker |
Kalimuthu Krishnamoorthy
University of Louisiana at Lafayette |
| Time |
2:00-3:00 p.m. |
| Place |
PHY 013 |
Abstract
The generalized p-value has been introduced by Tsui and Weerahandi
(1989, JASA) and the generalized confidence interval by Weerahandi (1993, JASA).
The concepts of generalized p-values and generalized confidence
intervals have turned out to be extremely fruitful for obtaining tests and
confidence intervals involving non-standard parameters, such as log normal
mean and quantiles in one-way random model. In this talk, I will first explain
a method of constructing a generalized pivotal quantity for a parameter in
a general setup. Then, construction of generalized quantities and inferential
procedures based on them will be outlined for normal parameters, lognormal
mean and to compare two lognormal means. I will briefly explain the applications
of the generalized variable (GV) approach for setting tolerance limits in one-way
random model, for correlation analysis in a multivariate normal distribution
and finding one-sided limits for stress-strength reliability involving two-parameter
exponential distributions. I will also compare the results based on the GV
approach with those of the other methods, and illustrate the results with practical
examples.
Friday, February 2, 2007
| Title |
Exceedance Problems for a Family of Branching Processes |
| Speaker |
Dr. Husna Hasan
School of Mathematical Sciences
University Sains Malaysia |
| Time |
1:00-2:00 p.m. |
| Place |
PHY 120 |
Abstract
A problem of the first exceedance of given level by the family of independent
branching processes is considered. Limit theorems for the index of the first
process exceeding some fixed and increasing level in subcritical, critical
and supercritical cases when the processes have common and different offspring
distributions will be presented.
Friday, January 26, 2007
| Title |
Parameter Estimatiion of Record Breaking Data and Some Characterization
Problems |
| Speaker |
Alfred Mbah |
| Time |
1:00-2:00 p.m. |
| Place |
PHY 120 |
Abstract
We shall present the general problem of classical parametric inference from
record breaking data which was first addressed by Samaniego and Whitaker (1986,
1988). Hoinkes and Padgett (1994) extended the work of Samaniego and Whitaker
(1986, 1988) to the Weibull distribution. I will present something along this
line on Gumbel distribution and present a comparison between the Weibull and
Gumbel distributions.
I will also talk about some distributional properties of lower generalized
order statistics (LGOS). Based on the distributional properties of LGOS some
characterizations of the power function distribution will be given.
Friday, January 19, 2007
| Title |
Johnson System and Mixture Modeling for Gene Expression Data Analysis |
| Speaker |
Florence George |
| Time |
1:00-2:00 p.m. |
| Place |
PHY 120 |
Abstract
A common task in analyzing microarray data is to determine which genes are
differentially expressed across two kinds of tissue samples or samples obtained
under two experimental conditions. In recent years several statistical methods
have been proposed to accomplish this goal when there are replicated samples
under each condition. In this talk Johnson system of curves will be introduced.
We will discuss how Johnson system can be used for gene expression data analysis.
A mixture model approach for gene expression data will also be discussed.