USF Home > College of Arts and Sciences > Department of Mathematics & Statistics

Mathematics & Statistics

# Frontiers in Statistics (Leader: Dr. George Yanev)

## Friday, April 20, 2007

Title
Speaker

Time
Place

SAS Opportunities for Students and Faculty
Elizabeth Ceranowski
SAS Student Program Manager
SAS Institute
1:00pm-2:00pm
PHY 120

Abstract

We will talk about the opportunities SAS has for students and faculty. This includes: software, recognition, jobs, and certification — technical or otherwise.

The speaker is also a BASE and Advanced SAS Certified Programmer.

## Thursday, April 12, 2007

Title
Speaker

Time
Place

Subgroup Analysis: A Stylized Bayes Approach
Siva Sivaganesan
University of Cincinnati
3:00pm-4:00pm
LIF 262

Abstract

Subgroup analysis is recognized to be important in clinical trials but lacks a formal approach that addresses the main issues such as accounting for multiple testing and limits on the number of tests. We introduce a new approach to inference for subgroups. The main elements of the proposed approach are the use of a priority ordering on covariates to define potential subgroups and the use of the posterior probabilities to identify subgroup effects for reporting. We employ Bayesian model selection methods with objective priors to determine the posterior probabilities of subgroup effects. As usual in Bayesian clinical trial design we compute frequentist operating characteristics (OC). We achieve desired OCs by obtaining a suitable threshold for the posterior probabilities.

## Monday, February 26, 2007

Title
Speaker

Time
Place

Mixture models in genetic research
Wonkuk Kim
Stony Brook University
2:00pm-3:00pm
PHY 109

Abstract

Two different types of mixture models will be presented. My first model is a mixture model with known mixing proportions. The classical regularity conditions for the asymptotic convergence of the null distribution of the likelihood ratio test statistic (LRTS) are not satisfied because of the degeneracy of the Fisher information matrix. The talk covers a brief sketch of the proof that the asymptotic null distribution of the likelihood ratio test (LRT) for two or more components does not depend on the number of components. As an example, Gamma mixtures are applied to an F-2 breeding experiment in classical genetics to detect a major gene.

Second, the test of whether the distribution of genotypes of a single nucleotide polymorphism (SNP) in a control population is the same as the distribution in an affected population can be made out using the $$2\times 3$$ test of independence. When the genotyping is determined by an underlying continuous measure that is the measure of three normal components, the LRT of the equality of mixing proportions is an alternative. We compare the performance of these tests by first calculating the power of the LRT and the relative efficiency of the $$2\times 3$$ test to the LRT. When the minor SNP allele frequency is less than $$0.2$$ in both cases and controls and the separation between genotype components is small, the LRT is more efficient than the $$2\times 3$$ test. We present detailed tables of efficiencies and the limiting behavior of the relative efficiency.

## Friday, February 23, 2007

Title
Speaker

Time
Place

A general formulation for a one-sided group sequential design
Barry K. Moser
Duke University Medical Center
2:00pm-3:00pm
PHY 108

Abstract

This talk focuses on one-sided group sequential designs based on conditional probabilities. A general design formulation is developed. This formulation is then shown to be equivalent to the commonly used one-sided group sequential procedures developed by Pampallona and Tsiatis. The value of the unknown parameter of the conditional probability is shown to control the interpretation of the results of the design. A graphical procedure is proposed to address issues of futility or efficacy when any test statistic is used at an interim stage. An example is used to illustrate the proposed graphical procedure. Finally, the interim boundaries developed from the conditional probabilities also have implication for stochastic curtailment procedures. These implications lead to recommendations on the application of stochastic curtailment stopping rules.

## Thursday, February 22, 2007

Title
Speaker

Time
Place

Screening for Differentially Expressed Genes Using Bayes Factors
Fang Yu
University of Connecticut
2:00pm-3:00pm
LIF 261

Abstract

A common interest in microarray data analysis is to identify genes having different expression levels between two conditions. The existing methods include using two-sampled t-statistics, a modified t-statistics (SAM), semiparametric hierarchial Bayesian models, and nonparametric permutation tests. All of these methods essentially compare two population means. In this talk, we consider using the Bayes factor to compare gene expression levels. The Bayes factor approach is quite attractive and flexible in evaluating the evidence for a gene to be differentially expressed as it allows us to compare not only two population means but also the population distributions. To facilitate the use of the Bayes factor, we propose a new calibration approach that weighs two types of error probabilities differently from the prior predictive distribution of the Bayes factor for each gene and at the same time controls overall error rates for all geners under consideration. Moreover, a novel gene selection algorithm based on the calibration of the Bayes factor is developed and the theoretical properties of the proposed method are carefully examined. Our method is shown to have smaller false discovery rate (FDR) and false non-discovery rate (FNDR) than several existing methods through simulations. Finally, a real dataset from an affymetric microarray experiment to identify genes associated with the onset of osteoblast differentiation is used to further illustrate the proposed methodology.

## Tuesday, February 20, 2007

Title
Speaker

Time
Place

Analyzing and modeling dichotomous traits in large complex pedigrees
Charlampos Papachristou
University of Chicago
3:00pm-4:00pm
PHY 118

Abstract

Although it is believed that many common complex disorders have a genetic basis, attempts to unravel the transmission mechanism governing such traits have met with limited success. It has been suggested that isolated founder populations with large, known pedigrees may be advantageous for complex trait mapping. However, their utility has been moderated by the extreme computational intensity involved in the analysis of such pedigrees as a whole.

We are proposing a likelihood method for modeling the transmission of dichotomous traits that can handle large pedigrees in a fast and efficient way. Using generalized linear mixed models, we extend the method of Abney et al. (2002) for mapping quantitative trait loci (QTLs), to accommodate binary traits. The high dimensionality of the integration involved in the likelihood prohibits exact computations. We show that one can overcome this hurdle and obtain the maximum likelihood estimates of the model parameters through the use of an efficient Monte Carlo expectation maximization (MCEM) algorithm.

Analysis of data from a 13-generation pedigree consisting of 1,653 Hutterites, focusing on the diabetes phenotype, reveals evidence for the existence of at least one locus with dominance mode of trait transmission.

## Monday, February 19, 2007

Title
Speaker

Time
Place

Estimating Reaction Constants in Stochastic Intracellular Networks
Greg A. Rempala
University of Louisville
2:00pm-3:00pm
PHY 013

Abstract

One of the key issues of interest in analyzing stochastic kinetic models of reaction networks involving RNA and DNA molecules (like, e.g., gene transcription) is how to infer the values of the reaction constants. Under mass action kinetics assumption this is relatively straightforward when the system trajectories are fully observed, however, this is rarely the case in practice. The talk shall summarize some recent developments in the area of Bayesian inference for reaction constants using MCMC methodology in “data-poor” settings. In particular, it shall attempt to indicate the benefits as well as the challenges of this approach with some examples of inferences for well-known biochemical networks models like, e.g., gene transcription and auto-regulation.

## Friday, February 9, 2007

Title
Speaker

Time
Place

Inferential Procedures Based on the Generalized Variable Approach With Applications
Kalimuthu Krishnamoorthy
University of Louisiana at Lafayette
2:00pm-3:00pm
PHY 013

Abstract

The generalized $$p$$-value has been introduced by Tsui and Weerahandi (1989, JASA) and the generalized confidence interval by Weerahandi (1993, JASA). The concepts of generalized $$p$$-values and generalized confidence intervals have turned out to be extremely fruitful for obtaining tests and confidence intervals involving non-standard parameters, such as log normal mean and quantiles in one-way random model. In this talk, I will first explain a method of constructing a generalized pivotal quantity for a parameter in a general setup. Then, construction of generalized quantities and inferential procedures based on them will be outlined for normal parameters, lognormal mean and to compare two lognormal means. I will briefly explain the applications of the generalized variable (GV) approach for setting tolerance limits in one-way random model, for correlation analysis in a multivariate normal distribution and finding one-sided limits for stress-strength reliability involving two-parameter exponential distributions. I will also compare the results based on the GV approach with those of the other methods, and illustrate the results with practical examples.

## Friday, February 2, 2007

Title
Speaker

Time
Place

Exceedance Problems for a Family of Branching Processes
Husna Hasan
School of Mathematical Sciences
Universiti Sains Malaysia
1:00pm-2:00pm
PHY 120

Abstract

A problem of the first exceedance of given level by the family of independent branching processes is considered. Limit theorems for the index of the first process exceeding some fixed and increasing level in subcritical, critical and supercritical cases when the processes have common and different offspring distributions will be presented.

## Friday, January 26, 2007

Title
Speaker
Time
Place

Parameter Estimatiion of Record Breaking Data and Some Characterization Problems
Alfred Mbah
1:00pm-2:00pm
PHY 120

Abstract

We shall present the general problem of classical parametric inference from record breaking data which was first addressed by Samaniego and Whitaker (1986, 1988). Hoinkes and Padgett (1994) extended the work of Samaniego and Whitaker (1986, 1988) to the Weibull distribution. I will present something along this line on Gumbel distribution and present a comparison between the Weibull and Gumbel distributions.

I will also talk about some distributional properties of lower generalized order statistics (LGOS). Based on the distributional properties of LGOS some characterizations of the power function distribution will be given.

## Friday, January 19, 2007

Title
Speaker
Time
Place

Johnson System and Mixture Modeling for Gene Expression Data Analysis
Florence George
1:00pm-2:00pm
PHY 120

Abstract

A common task in analyzing microarray data is to determine which genes are differentially expressed across two kinds of tissue samples or samples obtained under two experimental conditions. In recent years several statistical methods have been proposed to accomplish this goal when there are replicated samples under each condition. In this talk Johnson system of curves will be introduced. We will discuss how Johnson system can be used for gene expression data analysis. A mixture model approach for gene expression data will also be discussed.