In the first part of this work, we develop a novel scheme for solving nonparametric regression problems. That is the approximation of possibly low regular and noised functions from the knowledge of their approximate values given at some random points. Our proposed scheme is based on the use of the pseudo-inverse of a random projection matrix, combined with some specific properties of the Jacobi polynomials system, as well as some properties of positive definite random matrices. This scheme has the advantages to be stable, robust, accurate and fairly fast in terms of execution time. Moreover and unlike most of the existing nonparametric regression estimators, no extra regularization step is required by our proposed estimator. Although, this estimator is initially designed to work with random sampling set of uni-variate i.i.d. random variables following a Beta distribution, we show that it is still work for a wide range of sampling distribution laws. Moreover, we briefly describe how our estimator can be adapted in order to handle the multivariate case of random sampling sets. In the second part of this work, we extend the random pseudo-inverse scheme technique to build a stable and accurate estimator for solving linear functional regression (LFR) problems. A dyadic decomposition approach is used to construct this last stable estimator for the LFR problem. The performance of the two proposed estimators are illustrated by various numerical simulations. In particular, a real dataset is used to illustrate the performance of our nonparametric regression estimator.

### 相关内容

We study approximation methods for a large class of mixed models with a probit link function that includes mixed versions of the binomial model, the multinomial model, and generalized survival models. The class of models is special because the marginal likelihood can be expressed as Gaussian weighted integrals or as multivariate Gaussian cumulative density functions. The latter approach is unique to the probit link function models and has been proposed for parameter estimation in complex, mixed effects models. However, it has not been investigated in which scenarios either form is preferable. Our simulations and data example show that neither form is preferable in general and give guidance on when to approximate the cumulative density functions and when to approximate the Gaussian weighted integrals and, in the case of the latter, which general purpose method to use among a large list of methods.

Distributional regression is extended to Gaussian response vectors of dimension greater than two by parameterizing the covariance matrix $\Sigma$ of the response distribution using the entries of its Cholesky decomposition. The more common variance-correlation parameterization limits such regressions to bivariate responses -- higher dimensions require complicated constraints among the correlations to ensure positive definite $\Sigma$ and a well-defined probability density function. In contrast, Cholesky-based parameterizations ensure positive definiteness for all distributional dimensions no matter what values the parameters take, enabling estimation and regularization as for other distributional regression models. In cases where components of the response vector are assumed to be conditionally independent beyond a certain lag $r$, model complexity can be further reduced by setting Cholesky parameters beyond this lag to zero a priori. Cholesky-based multivariate Gaussian regression is first illustrated and assessed on artificial data and subsequently applied to a real-world 10-dimensional weather forecasting problem. There the regression is used to obtain reliable joint probabilities of temperature across ten future times, leveraging temporal correlations over the prediction period to obtain more precise and meteorologically consistent probabilistic forecasts.

Consider the task of matrix estimation in which a dataset $X \in \mathbb{R}^{n\times m}$ is observed with sparsity $p$, and we would like to estimate $\mathbb{E}[X]$, where $\mathbb{E}[X_{ui}] = f(\alpha_u, \beta_i)$ for some Holder smooth function $f$. We consider the setting where the row covariates $\alpha$ are unobserved yet the column covariates $\beta$ are observed. We provide an algorithm and accompanying analysis which shows that our algorithm improves upon naively estimating each row separately when the number of rows is not too small. Furthermore when the matrix is moderately proportioned, our algorithm achieves the minimax optimal nonparametric rate of an oracle algorithm that knows the row covariates. In simulated experiments we show our algorithm outperforms other baselines in low data regimes.

We consider the Bayesian analysis of models in which the unknown distribution of the outcomes is specified up to a set of conditional moment restrictions. The nonparametric exponentially tilted empirical likelihood function is constructed to satisfy a sequence of unconditional moments based on an increasing (in sample size) vector of approximating functions (such as tensor splines based on the splines of each conditioning variable). For any given sample size, results are robust to the number of expanded moments. We derive Bernstein-von Mises theorems for the behavior of the posterior distribution under both correct and incorrect specification of the conditional moments, subject to growth rate conditions (slower under misspecification) on the number of approximating functions. A large-sample theory for comparing different conditional moment models is also developed. The central result is that the marginal likelihood criterion selects the model that is less misspecified. We also introduce sparsity-based model search for high-dimensional conditioning variables, and provide efficient MCMC computations for high-dimensional parameters. Along with clarifying examples, the framework is illustrated with real-data applications to risk-factor determination in finance, and causal inference under conditional ignorability.

In this paper several related estimation problems are addressed from a Bayesian point of view and optimal estimators are obtained for each of them when some natural loss functions are considered. Namely, we are interested in estimating a regression curve. Simultaneously, the estimation problems of a conditional distribution function, or a conditional density, or even the conditional distribution itself, are considered. All these problems are posed in a sufficiently general framework to cover continuous and discrete, univariate and multivariate, parametric and non-parametric cases, without the need to use a specific prior distribution. The loss functions considered come naturally from the quadratic error loss function comonly used in estimating a real function of the unknown parameter. The cornerstone of the mentioned Bayes estimators is the posterior predictive distribution. Some examples are provided to illustrate these results.

For the class of Gauss-Markov processes we study the problem of asymptotic equivalence of the nonparametric regression model with errors given by the increments of the process and the continuous time model, where a whole path of a sum of a deterministic signal and the Gauss-Markov process can be observed. In particular we provide sufficient conditions such that asymptotic equivalence of the two models holds for functions from a given class, and we verify these for the special cases of Sobolev ellipsoids and H\"older classes with smoothness index $> 1/2$ under mild assumptions on the Gauss-Markov process at hand. To derive these results, we develop an explicit characterization of the reproducing kernel Hilbert space associated with the Gauss-Markov process, that hinges on a characterization of such processes by a property of the corresponding covariance kernel introduced by Doob. In order to demonstrate that the given assumptions on the Gauss-Markov process are in some sense sharp we also show that asymptotic equivalence fails to hold for the special case of Brownian bridge. Our findings demonstrate that the well-known asymptotic equivalence of the Gaussian white noise model and the nonparametric regression model with i.i.d. standard normal errors can be extended to a result treating general Gauss-Markov noises in a unified manner.

Compared to the nominal scale, the ordinal scale for a categorical outcome variable has the property of making a monotonicity assumption for the covariate effects meaningful. This assumption is encoded in the commonly used proportional odds model, but there it is combined with other parametric assumptions such as linearity and additivity. Herein, the considered models are non-parametric and the only condition imposed is that the effects of the covariates on the outcome categories are stochastically monotone according to the ordinal scale. We are not aware of the existence of other comparable multivariable models that would be suitable for inference purposes. We generalize our previously proposed Bayesian monotonic multivariable regression model to ordinal outcomes, and propose an estimation procedure based on reversible jump Markov chain Monte Carlo. The model is based on a marked point process construction, which allows it to approximate arbitrary monotonic regression function shapes, and has a built-in covariate selection property. We study the performance of the proposed approach through extensive simulation studies, and demonstrate its practical application in two real data examples.

We provide (high probability) bounds on the condition number of random feature matrices. In particular, we show that if the complexity ratio $\frac{N}{m}$ where $N$ is the number of neurons and $m$ is the number of data samples scales like $\log^{-3}(N)$ or $\log^{3}(m)$, then the random feature matrix is well-conditioned. This result holds without the need of regularization and relies on establishing a bound on the restricted isometry constant of the random feature matrix. In addition, we prove that the risk associated with regression problems using a random feature matrix exhibits the double descent phenomenon and that this is an effect of the double descent behavior of the condition number. The risk bounds include the underparameterized setting using the least squares problem and the overparameterized setting where using either the minimum norm interpolation problem or a sparse regression problem. For the least squares or sparse regression cases, we show that the risk decreases as $m$ and $N$ increase, even in the presence of bounded or random noise. The risk bound matches the optimal scaling in the literature and the constants in our results are explicit and independent of the dimension of the data.

Performing causal inference in observational studies requires we assume confounding variables are correctly adjusted for. G-computation methods are often used in these scenarios, with several recent proposals using Bayesian versions of g-computation. In settings with few confounders, standard models can be employed, however as the number of confounders increase these models become less feasible as there are fewer observations available for each unique combination of confounding variables. In this paper we propose a new model for estimating treatment effects in observational studies that incorporates both parametric and nonparametric outcome models. By conceptually splitting the data, we can combine these models while maintaining a conjugate framework, allowing us to avoid the use of MCMC methods. Approximations using the central limit theorem and random sampling allows our method to be scaled to high dimensional confounders while maintaining computational efficiency. We illustrate the model using carefully constructed simulation studies, as well as compare the computational costs to other benchmark models.

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

Benjamin Christoffersen,Mark Clements,Hedvig Kjellström,Keith Humphreys
0+阅读 · 10月27日
Thomas Muschinski,Georg J. Mayr,Thorsten Simon,Nikolaus Umlauf,Achim Zeileis
0+阅读 · 10月27日
Christina Lee Yu
0+阅读 · 10月26日
Siddhartha Chib,Minchul Shin,Anna Simoni
0+阅读 · 10月26日
Olli Saarela,Christian Rohrbeck,Elja Arjas
0+阅读 · 10月22日
Federico Camerlenghi,David B. Dunson,Antonio Lijoi,Igor Prünster,Abel Rodríguez
4+阅读 · 2018年1月15日

CreateAMind
12+阅读 · 2019年5月22日

12+阅读 · 2019年3月1日
CreateAMind
20+阅读 · 2019年1月4日

3+阅读 · 2018年5月17日

31+阅读 · 2017年12月10日

24+阅读 · 2017年11月16日
Call4Papers
3+阅读 · 2017年10月13日

22+阅读 · 2017年9月8日

3+阅读 · 2017年8月20日
CreateAMind
5+阅读 · 2017年8月4日
Top