Network meta-analysis (NMA) of rare events has attracted little attention in the literature. Until recently, networks of interventions with rare events were analyzed using the inverse-variance NMA approach. However, when events are rare the normal approximation made by this model can be poor and effect estimates are potentially biased. Other methods for the synthesis of such data are the recent extension of the Mantel-Haenszel approach to NMA or the use of the non-central hypergeometric distribution. In this article, we suggest a new common-effect NMA approach that can be applied even in networks of interventions with extremely low or even zero number of events without requiring study exclusion or arbitrary imputations. Our method is based on the implementation of the penalized likelihood function proposed by Firth for bias reduction of the maximum likelihood estimate to the logistic expression of the NMA model. A limitation of our method is that heterogeneity cannot be taken into account as an additive parameter as in most meta-analytical models. However, we account for heterogeneity by incorporating a multiplicative overdispersion term using a two-stage approach. We show through simulations that our method performs consistently well across all tested scenarios and most often results in smaller bias than other available methods. We also illustrate the use of our method through two clinical examples. We conclude that our "penalized likelihood NMA" approach is promising for the analysis of binary outcomes with rare events especially for networks with very few studies per comparison and very low control group risks.

0
下载
关闭预览

相关内容

Networking:IFIP International Conferences on Networking。 Explanation:国际网络会议。 Publisher:IFIP。 SIT: http://dblp.uni-trier.de/db/conf/networking/index.html

In recent years, local differential privacy (LDP) has emerged as a technique of choice for privacy-preserving data collection in several scenarios when the aggregator is not trustworthy. LDP provides client-side privacy by adding noise at the user's end. Thus, clients need not rely on the trustworthiness of the aggregator. In this work, we provide a noise-aware probabilistic modeling framework, which allows Bayesian inference to take into account the noise added for privacy under LDP, conditioned on locally perturbed observations. Stronger privacy protection (compared to the central model) provided by LDP protocols comes at a much harsher privacy-utility trade-off. Our framework tackles several computational and statistical challenges posed by LDP for accurate uncertainty quantification under Bayesian settings. We demonstrate the efficacy of our framework in parameter estimation for univariate and multi-variate distributions as well as logistic and linear regression.

0
0
下载
预览

We study regression discontinuity designs in which many covariates, possibly much more than the number of observations, are available. We provide a two-step algorithm which first selects the set of covariates to be used through a localized Lasso-type procedure, and then, in a second step, estimates the treatment effect by including the selected covariates into the usual local linear estimator. We provide an in-depth analysis of the algorithm's theoretical properties, showing that, under an approximate sparsity condition, the resulting estimator is asymptotically normal, with asymptotic bias and variance that are conceptually similar to those obtained in low-dimensional settings. Bandwidth selection and inference can be carried out using standard methods. We also provide simulations and an empirical application.

0
0
下载
预览

In this paper several related estimation problems are addressed from a Bayesian point of view and optimal estimators are obtained for each of them when some natural loss functions are considered. Namely, we are interested in estimating a regression curve. Simultaneously, the estimation problems of a conditional distribution function, or a conditional density, or even the conditional distribution itself, are considered. All these problems are posed in a sufficiently general framework to cover continuous and discrete, univariate and multivariate, parametric and non-parametric cases, without the need to use a specific prior distribution. The loss functions considered come naturally from the quadratic error loss function comonly used in estimating a real function of the unknown parameter. The cornerstone of the mentioned Bayes estimators is the posterior predictive distribution. Some examples are provided to illustrate these results.

0
0
下载
预览

Bayesian optimization (BO) is a powerful approach for optimizing black-box, expensive-to-evaluate functions. To enable a flexible trade-off between the cost and accuracy, many applications allow the function to be evaluated at different fidelities. In order to reduce the optimization cost while maximizing the benefit-cost ratio, in this paper, we propose Batch Multi-fidelity Bayesian Optimization with Deep Auto-Regressive Networks (BMBO-DARN). We use a set of Bayesian neural networks to construct a fully auto-regressive model, which is expressive enough to capture strong yet complex relationships across all the fidelities, so as to improve the surrogate learning and optimization performance. Furthermore, to enhance the quality and diversity of queries, we develop a simple yet efficient batch querying method, without any combinatorial search over the fidelities. We propose a batch acquisition function based on Max-value Entropy Search (MES) principle, which penalizes highly correlated queries and encourages diversity. We use posterior samples and moment matching to fulfill efficient computation of the acquisition function and conduct alternating optimization over every fidelity-input pair, which guarantees an improvement at each step. We demonstrate the advantage of our approach on four real-world hyperparameter optimization applications.

0
0
下载
预览

Understanding how treatment effects vary on individual characteristics is critical in the contexts of personalized medicine, personalized advertising and policy design. When the characteristics are of practical interest are only a subset of full covariate, non-parametric estimation is often desirable; but few methods are available due to the computational difficult. Existing non-parametric methods such as the inverse probability weighting methods have limitations that hinder their use in many practical settings where the values of propensity scores are close to 0 or 1. We propose the propensity score regression (PSR) that allows the non-parametric estimation of the heterogeneous treatment effects in a wide context. PSR includes two non-parametric regressions in turn, where it first regresses on the propensity scores together with the characteristics of interest, to obtain an intermediate estimate; and then, regress the intermediate estimates on the characteristics of interest only. By including propensity scores as regressors in the non-parametric manner, PSR is capable of substantially easing the computational difficulty while remain (locally) insensitive to any value of propensity scores. We present several appealing properties of PSR, including the consistency and asymptotical normality, and in particular the existence of an explicit variance estimator, from which the analytical behaviour of PSR and its precision can be assessed. Simulation studies indicate that PSR outperform existing methods in varying settings with extreme values of propensity scores. We apply our method to the national 2009 flu survey (NHFS) data to investigate the effects of seasonal influenza vaccination and having paid sick leave across different age groups.

0
0
下载
预览

Discrete data are abundant and often arise as counts or rounded data. However, even for linear regression models, conjugate priors and closed-form posteriors are typically unavailable, thereby necessitating approximations or Markov chain Monte Carlo for posterior inference. For a broad class of count and rounded data regression models, we introduce conjugate priors that enable closed-form posterior inference. Key posterior and predictive functionals are computable analytically or via direct Monte Carlo simulation. Crucially, the predictive distributions are discrete to match the support of the data and can be evaluated or simulated jointly across multiple covariate values. These tools are broadly useful for linear regression, nonlinear models via basis expansions, and model and variable selection. Multiple simulation studies demonstrate significant advantages in computing, predictive modeling, and selection relative to existing alternatives.

0
0
下载
预览

Longitudinal item response data are common in social science, educational science, and psychology, among other disciplines. Studying the time-varying relationships between items is crucial for educational assessment or designing marketing strategies from survey questions. Although dynamic network models have been widely developed, we cannot apply them directly to item response data because there are multiple systems of nodes with various types of local interactions among items, resulting in multiplex network structures. We propose a new model to study these temporal interactions among items by embedding the functional parameters within the exponential random graph model framework. Inference on such models is difficult because the likelihood functions contain intractable normalizing constants. Furthermore, the number of functional parameters grows exponentially as the number of items increases. Variable selection for such models is not trivial because standard shrinkage approaches do not consider temporal trends in functional parameters. To overcome these challenges, we develop a novel Bayes approach by combining an auxiliary variable MCMC algorithm and a recently-developed functional shrinkage method. We apply our algorithm to survey and review data sets, illustrating that the proposed approach can avoid the evaluation of intractable normalizing constants as well as the detection of significant temporal interactions among items. Through a simulation study under different scenarios, we examine the performance of our algorithm. Our method is, to our knowledge, the first attempt to select functional variables for models with intractable normalizing constants.

0
0
下载
预览

Active learning algorithms select a subset of data for annotation to maximize the model performance on a budget. One such algorithm is Expected Gradient Length, which as the name suggests uses the approximate gradient induced per example in the sampling process. While Expected Gradient Length has been successfully used for classification and regression, the formulation for regression remains intuitively driven. Hence, our theoretical contribution involves deriving this formulation, thereby supporting the experimental evidence. Subsequently, we show that expected gradient length in regression is equivalent to Bayesian uncertainty. If certain assumptions are infeasible, our algorithmic contribution (EGL++) approximates the effect of ensembles with a single deterministic network. Instead of computing multiple possible inferences per input, we leverage previously annotated samples to quantify the probability of previous labels being the true label. Such an approach allows us to extend expected gradient length to a new task: human pose estimation. We perform experimental validation on two human pose datasets (MPII and LSP/LSPET), highlighting the interpretability and competitiveness of EGL++ with different active learning algorithms for human pose estimation.

0
0
下载
预览

In this paper, we study the properties of nonparametric least squares regression using deep neural networks. We derive non-asymptotic upper bounds for the prediction error of the empirical risk minimizer for feedforward deep neural regression. Our error bounds achieve minimax optimal rate and significantly improve over the existing ones in the sense that they depend polynomially on the dimension of the predictor, instead of exponentially on dimension. We show that the neural regression estimator can circumvent the curse of dimensionality under the assumption that the predictor is supported on an approximate low-dimensional manifold or a set with low Minkowski dimension. These assumptions differ from the structural condition imposed on the target regression function and are weaker and more realistic than the exact low-dimensional manifold support assumption. We investigate how the prediction error of the neural regression estimator depends on the structure of neural networks and propose a notion of network relative efficiency between two types of neural networks, which provides a quantitative measure for evaluating the relative merits of different network structures. To establish these results, we derive a novel approximation error bound for the H\"older smooth functions with a positive smoothness index using ReLU activated neural networks, which may be of independent interest. Our results are derived under weaker assumptions on the data distribution and the neural network structure than those in the existing literature.

0
0
下载
预览

Heatmap-based methods dominate in the field of human pose estimation by modelling the output distribution through likelihood heatmaps. In contrast, regression-based methods are more efficient but suffer from inferior performance. In this work, we explore maximum likelihood estimation (MLE) to develop an efficient and effective regression-based methods. From the perspective of MLE, adopting different regression losses is making different assumptions about the output density function. A density function closer to the true distribution leads to a better regression performance. In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution. Concretely, RLE learns the change of the distribution instead of the unreferenced underlying distribution to facilitate the training process. With the proposed reparameterization design, our method is compatible with off-the-shelf flow models. The proposed method is effective, efficient and flexible. We show its potential in various human pose estimation tasks with comprehensive experiments. Compared to the conventional regression paradigm, regression with RLE bring 12.4 mAP improvement on MSCOCO without any test-time overhead. Moreover, for the first time, especially on multi-person pose estimation, our regression method is superior to the heatmap-based methods. Our code is available at https://github.com/Jeff-sjtu/res-loglikelihood-regression

0
3
下载
预览
小贴士
相关论文
Tejas Kulkarni,Joonas Jälkö,Samuel Kaski,Antti Honkela
0+阅读 · 10月27日
Alexander Kreiß,Christoph Rothe
0+阅读 · 10月26日
Shibo Li,Robert M. Kirby,Shandian Zhe
0+阅读 · 10月25日
Peng Wu,ShaSha Han,Xingwei Tong,Runze Li
0+阅读 · 10月25日
Daniel R. Kowal
0+阅读 · 10月23日
Jaewoo Park,Yeseul Jeon,Minsuk Shin,Minjeong Jeon,Ick Hoon Jin
0+阅读 · 10月22日
Yuling Jiao,Guohao Shen,Yuanyuan Lin,Jian Huang
0+阅读 · 10月21日
Jiefeng Li,Siyuan Bian,Ailing Zeng,Can Wang,Bo Pang,Wentao Liu,Cewu Lu
3+阅读 · 7月26日
相关VIP内容
专知会员服务
42+阅读 · 8月20日
专知会员服务
24+阅读 · 7月4日
专知会员服务
80+阅读 · 2020年6月2日
因果图,Causal Graphs,52页ppt
专知会员服务
144+阅读 · 2020年4月19日
最新BERT相关论文清单,BERT-related Papers
专知会员服务
37+阅读 · 2019年9月29日
相关资讯
强化学习三篇论文 避免遗忘等
CreateAMind
12+阅读 · 2019年5月24日
Hierarchically Structured Meta-learning
CreateAMind
12+阅读 · 2019年5月22日
Transferring Knowledge across Learning Processes
CreateAMind
8+阅读 · 2019年5月18日
已删除
将门创投
3+阅读 · 2019年1月15日
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字闺中
10+阅读 · 2018年12月24日
Disentangled的假设的探讨
CreateAMind
8+阅读 · 2018年12月10日
ERROR: GLEW initalization error: Missing GL version
深度强化学习实验室
4+阅读 · 2018年6月13日
Hierarchical Disentangled Representations
CreateAMind
3+阅读 · 2018年4月15日
【论文】变分推断(Variational inference)的总结
机器学习研究会
24+阅读 · 2017年11月16日
Auto-Encoding GAN
CreateAMind
5+阅读 · 2017年8月4日
Top