Signal maps are essential for the planning and operation of cellular networks. However, the measurements needed to create such maps are expensive, often biased, not always reflecting the metrics of interest, and posing privacy risks. In this paper, we develop a unified framework for predicting cellular signal maps from limited measurements. Our framework builds on a state-of-the-art random-forest predictor, or any other base predictor. We propose and combine three mechanisms that deal with the fact that not all measurements are equally important for a particular prediction task. First, we design quality-of-service functions ($Q$), including signal strength (RSRP) but also other metrics of interest to operators, i.e., coverage and call drop probability. By implicitly altering the loss function employed in learning, quality functions can also improve prediction for RSRP itself where it matters (e.g., MSE reduction up to 27% in the low signal strength regime, where errors are critical). Second, we introduce weight functions ($W$) to specify the relative importance of prediction at different locations and other parts of the feature space. We propose re-weighting based on importance sampling to obtain unbiased estimators when the sampling and target distributions are different. This yields improvements up to 20% for targets based on spatially uniform loss or losses based on user population density. Third, we apply the Data Shapley framework for the first time in this context: to assign values ($\phi$) to individual measurement points, which capture the importance of their contribution to the prediction task. This improves prediction (e.g., from 64% to 94% in recall for coverage loss) by removing points with negative values, and can also enable data minimization. We evaluate our methods and demonstrate significant improvement in prediction performance, using several real-world datasets.
翻译:64号信号地图对于蜂窝网络的规划和运行至关重要。然而,为创建这样的地图而需要的测量数据是昂贵的,往往有偏差,而且并不总是反映感兴趣的度量,并造成隐私风险。在本文中,我们开发了一个统一框架,从有限的测量中预测蜂窝信号图。我们的框架建立在最新的随机森林预测器或任何其他基准预测器上。我们建议并合并三个机制,处理以下事实,即并非所有的测量数据对于特定的预测任务同等重要。首先,我们设计服务质量功能(Q$$),包括信号强度(RSRP),但也设计对操作者感兴趣的其他度量度指标,即覆盖面和调用降低概率的可能性。通过暗中改变学习中所使用的损失函数,质量功能还可以改进RSRP本身的预测结果(例如,MSE在低信号强度系统中将比例降低到27%,而错误又非常严重 ) 第二,我们引入负比值函数(W$),以便确定在不同地点和地表空间的其他部分进行预测的相对重要性。我们提议根据重要程度进行重新加权,根据重要程度进行测量,从而获得不偏差的精确的精确的精确度数据,在用户的精确度上进行数据分布上,然后在进行数据取样时,然后将数据采集进行数据采集数据输出,然后使用。