利用适应性计量和强化学习进行个人接触控制 (Personalized Exposure Control Using Adaptive Metering and Reinforcement Learning)

We propose a reinforcement learning approach for real-time exposure control of a mobile camera that is personalizable. Our approach is based on Markov Decision Process (MDP). In the camera viewfinder or live preview mode, given the current frame, our system predicts the change in exposure so as to optimize the trade-off among image quality, fast convergence, and minimal temporal oscillation. We model the exposure prediction function as a fully convolutional neural network that can be trained through Gaussian policy gradient in an end-to-end fashion. As a result, our system can associate scene semantics with exposure values; it can also be extended to personalize the exposure adjustments for a user and device. We improve the learning performance by incorporating an adaptive metering module that links semantics with exposure. This adaptive metering module generalizes the conventional spot or matrix metering techniques. We validate our system using the MIT FiveK and our own datasets captured using iPhone 7 and Google Pixel. Experimental results show that our system exhibits stable real-time behavior while improving visual quality compared to what is achieved through native camera control.

翻译：我们建议了个人可以个人化的移动相机实时曝光控制强化学习方法。我们的方法基于Markov 决策程序( MDP ) 。根据当前框架,我们的系统预测了曝光量的变化,以便优化图像质量、快速趋同和最低时间振荡之间的平衡。我们将曝光量预测功能模拟成一个完全共进的神经网络,可以通过高斯政策梯度在终端到终端时进行培训。因此,我们的系统可以将现场语义与曝光值联系起来; 也可以扩展为用户和装置的曝光量调整个性化。我们通过纳入一个适应性计量模块来改进学习绩效, 该模块将图像质量与暴露联系起来。这个适应性计量模块将常规点或矩阵测量技术统称为常规点或矩阵测量技术。我们用iPhone 7 和 Google Pixel 来验证我们的系统。实验结果显示,我们的系统在通过本地摄像控制来改进视觉质量的同时, 显示出稳定的实时行为。