We consider the problem of embedding a subset of $\mathbb{R}^n$ into a low-dimensional Hamming cube in an almost isometric way. We construct a simple and computationally efficient map that achieves this task with high probability: we first apply a specific structured random matrix, which we call the double circulant matrix; using that matrix requires little storage and matrix-vector multiplication can be performed in near-linear time. We then binarize each vector by comparing each of its entries to a random threshold, selected uniformly at random from a well-chosen interval. We estimate the number of bits required for this encoding scheme in terms of two natural geometric complexity parameters of the set -- its Euclidean covering numbers and its localized Gaussian complexity. The estimate we derive turns out to be the best that one can hope for -- up to logarithmic terms. The key to the proof is a phenomenon of independent interest: we show that the double circulant matrix mimics the behavior of a Gaussian matrix in two important ways. First, it yields an almost isometric embedding of any subset of $\ell_2^n$ into $\ell_1^m$ and, second, it maps an arbitrary set in $\mathbb{R}^n$ into a set of well-spread vectors.
翻译:我们考虑将一个子集的 $mathbb{R ⁇ {R ⁇ n$ 嵌入一个低维Hamming 立方体的问题。 我们建造了一个简单且计算高效的地图, 以极有可能的方式完成这项任务: 我们首先应用一个特定的结构随机矩阵, 我们称之为双螺旋矩阵矩阵; 使用该矩阵, 只需要在近线时间进行少量存储和矩阵- 矢量倍增。 然后, 我们通过将每个矢量的条目与随机阈值进行比较, 将每个矢量分解成一个随机的临界值, 从选自一个精密的间隔点中随机选取出。 我们估计这个编码方案需要的位数是两个自然几何复杂参数 -- -- 包括数字和本地化的高频复杂度。 我们得出的估计结果是人们所期望的最好的 -- 最高为对数术语。 证明的关键是独立的兴趣现象: 我们显示, 双光量基矩阵以两种重要的方式模拟高斯矩阵的行为。 首先, 它产生一个几乎是美元 美元 的正位数 。