We propose an approach to compute inner and outer-approximations of the sets of values satisfying constraints expressed as arbitrarily quantified formulas. Such formulas arise for instance when specifying important problems in control such as robustness, motion planning or controllers comparison. We propose an interval-based method which allows for tractable but tight approximations. We demonstrate its applicability through a series of examples and benchmarks using a prototype implementation.
暂无翻译
Cloud computing and virtualization solutions allow one to rent the virtual machines (VMs) needed to run applications on a pay-per-use basis, but rented VMs do not offer any guarantee on their performance. Cloud platforms are known to be affected by performance variability, but a better understanding is still required. This paper moves in that direction and presents an in-depth, multi-faceted study on the performance variability of VMs. Unlike previous studies, our assessment covers a wide range of factors: 16 VM types from 4 well-known cloud providers, 10 benchmarks, and 28 different metrics. We present four new contributions. First, we introduce a new benchmark suite (VMBS) that let researchers and practitioners systematically collect a diverse set of performance data. Second, we present a new indicator, called Variability Indicator, that allows for measuring variability in the performance of VMs. Third, we illustrate an analysis of the collected data across four different dimensions: resources, isolation, time, and cost. Fourth, we present multiple predictive models based on Machine Learning that aim to forecast future performance and detect time patterns. Our experiments provide important insights on the resource variability of VMs, highlighting differences and similarities between various cloud providers. To the best of our knowledge, this is the widest analysis ever conducted on the topic.
暂无翻译
ASR systems have become increasingly widespread in recent years. However, their textual outputs often require post-processing tasks before they can be practically utilized. To address this issue, we draw inspiration from the multifaceted capabilities of LLMs and Whisper, and focus on integrating multiple ASR text processing tasks related to speech recognition into the ASR model. This integration not only shortens the multi-stage pipeline, but also prevents the propagation of cascading errors, resulting in direct generation of post-processed text. In this study, we focus on ASR-related processing tasks, including Contextual ASR and multiple ASR post processing tasks. To achieve this objective, we introduce the CPPF model, which offers a versatile and highly effective alternative to ASR processing. CPPF seamlessly integrates these tasks without any significant loss in recognition performance.
暂无翻译
Generating multi-instrument music from symbolic music representations is an important task in Music Information Retrieval (MIR). A central but still largely unsolved problem in this context is musically and acoustically informed control in the generation process. As the main contribution of this work, we propose enhancing control of multi-instrument synthesis by conditioning a generative model on a specific performance and recording environment, thus allowing for better guidance of timbre and style. Building on state-of-the-art diffusion-based music generative models, we introduce performance conditioning - a simple tool indicating the generative model to synthesize music with style and timbre of specific instruments taken from specific performances. Our prototype is evaluated using uncurated performances with diverse instrumentation and achieves state-of-the-art FAD realism scores while allowing novel timbre and style control. Our project page, including samples and demonstrations, is available at benadar293.github.io/midipm
暂无翻译
In this era of large language models (LLMs), the traditional training of models has become increasingly unimaginable for regular users and institutions. The exploration of efficient fine-tuning for high-resource languages on these models is an undeniable trend that is gradually gaining popularity. However, there has been very little exploration for various low-resource languages, such as Tibetan. Research in Tibetan NLP is inherently scarce and limited. While there is currently no existing large language model for Tibetan due to its low-resource nature, that day will undoubtedly arrive. Therefore, research on efficient fine-tuning for low-resource language models like Tibetan is highly necessary. Our research can serve as a reference to fill this crucial gap. Efficient fine-tuning strategies for pre-trained language models (PLMs) in Tibetan have seen minimal exploration. We conducted three types of efficient fine-tuning experiments on the publicly available TNCC-title dataset: "prompt-tuning," "Adapter lightweight fine-tuning," and "prompt-tuning + Adapter fine-tuning." The experimental results demonstrate significant improvements using these methods, providing valuable insights for advancing Tibetan language applications in the context of pre-trained models.
暂无翻译
Efficient routing algorithms based on vehicular ad hoc networks (VANETs) play an important role in emerging intelligent transportation systems. This highly dynamic topology faces a number of wireless communication service challenges. In this paper, we propose a protocol based on reinforcement learning and vehicle node clustering, the protocol is called Qucts, solve vehicle-to-fixed-destination or V2V messaging problems. Improve message delivery rates with minimal hops and latency, link stability is also taken into account. The agreement is divided into three levels, first cluster the vehicles, each cluster head broadcasts its own coordinates and speed, to get more cluster members. Also when a cluster member receives another cluster head broadcast message, the cluster head generates a list of surrounding clusters, find the best cluster to the destination as the next cluster during message passing. Second, the protocol constructs a Q-value table based on the state after clustering, used to participate in the selection of messaging clusters. Finally, we introduce parameters that express the stability of the vehicle within the cluster, for communication node selection. This protocol hierarchy makes Qucts an offline and online solution. In order to distinguish unstable nodes within a cluster, Coding of each road, will have vehicles with planned routes, For example, car hailing and public bus. Compare the overlap with other planned paths vehicles in the cluster, low overlap is labeled as unstable nodes. Vehicle path overlap rate without a planned path is set to the mean value. Comparing Qucts with existing routing protocols through simulation, Our proposed Qucts scheme provides large improvements in both data delivery rate and end-to-end delay reduction.
暂无翻译
In this paper, we study the effect of popularity degradation bias in the context of local music recommendations. Specifically, we examine how accurate two top-performing recommendation algorithms, Weight Relevance Matrix Factorization (WRMF) and Multinomial Variational Autoencoder (Mult-VAE), are at recommending artists as a function of artist popularity. We find that both algorithms improve recommendation performance for more popular artists and, as such, exhibit popularity degradation bias. While both algorithms produce a similar level of performance for more popular artists, Mult-VAE shows better relative performance for less popular artists. This suggests that this algorithm should be preferred for local (long-tail) music artist recommendation.
暂无翻译
Sequential change detection is a classical problem with a variety of applications. However, the majority of prior work has been parametric, for example, focusing on exponential families. We develop a fundamentally new and general framework for sequential change detection when the pre- and post-change distributions are nonparametrically specified (and thus composite). Our procedures come with clean, nonasymptotic bounds on the average run length (frequency of false alarms). In certain nonparametric cases (like sub-Gaussian or sub-exponential), we also provide near-optimal bounds on the detection delay following a changepoint. The primary technical tool that we introduce is called an \emph{e-detector}, which is composed of sums of e-processes -- a fundamental generalization of nonnegative supermartingales -- that are started at consecutive times. We first introduce simple Shiryaev-Roberts and CUSUM-style e-detectors, and then show how to design their mixtures in order to achieve both statistical and computational efficiency. Our e-detector framework can be instantiated to recover classical likelihood-based procedures for parametric problems, as well as yielding the first change detection method for many nonparametric problems. As a running example, we tackle the problem of detecting changes in the mean of a bounded random variable without i.i.d. assumptions, with an application to tracking the performance of a basketball team over multiple seasons.
暂无翻译
Long-Term Person Re-Identification (LT-ReID) has become increasingly crucial in computer vision and biometrics. In this work, we aim to extend LT-ReID beyond pedestrian recognition to include a wider range of real-world human activities while still accounting for cloth-changing scenarios over large time gaps. This setting poses additional challenges due to the geometric misalignment and appearance ambiguity caused by the diversity of human pose and clothing. To address these challenges, we propose a new approach 3DInvarReID for (i) disentangling identity from non-identity components (pose, clothing shape, and texture) of 3D clothed humans, and (ii) reconstructing accurate 3D clothed body shapes and learning discriminative features of naked body shapes for person ReID in a joint manner. To better evaluate our study of LT-ReID, we collect a real-world dataset called CCDA, which contains a wide variety of human activities and clothing changes. Experimentally, we show the superior performance of our approach for person ReID.
暂无翻译
Prompt Tuning is emerging as a scalable and cost-effective method to fine-tune Pretrained Language Models (PLMs). This study benchmarks the performance and computational efficiency of Prompt Tuning and baseline methods on a multi-label text classification task. This is applied to the use case of classifying companies into an investment firm's proprietary industry taxonomy, supporting their thematic investment strategy. Text-to-text classification with PLMs is frequently reported to outperform classification with a classification head, but has several limitations when applied to a multi-label classification problem where each label consists of multiple tokens: (a) Generated labels may not match any label in the industry taxonomy; (b) During fine-tuning, multiple labels must be provided in an arbitrary order; (c) The model provides a binary decision for each label, rather than an appropriate confidence score. Limitation (a) is addressed by applying constrained decoding using Trie Search, which slightly improves classification performance. All limitations (a), (b), and (c) are addressed by replacing the PLM's language head with a classification head. This improves performance significantly, while also reducing computational costs during inference. The results indicate the continuing need to adapt state-of-the-art methods to domain-specific tasks, even in the era of PLMs with strong generalization abilities.
暂无翻译