Paper Group ANR 565
A Comparative Study of Glottal Source Estimation Techniques. A Rank-1 Sketch for Matrix Multiplicative Weights. Link Prediction in the Stochastic Block Model with Outliers. Glottal Closure and Opening Instant Detection from Speech Signals. Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff. Prioritized Unit Propagation with Perio …
A Comparative Study of Glottal Source Estimation Techniques
Title | A Comparative Study of Glottal Source Estimation Techniques |
Authors | Thomas Drugman, Baris Bozkurt, Thierry Dutoit |
Abstract | Source-tract decomposition (or glottal flow estimation) is one of the basic problems of speech processing. For this, several techniques have been proposed in the literature. However studies comparing different approaches are almost nonexistent. Besides, experiments have been systematically performed either on synthetic speech or on sustained vowels. In this study we compare three of the main representative state-of-the-art methods of glottal flow estimation: closed-phase inverse filtering, iterative and adaptive inverse filtering, and mixed-phase decomposition. These techniques are first submitted to an objective assessment test on synthetic speech signals. Their sensitivity to various factors affecting the estimation quality, as well as their robustness to noise are studied. In a second experiment, their ability to label voice quality (tensed, modal, soft) is studied on a large corpus of real connected speech. It is shown that changes of voice quality are reflected by significant modifications in glottal feature distributions. Techniques based on the mixed-phase decomposition and on a closed-phase inverse filtering process turn out to give the best results on both clean synthetic and real speech signals. On the other hand, iterative and adaptive inverse filtering is recommended in noisy environments for its high robustness. |
Tasks | |
Published | 2019-12-28 |
URL | https://arxiv.org/abs/2001.00840v1 |
https://arxiv.org/pdf/2001.00840v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-study-of-glottal-source |
Repo | |
Framework | |
A Rank-1 Sketch for Matrix Multiplicative Weights
Title | A Rank-1 Sketch for Matrix Multiplicative Weights |
Authors | Yair Carmon, John C. Duchi, Aaron Sidford, Kevin Tian |
Abstract | We show that a simple randomized sketch of the matrix multiplicative weight (MMW) update enjoys (in expectation) the same regret bounds as MMW, up to a small constant factor. Unlike MMW, where every step requires full matrix exponentiation, our steps require only a single product of the form $e^A b$, which the Lanczos method approximates efficiently. Our key technique is to view the sketch as a $\textit{randomized mirror projection}$, and perform mirror descent analysis on the $\textit{expected projection}$. Our sketch solves the online eigenvector problem, improving the best known complexity bounds by $\Omega(\log^5 n)$. We also apply this sketch to semidefinite programming in saddle-point form, yielding a simple primal-dual scheme with guarantees matching the best in the literature. |
Tasks | |
Published | 2019-03-07 |
URL | https://arxiv.org/abs/1903.02675v2 |
https://arxiv.org/pdf/1903.02675v2.pdf | |
PWC | https://paperswithcode.com/paper/a-rank-1-sketch-for-matrix-multiplicative |
Repo | |
Framework | |
Link Prediction in the Stochastic Block Model with Outliers
Title | Link Prediction in the Stochastic Block Model with Outliers |
Authors | Solenne Gaucher, Olga Klopp, Geneviève Robin |
Abstract | The Stochastic Block Model is a popular model for network analysis in the presence of community structure. However, in numerous examples, the assumptions underlying this classical model are put in default by the behaviour of a small number of outlier nodes such as hubs, nodes with mixed membership profiles, or corrupted nodes. In addition, real-life networks are likely to be incomplete, due to non-response or machine failures. We introduce a new algorithm to estimate the connection probabilities in a network, which is robust to both outlier nodes and missing observations. Under fairly general assumptions, this method detects the outliers, and achieves the best known error for the estimation of connection probabilities with polynomial computation cost. In addition, we prove sub-linear convergence of our algorithm. We provide a simulation study which demonstrates the good behaviour of the method in terms of outliers selection and prediction of the missing links. |
Tasks | Link Prediction |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.13122v1 |
https://arxiv.org/pdf/1911.13122v1.pdf | |
PWC | https://paperswithcode.com/paper/link-prediction-in-the-stochastic-block-model |
Repo | |
Framework | |
Glottal Closure and Opening Instant Detection from Speech Signals
Title | Glottal Closure and Opening Instant Detection from Speech Signals |
Authors | Thomas Drugman, Thierry Dutoit |
Abstract | This paper proposes a new procedure to detect Glottal Closure and Opening Instants (GCIs and GOIs) directly from speech waveforms. The procedure is divided into two successive steps. First a mean-based signal is computed, and intervals where speech events are expected to occur are extracted from it. Secondly, at each interval a precise position of the speech event is assigned by locating a discontinuity in the Linear Prediction residual. The proposed method is compared to the DYPSA algorithm on the CMU ARCTIC database. A significant improvement as well as a better noise robustness are reported. Besides, results of GOI identification accuracy are promising for the glottal source characterization. |
Tasks | |
Published | 2019-12-28 |
URL | https://arxiv.org/abs/2001.00841v1 |
https://arxiv.org/pdf/2001.00841v1.pdf | |
PWC | https://paperswithcode.com/paper/glottal-closure-and-opening-instant-detection |
Repo | |
Framework | |
Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff
Title | Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff |
Authors | Yochai Blau, Tomer Michaeli |
Abstract | Lossy compression algorithms are typically designed and analyzed through the lens of Shannon’s rate-distortion theory, where the goal is to achieve the lowest possible distortion (e.g., low MSE or high SSIM) at any given bit rate. However, in recent years, it has become increasingly accepted that “low distortion” is not a synonym for “high perceptual quality”, and in fact optimization of one often comes at the expense of the other. In light of this understanding, it is natural to seek for a generalization of rate-distortion theory which takes perceptual quality into account. In this paper, we adopt the mathematical definition of perceptual quality recently proposed by Blau & Michaeli (2018), and use it to study the three-way tradeoff between rate, distortion, and perception. We show that restricting the perceptual quality to be high, generally leads to an elevation of the rate-distortion curve, thus necessitating a sacrifice in either rate or distortion. We prove several fundamental properties of this triple-tradeoff, calculate it in closed form for a Bernoulli source, and illustrate it visually on a toy MNIST example. |
Tasks | |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.07821v4 |
https://arxiv.org/pdf/1901.07821v4.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-lossy-compression-the-rate |
Repo | |
Framework | |
Prioritized Unit Propagation with Periodic Resetting is (Almost) All You Need for Random SAT Solving
Title | Prioritized Unit Propagation with Periodic Resetting is (Almost) All You Need for Random SAT Solving |
Authors | Xujie Si, Yujia Li, Vinod Nair, Felix Gimeno |
Abstract | We propose prioritized unit propagation with periodic resetting, which is a simple but surprisingly effective algorithm for solving random SAT instances that are meant to be hard. In particular, an evaluation on the Random Track of the 2017 and 2018 SAT competitions shows that a basic prototype of this simple idea already ranks at second place in both years. We share this observation in the hope that it helps the SAT community better understand the hardness of random instances used in competitions and inspire other interesting ideas on SAT solving. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.05906v1 |
https://arxiv.org/pdf/1912.05906v1.pdf | |
PWC | https://paperswithcode.com/paper/prioritized-unit-propagation-with-periodic |
Repo | |
Framework | |
Deep CSI Learning for Gait Biometric Sensing and Recognition
Title | Deep CSI Learning for Gait Biometric Sensing and Recognition |
Authors | Kalvik Jakkala, Arupjyoti Bhuya, Zhi Sun, Pu Wang, Zhuo Cheng |
Abstract | Gait is a person’s natural walking style and a complex biological process that is unique to each person. Recently, the channel state information (CSI) of WiFi devices have been exploited to capture human gait biometrics for user identification. However, the performance of existing CSI-based gait identification systems is far from satisfactory. They can only achieve limited identification accuracy (maximum $93%$) only for a very small group of people (i.e., between 2 to 10). To address such challenge, an end-to-end deep CSI learning system is developed, which exploits deep neural networks to automatically learn the salient gait features in CSI data that are discriminative enough to distinguish different people Firstly, the raw CSI data are sanitized through window-based denoising, mean centering and normalization. The sanitized data is then passed to a residual deep convolutional neural network (DCNN), which automatically extracts the hierarchical features of gait-signatures embedded in the CSI data. Finally, a softmax classifier utilizes the extracted features to make the final prediction about the identity of the user. In a typical indoor environment, a top-1 accuracy of $97.12 \pm 1.13%$ is achieved for a dataset of 30 people. |
Tasks | Denoising, Gait Identification |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.02300v1 |
http://arxiv.org/pdf/1902.02300v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-csi-learning-for-gait-biometric-sensing |
Repo | |
Framework | |
Information asymmetry in KL-regularized RL
Title | Information asymmetry in KL-regularized RL |
Authors | Alexandre Galashov, Siddhant M. Jayakumar, Leonard Hasenclever, Dhruva Tirumala, Jonathan Schwarz, Guillaume Desjardins, Wojciech M. Czarnecki, Yee Whye Teh, Razvan Pascanu, Nicolas Heess |
Abstract | Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start from the KL regularized expected reward objective which introduces an additional component, a default policy. Instead of relying on a fixed default policy, we learn it from data. But crucially, we restrict the amount of information the default policy receives, forcing it to learn reusable behaviors that help the policy learn faster. We formalize this strategy and discuss connections to information bottleneck approaches and to the variational EM algorithm. We present empirical results in both discrete and continuous action domains and demonstrate that, for certain tasks, learning a default policy alongside the policy can significantly speed up and improve learning. |
Tasks | |
Published | 2019-05-03 |
URL | https://arxiv.org/abs/1905.01240v1 |
https://arxiv.org/pdf/1905.01240v1.pdf | |
PWC | https://paperswithcode.com/paper/information-asymmetry-in-kl-regularized-rl-1 |
Repo | |
Framework | |
Learning Privately over Distributed Features: An ADMM Sharing Approach
Title | Learning Privately over Distributed Features: An ADMM Sharing Approach |
Authors | Yaochen Hu, Peng Liu, Linglong Kong, Di Niu |
Abstract | Distributed machine learning has been widely studied in order to handle exploding amount of data. In this paper, we study an important yet less visited distributed learning problem where features are inherently distributed or vertically partitioned among multiple parties, and sharing of raw data or model parameters among parties is prohibited due to privacy concerns. We propose an ADMM sharing framework to approach risk minimization over distributed features, where each party only needs to share a single value for each sample in the training process, thus minimizing the data leakage risk. We establish convergence and iteration complexity results for the proposed parallel ADMM algorithm under non-convex loss. We further introduce a novel differentially private ADMM sharing algorithm and bound the privacy guarantee with carefully designed noise perturbation. The experiments based on a prototype system shows that the proposed ADMM algorithms converge efficiently in a robust fashion, demonstrating advantage over gradient based methods especially for data set with high dimensional feature spaces. |
Tasks | |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07735v1 |
https://arxiv.org/pdf/1907.07735v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-privately-over-distributed-features |
Repo | |
Framework | |
Natural Compression for Distributed Deep Learning
Title | Natural Compression for Distributed Deep Learning |
Authors | Samuel Horvath, Chen-Yu Ho, Ludovit Horvath, Atal Narayan Sahu, Marco Canini, Peter Richtarik |
Abstract | Modern deep learning models are often trained in parallel over a collection of distributed machines to reduce training time. In such settings, communication of model updates among machines becomes a significant performance bottleneck and various lossy update compression techniques have been proposed to alleviate this problem. In this work, we introduce a new, simple yet theoretically and practically effective compression technique: {\em natural compression (NC)}. Our technique is applied individually to all entries of the to-be-compressed update vector and works by randomized rounding to the nearest (negative or positive) power of two, which can be computed in a “natural” way by ignoring the mantissa. We show that compared to no compression, NC increases the second moment of the compressed vector by not more than the tiny factor $\nicefrac{9}{8}$, which means that the effect of NC on the convergence speed of popular training algorithms, such as distributed SGD, is negligible. However, the communications savings enabled by NC are substantial, leading to {\em $3$-$4\times$ improvement in overall theoretical running time}. For applications requiring more aggressive compression, we generalize NC to {\em natural dithering}, which we prove is {\em exponentially better} than the common random dithering technique. Our compression operators can be used on their own or in combination with existing operators for a more aggressive combined effect, and offer new state-of-the-art both in theory and practice. |
Tasks | Quantization |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.10988v2 |
https://arxiv.org/pdf/1905.10988v2.pdf | |
PWC | https://paperswithcode.com/paper/natural-compression-for-distributed-deep |
Repo | |
Framework | |
Anomaly detecting and ranking of the cloud computing platform by multi-view learning
Title | Anomaly detecting and ranking of the cloud computing platform by multi-view learning |
Authors | Jing Zhang |
Abstract | Anomaly detecting as an important technical in cloud computing is applied to support smooth running of the cloud platform. Traditional detecting methods based on statistic, analysis, etc. lead to the high false-alarm rate due to non-adaptive and sensitive parameters setting. We presented an online model for anomaly detecting using machine learning theory. However, most existing methods based on machine learning linked all features from difference sub-systems into a long feature vector directly, which is difficult to both exploit the complement information between sub-systems and ignore multi-view features enhancing the classification performance. Aiming to this problem, the proposed method automatic fuses multi-view features and optimize the discriminative model to enhance the accuracy. This model takes advantage of extreme learning machine (ELM) to improve detection efficiency. ELM is the single hidden layer neural network, which is transforming iterative solution the output weights to solution of linear equations and avoiding the local optimal solution. Moreover, we rank anomies according to the relationship between samples and the classification boundary, and then assigning weights for ranked anomalies, retraining the classification model finally. Our method exploits the complement information between sub-systems sufficiently, and avoids the influence from imbalance dataset, therefore, deal with various challenges from the cloud computing platform. We deploy the privately cloud platform by Openstack, verifying the proposed model and comparing results to the state-of-the-art methods with better efficiency and simplicity. |
Tasks | MULTI-VIEW LEARNING |
Published | 2019-01-27 |
URL | http://arxiv.org/abs/1901.09294v1 |
http://arxiv.org/pdf/1901.09294v1.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detecting-and-ranking-of-the-cloud |
Repo | |
Framework | |
Grounding-Tracking-Integration
Title | Grounding-Tracking-Integration |
Authors | Zhengyuan Yang, Tushar Kumar, Tianlang Chen, Jiebo Luo |
Abstract | In this paper, we study tracking by language that localizes the target box sequence in a video based on a language query. We propose a framework called GTI that decomposes the problem into three sub-tasks: Grounding, Tracking and Integration. The three sub-task modules operate simultaneously and predict the box sequence frame-by-frame. “Grounding” predicts the referred region directly from the language query. “Tracking” localizes the target based on the history of the grounded regions in previous frames. “Integration” generates final predictions by synergistically combining grounding and tracking. With the “integration” task as the key, we explore how to indicate the quality of the grounded regions in each frame and achieve the desired mutually beneficial combination. To this end, we propose an “RT-integration” method that defines and predicts two scores to guide the integration: 1) R-score represents the Region correctness whether the grounding prediction accurately covers the target, and 2) T-score represents the Template quality whether the region provides informative visual cues to improve tracking in future frames. We present our real-time GTI implementation with the proposed RT-integration, and benchmark the framework on LaSOT and Lingual OTB99 with highly promising results. Moreover, a disambiguated version of LaSOT queries can be used to facilitate future tracking by language studies. |
Tasks | |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06316v1 |
https://arxiv.org/pdf/1912.06316v1.pdf | |
PWC | https://paperswithcode.com/paper/grounding-tracking-integration |
Repo | |
Framework | |
Sufficiently Accurate Model Learning
Title | Sufficiently Accurate Model Learning |
Authors | Clark Zhang, Arbaaz Khan, Santiago Paternain, Alejandro Ribeiro |
Abstract | Modeling how a robot interacts with the environment around it is an important prerequisite for designing control and planning algorithms. In fact, the performance of controllers and planners is highly dependent on the quality of the model. One popular approach is to learn data driven models in order to compensate for inaccurate physical measurements and to adapt to systems that evolve over time. In this paper, we investigate a method to regularize model learning techniques to provide better error characteristics for traditional control and planning algorithms. This work proposes learning “Sufficiently Accurate” models of dynamics using a primal-dual method that can explicitly enforce constraints on the error in pre-defined parts of the state-space. The result of this method is that the error characteristics of the learned model is more predictable and can be better utilized by planning and control algorithms. The characteristics of Sufficiently Accurate models are analyzed through experiments on a simulated ball paddle system. |
Tasks | |
Published | 2019-02-19 |
URL | https://arxiv.org/abs/1902.06862v2 |
https://arxiv.org/pdf/1902.06862v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-task-agnostic-sufficiently-accurate |
Repo | |
Framework | |
ActiveMoCap: Optimized Drone Flight for Active Human Motion Capture
Title | ActiveMoCap: Optimized Drone Flight for Active Human Motion Capture |
Authors | Sena Kiciroglu, Helge Rhodin, Sudipta Sinha, Mathieu Salzmann, Pascal Fua |
Abstract | The accuracy of monocular 3D human pose estimation depends on the viewpoint from which the image is captured. While camera-equipped drones provide control over this viewpoint, automatically positioning them at the location which will yield the highest accuracy remains an open problem. This is the problem that we address in this paper. Specifically, given a short video sequence, we introduce an algorithm that predicts the where a drone should go in the future frame so as to maximize 3D human pose estimation accuracy. A key idea underlying our approach is a method to estimate the uncertainty of the 3D body pose estimates. We integrate several sources of uncertainty, originating from a deep learning based regressors and temporal smoothness. The resulting motion planner leads to improved 3D body pose estimates and outperforms or matches existing planners that are based on person following and orbiting. |
Tasks | 3D Human Pose Estimation, Motion Capture, Pose Estimation |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08568v1 |
https://arxiv.org/pdf/1912.08568v1.pdf | |
PWC | https://paperswithcode.com/paper/activemocap-optimized-drone-flight-for-active |
Repo | |
Framework | |
Efficient Navigation of Colloidal Robots in an Unknown Environment via Deep Reinforcement Learning
Title | Efficient Navigation of Colloidal Robots in an Unknown Environment via Deep Reinforcement Learning |
Authors | Yuguang Yang, Michael A. Bevan, Bo Li |
Abstract | Equipping active colloidal robots with intelligence such that they can efficiently navigate in unknown complex environments could dramatically impact their use in emerging applications like precision surgery and targeted drug delivery. Here we develop a model-free deep reinforcement learning that can train colloidal robots to learn effective navigation strategies in unknown environments with random obstacles. We show that trained robot agents learn to make navigation decisions regarding both obstacle avoidance and travel time minimization, based solely on local sensory inputs without prior knowledge of the global environment. Such agents with biologically inspired mechanisms can acquire competitive navigation capabilities in large-scale, complex environments containing obstacles of diverse shapes, sizes, and configurations. This study illustrates the potential of artificial intelligence in engineering active colloidal systems for future applications and constructing complex active systems with visual and learning capability. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.10844v2 |
https://arxiv.org/pdf/1906.10844v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-navigation-of-active-particles-in |
Repo | |
Framework | |