January 29, 2020

3055 words 15 mins read

Paper Group ANR 565

A Comparative Study of Glottal Source Estimation Techniques. A Rank-1 Sketch for Matrix Multiplicative Weights. Link Prediction in the Stochastic Block Model with Outliers. Glottal Closure and Opening Instant Detection from Speech Signals. Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff. Prioritized Unit Propagation with Perio …

A Comparative Study of Glottal Source Estimation Techniques


Title	A Comparative Study of Glottal Source Estimation Techniques
Authors	Thomas Drugman, Baris Bozkurt, Thierry Dutoit
Abstract	Source-tract decomposition (or glottal flow estimation) is one of the basic problems of speech processing. For this, several techniques have been proposed in the literature. However studies comparing different approaches are almost nonexistent. Besides, experiments have been systematically performed either on synthetic speech or on sustained vowels. In this study we compare three of the main representative state-of-the-art methods of glottal flow estimation: closed-phase inverse filtering, iterative and adaptive inverse filtering, and mixed-phase decomposition. These techniques are first submitted to an objective assessment test on synthetic speech signals. Their sensitivity to various factors affecting the estimation quality, as well as their robustness to noise are studied. In a second experiment, their ability to label voice quality (tensed, modal, soft) is studied on a large corpus of real connected speech. It is shown that changes of voice quality are reflected by significant modifications in glottal feature distributions. Techniques based on the mixed-phase decomposition and on a closed-phase inverse filtering process turn out to give the best results on both clean synthetic and real speech signals. On the other hand, iterative and adaptive inverse filtering is recommended in noisy environments for its high robustness.
Tasks
Published	2019-12-28
URL	https://arxiv.org/abs/2001.00840v1
PDF	https://arxiv.org/pdf/2001.00840v1.pdf
PWC	https://paperswithcode.com/paper/a-comparative-study-of-glottal-source
Repo
Framework

A Rank-1 Sketch for Matrix Multiplicative Weights


Title	A Rank-1 Sketch for Matrix Multiplicative Weights
Authors	Yair Carmon, John C. Duchi, Aaron Sidford, Kevin Tian
Abstract	We show that a simple randomized sketch of the matrix multiplicative weight (MMW) update enjoys (in expectation) the same regret bounds as MMW, up to a small constant factor. Unlike MMW, where every step requires full matrix exponentiation, our steps require only a single product of the form $e^A b$, which the Lanczos method approximates efficiently. Our key technique is to view the sketch as a $\textit{randomized mirror projection}$, and perform mirror descent analysis on the $\textit{expected projection}$. Our sketch solves the online eigenvector problem, improving the best known complexity bounds by $\Omega(\log^5 n)$. We also apply this sketch to semidefinite programming in saddle-point form, yielding a simple primal-dual scheme with guarantees matching the best in the literature.
Tasks
Published	2019-03-07
URL	https://arxiv.org/abs/1903.02675v2
PDF	https://arxiv.org/pdf/1903.02675v2.pdf
PWC	https://paperswithcode.com/paper/a-rank-1-sketch-for-matrix-multiplicative
Repo
Framework

Link Prediction in the Stochastic Block Model with Outliers


Title	Link Prediction in the Stochastic Block Model with Outliers
Authors	Solenne Gaucher, Olga Klopp, Geneviève Robin
Abstract	The Stochastic Block Model is a popular model for network analysis in the presence of community structure. However, in numerous examples, the assumptions underlying this classical model are put in default by the behaviour of a small number of outlier nodes such as hubs, nodes with mixed membership profiles, or corrupted nodes. In addition, real-life networks are likely to be incomplete, due to non-response or machine failures. We introduce a new algorithm to estimate the connection probabilities in a network, which is robust to both outlier nodes and missing observations. Under fairly general assumptions, this method detects the outliers, and achieves the best known error for the estimation of connection probabilities with polynomial computation cost. In addition, we prove sub-linear convergence of our algorithm. We provide a simulation study which demonstrates the good behaviour of the method in terms of outliers selection and prediction of the missing links.
Tasks	Link Prediction
Published	2019-11-29
URL	https://arxiv.org/abs/1911.13122v1
PDF	https://arxiv.org/pdf/1911.13122v1.pdf
PWC	https://paperswithcode.com/paper/link-prediction-in-the-stochastic-block-model
Repo
Framework

Glottal Closure and Opening Instant Detection from Speech Signals


Title	Glottal Closure and Opening Instant Detection from Speech Signals
Authors	Thomas Drugman, Thierry Dutoit
Abstract	This paper proposes a new procedure to detect Glottal Closure and Opening Instants (GCIs and GOIs) directly from speech waveforms. The procedure is divided into two successive steps. First a mean-based signal is computed, and intervals where speech events are expected to occur are extracted from it. Secondly, at each interval a precise position of the speech event is assigned by locating a discontinuity in the Linear Prediction residual. The proposed method is compared to the DYPSA algorithm on the CMU ARCTIC database. A significant improvement as well as a better noise robustness are reported. Besides, results of GOI identification accuracy are promising for the glottal source characterization.
Tasks
Published	2019-12-28
URL	https://arxiv.org/abs/2001.00841v1
PDF	https://arxiv.org/pdf/2001.00841v1.pdf
PWC	https://paperswithcode.com/paper/glottal-closure-and-opening-instant-detection
Repo
Framework

Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff


Title	Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff
Authors	Yochai Blau, Tomer Michaeli
Abstract	Lossy compression algorithms are typically designed and analyzed through the lens of Shannon’s rate-distortion theory, where the goal is to achieve the lowest possible distortion (e.g., low MSE or high SSIM) at any given bit rate. However, in recent years, it has become increasingly accepted that “low distortion” is not a synonym for “high perceptual quality”, and in fact optimization of one often comes at the expense of the other. In light of this understanding, it is natural to seek for a generalization of rate-distortion theory which takes perceptual quality into account. In this paper, we adopt the mathematical definition of perceptual quality recently proposed by Blau & Michaeli (2018), and use it to study the three-way tradeoff between rate, distortion, and perception. We show that restricting the perceptual quality to be high, generally leads to an elevation of the rate-distortion curve, thus necessitating a sacrifice in either rate or distortion. We prove several fundamental properties of this triple-tradeoff, calculate it in closed form for a Bernoulli source, and illustrate it visually on a toy MNIST example.
Tasks
Published	2019-01-23
URL	https://arxiv.org/abs/1901.07821v4
PDF	https://arxiv.org/pdf/1901.07821v4.pdf
PWC	https://paperswithcode.com/paper/rethinking-lossy-compression-the-rate
Repo
Framework

Prioritized Unit Propagation with Periodic Resetting is (Almost) All You Need for Random SAT Solving


Title	Prioritized Unit Propagation with Periodic Resetting is (Almost) All You Need for Random SAT Solving
Authors	Xujie Si, Yujia Li, Vinod Nair, Felix Gimeno
Abstract	We propose prioritized unit propagation with periodic resetting, which is a simple but surprisingly effective algorithm for solving random SAT instances that are meant to be hard. In particular, an evaluation on the Random Track of the 2017 and 2018 SAT competitions shows that a basic prototype of this simple idea already ranks at second place in both years. We share this observation in the hope that it helps the SAT community better understand the hardness of random instances used in competitions and inspire other interesting ideas on SAT solving.
Tasks
Published	2019-12-04
URL	https://arxiv.org/abs/1912.05906v1
PDF	https://arxiv.org/pdf/1912.05906v1.pdf
PWC	https://paperswithcode.com/paper/prioritized-unit-propagation-with-periodic
Repo
Framework

Deep CSI Learning for Gait Biometric Sensing and Recognition


Title	Deep CSI Learning for Gait Biometric Sensing and Recognition
Authors	Kalvik Jakkala, Arupjyoti Bhuya, Zhi Sun, Pu Wang, Zhuo Cheng
Abstract	Gait is a person’s natural walking style and a complex biological process that is unique to each person. Recently, the channel state information (CSI) of WiFi devices have been exploited to capture human gait biometrics for user identification. However, the performance of existing CSI-based gait identification systems is far from satisfactory. They can only achieve limited identification accuracy (maximum $93%$) only for a very small group of people (i.e., between 2 to 10). To address such challenge, an end-to-end deep CSI learning system is developed, which exploits deep neural networks to automatically learn the salient gait features in CSI data that are discriminative enough to distinguish different people Firstly, the raw CSI data are sanitized through window-based denoising, mean centering and normalization. The sanitized data is then passed to a residual deep convolutional neural network (DCNN), which automatically extracts the hierarchical features of gait-signatures embedded in the CSI data. Finally, a softmax classifier utilizes the extracted features to make the final prediction about the identity of the user. In a typical indoor environment, a top-1 accuracy of $97.12 \pm 1.13%$ is achieved for a dataset of 30 people.
Tasks	Denoising, Gait Identification
Published	2019-02-06
URL	http://arxiv.org/abs/1902.02300v1
PDF	http://arxiv.org/pdf/1902.02300v1.pdf
PWC	https://paperswithcode.com/paper/deep-csi-learning-for-gait-biometric-sensing
Repo
Framework

Information asymmetry in KL-regularized RL


Title	Information asymmetry in KL-regularized RL
Authors	Alexandre Galashov, Siddhant M. Jayakumar, Leonard Hasenclever, Dhruva Tirumala, Jonathan Schwarz, Guillaume Desjardins, Wojciech M. Czarnecki, Yee Whye Teh, Razvan Pascanu, Nicolas Heess
Abstract	Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start from the KL regularized expected reward objective which introduces an additional component, a default policy. Instead of relying on a fixed default policy, we learn it from data. But crucially, we restrict the amount of information the default policy receives, forcing it to learn reusable behaviors that help the policy learn faster. We formalize this strategy and discuss connections to information bottleneck approaches and to the variational EM algorithm. We present empirical results in both discrete and continuous action domains and demonstrate that, for certain tasks, learning a default policy alongside the policy can significantly speed up and improve learning.
Tasks
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01240v1
PDF	https://arxiv.org/pdf/1905.01240v1.pdf
PWC	https://paperswithcode.com/paper/information-asymmetry-in-kl-regularized-rl-1
Repo
Framework


Title	Learning Privately over Distributed Features: An ADMM Sharing Approach
Authors	Yaochen Hu, Peng Liu, Linglong Kong, Di Niu
Abstract	Distributed machine learning has been widely studied in order to handle exploding amount of data. In this paper, we study an important yet less visited distributed learning problem where features are inherently distributed or vertically partitioned among multiple parties, and sharing of raw data or model parameters among parties is prohibited due to privacy concerns. We propose an ADMM sharing framework to approach risk minimization over distributed features, where each party only needs to share a single value for each sample in the training process, thus minimizing the data leakage risk. We establish convergence and iteration complexity results for the proposed parallel ADMM algorithm under non-convex loss. We further introduce a novel differentially private ADMM sharing algorithm and bound the privacy guarantee with carefully designed noise perturbation. The experiments based on a prototype system shows that the proposed ADMM algorithms converge efficiently in a robust fashion, demonstrating advantage over gradient based methods especially for data set with high dimensional feature spaces.
Tasks
Published	2019-07-17
URL	https://arxiv.org/abs/1907.07735v1
PDF	https://arxiv.org/pdf/1907.07735v1.pdf
PWC	https://paperswithcode.com/paper/learning-privately-over-distributed-features
Repo
Framework

Natural Compression for Distributed Deep Learning


Title	Natural Compression for Distributed Deep Learning
Authors	Samuel Horvath, Chen-Yu Ho, Ludovit Horvath, Atal Narayan Sahu, Marco Canini, Peter Richtarik
Abstract	Modern deep learning models are often trained in parallel over a collection of distributed machines to reduce training time. In such settings, communication of model updates among machines becomes a significant performance bottleneck and various lossy update compression techniques have been proposed to alleviate this problem. In this work, we introduce a new, simple yet theoretically and practically effective compression technique: {\em natural compression (NC)}. Our technique is applied individually to all entries of the to-be-compressed update vector and works by randomized rounding to the nearest (negative or positive) power of two, which can be computed in a “natural” way by ignoring the mantissa. We show that compared to no compression, NC increases the second moment of the compressed vector by not more than the tiny factor $\nicefrac{9}{8}$, which means that the effect of NC on the convergence speed of popular training algorithms, such as distributed SGD, is negligible. However, the communications savings enabled by NC are substantial, leading to {\em $3$-$4\times$ improvement in overall theoretical running time}. For applications requiring more aggressive compression, we generalize NC to {\em natural dithering}, which we prove is {\em exponentially better} than the common random dithering technique. Our compression operators can be used on their own or in combination with existing operators for a more aggressive combined effect, and offer new state-of-the-art both in theory and practice.
Tasks	Quantization
Published	2019-05-27
URL	https://arxiv.org/abs/1905.10988v2
PDF	https://arxiv.org/pdf/1905.10988v2.pdf
PWC	https://paperswithcode.com/paper/natural-compression-for-distributed-deep
Repo
Framework

Anomaly detecting and ranking of the cloud computing platform by multi-view learning


Title	Anomaly detecting and ranking of the cloud computing platform by multi-view learning
Authors	Jing Zhang
Abstract	Anomaly detecting as an important technical in cloud computing is applied to support smooth running of the cloud platform. Traditional detecting methods based on statistic, analysis, etc. lead to the high false-alarm rate due to non-adaptive and sensitive parameters setting. We presented an online model for anomaly detecting using machine learning theory. However, most existing methods based on machine learning linked all features from difference sub-systems into a long feature vector directly, which is difficult to both exploit the complement information between sub-systems and ignore multi-view features enhancing the classification performance. Aiming to this problem, the proposed method automatic fuses multi-view features and optimize the discriminative model to enhance the accuracy. This model takes advantage of extreme learning machine (ELM) to improve detection efficiency. ELM is the single hidden layer neural network, which is transforming iterative solution the output weights to solution of linear equations and avoiding the local optimal solution. Moreover, we rank anomies according to the relationship between samples and the classification boundary, and then assigning weights for ranked anomalies, retraining the classification model finally. Our method exploits the complement information between sub-systems sufficiently, and avoids the influence from imbalance dataset, therefore, deal with various challenges from the cloud computing platform. We deploy the privately cloud platform by Openstack, verifying the proposed model and comparing results to the state-of-the-art methods with better efficiency and simplicity.
Tasks	MULTI-VIEW LEARNING
Published	2019-01-27
URL	http://arxiv.org/abs/1901.09294v1
PDF	http://arxiv.org/pdf/1901.09294v1.pdf
PWC	https://paperswithcode.com/paper/anomaly-detecting-and-ranking-of-the-cloud
Repo
Framework

Grounding-Tracking-Integration


Title	Grounding-Tracking-Integration
Authors	Zhengyuan Yang, Tushar Kumar, Tianlang Chen, Jiebo Luo
Abstract	In this paper, we study tracking by language that localizes the target box sequence in a video based on a language query. We propose a framework called GTI that decomposes the problem into three sub-tasks: Grounding, Tracking and Integration. The three sub-task modules operate simultaneously and predict the box sequence frame-by-frame. “Grounding” predicts the referred region directly from the language query. “Tracking” localizes the target based on the history of the grounded regions in previous frames. “Integration” generates final predictions by synergistically combining grounding and tracking. With the “integration” task as the key, we explore how to indicate the quality of the grounded regions in each frame and achieve the desired mutually beneficial combination. To this end, we propose an “RT-integration” method that defines and predicts two scores to guide the integration: 1) R-score represents the Region correctness whether the grounding prediction accurately covers the target, and 2) T-score represents the Template quality whether the region provides informative visual cues to improve tracking in future frames. We present our real-time GTI implementation with the proposed RT-integration, and benchmark the framework on LaSOT and Lingual OTB99 with highly promising results. Moreover, a disambiguated version of LaSOT queries can be used to facilitate future tracking by language studies.
Tasks
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06316v1
PDF	https://arxiv.org/pdf/1912.06316v1.pdf
PWC	https://paperswithcode.com/paper/grounding-tracking-integration
Repo
Framework

Sufficiently Accurate Model Learning


Title	Sufficiently Accurate Model Learning
Authors	Clark Zhang, Arbaaz Khan, Santiago Paternain, Alejandro Ribeiro
Abstract	Modeling how a robot interacts with the environment around it is an important prerequisite for designing control and planning algorithms. In fact, the performance of controllers and planners is highly dependent on the quality of the model. One popular approach is to learn data driven models in order to compensate for inaccurate physical measurements and to adapt to systems that evolve over time. In this paper, we investigate a method to regularize model learning techniques to provide better error characteristics for traditional control and planning algorithms. This work proposes learning “Sufficiently Accurate” models of dynamics using a primal-dual method that can explicitly enforce constraints on the error in pre-defined parts of the state-space. The result of this method is that the error characteristics of the learned model is more predictable and can be better utilized by planning and control algorithms. The characteristics of Sufficiently Accurate models are analyzed through experiments on a simulated ball paddle system.
Tasks
Published	2019-02-19
URL	https://arxiv.org/abs/1902.06862v2
PDF	https://arxiv.org/pdf/1902.06862v2.pdf
PWC	https://paperswithcode.com/paper/learning-task-agnostic-sufficiently-accurate
Repo
Framework

ActiveMoCap: Optimized Drone Flight for Active Human Motion Capture


Title	ActiveMoCap: Optimized Drone Flight for Active Human Motion Capture
Authors	Sena Kiciroglu, Helge Rhodin, Sudipta Sinha, Mathieu Salzmann, Pascal Fua
Abstract	The accuracy of monocular 3D human pose estimation depends on the viewpoint from which the image is captured. While camera-equipped drones provide control over this viewpoint, automatically positioning them at the location which will yield the highest accuracy remains an open problem. This is the problem that we address in this paper. Specifically, given a short video sequence, we introduce an algorithm that predicts the where a drone should go in the future frame so as to maximize 3D human pose estimation accuracy. A key idea underlying our approach is a method to estimate the uncertainty of the 3D body pose estimates. We integrate several sources of uncertainty, originating from a deep learning based regressors and temporal smoothness. The resulting motion planner leads to improved 3D body pose estimates and outperforms or matches existing planners that are based on person following and orbiting.
Tasks	3D Human Pose Estimation, Motion Capture, Pose Estimation
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08568v1
PDF	https://arxiv.org/pdf/1912.08568v1.pdf
PWC	https://paperswithcode.com/paper/activemocap-optimized-drone-flight-for-active
Repo
Framework


Title	Efficient Navigation of Colloidal Robots in an Unknown Environment via Deep Reinforcement Learning
Authors	Yuguang Yang, Michael A. Bevan, Bo Li
Abstract	Equipping active colloidal robots with intelligence such that they can efficiently navigate in unknown complex environments could dramatically impact their use in emerging applications like precision surgery and targeted drug delivery. Here we develop a model-free deep reinforcement learning that can train colloidal robots to learn effective navigation strategies in unknown environments with random obstacles. We show that trained robot agents learn to make navigation decisions regarding both obstacle avoidance and travel time minimization, based solely on local sensory inputs without prior knowledge of the global environment. Such agents with biologically inspired mechanisms can acquire competitive navigation capabilities in large-scale, complex environments containing obstacles of diverse shapes, sizes, and configurations. This study illustrates the potential of artificial intelligence in engineering active colloidal systems for future applications and constructing complex active systems with visual and learning capability.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.10844v2
PDF	https://arxiv.org/pdf/1906.10844v2.pdf
PWC	https://paperswithcode.com/paper/efficient-navigation-of-active-particles-in
Repo
Framework