Paper Group ANR 1051
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data. Harmonic Unpaired Image-to-image Translation. Optimal coding and the origins of Zipfian laws. Mango Tree Net – A fully convolutional network for semantic segmentation and individual crown detection of mango trees. Memory-Sample Tradeoffs for Linear …
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data
Title | In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data |
Authors | Nishant Prateek, Mateusz Łajszczak, Roberto Barra-Chicote, Thomas Drugman, Jaime Lorenzo-Trueba, Thomas Merritt, Srikanth Ronanki, Trevor Wood |
Abstract | Neural text-to-speech synthesis (NTTS) models have shown significant progress in generating high-quality speech, however they require a large quantity of training data. This makes creating models for multiple styles expensive and time-consuming. In this paper different styles of speech are analysed based on prosodic variations, from this a model is proposed to synthesise speech in the style of a newscaster, with just a few hours of supplementary data. We pose the problem of synthesising in a target style using limited data as that of creating a bi-style model that can synthesise both neutral-style and newscaster-style speech via a one-hot vector which factorises the two styles. We also propose conditioning the model on contextual word embeddings, and extensively evaluate it against neutral NTTS, and neutral concatenative-based synthesis. This model closes the gap in perceived style-appropriateness between natural recordings for newscaster-style of speech, and neutral speech synthesis by approximately two-thirds. |
Tasks | Speech Synthesis, Text-To-Speech Synthesis, Word Embeddings |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02790v1 |
http://arxiv.org/pdf/1904.02790v1.pdf | |
PWC | https://paperswithcode.com/paper/in-other-news-a-bi-style-text-to-speech-model |
Repo | |
Framework | |
Harmonic Unpaired Image-to-image Translation
Title | Harmonic Unpaired Image-to-image Translation |
Authors | Rui Zhang, Tomas Pfister, Jia Li |
Abstract | The recent direction of unpaired image-to-image translation is on one hand very exciting as it alleviates the big burden in obtaining label-intensive pixel-to-pixel supervision, but it is on the other hand not fully satisfactory due to the presence of artifacts and degenerated transformations. In this paper, we take a manifold view of the problem by introducing a smoothness term over the sample graph to attain harmonic functions to enforce consistent mappings during the translation. We develop HarmonicGAN to learn bi-directional translations between the source and the target domains. With the help of similarity-consistency, the inherent self-consistency property of samples can be maintained. Distance metrics defined on two types of features including histogram and CNN are exploited. Under an identical problem setting as CycleGAN, without additional manual inputs and only at a small training-time cost, HarmonicGAN demonstrates a significant qualitative and quantitative improvement over the state of the art, as well as improved interpretability. We show experimental results in a number of applications including medical imaging, object transfiguration, and semantic labeling. We outperform the competing methods in all tasks, and for a medical imaging task in particular our method turns CycleGAN from a failure to a success, halving the mean-squared error, and generating images that radiologists prefer over competing methods in 95% of cases. |
Tasks | Image-to-Image Translation |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09727v1 |
http://arxiv.org/pdf/1902.09727v1.pdf | |
PWC | https://paperswithcode.com/paper/harmonic-unpaired-image-to-image-translation |
Repo | |
Framework | |
Optimal coding and the origins of Zipfian laws
Title | Optimal coding and the origins of Zipfian laws |
Authors | Ramon Ferrer-i-Cancho, Christian Bentz, Caio Seguin |
Abstract | The problem of compression in standard information theory consists of assigning codes as short as possible to numbers. Here we consider the problem of optimal coding – under an arbitrary coding scheme – and show that it predicts Zipf’s law of abbreviation, namely a tendency in natural languages for more frequent words to be shorter. We apply this result to investigate optimal coding also under so-called non-singular coding, a scheme where unique segmentation is not warranted but codes stand for a distinct number. Optimal non-singular coding predicts that the length of a word should grow approximately as the logarithm of its frequency rank, which is again consistent with Zipf’s law of abbreviation. Optimal non-singular coding in combination with the maximum entropy principle also predicts Zipf’s rank-frequency distribution. Furthermore, our findings on optimal non-singular coding challenge common beliefs about random typing. It turns out that random typing is in fact an optimal coding process, in stark contrast with the common assumption that it is detached from cost cutting considerations. Finally, we discuss the implications of optimal coding for the construction of a compact theory of Zipfian laws and other linguistic laws. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01545v3 |
https://arxiv.org/pdf/1906.01545v3.pdf | |
PWC | https://paperswithcode.com/paper/optimal-coding-and-the-origins-of-zipfian |
Repo | |
Framework | |
Mango Tree Net – A fully convolutional network for semantic segmentation and individual crown detection of mango trees
Title | Mango Tree Net – A fully convolutional network for semantic segmentation and individual crown detection of mango trees |
Authors | Vikas Agaradahalli Gurumurthy, Ramesh Kestur, Omkar Narasipura |
Abstract | This work presents a method for semantic segmentation of mango trees in high resolution aerial imagery, and, a novel method for individual crown detection of mango trees using segmentation output. Mango Tree Net, a fully convolutional neural network (FCN), is trained using supervised learning to perform semantic segmentation of mango trees in imagery acquired using an unmanned aerial vehicle (UAV). The proposed network is retrained to separate touching/overlapping tree crowns in segmentation output. Contour based connected object detection is performed on the segmentation output from retrained network. Bounding boxes are drawn on the original images using coordinates of connected objects to achieve individual crown detection. The training dataset consists of 8,824 image patches of size 240 x 240. The approach is tested for performance on segmentation and individual crown detection tasks using test datasets containing 36 and 4 images respectively. The performance is analyzed using standard metrics precision, recall, f1-score and accuracy. Results obtained demonstrate the robustness of the proposed methods despite variations in factors such as scale, occlusion, lighting conditions and surrounding vegetation. |
Tasks | Object Detection, Semantic Segmentation |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.06915v1 |
https://arxiv.org/pdf/1907.06915v1.pdf | |
PWC | https://paperswithcode.com/paper/mango-tree-net-a-fully-convolutional-network |
Repo | |
Framework | |
Memory-Sample Tradeoffs for Linear Regression with Small Error
Title | Memory-Sample Tradeoffs for Linear Regression with Small Error |
Authors | Vatsal Sharan, Aaron Sidford, Gregory Valiant |
Abstract | We consider the problem of performing linear regression over a stream of $d$-dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints. Specifically, consider a sequence of labeled examples $(a_1,b_1), (a_2,b_2)\ldots,$ with $a_i$ drawn independently from a $d$-dimensional isotropic Gaussian, and where $b_i = \langle a_i, x\rangle + \eta_i,$ for a fixed $x \in \mathbb{R}^d$ with $\x_2 = 1$ and with independent noise $\eta_i$ drawn uniformly from the interval $[-2^{-d/5},2^{-d/5}].$ We show that any algorithm with at most $d^2/4$ bits of memory requires at least $\Omega(d \log \log \frac{1}{\epsilon})$ samples to approximate $x$ to $\ell_2$ error $\epsilon$ with probability of success at least $2/3$, for $\epsilon$ sufficiently small as a function of $d$. In contrast, for such $\epsilon$, $x$ can be recovered to error $\epsilon$ with probability $1-o(1)$ with memory $O\left(d^2 \log(1/\epsilon)\right)$ using $d$ examples. This represents the first nontrivial lower bounds for regression with super-linear memory, and may open the door for strong memory/sample tradeoffs for continuous optimization. |
Tasks | |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.08544v1 |
http://arxiv.org/pdf/1904.08544v1.pdf | |
PWC | https://paperswithcode.com/paper/memory-sample-tradeoffs-for-linear-regression |
Repo | |
Framework | |
Ensemble Super-Resolution with A Reference Dataset
Title | Ensemble Super-Resolution with A Reference Dataset |
Authors | Junjun Jiang, Yi Yu, Zheng Wang, Suhua Tang, Ruimin Hu, Jiayi Ma |
Abstract | By developing sophisticated image priors or designing deep(er) architectures, a variety of image Super-Resolution (SR) approaches have been proposed recently and achieved very promising performance. A natural question that arises is whether these methods can be reformulated into a unifying framework and whether this framework assists in SR reconstruction? In this paper, we present a simple but effective single image SR method based on ensemble learning, which can produce a better performance than that could be obtained from any of SR methods to be ensembled (or called component super-resolvers). Based on the assumption that better component super-resolver should have larger ensemble weight when performing SR reconstruction, we present a Maximum A Posteriori (MAP) estimation framework for the inference of optimal ensemble weights. Specially, we introduce a reference dataset, which is composed of High-Resolution (HR) and Low-Resolution (LR) image pairs, to measure the super-resolution abilities (prior knowledge) of different component super-resolvers. To obtain the optimal ensemble weights, we propose to incorporate the reconstruction constraint, which states that the degenerated HR image should be equal to the LR observation one, as well as the prior knowledge of ensemble weights into the MAP estimation framework. Moreover, the proposed optimization problem can be solved by an analytical solution. We study the performance of the proposed method by comparing with different competitive approaches, including four state-of-the-art non-deep learning based methods, four latest deep learning based methods and one ensemble learning based method, and prove its effectiveness and superiority on three public datasets. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-05-12 |
URL | https://arxiv.org/abs/1905.04696v1 |
https://arxiv.org/pdf/1905.04696v1.pdf | |
PWC | https://paperswithcode.com/paper/ensemble-super-resolution-with-a-reference |
Repo | |
Framework | |
Sub-linear RACE Sketches for Approximate Kernel Density Estimation on Streaming Data
Title | Sub-linear RACE Sketches for Approximate Kernel Density Estimation on Streaming Data |
Authors | Benjamin Coleman, Anshumali Shrivastava |
Abstract | Kernel density estimation is a simple and effective method that lies at the heart of many important machine learning applications. Unfortunately, kernel methods scale poorly for large, high dimensional datasets. Approximate kernel density estimation has a prohibitively high memory and computation cost, especially in the streaming setting. Recent sampling algorithms for high dimensional densities can reduce the computation cost but cannot operate online, while streaming algorithms cannot handle high dimensional datasets due to the curse of dimensionality. We propose RACE, an efficient sketching algorithm for kernel density estimation on high-dimensional streaming data. RACE compresses a set of N high dimensional vectors into a small array of integer counters. This array is sufficient to estimate the kernel density for a large class of kernels. Our sketch is practical to implement and comes with strong theoretical guarantees. We evaluate our method on real-world high-dimensional datasets and show that our sketch achieves 10x better compression compared to competing methods. |
Tasks | Density Estimation |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02283v1 |
https://arxiv.org/pdf/1912.02283v1.pdf | |
PWC | https://paperswithcode.com/paper/sub-linear-race-sketches-for-approximate |
Repo | |
Framework | |
The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation
Title | The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation |
Authors | Zhe Feng, David C. Parkes, Haifeng Xu |
Abstract | We study the behavior of stochastic bandits algorithms under \emph{strategic behavior} conducted by rational actors, i.e., the arms. Each arm is a strategic player who can modify its own reward whenever pulled, subject to a cross-period budget constraint. Each arm is \emph{self-interested} and seeks to maximize its own expected number of times of being pulled over a decision horizon. Strategic manipulations naturally arise in various economic applications, e.g., recommendation systems such as Yelp and Amazon. We analyze the robustness of three popular bandit algorithms: UCB, $\varepsilon$-Greedy, and Thompson Sampling. We prove that all three algorithms achieve a regret upper bound $\mathcal{O}(\max { B, \ln T})$ under \emph{any} (possibly adaptive) strategy of the strategic arms, where $B$ is the total budget across arms. Moreover, we prove that our regret upper bound is \emph{tight}. Our results illustrate the intrinsic robustness of bandits algorithms against strategic manipulation so long as $B=o(T)$. This is in sharp contrast to the more pessimistic model of adversarial attacks where an attack budget of $\mathcal{O}(\ln T) $ can trick UCB and $\varepsilon$-Greedy to pull the optimal arm only $o(T)$ number of times. Our results hold for both bounded and unbounded rewards. |
Tasks | Recommendation Systems |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01528v1 |
https://arxiv.org/pdf/1906.01528v1.pdf | |
PWC | https://paperswithcode.com/paper/the-intrinsic-robustness-of-stochastic |
Repo | |
Framework | |
Reduce Noise in Computed Tomography Image using Adaptive Gaussian Filter
Title | Reduce Noise in Computed Tomography Image using Adaptive Gaussian Filter |
Authors | Rini Mayasari, Nono Heryana |
Abstract | One image processing application that is very helpful for humans is to improve image quality, poor image quality makes the image more difficult to interpret because the information conveyed by the image is reduced. In the process of the acquisition of medical images, the resulting image has decreased quality (degraded) due to external factors and medical equipment used. For this reason, it is necessary to have an image processing process to improve the quality of medical images, so that later it is expected to help facilitate medical personnel in analyzing and translating medical images, which will lead to an improvement in the quality of diagnosis. In this study, an analysis will be carried out to improve the quality of medical images with noise reduction with the Gaussian Filter Method. Next, it is carried out, and tested against medical images, in this case, the lung photo image. The test image is given noise in the form of impulse salt & pepper and adaptive Gaussian then analyzed its performance qualitatively by comparing the output filter image, noise image, and the original image by naked eye. |
Tasks | |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.05985v1 |
http://arxiv.org/pdf/1902.05985v1.pdf | |
PWC | https://paperswithcode.com/paper/reduce-noise-in-computed-tomography-image |
Repo | |
Framework | |
Safe Reinforcement Learning on Autonomous Vehicles
Title | Safe Reinforcement Learning on Autonomous Vehicles |
Authors | David Isele, Alireza Nakhaei, Kikuo Fujimura |
Abstract | There have been numerous advances in reinforcement learning, but the typically unconstrained exploration of the learning process prevents the adoption of these methods in many safety critical applications. Recent work in safe reinforcement learning uses idealized models to achieve their guarantees, but these models do not easily accommodate the stochasticity or high-dimensionality of real world systems. We investigate how prediction provides a general and intuitive framework to constraint exploration, and show how it can be used to safely learn intersection handling behaviors on an autonomous vehicle. |
Tasks | Autonomous Vehicles |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1910.00399v1 |
https://arxiv.org/pdf/1910.00399v1.pdf | |
PWC | https://paperswithcode.com/paper/safe-reinforcement-learning-on-autonomous |
Repo | |
Framework | |
Data-driven sensor scheduling for remote estimation in wireless networks
Title | Data-driven sensor scheduling for remote estimation in wireless networks |
Authors | Marcos M. Vasconcelos, Urbashi Mitra |
Abstract | Sensor scheduling is a well studied problem in signal processing and control with numerous applications. Despite its successful history, most of the related literature assumes the knowledge of the underlying probabilistic model of the sensor measurements such as the correlation structure or the entire joint probability density function. Herein, a framework for sensor scheduling for remote estimation is introduced in which the system design and the scheduling decisions are based solely on observed data. Unicast and broadcast networks and corresponding receivers are considered. In both cases, the empirical risk minimization can be posed as a difference-of-convex optimization problem and locally optimal solutions are obtained efficiently by applying the convex-concave procedure. Our results are independent of the data’s probability density function, correlation structure and the number of sensors. |
Tasks | |
Published | 2019-12-05 |
URL | https://arxiv.org/abs/1912.02411v1 |
https://arxiv.org/pdf/1912.02411v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-sensor-scheduling-for-remote |
Repo | |
Framework | |
Approaching Adaptation Guided Retrieval in Case-Based Reasoning through Inference in Undirected Graphical Models
Title | Approaching Adaptation Guided Retrieval in Case-Based Reasoning through Inference in Undirected Graphical Models |
Authors | Luigi Portinale |
Abstract | In Case-Based Reasoning, when the similarity assumption does not hold, the retrieval of a set of cases structurally similar to the query does not guarantee to get a reusable or revisable solution. Knowledge about the adaptability of solutions has to be exploited, in order to define a method for adaptation-guided retrieval. We propose a novel approach to address this problem, where knowledge about the adaptability of the solutions is captured inside a metric Markov Random Field (MRF). Nodes of the MRF represent cases and edges connect nodes whose solutions are close in the solution space. States of the nodes represent different adaptation levels with respect to the potential query. Metric-based potentials enforce connected nodes to share the same state, since cases having similar solutions should have the same adaptability level with respect to the query. The main goal is to enlarge the set of potentially adaptable cases that are retrieved without significantly sacrificing the precision and accuracy of retrieval. We will report on some experiments concerning a retrieval architecture where a simple kNN retrieval (on the problem description) is followed by a further retrieval step based on MRF inference. |
Tasks | |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12464v1 |
https://arxiv.org/pdf/1905.12464v1.pdf | |
PWC | https://paperswithcode.com/paper/approaching-adaptation-guided-retrieval-in |
Repo | |
Framework | |
Device-Free User Authentication, Activity Classification and Tracking using Passive Wi-Fi Sensing: A Deep Learning Based Approach
Title | Device-Free User Authentication, Activity Classification and Tracking using Passive Wi-Fi Sensing: A Deep Learning Based Approach |
Authors | Vinoj Jayasundara, Hirunima Jayasekara, Tharaka Samarasinghe, Kasun T. Hemachandra |
Abstract | Privacy issues related to video camera feeds have led to a growing need for suitable alternatives that provide functionalities such as user authentication, activity classification and tracking in a noninvasive manner. Existing infrastructure makes Wi-Fi a possible candidate, yet, utilizing traditional signal processing methods to extract information necessary to fully characterize an event by sensing weak ambient Wi-Fi signals is deemed to be challenging. This paper introduces a novel end to-end deep learning framework that simultaneously predicts the identity, activity and the location of a user to create user profiles similar to the information provided through a video camera. The system is fully autonomous and requires zero user intervention unlike systems that require user-initiated initialization, or a user held transmitting device to facilitate the prediction. The system can also predict the trajectory of the user by predicting the location of a user over consecutive time steps. The performance of the system is evaluated through experiments. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11743v1 |
https://arxiv.org/pdf/1911.11743v1.pdf | |
PWC | https://paperswithcode.com/paper/device-free-user-authentication-activity |
Repo | |
Framework | |
The Role of Local Intrinsic Dimensionality in Benchmarking Nearest Neighbor Search
Title | The Role of Local Intrinsic Dimensionality in Benchmarking Nearest Neighbor Search |
Authors | Martin Aumüller, Matteo Ceccarello |
Abstract | This paper reconsiders common benchmarking approaches to nearest neighbor search. It is shown that the concept of local intrinsic dimensionality (LID) allows to choose query sets of a wide range of difficulty for real-world datasets. Moreover, the effect of different LID distributions on the running time performance of implementations is empirically studied. To this end, different visualization concepts are introduced that allow to get a more fine-grained overview of the inner workings of nearest neighbor search principles. The paper closes with remarks about the diversity of datasets commonly used for nearest neighbor search benchmarking. It is shown that such real-world datasets are not diverse: results on a single dataset predict results on all other datasets well. |
Tasks | |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07387v1 |
https://arxiv.org/pdf/1907.07387v1.pdf | |
PWC | https://paperswithcode.com/paper/the-role-of-local-intrinsic-dimensionality-in |
Repo | |
Framework | |
Counting and Segmenting Sorghum Heads
Title | Counting and Segmenting Sorghum Heads |
Authors | Min-hwan Oh, Peder Olsen, Karthikeyan Natesan Ramamurthy |
Abstract | Phenotyping is the process of measuring an organism’s observable traits. Manual phenotyping of crops is a labor-intensive, time-consuming, costly, and error prone process. Accurate, automated, high-throughput phenotyping can relieve a huge burden in the crop breeding pipeline. In this paper, we propose a scalable, high-throughput approach to automatically count and segment panicles (heads), a key phenotype, from aerial sorghum crop imagery. Our counting approach uses the image density map obtained from dot or region annotation as the target with a novel deep convolutional neural network architecture. We also propose a novel instance segmentation algorithm using the estimated density map, to identify the individual panicles in the presence of occlusion. With real Sorghum aerial images, we obtain a mean absolute error (MAE) of 1.06 for counting which is better than using well-known crowd counting approaches such as CCNN, MCNN and CSRNet models. The instance segmentation model also produces respectable results which will be ultimately useful in reducing the manual annotation workload for future data. |
Tasks | Crowd Counting, Instance Segmentation, Semantic Segmentation |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13291v1 |
https://arxiv.org/pdf/1905.13291v1.pdf | |
PWC | https://paperswithcode.com/paper/counting-and-segmenting-sorghum-heads |
Repo | |
Framework | |