April 2, 2020

3125 words 15 mins read

Paper Group ANR 345

Detecting Mismatch between Text Script and Voice-over Using Utterance Verification Based on Phoneme Recognition Ranking. A review of machine learning applications in wildfire science and management. Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach. Bayesian nonparametric shared multi-sequence time series segme …

Detecting Mismatch between Text Script and Voice-over Using Utterance Verification Based on Phoneme Recognition Ranking


Title	Detecting Mismatch between Text Script and Voice-over Using Utterance Verification Based on Phoneme Recognition Ranking
Authors	Yoonjae Jeong, Hoon-Young Cho
Abstract	The purpose of this study is to detect the mismatch between text script and voice-over. For this, we present a novel utterance verification (UV) method, which calculates the degree of correspondence between a voice-over and the phoneme sequence of a script. We found that the phoneme recognition probabilities of exaggerated voice-overs decrease compared to ordinary utterances, but their rankings do not demonstrate any significant change. The proposed method, therefore, uses the recognition ranking of each phoneme segment corresponding to a phoneme sequence for measuring the confidence of a voice-over utterance for its corresponding script. The experimental results show that the proposed UV method outperforms a state-of-the-art approach using cross modal attention used for detecting mismatch between speech and transcription.
Tasks
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09180v1
PDF	https://arxiv.org/pdf/2003.09180v1.pdf
PWC	https://paperswithcode.com/paper/detecting-mismatch-between-text-script-and
Repo
Framework

A review of machine learning applications in wildfire science and management


Title	A review of machine learning applications in wildfire science and management
Authors	Piyush Jain, Sean C P Coogan, Sriram Ganapathi Subramanian, Mark Crowley, Steve Taylor, Mike D Flannigan
Abstract	Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) in the environmental sciences. Here, we present a scoping review of ML in wildfire science and management. Our objective is to improve awareness of ML among wildfire scientists and managers, as well as illustrate the challenging range of problems in wildfire science available to data scientists. We first present an overview of popular ML approaches used in wildfire science to date, and then review their use in wildfire science within six problem domains: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. We also discuss the advantages and limitations of various ML approaches and identify opportunities for future advances in wildfire science and management within a data science context. We identified 298 relevant publications, where the most frequently used ML methods included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. There exists opportunities to apply more current ML methods (e.g., deep learning and agent based learning) in wildfire science. However, despite the ability of ML models to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods requires sophisticated knowledge for their application. Finally, we stress that the wildfire research and management community plays an active role in providing relevant, high quality data for use by practitioners of ML methods.
Tasks
Published	2020-03-02
URL	https://arxiv.org/abs/2003.00646v1
PDF	https://arxiv.org/pdf/2003.00646v1.pdf
PWC	https://paperswithcode.com/paper/a-review-of-machine-learning-applications-in-1
Repo
Framework

Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach


Title	Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach
Authors	Yu-Siang Wang, Yen-Ling Kuo, Boris Katz
Abstract	We demonstrate how we can practically incorporate multi-step future information into a decoder of maximum likelihood sequence models. We propose a “k-step look-ahead” module to consider the likelihood information of a rollout up to k steps. Unlike other approaches that need to train another value network to evaluate the rollouts, we can directly apply this look-ahead module to improve the decoding of any sequence model trained in a maximum likelihood framework. We evaluate our look-ahead module on three datasets of varying difficulties: IM2LATEX-100k OCR image to LaTeX, WMT16 multimodal machine translation, and WMT14 machine translation. Our look-ahead module improves the performance of the simpler datasets such as IM2LATEX-100k and WMT16 multimodal machine translation. However, the improvement of the more difficult dataset (e.g., containing longer sequences), WMT14 machine translation, becomes marginal. Our further investigation using the k-step look-ahead suggests that the more difficult tasks suffer from the overestimated EOS (end-of-sentence) probability. We argue that the overestimated EOS probability also causes the decreased performance of beam search when increasing its beam width. We tackle the EOS problem by integrating an auxiliary EOS loss into the training to estimate if the model should emit EOS or other words. Our experiments show that improving EOS estimation not only increases the performance of our proposed look-ahead module but also the robustness of the beam search.
Tasks	Machine Translation, Multimodal Machine Translation, Optical Character Recognition
Published	2020-03-08
URL	https://arxiv.org/abs/2003.03716v1
PDF	https://arxiv.org/pdf/2003.03716v1.pdf
PWC	https://paperswithcode.com/paper/investigating-the-decoders-of-maximum
Repo
Framework

Bayesian nonparametric shared multi-sequence time series segmentation


Title	Bayesian nonparametric shared multi-sequence time series segmentation
Authors	Olga Mikheeva, Ieva Kazlauskaite, Hedvig Kjellström, Carl Henrik Ek
Abstract	In this paper, we introduce a method for segmenting time series data using tools from Bayesian nonparametrics. We consider the task of temporal segmentation of a set of time series data into representative stationary segments. We use Gaussian process (GP) priors to impose our knowledge about the characteristics of the underlying stationary segments, and use a nonparametric distribution to partition the sequences into such segments, formulated in terms of a prior distribution on segment length. Given the segmentation, the model can be viewed as a variant of a Gaussian mixture model where the mixture components are described using the covariance function of a GP. We demonstrate the effectiveness of our model on synthetic data as well as on real time-series data of heartbeats where the task is to segment the indicative types of beats and to classify the heartbeat recordings into classes that correspond to healthy and abnormal heart sounds.
Tasks	Time Series
Published	2020-01-27
URL	https://arxiv.org/abs/2001.09886v1
PDF	https://arxiv.org/pdf/2001.09886v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-nonparametric-shared-multi-sequence
Repo
Framework

Machine Learning for a Music Glove Instrument


Title	Machine Learning for a Music Glove Instrument
Authors	Joseph Bakarji
Abstract	A music glove instrument equipped with force sensitive, flex and IMU sensors is trained on an electric piano to learn note sequences based on a time series of sensor inputs. Once trained, the glove is used on any surface to generate the sequence of notes most closely related to the hand motion. The data is collected manually by a performer wearing the glove and playing on an electric keyboard. The feature space is designed to account for the key hand motion, such as the thumb-under movement. Logistic regression along with bayesian belief networks are used learn the transition probabilities from one note to another. This work demonstrates a data-driven approach for digital musical instruments in general.
Tasks	Time Series
Published	2020-01-27
URL	https://arxiv.org/abs/2001.09551v1
PDF	https://arxiv.org/pdf/2001.09551v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-a-music-glove-instrument
Repo
Framework

Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification


Title	Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification
Authors	Rui Zhou
Abstract	In large-scale classification problems, the data set may be faced with frequent updates, e.g., a small ratio of data is added to or removed from the original data set. In this case, incremental learning, which updates an existing classifier by explicitly modeling the data modification, is more efficient than retraining a new classifier from scratch. Conventional incremental learning algorithms try to solve the problem exactly. However, for some tasks, we are only interested in the lower and upper bound for some values relevant to the coefficient vector of the updated classifier without really solving it, e.g., determining whether we should update the classifier or performing some sensitivity analysis tasks. To deal with these such tasks, we propose an algorithm to make rational inferences about the updated classifier with low computational complexity. Specifically, we present a method to calculate tighter bounds of a general linear score for the updated classifier such that it’s more accurate to estimate the range of interest than existing papers. The proposed method can be applied to any linear classifiers with differentiable convex L2 regularization loss function. Both theoretical analysis and experiment results show that the proposed approach is superior to existing methods.
Tasks	L2 Regularization
Published	2020-03-06
URL	https://arxiv.org/abs/2003.03351v1
PDF	https://arxiv.org/pdf/2003.03351v1.pdf
PWC	https://paperswithcode.com/paper/tighter-bound-estimation-of-sensitivity
Repo
Framework

MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision


Title	MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision
Authors	Tingbo Hou, Adel Ahmadyan, Liangkai Zhang, Jianing Wei, Matthias Grundmann
Abstract	In this paper, we address the problem of detecting unseen objects from RGB images and estimating their poses in 3D. We propose two mobile friendly networks: MobilePose-Base and MobilePose-Shape. The former is used when there is only pose supervision, and the latter is for the case when shape supervision is available, even a weak one. We revisit shape features used in previous methods, including segmentation and coordinate map. We explain when and why pixel-level shape supervision can improve pose estimation. Consequently, we add shape prediction as an intermediate layer in the MobilePose-Shape, and let the network learn pose from shape. Our models are trained on mixed real and synthetic data, with weak and noisy shape supervision. They are ultra lightweight that can run in real-time on modern mobile devices (e.g. 36 FPS on Galaxy S20). Comparing with previous single-shot solutions, our method has higher accuracy, while using a significantly smaller model (2~3% in model size or number of parameters).
Tasks	Pose Estimation
Published	2020-03-07
URL	https://arxiv.org/abs/2003.03522v1
PDF	https://arxiv.org/pdf/2003.03522v1.pdf
PWC	https://paperswithcode.com/paper/mobilepose-real-time-pose-estimation-for
Repo
Framework

Holographic Image Sensing


Title	Holographic Image Sensing
Authors	Alfred Marcel Bruckstein, Martianus Frederic Ezerman, Adamas Aqsa Fahreza, San Ling
Abstract	Holographic representations of data enable distributed storage with progressive refinement when the stored packets of data are made available in any arbitrary order. In this paper, we propose and test patch-based transform coding holographic sensing of image data. Our proposal is optimized for progressive recovery under random order of retrieval of the stored data. The coding of the image patches relies on the design of distributed projections ensuring best image recovery, in terms of the $\ell_2$ norm, at each retrieval stage. The performance depends only on the number of data packets that has been retrieved thus far. Several possible options to enhance the quality of the recovery while changing the size and number of data packets are discussed and tested. This leads us to examine several interesting bit-allocation and rate-distortion trade offs, highlighted for a set of natural images with ensemble estimated statistical properties.
Tasks
Published	2020-02-09
URL	https://arxiv.org/abs/2002.03314v2
PDF	https://arxiv.org/pdf/2002.03314v2.pdf
PWC	https://paperswithcode.com/paper/holographic-image-sensing
Repo
Framework

Signaling in Bayesian Network Congestion Games: the Subtle Power of Symmetry


Title	Signaling in Bayesian Network Congestion Games: the Subtle Power of Symmetry
Authors	Matteo Castiglioni, Andrea Celli, Alberto Marchesi, Nicola Gatti
Abstract	Network congestion games are a well-understood model of multi-agent strategic interactions. Despite their ubiquitous applications, it is not clear whether it is possible to design information structures to ameliorate the overall experience of the network users. We focus on Bayesian games with atomic players, where network vagaries are modeled via a (random) state of nature which determines the costs incurred by the players. A third-party entity—the sender—can observe the realized state of the network and exploit this additional information to send a signal to each player. A natural question is the following: is it possible for an informed sender to reduce the overall social cost via the strategic provision of information to players who update their beliefs rationally? The paper focuses on the problem of computing optimal ex ante persuasive signaling schemes, showing that symmetry is a crucial property for its solution. Indeed, we show that an optimal ex ante persuasive signaling scheme can be computed in polynomial time when players are symmetric and have affine cost functions. Moreover, the problem becomes NP-hard when players are asymmetric, even in non-Bayesian settings.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05190v1
PDF	https://arxiv.org/pdf/2002.05190v1.pdf
PWC	https://paperswithcode.com/paper/signaling-in-bayesian-network-congestion
Repo
Framework

PointTrackNet: An End-to-End Network For 3-D Object Detection and Tracking From Point Clouds


Title	PointTrackNet: An End-to-End Network For 3-D Object Detection and Tracking From Point Clouds
Authors	Sukai Wang, Yuxiang Sun, Chengju Liu, Ming Liu
Abstract	Recent machine learning-based multi-object tracking (MOT) frameworks are becoming popular for 3-D point clouds. Most traditional tracking approaches use filters (e.g., Kalman filter or particle filter) to predict object locations in a time sequence, however, they are vulnerable to extreme motion conditions, such as sudden braking and turning. In this letter, we propose PointTrackNet, an end-to-end 3-D object detection and tracking network, to generate foreground masks, 3-D bounding boxes, and point-wise tracking association displacements for each detected object. The network merely takes as input two adjacent point-cloud frames. Experimental results on the KITTI tracking dataset show competitive results over the state-of-the-arts, especially in the irregularly and rapidly changing scenarios.
Tasks	Multi-Object Tracking, Object Detection, Object Tracking
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11559v1
PDF	https://arxiv.org/pdf/2002.11559v1.pdf
PWC	https://paperswithcode.com/paper/pointtracknet-an-end-to-end-network-for-3-d
Repo
Framework

Improving Irregularly Sampled Time Series Learning with Dense Descriptors of Time


Title	Improving Irregularly Sampled Time Series Learning with Dense Descriptors of Time
Authors	Rafael T. Sousa, Lucas A. Pereira, Anderson S. Soares
Abstract	Supervised learning with irregularly sampled time series have been a challenge to Machine Learning methods due to the obstacle of dealing with irregular time intervals. Some papers introduced recently recurrent neural network models that deals with irregularity, but most of them rely on complex mechanisms to achieve a better performance. This work propose a novel method to represent timestamps (hours or dates) as dense vectors using sinusoidal functions, called Time Embeddings. As a data input method it and can be applied to most machine learning models. The method was evaluated with two predictive tasks from MIMIC III, a dataset of irregularly sampled time series of electronic health records. Our tests showed an improvement to LSTM-based and classical machine learning models, specially with very irregular data.
Tasks	Time Series
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09291v1
PDF	https://arxiv.org/pdf/2003.09291v1.pdf
PWC	https://paperswithcode.com/paper/improving-irregularly-sampled-time-series
Repo
Framework

BrazilDAM: A Benchmark dataset for Tailings Dam Detection


Title	BrazilDAM: A Benchmark dataset for Tailings Dam Detection
Authors	Edemir Ferreira, Matheus Brito, Remis Balaniuk, Mário S. Alvim, Jefersson A. dos Santos
Abstract	In this work we present BrazilDAM, a novel public dataset based on Sentinel-2 and Landsat-8 satellite images covering all tailings dams cataloged by the Brazilian National Mining Agency (ANM). The dataset was built using georeferenced images from 769 dams, recorded between 2016 and 2019. The time series were processed in order to produce cloud free images. The dams contain mining waste from different ore categories and have highly varying shapes, areas and volumes, making BrazilDAM particularly interesting and challenging to be used in machine learning benchmarks. The original catalog contains, besides the dam coordinates, information about: the main ore, constructive method, risk category, and associated potential damage. To evaluate BrazilDAM’s predictive potential we performed classification essays using state-of-the-art deep Convolutional Neural Network (CNNs). In the experiments, we achieved an average classification accuracy of 94.11% in tailing dam binary classification task. In addition, others four setups of experiments were made using the complementary information from the original catalog, exhaustively exploiting the capacity of the proposed dataset.
Tasks	Time Series
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07948v1
PDF	https://arxiv.org/pdf/2003.07948v1.pdf
PWC	https://paperswithcode.com/paper/brazildam-a-benchmark-dataset-for-tailings
Repo
Framework

Do Deep Minds Think Alike? Selective Adversarial Attacks for Fine-Grained Manipulation of Multiple Deep Neural Networks


Title	Do Deep Minds Think Alike? Selective Adversarial Attacks for Fine-Grained Manipulation of Multiple Deep Neural Networks
Authors	Zain Khan, Jirong Yi, Raghu Mudumbai, Xiaodong Wu, Weiyu Xu
Abstract	Recent works have demonstrated the existence of {\it adversarial examples} targeting a single machine learning system. In this paper we ask a simple but fundamental question of “selective fooling”: given {\it multiple} machine learning systems assigned to solve the same classification problem and taking the same input signal, is it possible to construct a perturbation to the input signal that manipulates the outputs of these {\it multiple} machine learning systems {\it simultaneously} in arbitrary pre-defined ways? For example, is it possible to selectively fool a set of “enemy” machine learning systems but does not fool the other “friend” machine learning systems? The answer to this question depends on the extent to which these different machine learning systems “think alike”. We formulate the problem of “selective fooling” as a novel optimization problem, and report on a series of experiments on the MNIST dataset. Our preliminary findings from these experiments show that it is in fact very easy to selectively manipulate multiple MNIST classifiers simultaneously, even when the classifiers are identical in their architectures, training algorithms and training datasets except for random initialization during training. This suggests that two nominally equivalent machine learning systems do not in fact “think alike” at all, and opens the possibility for many novel applications and deeper understandings of the working principles of deep neural networks.
Tasks
Published	2020-03-26
URL	https://arxiv.org/abs/2003.11816v1
PDF	https://arxiv.org/pdf/2003.11816v1.pdf
PWC	https://paperswithcode.com/paper/do-deep-minds-think-alike-selective
Repo
Framework

A Joint Approach to Compound Splitting and Idiomatic Compound Detection


Title	A Joint Approach to Compound Splitting and Idiomatic Compound Detection
Authors	Irina Krotova, Sergey Aksenov, Ekaterina Artemova
Abstract	Applications such as machine translation, speech recognition, and information retrieval require efficient handling of noun compounds as they are one of the possible sources for out-of-vocabulary (OOV) words. In-depth processing of noun compounds requires not only splitting them into smaller components (or even roots) but also the identification of instances that should remain unsplitted as they are of idiomatic nature. We develop a two-fold deep learning-based approach of noun compound splitting and idiomatic compound detection for the German language that we train using a newly collected corpus of annotated German compounds. Our neural noun compound splitter operates on a sub-word level and outperforms the current state of the art by about 5%.
Tasks	Information Retrieval, Machine Translation, Speech Recognition
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09606v1
PDF	https://arxiv.org/pdf/2003.09606v1.pdf
PWC	https://paperswithcode.com/paper/a-joint-approach-to-compound-splitting-and
Repo
Framework

No Regret Sample Selection with Noisy Labels


Title	No Regret Sample Selection with Noisy Labels
Authors	N. Mitsuo, S. Uchida, D. Suehiro
Abstract	Deep Neural Network (DNN) suffers from noisy labeled data because of the heavily overfitting risk. To avoid the risk, in this paper, we propose a novel sample selection framework for learning noisy samples. The core idea is to employ a “regret” minimization approach. The proposed sample selection method adaptively selects a subset of noisy-labeled training samples to minimize the regret to select noise samples. The algorithm efficiently works and performs with theoretical support. Moreover, unlike the typical approaches, the algorithm does not require any side information or learning information depending on the training settings of DNN. The experimental results demonstrate that the proposed method improves the performance of a black-box DNN with noisy labeled data.
Tasks
Published	2020-03-06
URL	https://arxiv.org/abs/2003.03179v1
PDF	https://arxiv.org/pdf/2003.03179v1.pdf
PWC	https://paperswithcode.com/paper/no-regret-sample-selection-with-noisy-labels
Repo
Framework