Paper Group ANR 345
Detecting Mismatch between Text Script and Voice-over Using Utterance Verification Based on Phoneme Recognition Ranking. A review of machine learning applications in wildfire science and management. Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach. Bayesian nonparametric shared multi-sequence time series segme …
Detecting Mismatch between Text Script and Voice-over Using Utterance Verification Based on Phoneme Recognition Ranking
Title | Detecting Mismatch between Text Script and Voice-over Using Utterance Verification Based on Phoneme Recognition Ranking |
Authors | Yoonjae Jeong, Hoon-Young Cho |
Abstract | The purpose of this study is to detect the mismatch between text script and voice-over. For this, we present a novel utterance verification (UV) method, which calculates the degree of correspondence between a voice-over and the phoneme sequence of a script. We found that the phoneme recognition probabilities of exaggerated voice-overs decrease compared to ordinary utterances, but their rankings do not demonstrate any significant change. The proposed method, therefore, uses the recognition ranking of each phoneme segment corresponding to a phoneme sequence for measuring the confidence of a voice-over utterance for its corresponding script. The experimental results show that the proposed UV method outperforms a state-of-the-art approach using cross modal attention used for detecting mismatch between speech and transcription. |
Tasks | |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09180v1 |
https://arxiv.org/pdf/2003.09180v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-mismatch-between-text-script-and |
Repo | |
Framework | |
A review of machine learning applications in wildfire science and management
Title | A review of machine learning applications in wildfire science and management |
Authors | Piyush Jain, Sean C P Coogan, Sriram Ganapathi Subramanian, Mark Crowley, Steve Taylor, Mike D Flannigan |
Abstract | Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) in the environmental sciences. Here, we present a scoping review of ML in wildfire science and management. Our objective is to improve awareness of ML among wildfire scientists and managers, as well as illustrate the challenging range of problems in wildfire science available to data scientists. We first present an overview of popular ML approaches used in wildfire science to date, and then review their use in wildfire science within six problem domains: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. We also discuss the advantages and limitations of various ML approaches and identify opportunities for future advances in wildfire science and management within a data science context. We identified 298 relevant publications, where the most frequently used ML methods included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. There exists opportunities to apply more current ML methods (e.g., deep learning and agent based learning) in wildfire science. However, despite the ability of ML models to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods requires sophisticated knowledge for their application. Finally, we stress that the wildfire research and management community plays an active role in providing relevant, high quality data for use by practitioners of ML methods. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00646v1 |
https://arxiv.org/pdf/2003.00646v1.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-machine-learning-applications-in-1 |
Repo | |
Framework | |
Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach
Title | Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach |
Authors | Yu-Siang Wang, Yen-Ling Kuo, Boris Katz |
Abstract | We demonstrate how we can practically incorporate multi-step future information into a decoder of maximum likelihood sequence models. We propose a “k-step look-ahead” module to consider the likelihood information of a rollout up to k steps. Unlike other approaches that need to train another value network to evaluate the rollouts, we can directly apply this look-ahead module to improve the decoding of any sequence model trained in a maximum likelihood framework. We evaluate our look-ahead module on three datasets of varying difficulties: IM2LATEX-100k OCR image to LaTeX, WMT16 multimodal machine translation, and WMT14 machine translation. Our look-ahead module improves the performance of the simpler datasets such as IM2LATEX-100k and WMT16 multimodal machine translation. However, the improvement of the more difficult dataset (e.g., containing longer sequences), WMT14 machine translation, becomes marginal. Our further investigation using the k-step look-ahead suggests that the more difficult tasks suffer from the overestimated EOS (end-of-sentence) probability. We argue that the overestimated EOS probability also causes the decreased performance of beam search when increasing its beam width. We tackle the EOS problem by integrating an auxiliary EOS loss into the training to estimate if the model should emit EOS or other words. Our experiments show that improving EOS estimation not only increases the performance of our proposed look-ahead module but also the robustness of the beam search. |
Tasks | Machine Translation, Multimodal Machine Translation, Optical Character Recognition |
Published | 2020-03-08 |
URL | https://arxiv.org/abs/2003.03716v1 |
https://arxiv.org/pdf/2003.03716v1.pdf | |
PWC | https://paperswithcode.com/paper/investigating-the-decoders-of-maximum |
Repo | |
Framework | |
Bayesian nonparametric shared multi-sequence time series segmentation
Title | Bayesian nonparametric shared multi-sequence time series segmentation |
Authors | Olga Mikheeva, Ieva Kazlauskaite, Hedvig Kjellström, Carl Henrik Ek |
Abstract | In this paper, we introduce a method for segmenting time series data using tools from Bayesian nonparametrics. We consider the task of temporal segmentation of a set of time series data into representative stationary segments. We use Gaussian process (GP) priors to impose our knowledge about the characteristics of the underlying stationary segments, and use a nonparametric distribution to partition the sequences into such segments, formulated in terms of a prior distribution on segment length. Given the segmentation, the model can be viewed as a variant of a Gaussian mixture model where the mixture components are described using the covariance function of a GP. We demonstrate the effectiveness of our model on synthetic data as well as on real time-series data of heartbeats where the task is to segment the indicative types of beats and to classify the heartbeat recordings into classes that correspond to healthy and abnormal heart sounds. |
Tasks | Time Series |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09886v1 |
https://arxiv.org/pdf/2001.09886v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-nonparametric-shared-multi-sequence |
Repo | |
Framework | |
Machine Learning for a Music Glove Instrument
Title | Machine Learning for a Music Glove Instrument |
Authors | Joseph Bakarji |
Abstract | A music glove instrument equipped with force sensitive, flex and IMU sensors is trained on an electric piano to learn note sequences based on a time series of sensor inputs. Once trained, the glove is used on any surface to generate the sequence of notes most closely related to the hand motion. The data is collected manually by a performer wearing the glove and playing on an electric keyboard. The feature space is designed to account for the key hand motion, such as the thumb-under movement. Logistic regression along with bayesian belief networks are used learn the transition probabilities from one note to another. This work demonstrates a data-driven approach for digital musical instruments in general. |
Tasks | Time Series |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09551v1 |
https://arxiv.org/pdf/2001.09551v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-for-a-music-glove-instrument |
Repo | |
Framework | |
Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification
Title | Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification |
Authors | Rui Zhou |
Abstract | In large-scale classification problems, the data set may be faced with frequent updates, e.g., a small ratio of data is added to or removed from the original data set. In this case, incremental learning, which updates an existing classifier by explicitly modeling the data modification, is more efficient than retraining a new classifier from scratch. Conventional incremental learning algorithms try to solve the problem exactly. However, for some tasks, we are only interested in the lower and upper bound for some values relevant to the coefficient vector of the updated classifier without really solving it, e.g., determining whether we should update the classifier or performing some sensitivity analysis tasks. To deal with these such tasks, we propose an algorithm to make rational inferences about the updated classifier with low computational complexity. Specifically, we present a method to calculate tighter bounds of a general linear score for the updated classifier such that it’s more accurate to estimate the range of interest than existing papers. The proposed method can be applied to any linear classifiers with differentiable convex L2 regularization loss function. Both theoretical analysis and experiment results show that the proposed approach is superior to existing methods. |
Tasks | L2 Regularization |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03351v1 |
https://arxiv.org/pdf/2003.03351v1.pdf | |
PWC | https://paperswithcode.com/paper/tighter-bound-estimation-of-sensitivity |
Repo | |
Framework | |
MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision
Title | MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision |
Authors | Tingbo Hou, Adel Ahmadyan, Liangkai Zhang, Jianing Wei, Matthias Grundmann |
Abstract | In this paper, we address the problem of detecting unseen objects from RGB images and estimating their poses in 3D. We propose two mobile friendly networks: MobilePose-Base and MobilePose-Shape. The former is used when there is only pose supervision, and the latter is for the case when shape supervision is available, even a weak one. We revisit shape features used in previous methods, including segmentation and coordinate map. We explain when and why pixel-level shape supervision can improve pose estimation. Consequently, we add shape prediction as an intermediate layer in the MobilePose-Shape, and let the network learn pose from shape. Our models are trained on mixed real and synthetic data, with weak and noisy shape supervision. They are ultra lightweight that can run in real-time on modern mobile devices (e.g. 36 FPS on Galaxy S20). Comparing with previous single-shot solutions, our method has higher accuracy, while using a significantly smaller model (2~3% in model size or number of parameters). |
Tasks | Pose Estimation |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03522v1 |
https://arxiv.org/pdf/2003.03522v1.pdf | |
PWC | https://paperswithcode.com/paper/mobilepose-real-time-pose-estimation-for |
Repo | |
Framework | |
Holographic Image Sensing
Title | Holographic Image Sensing |
Authors | Alfred Marcel Bruckstein, Martianus Frederic Ezerman, Adamas Aqsa Fahreza, San Ling |
Abstract | Holographic representations of data enable distributed storage with progressive refinement when the stored packets of data are made available in any arbitrary order. In this paper, we propose and test patch-based transform coding holographic sensing of image data. Our proposal is optimized for progressive recovery under random order of retrieval of the stored data. The coding of the image patches relies on the design of distributed projections ensuring best image recovery, in terms of the $\ell_2$ norm, at each retrieval stage. The performance depends only on the number of data packets that has been retrieved thus far. Several possible options to enhance the quality of the recovery while changing the size and number of data packets are discussed and tested. This leads us to examine several interesting bit-allocation and rate-distortion trade offs, highlighted for a set of natural images with ensemble estimated statistical properties. |
Tasks | |
Published | 2020-02-09 |
URL | https://arxiv.org/abs/2002.03314v2 |
https://arxiv.org/pdf/2002.03314v2.pdf | |
PWC | https://paperswithcode.com/paper/holographic-image-sensing |
Repo | |
Framework | |
Signaling in Bayesian Network Congestion Games: the Subtle Power of Symmetry
Title | Signaling in Bayesian Network Congestion Games: the Subtle Power of Symmetry |
Authors | Matteo Castiglioni, Andrea Celli, Alberto Marchesi, Nicola Gatti |
Abstract | Network congestion games are a well-understood model of multi-agent strategic interactions. Despite their ubiquitous applications, it is not clear whether it is possible to design information structures to ameliorate the overall experience of the network users. We focus on Bayesian games with atomic players, where network vagaries are modeled via a (random) state of nature which determines the costs incurred by the players. A third-party entity—the sender—can observe the realized state of the network and exploit this additional information to send a signal to each player. A natural question is the following: is it possible for an informed sender to reduce the overall social cost via the strategic provision of information to players who update their beliefs rationally? The paper focuses on the problem of computing optimal ex ante persuasive signaling schemes, showing that symmetry is a crucial property for its solution. Indeed, we show that an optimal ex ante persuasive signaling scheme can be computed in polynomial time when players are symmetric and have affine cost functions. Moreover, the problem becomes NP-hard when players are asymmetric, even in non-Bayesian settings. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05190v1 |
https://arxiv.org/pdf/2002.05190v1.pdf | |
PWC | https://paperswithcode.com/paper/signaling-in-bayesian-network-congestion |
Repo | |
Framework | |
PointTrackNet: An End-to-End Network For 3-D Object Detection and Tracking From Point Clouds
Title | PointTrackNet: An End-to-End Network For 3-D Object Detection and Tracking From Point Clouds |
Authors | Sukai Wang, Yuxiang Sun, Chengju Liu, Ming Liu |
Abstract | Recent machine learning-based multi-object tracking (MOT) frameworks are becoming popular for 3-D point clouds. Most traditional tracking approaches use filters (e.g., Kalman filter or particle filter) to predict object locations in a time sequence, however, they are vulnerable to extreme motion conditions, such as sudden braking and turning. In this letter, we propose PointTrackNet, an end-to-end 3-D object detection and tracking network, to generate foreground masks, 3-D bounding boxes, and point-wise tracking association displacements for each detected object. The network merely takes as input two adjacent point-cloud frames. Experimental results on the KITTI tracking dataset show competitive results over the state-of-the-arts, especially in the irregularly and rapidly changing scenarios. |
Tasks | Multi-Object Tracking, Object Detection, Object Tracking |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11559v1 |
https://arxiv.org/pdf/2002.11559v1.pdf | |
PWC | https://paperswithcode.com/paper/pointtracknet-an-end-to-end-network-for-3-d |
Repo | |
Framework | |
Improving Irregularly Sampled Time Series Learning with Dense Descriptors of Time
Title | Improving Irregularly Sampled Time Series Learning with Dense Descriptors of Time |
Authors | Rafael T. Sousa, Lucas A. Pereira, Anderson S. Soares |
Abstract | Supervised learning with irregularly sampled time series have been a challenge to Machine Learning methods due to the obstacle of dealing with irregular time intervals. Some papers introduced recently recurrent neural network models that deals with irregularity, but most of them rely on complex mechanisms to achieve a better performance. This work propose a novel method to represent timestamps (hours or dates) as dense vectors using sinusoidal functions, called Time Embeddings. As a data input method it and can be applied to most machine learning models. The method was evaluated with two predictive tasks from MIMIC III, a dataset of irregularly sampled time series of electronic health records. Our tests showed an improvement to LSTM-based and classical machine learning models, specially with very irregular data. |
Tasks | Time Series |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09291v1 |
https://arxiv.org/pdf/2003.09291v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-irregularly-sampled-time-series |
Repo | |
Framework | |
BrazilDAM: A Benchmark dataset for Tailings Dam Detection
Title | BrazilDAM: A Benchmark dataset for Tailings Dam Detection |
Authors | Edemir Ferreira, Matheus Brito, Remis Balaniuk, Mário S. Alvim, Jefersson A. dos Santos |
Abstract | In this work we present BrazilDAM, a novel public dataset based on Sentinel-2 and Landsat-8 satellite images covering all tailings dams cataloged by the Brazilian National Mining Agency (ANM). The dataset was built using georeferenced images from 769 dams, recorded between 2016 and 2019. The time series were processed in order to produce cloud free images. The dams contain mining waste from different ore categories and have highly varying shapes, areas and volumes, making BrazilDAM particularly interesting and challenging to be used in machine learning benchmarks. The original catalog contains, besides the dam coordinates, information about: the main ore, constructive method, risk category, and associated potential damage. To evaluate BrazilDAM’s predictive potential we performed classification essays using state-of-the-art deep Convolutional Neural Network (CNNs). In the experiments, we achieved an average classification accuracy of 94.11% in tailing dam binary classification task. In addition, others four setups of experiments were made using the complementary information from the original catalog, exhaustively exploiting the capacity of the proposed dataset. |
Tasks | Time Series |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07948v1 |
https://arxiv.org/pdf/2003.07948v1.pdf | |
PWC | https://paperswithcode.com/paper/brazildam-a-benchmark-dataset-for-tailings |
Repo | |
Framework | |
Do Deep Minds Think Alike? Selective Adversarial Attacks for Fine-Grained Manipulation of Multiple Deep Neural Networks
Title | Do Deep Minds Think Alike? Selective Adversarial Attacks for Fine-Grained Manipulation of Multiple Deep Neural Networks |
Authors | Zain Khan, Jirong Yi, Raghu Mudumbai, Xiaodong Wu, Weiyu Xu |
Abstract | Recent works have demonstrated the existence of {\it adversarial examples} targeting a single machine learning system. In this paper we ask a simple but fundamental question of “selective fooling”: given {\it multiple} machine learning systems assigned to solve the same classification problem and taking the same input signal, is it possible to construct a perturbation to the input signal that manipulates the outputs of these {\it multiple} machine learning systems {\it simultaneously} in arbitrary pre-defined ways? For example, is it possible to selectively fool a set of “enemy” machine learning systems but does not fool the other “friend” machine learning systems? The answer to this question depends on the extent to which these different machine learning systems “think alike”. We formulate the problem of “selective fooling” as a novel optimization problem, and report on a series of experiments on the MNIST dataset. Our preliminary findings from these experiments show that it is in fact very easy to selectively manipulate multiple MNIST classifiers simultaneously, even when the classifiers are identical in their architectures, training algorithms and training datasets except for random initialization during training. This suggests that two nominally equivalent machine learning systems do not in fact “think alike” at all, and opens the possibility for many novel applications and deeper understandings of the working principles of deep neural networks. |
Tasks | |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.11816v1 |
https://arxiv.org/pdf/2003.11816v1.pdf | |
PWC | https://paperswithcode.com/paper/do-deep-minds-think-alike-selective |
Repo | |
Framework | |
A Joint Approach to Compound Splitting and Idiomatic Compound Detection
Title | A Joint Approach to Compound Splitting and Idiomatic Compound Detection |
Authors | Irina Krotova, Sergey Aksenov, Ekaterina Artemova |
Abstract | Applications such as machine translation, speech recognition, and information retrieval require efficient handling of noun compounds as they are one of the possible sources for out-of-vocabulary (OOV) words. In-depth processing of noun compounds requires not only splitting them into smaller components (or even roots) but also the identification of instances that should remain unsplitted as they are of idiomatic nature. We develop a two-fold deep learning-based approach of noun compound splitting and idiomatic compound detection for the German language that we train using a newly collected corpus of annotated German compounds. Our neural noun compound splitter operates on a sub-word level and outperforms the current state of the art by about 5%. |
Tasks | Information Retrieval, Machine Translation, Speech Recognition |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09606v1 |
https://arxiv.org/pdf/2003.09606v1.pdf | |
PWC | https://paperswithcode.com/paper/a-joint-approach-to-compound-splitting-and |
Repo | |
Framework | |
No Regret Sample Selection with Noisy Labels
Title | No Regret Sample Selection with Noisy Labels |
Authors | N. Mitsuo, S. Uchida, D. Suehiro |
Abstract | Deep Neural Network (DNN) suffers from noisy labeled data because of the heavily overfitting risk. To avoid the risk, in this paper, we propose a novel sample selection framework for learning noisy samples. The core idea is to employ a “regret” minimization approach. The proposed sample selection method adaptively selects a subset of noisy-labeled training samples to minimize the regret to select noise samples. The algorithm efficiently works and performs with theoretical support. Moreover, unlike the typical approaches, the algorithm does not require any side information or learning information depending on the training settings of DNN. The experimental results demonstrate that the proposed method improves the performance of a black-box DNN with noisy labeled data. |
Tasks | |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03179v1 |
https://arxiv.org/pdf/2003.03179v1.pdf | |
PWC | https://paperswithcode.com/paper/no-regret-sample-selection-with-noisy-labels |
Repo | |
Framework | |