October 16, 2019

3233 words 16 mins read

Paper Group ANR 1090

Paper Group ANR 1090

Acoustic Scene Classification: A Competition Review. Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering. Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD. Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when …

Acoustic Scene Classification: A Competition Review

Title Acoustic Scene Classification: A Competition Review
Authors Shayan Gharib, Honain Derrar, Daisuke Niizumi, Tuukka Senttula, Janne Tommola, Toni Heittola, Tuomas Virtanen, Heikki Huttunen
Abstract In this paper we study the problem of acoustic scene classification, i.e., categorization of audio sequences into mutually exclusive classes based on their spectral content. We describe the methods and results discovered during a competition organized in the context of a graduate machine learning course; both by the students and external participants. We identify the most suitable methods and study the impact of each by performing an ablation study of the mixture of approaches. We also compare the results with a neural network baseline, and show the improvement over that. Finally, we discuss the impact of using a competition as a part of a university course, and justify its importance in the curriculum based on student feedback.
Tasks Acoustic Scene Classification, Scene Classification
Published 2018-08-02
URL http://arxiv.org/abs/1808.02357v1
PDF http://arxiv.org/pdf/1808.02357v1.pdf
PWC https://paperswithcode.com/paper/acoustic-scene-classification-a-competition
Repo
Framework

Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering

Title Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering
Authors Rui Zhang, Cicero Nogueira dos Santos, Michihiro Yasunaga, Bing Xiang, Dragomir Radev
Abstract Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity. The state-of-the-art end-to-end neural coreference model considers all text spans in a document as potential mentions and learns to link an antecedent for each possible mention. In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention model to get antecedent scores for each possible mention, and (2) jointly optimizing the mention detection accuracy and the mention clustering log-likelihood given the mention cluster labels. Our model achieves the state-of-the-art performance on the CoNLL-2012 Shared Task English test set.
Tasks Coreference Resolution
Published 2018-05-13
URL http://arxiv.org/abs/1805.04893v1
PDF http://arxiv.org/pdf/1805.04893v1.pdf
PWC https://paperswithcode.com/paper/neural-coreference-resolution-with-deep
Repo
Framework

Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD

Title Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD
Authors Marten van Dijk, Lam M. Nguyen, Phuong Ha Nguyen, Dzung T. Phan
Abstract We study Stochastic Gradient Descent (SGD) with diminishing step sizes for convex objective functions. We introduce a definitional framework and theory that defines and characterizes a core property, called curvature, of convex objective functions. In terms of curvature we can derive a new inequality that can be used to compute an optimal sequence of diminishing step sizes by solving a differential equation. Our exact solutions confirm known results in literature and allows us to fully characterize a new regularizer with its corresponding expected convergence rates.
Tasks
Published 2018-10-09
URL https://arxiv.org/abs/1810.04100v2
PDF https://arxiv.org/pdf/1810.04100v2.pdf
PWC https://paperswithcode.com/paper/characterization-of-convex-objective
Repo
Framework

Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting

Title Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting
Authors Mikel L. Forcada, Carolina Scarton, Lucia Specia, Barry Haddow, Alexandra Birch
Abstract A popular application of machine translation (MT) is gisting: MT is consumed as is to make sense of text in a foreign language. Evaluation of the usefulness of MT for gisting is surprisingly uncommon. The classical method uses reading comprehension questionnaires (RCQ), in which informants are asked to answer professionally-written questions in their language about a foreign text that has been machine-translated into their language. Recently, gap-filling (GF), a form of cloze testing, has been proposed as a cheaper alternative to RCQ. In GF, certain words are removed from reference translations and readers are asked to fill the gaps left using the machine-translated text as a hint. This paper reports, for thefirst time, a comparative evaluation, using both RCQ and GF, of translations from multiple MT systems for the same foreign texts, and a systematic study on the effect of variables such as gap density, gap-selection strategies, and document context in GF. The main findings of the study are: (a) both RCQ and GF clearly identify MT to be useful, (b) global RCQ and GF rankings for the MT systems are mostly in agreement, (c) GF scores vary very widely across informants, making comparisons among MT systems hard, and (d) unlike RCQ, which is framed around documents, GF evaluation can be framed at the sentence level. These findings support the use of GF as a cheaper alternative to RCQ.
Tasks Machine Translation, Reading Comprehension
Published 2018-09-02
URL http://arxiv.org/abs/1809.00315v1
PDF http://arxiv.org/pdf/1809.00315v1.pdf
PWC https://paperswithcode.com/paper/exploring-gap-filling-as-a-cheaper
Repo
Framework

Deep Discriminative Model for Video Classification

Title Deep Discriminative Model for Video Classification
Authors Mohammad Tavakolian, Abdenour Hadid
Abstract This paper presents a new deep learning approach for video-based scene classification. We design a Heterogeneous Deep Discriminative Model (HDDM) whose parameters are initialized by performing an unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBM). In order to avoid the redundancy of adjacent frames, we extract spatiotemporal variation patterns within frames and represent them sparsely using Sparse Cubic Symmetrical Pattern (SCSP). Then, a pre-initialized HDDM is separately trained using the videos of each class to learn class-specific models. According to the minimum reconstruction error from the learnt class-specific models, a weighted voting strategy is employed for the classification. The performance of the proposed method is extensively evaluated on two action recognition datasets; UCF101 and Hollywood II, and three dynamic texture and dynamic scene datasets; DynTex, YUPENN, and Maryland. The experimental results and comparisons against state-of-the-art methods demonstrate that the proposed method consistently achieves superior performance on all datasets.
Tasks Scene Classification, Temporal Action Localization, Video Classification
Published 2018-07-22
URL http://arxiv.org/abs/1807.08259v1
PDF http://arxiv.org/pdf/1807.08259v1.pdf
PWC https://paperswithcode.com/paper/deep-discriminative-model-for-video
Repo
Framework

Free-breathing cardiac MRI using bandlimited manifold modelling

Title Free-breathing cardiac MRI using bandlimited manifold modelling
Authors Sunrita Poddar, Yasir Mohsin, Deidra Ansah, Bijoy Thattaliyath, Ravi Ashwath, Mathews Jacob
Abstract We introduce a novel bandlimited manifold framework and an algorithm to recover freebreathing and ungated cardiac MR images from highly undersampled measurements. The image frames in the free breathing and ungated dataset are assumed to be points on a bandlimited manifold. We introduce a novel kernel low-rank algorithm to estimate the manifold structure (Laplacian) from a navigator-based acquisition scheme. The structure of the manifold is then used to recover the images from highly undersampled measurements. A computationally efficient algorithm, which relies on the bandlimited approximation of the Laplacian matrix, is used to recover the images. The proposed scheme is demonstrated on several patients with different breathing patterns and cardiac rates, without requiring the need for manually tuning the reconstruction parameters in each case. The proposed scheme enabled the recovery of free-breathing and ungated data, providing reconstructions that are qualitatively similar to breath-held scans performed on the same patients. This shows the potential of the technique as a clinical protocol for free-breathing cardiac scans.
Tasks
Published 2018-02-24
URL http://arxiv.org/abs/1802.08909v1
PDF http://arxiv.org/pdf/1802.08909v1.pdf
PWC https://paperswithcode.com/paper/free-breathing-cardiac-mri-using-bandlimited
Repo
Framework

Using Machine Learning to Predict the Evolution of Physics Research

Title Using Machine Learning to Predict the Evolution of Physics Research
Authors Wenyuan Liu, Stanisław Saganowski, Przemysław Kazienko, Siew Ann Cheong
Abstract The advancement of science as outlined by Popper and Kuhn is largely qualitative, but with bibliometric data it is possible and desirable to develop a quantitative picture of scientific progress. Furthermore it is also important to allocate finite resources to research topics that have growth potential, to accelerate the process from scientific breakthroughs to technological innovations. In this paper, we address this problem of quantitative knowledge evolution by analysing the APS publication data set from 1981 to 2010. We build the bibliographic coupling and co-citation networks, use the Louvain method to detect topical clusters (TCs) in each year, measure the similarity of TCs in consecutive years, and visualize the results as alluvial diagrams. Having the predictive features describing a given TC and its known evolution in the next year, we can train a machine learning model to predict future changes of TCs, i.e., their continuing, dissolving, merging and splitting. We found the number of papers from certain journals, the degree, closeness, and betweenness to be the most predictive features. Additionally, betweenness increases significantly for merging events, and decreases significantly for splitting events. Our results represent a first step from a descriptive understanding of the Science of Science (SciSci), towards one that is ultimately prescriptive.
Tasks
Published 2018-10-29
URL http://arxiv.org/abs/1810.12116v1
PDF http://arxiv.org/pdf/1810.12116v1.pdf
PWC https://paperswithcode.com/paper/using-machine-learning-to-predict-the
Repo
Framework

Accelerated Optimization in the PDE Framework: Formulations for the Manifold of Diffeomorphisms

Title Accelerated Optimization in the PDE Framework: Formulations for the Manifold of Diffeomorphisms
Authors Ganesh Sundaramoorthi, Anthony Yezzi
Abstract We consider the problem of optimization of cost functionals on the infinite-dimensional manifold of diffeomorphisms. We present a new class of optimization methods, valid for any optimization problem setup on the space of diffeomorphisms by generalizing Nesterov accelerated optimization to the manifold of diffeomorphisms. While our framework is general for infinite dimensional manifolds, we specifically treat the case of diffeomorphisms, motivated by optical flow problems in computer vision. This is accomplished by building on a recent variational approach to a general class of accelerated optimization methods by Wibisono, Wilson and Jordan, which applies in finite dimensions. We generalize that approach to infinite dimensional manifolds. We derive the surprisingly simple continuum evolution equations, which are partial differential equations, for accelerated gradient descent, and relate it to simple mechanical principles from fluid mechanics. Our approach has natural connections to the optimal mass transport problem. This is because one can think of our approach as an evolution of an infinite number of particles endowed with mass (represented with a mass density) that moves in an energy landscape. The mass evolves with the optimization variable, and endows the particles with dynamics. This is different than the finite dimensional case where only a single particle moves and hence the dynamics does not depend on the mass. We derive the theory, compute the PDEs for accelerated optimization, and illustrate the behavior of these new accelerated optimization schemes.
Tasks Optical Flow Estimation
Published 2018-04-04
URL http://arxiv.org/abs/1804.02307v2
PDF http://arxiv.org/pdf/1804.02307v2.pdf
PWC https://paperswithcode.com/paper/accelerated-optimization-in-the-pde-framework
Repo
Framework

Survey of Face Detection on Low-quality Images

Title Survey of Face Detection on Low-quality Images
Authors Yuqian Zhou, Ding Liu, Thomas Huang
Abstract Face detection is a well-explored problem. Many challenges on face detectors like extreme pose, illumination, low resolution and small scales are studied in the previous work. However, previous proposed models are mostly trained and tested on good-quality images which are not always the case for practical applications like surveillance systems. In this paper, we first review the current state-of-the-art face detectors and their performance on benchmark dataset FDDB, and compare the design protocols of the algorithms. Secondly, we investigate their performance degradation while testing on low-quality images with different levels of blur, noise, and contrast. Our results demonstrate that both hand-crafted and deep-learning based face detectors are not robust enough for low-quality images. It inspires researchers to produce more robust design for face detection in the wild.
Tasks Face Detection
Published 2018-04-19
URL http://arxiv.org/abs/1804.07362v1
PDF http://arxiv.org/pdf/1804.07362v1.pdf
PWC https://paperswithcode.com/paper/survey-of-face-detection-on-low-quality
Repo
Framework

Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks

Title Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks
Authors Daniel Tanneberg, Jan Peters, Elmar Rueckert
Abstract Autonomous robots need to interact with unknown, unstructured and changing environments, constantly facing novel challenges. Therefore, continuous online adaptation for lifelong-learning and the need of sample-efficient mechanisms to adapt to changes in the environment, the constraints, the tasks, or the robot itself are crucial. In this work, we propose a novel framework for probabilistic online motion planning with online adaptation based on a bio-inspired stochastic recurrent neural network. By using learning signals which mimic the intrinsic motivation signalcognitive dissonance in addition with a mental replay strategy to intensify experiences, the stochastic recurrent network can learn from few physical interactions and adapts to novel environments in seconds. We evaluate our online planning and adaptation framework on an anthropomorphic KUKA LWR arm. The rapid online adaptation is shown by learning unknown workspace constraints sample-efficiently from few physical interactions while following given way points.
Tasks Motion Planning
Published 2018-02-22
URL http://arxiv.org/abs/1802.08013v2
PDF http://arxiv.org/pdf/1802.08013v2.pdf
PWC https://paperswithcode.com/paper/intrinsic-motivation-and-mental-replay-enable
Repo
Framework

IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles

Title IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles
Authors Tianze Shi, Kedar Tatwawadi, Kaushik Chakrabarti, Yi Mao, Oleksandr Polozov, Weizhu Chen
Abstract We present a sequence-to-action parsing approach for the natural language to SQL task that incrementally fills the slots of a SQL query with feasible actions from a pre-defined inventory. To account for the fact that typically there are multiple correct SQL queries with the same or very similar semantics, we draw inspiration from syntactic parsing techniques and propose to train our sequence-to-action models with non-deterministic oracles. We evaluate our models on the WikiSQL dataset and achieve an execution accuracy of 83.7% on the test set, a 2.1% absolute improvement over the models trained with traditional static oracles assuming a single correct target SQL query. When further combined with the execution-guided decoding strategy, our model sets a new state-of-the-art performance at an execution accuracy of 87.1%.
Tasks Action Parsing, Text-To-Sql
Published 2018-09-13
URL http://arxiv.org/abs/1809.05054v2
PDF http://arxiv.org/pdf/1809.05054v2.pdf
PWC https://paperswithcode.com/paper/incsql-training-incremental-text-to-sql
Repo
Framework

Deep Attention-guided Hashing

Title Deep Attention-guided Hashing
Authors Zhan Yang, Osolo Ian Raymond, Wuqing Sun, Jun Long
Abstract With the rapid growth of multimedia data (e.g., image, audio and video etc.) on the web, learning-based hashing techniques such as Deep Supervised Hashing (DSH) have proven to be very efficient for large-scale multimedia search. The recent successes seen in Learning-based hashing methods are largely due to the success of deep learning-based hashing methods. However, there are some limitations to previous learning-based hashing methods (e.g., the learned hash codes containing repetitive and highly correlated information). In this paper, we propose a novel learning-based hashing method, named Deep Attention-guided Hashing (DAgH). DAgH is implemented using two stream frameworks. The core idea is to use guided hash codes which are generated by the hashing network of the first stream framework (called first hashing network) to guide the training of the hashing network of the second stream framework (called second hashing network). Specifically, in the first network, it leverages an attention network and hashing network to generate the attention-guided hash codes from the original images. The loss function we propose contains two components: the semantic loss and the attention loss. The attention loss is used to punish the attention network to obtain the salient region from pairs of images; in the second network, these attention-guided hash codes are used to guide the training of the second hashing network (i.e., these codes are treated as supervised labels to train the second network). By doing this, DAgH can make full use of the most critical information contained in images to guide the second hashing network in order to learn efficient hash codes in a true end-to-end fashion. Results from our experiments demonstrate that DAgH can generate high quality hash codes and it outperforms current state-of-the-art methods on three benchmark datasets, CIFAR-10, NUS-WIDE, and ImageNet.
Tasks Deep Attention
Published 2018-12-04
URL http://arxiv.org/abs/1812.01404v2
PDF http://arxiv.org/pdf/1812.01404v2.pdf
PWC https://paperswithcode.com/paper/deep-attention-guided-hashing
Repo
Framework

An Adaptive Learning Method of Personality Trait Based Mood in Mental State Transition Network by Recurrent Neural Network

Title An Adaptive Learning Method of Personality Trait Based Mood in Mental State Transition Network by Recurrent Neural Network
Authors Takumi Ichimura, Kosuke Tanabe, Toshiyuki Yamashita
Abstract Mental State Transition Network (MSTN) is a basic concept of approximating to human psychological and mental responses. A stimulus calculated by Emotion Generating Calculations (EGC) method can cause the transition of mood from an emotional state to others. In this paper, the agent can interact with human to realize smooth communication by an adaptive learning method of the user’s personality trait based mood. The learning method consists of the profit sharing (PS) method and the recurrent neural network (RNN). An emotion for sensor inputs to MSTN is calculated by EGC and the variance of emotion leads to the change of mental state, and then the sequence of states forms an episode. In order to learn the tendency of personality trait effectively, the ineffective rules should be removed from the episode. PS method finds out a detour in episode and should be deleted. Furthermore, RNN works to realize the variance of user’s mood. Some experimental results were shown the success of representing a various human’s delicate emotion.
Tasks
Published 2018-04-09
URL http://arxiv.org/abs/1804.02813v1
PDF http://arxiv.org/pdf/1804.02813v1.pdf
PWC https://paperswithcode.com/paper/an-adaptive-learning-method-of-personality
Repo
Framework

Local Distance Metric Learning for Nearest Neighbor Algorithm

Title Local Distance Metric Learning for Nearest Neighbor Algorithm
Authors Hossein Rajabzadeh, Mansoor Zolghadri Jahromi, Mohammad Sadegh Zare, Mostafa Fakhrahmad
Abstract Distance metric learning is a successful way to enhance the performance of the nearest neighbor classifier. In most cases, however, the distribution of data does not obey a regular form and may change in different parts of the feature space. Regarding that, this paper proposes a novel local distance metric learning method, namely Local Mahalanobis Distance Learning (LMDL), in order to enhance the performance of the nearest neighbor classifier. LMDL considers the neighborhood influence and learns multiple distance metrics for a reduced set of input samples. The reduced set is called as prototypes which try to preserve local discriminative information as much as possible. The proposed LMDL can be kernelized very easily, which is significantly desirable in the case of highly nonlinear data. The quality as well as the efficiency of the proposed method assesses through a set of different experiments on various datasets and the obtained results show that LDML as well as the kernelized version is superior to the other related state-of-the-art methods.
Tasks Metric Learning
Published 2018-03-05
URL http://arxiv.org/abs/1803.01562v2
PDF http://arxiv.org/pdf/1803.01562v2.pdf
PWC https://paperswithcode.com/paper/local-distance-metric-learning-for-nearest
Repo
Framework

Visual Attention on the Sun: What Do Existing Models Actually Predict?

Title Visual Attention on the Sun: What Do Existing Models Actually Predict?
Authors Jia Li, Daowei Li, Kui Fu, Long Xu
Abstract Visual attention prediction is a classic problem that seems to be well addressed in the deep learning era. One compelling concern, however, gradually arise along with the rapidly growing performance scores over existing visual attention datasets: do existing deep models really capture the inherent mechanism of human visual attention? To address this concern, this paper proposes a new dataset, named VASUN, that records the free-viewing human attention on solar images. Different from previous datasets, images in VASUN contain many irregular visual patterns that existing deep models have never seen. By benchmarking existing models on VASUN, we find the performances of many state-of-the-art deep models drop remarkably, while many classic shallow models perform impressively. From these results, we find that the significant performance advance of existing deep attention models may come from their capabilities of memorizing and predicting the occurrence of some specific visual patterns other than learning the inherent mechanism of human visual attention. In addition, we also train several baseline models on VASUN to demonstrate the feasibility and key issues of predicting visual attention on the sun. These baseline models, together with the proposed dataset, can be used to revisit the problem of visual attention prediction from a novel perspective that are complementary to existing ones.
Tasks Deep Attention
Published 2018-11-25
URL http://arxiv.org/abs/1811.10004v1
PDF http://arxiv.org/pdf/1811.10004v1.pdf
PWC https://paperswithcode.com/paper/visual-attention-on-the-sun-what-do-existing
Repo
Framework
comments powered by Disqus