October 16, 2019

3233 words 16 mins read

Paper Group ANR 1090

Acoustic Scene Classification: A Competition Review. Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering. Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD. Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when …

Acoustic Scene Classification: A Competition Review


Title	Acoustic Scene Classification: A Competition Review
Authors	Shayan Gharib, Honain Derrar, Daisuke Niizumi, Tuukka Senttula, Janne Tommola, Toni Heittola, Tuomas Virtanen, Heikki Huttunen
Abstract	In this paper we study the problem of acoustic scene classification, i.e., categorization of audio sequences into mutually exclusive classes based on their spectral content. We describe the methods and results discovered during a competition organized in the context of a graduate machine learning course; both by the students and external participants. We identify the most suitable methods and study the impact of each by performing an ablation study of the mixture of approaches. We also compare the results with a neural network baseline, and show the improvement over that. Finally, we discuss the impact of using a competition as a part of a university course, and justify its importance in the curriculum based on student feedback.
Tasks	Acoustic Scene Classification, Scene Classification
Published	2018-08-02
URL	http://arxiv.org/abs/1808.02357v1
PDF	http://arxiv.org/pdf/1808.02357v1.pdf
PWC	https://paperswithcode.com/paper/acoustic-scene-classification-a-competition
Repo
Framework

Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering


Title	Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering
Authors	Rui Zhang, Cicero Nogueira dos Santos, Michihiro Yasunaga, Bing Xiang, Dragomir Radev
Abstract	Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity. The state-of-the-art end-to-end neural coreference model considers all text spans in a document as potential mentions and learns to link an antecedent for each possible mention. In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention model to get antecedent scores for each possible mention, and (2) jointly optimizing the mention detection accuracy and the mention clustering log-likelihood given the mention cluster labels. Our model achieves the state-of-the-art performance on the CoNLL-2012 Shared Task English test set.
Tasks	Coreference Resolution
Published	2018-05-13
URL	http://arxiv.org/abs/1805.04893v1
PDF	http://arxiv.org/pdf/1805.04893v1.pdf
PWC	https://paperswithcode.com/paper/neural-coreference-resolution-with-deep
Repo
Framework

Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD


Title	Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD
Authors	Marten van Dijk, Lam M. Nguyen, Phuong Ha Nguyen, Dzung T. Phan
Abstract	We study Stochastic Gradient Descent (SGD) with diminishing step sizes for convex objective functions. We introduce a definitional framework and theory that defines and characterizes a core property, called curvature, of convex objective functions. In terms of curvature we can derive a new inequality that can be used to compute an optimal sequence of diminishing step sizes by solving a differential equation. Our exact solutions confirm known results in literature and allows us to fully characterize a new regularizer with its corresponding expected convergence rates.
Tasks
Published	2018-10-09
URL	https://arxiv.org/abs/1810.04100v2
PDF	https://arxiv.org/pdf/1810.04100v2.pdf
PWC	https://paperswithcode.com/paper/characterization-of-convex-objective
Repo
Framework

Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting


Title	Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting
Authors	Mikel L. Forcada, Carolina Scarton, Lucia Specia, Barry Haddow, Alexandra Birch
Abstract	A popular application of machine translation (MT) is gisting: MT is consumed as is to make sense of text in a foreign language. Evaluation of the usefulness of MT for gisting is surprisingly uncommon. The classical method uses reading comprehension questionnaires (RCQ), in which informants are asked to answer professionally-written questions in their language about a foreign text that has been machine-translated into their language. Recently, gap-filling (GF), a form of cloze testing, has been proposed as a cheaper alternative to RCQ. In GF, certain words are removed from reference translations and readers are asked to fill the gaps left using the machine-translated text as a hint. This paper reports, for thefirst time, a comparative evaluation, using both RCQ and GF, of translations from multiple MT systems for the same foreign texts, and a systematic study on the effect of variables such as gap density, gap-selection strategies, and document context in GF. The main findings of the study are: (a) both RCQ and GF clearly identify MT to be useful, (b) global RCQ and GF rankings for the MT systems are mostly in agreement, (c) GF scores vary very widely across informants, making comparisons among MT systems hard, and (d) unlike RCQ, which is framed around documents, GF evaluation can be framed at the sentence level. These findings support the use of GF as a cheaper alternative to RCQ.
Tasks	Machine Translation, Reading Comprehension
Published	2018-09-02
URL	http://arxiv.org/abs/1809.00315v1
PDF	http://arxiv.org/pdf/1809.00315v1.pdf
PWC	https://paperswithcode.com/paper/exploring-gap-filling-as-a-cheaper
Repo
Framework

Deep Discriminative Model for Video Classification


Title	Deep Discriminative Model for Video Classification
Authors	Mohammad Tavakolian, Abdenour Hadid
Abstract	This paper presents a new deep learning approach for video-based scene classification. We design a Heterogeneous Deep Discriminative Model (HDDM) whose parameters are initialized by performing an unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBM). In order to avoid the redundancy of adjacent frames, we extract spatiotemporal variation patterns within frames and represent them sparsely using Sparse Cubic Symmetrical Pattern (SCSP). Then, a pre-initialized HDDM is separately trained using the videos of each class to learn class-specific models. According to the minimum reconstruction error from the learnt class-specific models, a weighted voting strategy is employed for the classification. The performance of the proposed method is extensively evaluated on two action recognition datasets; UCF101 and Hollywood II, and three dynamic texture and dynamic scene datasets; DynTex, YUPENN, and Maryland. The experimental results and comparisons against state-of-the-art methods demonstrate that the proposed method consistently achieves superior performance on all datasets.
Tasks	Scene Classification, Temporal Action Localization, Video Classification
Published	2018-07-22
URL	http://arxiv.org/abs/1807.08259v1
PDF	http://arxiv.org/pdf/1807.08259v1.pdf
PWC	https://paperswithcode.com/paper/deep-discriminative-model-for-video
Repo
Framework

Free-breathing cardiac MRI using bandlimited manifold modelling


Title	Free-breathing cardiac MRI using bandlimited manifold modelling
Authors	Sunrita Poddar, Yasir Mohsin, Deidra Ansah, Bijoy Thattaliyath, Ravi Ashwath, Mathews Jacob
Abstract	We introduce a novel bandlimited manifold framework and an algorithm to recover freebreathing and ungated cardiac MR images from highly undersampled measurements. The image frames in the free breathing and ungated dataset are assumed to be points on a bandlimited manifold. We introduce a novel kernel low-rank algorithm to estimate the manifold structure (Laplacian) from a navigator-based acquisition scheme. The structure of the manifold is then used to recover the images from highly undersampled measurements. A computationally efficient algorithm, which relies on the bandlimited approximation of the Laplacian matrix, is used to recover the images. The proposed scheme is demonstrated on several patients with different breathing patterns and cardiac rates, without requiring the need for manually tuning the reconstruction parameters in each case. The proposed scheme enabled the recovery of free-breathing and ungated data, providing reconstructions that are qualitatively similar to breath-held scans performed on the same patients. This shows the potential of the technique as a clinical protocol for free-breathing cardiac scans.
Tasks
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08909v1
PDF	http://arxiv.org/pdf/1802.08909v1.pdf
PWC	https://paperswithcode.com/paper/free-breathing-cardiac-mri-using-bandlimited
Repo
Framework

Using Machine Learning to Predict the Evolution of Physics Research


Title	Using Machine Learning to Predict the Evolution of Physics Research
Authors	Wenyuan Liu, Stanisław Saganowski, Przemysław Kazienko, Siew Ann Cheong
Abstract	The advancement of science as outlined by Popper and Kuhn is largely qualitative, but with bibliometric data it is possible and desirable to develop a quantitative picture of scientific progress. Furthermore it is also important to allocate finite resources to research topics that have growth potential, to accelerate the process from scientific breakthroughs to technological innovations. In this paper, we address this problem of quantitative knowledge evolution by analysing the APS publication data set from 1981 to 2010. We build the bibliographic coupling and co-citation networks, use the Louvain method to detect topical clusters (TCs) in each year, measure the similarity of TCs in consecutive years, and visualize the results as alluvial diagrams. Having the predictive features describing a given TC and its known evolution in the next year, we can train a machine learning model to predict future changes of TCs, i.e., their continuing, dissolving, merging and splitting. We found the number of papers from certain journals, the degree, closeness, and betweenness to be the most predictive features. Additionally, betweenness increases significantly for merging events, and decreases significantly for splitting events. Our results represent a first step from a descriptive understanding of the Science of Science (SciSci), towards one that is ultimately prescriptive.
Tasks
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12116v1
PDF	http://arxiv.org/pdf/1810.12116v1.pdf
PWC	https://paperswithcode.com/paper/using-machine-learning-to-predict-the
Repo
Framework

Accelerated Optimization in the PDE Framework: Formulations for the Manifold of Diffeomorphisms


Title	Accelerated Optimization in the PDE Framework: Formulations for the Manifold of Diffeomorphisms
Authors	Ganesh Sundaramoorthi, Anthony Yezzi
Abstract	We consider the problem of optimization of cost functionals on the infinite-dimensional manifold of diffeomorphisms. We present a new class of optimization methods, valid for any optimization problem setup on the space of diffeomorphisms by generalizing Nesterov accelerated optimization to the manifold of diffeomorphisms. While our framework is general for infinite dimensional manifolds, we specifically treat the case of diffeomorphisms, motivated by optical flow problems in computer vision. This is accomplished by building on a recent variational approach to a general class of accelerated optimization methods by Wibisono, Wilson and Jordan, which applies in finite dimensions. We generalize that approach to infinite dimensional manifolds. We derive the surprisingly simple continuum evolution equations, which are partial differential equations, for accelerated gradient descent, and relate it to simple mechanical principles from fluid mechanics. Our approach has natural connections to the optimal mass transport problem. This is because one can think of our approach as an evolution of an infinite number of particles endowed with mass (represented with a mass density) that moves in an energy landscape. The mass evolves with the optimization variable, and endows the particles with dynamics. This is different than the finite dimensional case where only a single particle moves and hence the dynamics does not depend on the mass. We derive the theory, compute the PDEs for accelerated optimization, and illustrate the behavior of these new accelerated optimization schemes.
Tasks	Optical Flow Estimation
Published	2018-04-04
URL	http://arxiv.org/abs/1804.02307v2
PDF	http://arxiv.org/pdf/1804.02307v2.pdf
PWC	https://paperswithcode.com/paper/accelerated-optimization-in-the-pde-framework
Repo
Framework

Survey of Face Detection on Low-quality Images


Title	Survey of Face Detection on Low-quality Images
Authors	Yuqian Zhou, Ding Liu, Thomas Huang
Abstract	Face detection is a well-explored problem. Many challenges on face detectors like extreme pose, illumination, low resolution and small scales are studied in the previous work. However, previous proposed models are mostly trained and tested on good-quality images which are not always the case for practical applications like surveillance systems. In this paper, we first review the current state-of-the-art face detectors and their performance on benchmark dataset FDDB, and compare the design protocols of the algorithms. Secondly, we investigate their performance degradation while testing on low-quality images with different levels of blur, noise, and contrast. Our results demonstrate that both hand-crafted and deep-learning based face detectors are not robust enough for low-quality images. It inspires researchers to produce more robust design for face detection in the wild.
Tasks	Face Detection
Published	2018-04-19
URL	http://arxiv.org/abs/1804.07362v1
PDF	http://arxiv.org/pdf/1804.07362v1.pdf
PWC	https://paperswithcode.com/paper/survey-of-face-detection-on-low-quality
Repo
Framework

Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks


Title	Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks
Authors	Daniel Tanneberg, Jan Peters, Elmar Rueckert
Abstract	Autonomous robots need to interact with unknown, unstructured and changing environments, constantly facing novel challenges. Therefore, continuous online adaptation for lifelong-learning and the need of sample-efficient mechanisms to adapt to changes in the environment, the constraints, the tasks, or the robot itself are crucial. In this work, we propose a novel framework for probabilistic online motion planning with online adaptation based on a bio-inspired stochastic recurrent neural network. By using learning signals which mimic the intrinsic motivation signalcognitive dissonance in addition with a mental replay strategy to intensify experiences, the stochastic recurrent network can learn from few physical interactions and adapts to novel environments in seconds. We evaluate our online planning and adaptation framework on an anthropomorphic KUKA LWR arm. The rapid online adaptation is shown by learning unknown workspace constraints sample-efficiently from few physical interactions while following given way points.
Tasks	Motion Planning
Published	2018-02-22
URL	http://arxiv.org/abs/1802.08013v2
PDF	http://arxiv.org/pdf/1802.08013v2.pdf
PWC	https://paperswithcode.com/paper/intrinsic-motivation-and-mental-replay-enable
Repo
Framework

IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles


Title	IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles
Authors	Tianze Shi, Kedar Tatwawadi, Kaushik Chakrabarti, Yi Mao, Oleksandr Polozov, Weizhu Chen
Abstract	We present a sequence-to-action parsing approach for the natural language to SQL task that incrementally fills the slots of a SQL query with feasible actions from a pre-defined inventory. To account for the fact that typically there are multiple correct SQL queries with the same or very similar semantics, we draw inspiration from syntactic parsing techniques and propose to train our sequence-to-action models with non-deterministic oracles. We evaluate our models on the WikiSQL dataset and achieve an execution accuracy of 83.7% on the test set, a 2.1% absolute improvement over the models trained with traditional static oracles assuming a single correct target SQL query. When further combined with the execution-guided decoding strategy, our model sets a new state-of-the-art performance at an execution accuracy of 87.1%.
Tasks	Action Parsing, Text-To-Sql
Published	2018-09-13
URL	http://arxiv.org/abs/1809.05054v2
PDF	http://arxiv.org/pdf/1809.05054v2.pdf
PWC	https://paperswithcode.com/paper/incsql-training-incremental-text-to-sql
Repo
Framework

Deep Attention-guided Hashing


Title	Deep Attention-guided Hashing
Authors	Zhan Yang, Osolo Ian Raymond, Wuqing Sun, Jun Long
Abstract	With the rapid growth of multimedia data (e.g., image, audio and video etc.) on the web, learning-based hashing techniques such as Deep Supervised Hashing (DSH) have proven to be very efficient for large-scale multimedia search. The recent successes seen in Learning-based hashing methods are largely due to the success of deep learning-based hashing methods. However, there are some limitations to previous learning-based hashing methods (e.g., the learned hash codes containing repetitive and highly correlated information). In this paper, we propose a novel learning-based hashing method, named Deep Attention-guided Hashing (DAgH). DAgH is implemented using two stream frameworks. The core idea is to use guided hash codes which are generated by the hashing network of the first stream framework (called first hashing network) to guide the training of the hashing network of the second stream framework (called second hashing network). Specifically, in the first network, it leverages an attention network and hashing network to generate the attention-guided hash codes from the original images. The loss function we propose contains two components: the semantic loss and the attention loss. The attention loss is used to punish the attention network to obtain the salient region from pairs of images; in the second network, these attention-guided hash codes are used to guide the training of the second hashing network (i.e., these codes are treated as supervised labels to train the second network). By doing this, DAgH can make full use of the most critical information contained in images to guide the second hashing network in order to learn efficient hash codes in a true end-to-end fashion. Results from our experiments demonstrate that DAgH can generate high quality hash codes and it outperforms current state-of-the-art methods on three benchmark datasets, CIFAR-10, NUS-WIDE, and ImageNet.
Tasks	Deep Attention
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01404v2
PDF	http://arxiv.org/pdf/1812.01404v2.pdf
PWC	https://paperswithcode.com/paper/deep-attention-guided-hashing
Repo
Framework

An Adaptive Learning Method of Personality Trait Based Mood in Mental State Transition Network by Recurrent Neural Network


Title	An Adaptive Learning Method of Personality Trait Based Mood in Mental State Transition Network by Recurrent Neural Network
Authors	Takumi Ichimura, Kosuke Tanabe, Toshiyuki Yamashita
Abstract	Mental State Transition Network (MSTN) is a basic concept of approximating to human psychological and mental responses. A stimulus calculated by Emotion Generating Calculations (EGC) method can cause the transition of mood from an emotional state to others. In this paper, the agent can interact with human to realize smooth communication by an adaptive learning method of the user’s personality trait based mood. The learning method consists of the profit sharing (PS) method and the recurrent neural network (RNN). An emotion for sensor inputs to MSTN is calculated by EGC and the variance of emotion leads to the change of mental state, and then the sequence of states forms an episode. In order to learn the tendency of personality trait effectively, the ineffective rules should be removed from the episode. PS method finds out a detour in episode and should be deleted. Furthermore, RNN works to realize the variance of user’s mood. Some experimental results were shown the success of representing a various human’s delicate emotion.
Tasks
Published	2018-04-09
URL	http://arxiv.org/abs/1804.02813v1
PDF	http://arxiv.org/pdf/1804.02813v1.pdf
PWC	https://paperswithcode.com/paper/an-adaptive-learning-method-of-personality
Repo
Framework

Local Distance Metric Learning for Nearest Neighbor Algorithm


Title	Local Distance Metric Learning for Nearest Neighbor Algorithm
Authors	Hossein Rajabzadeh, Mansoor Zolghadri Jahromi, Mohammad Sadegh Zare, Mostafa Fakhrahmad
Abstract	Distance metric learning is a successful way to enhance the performance of the nearest neighbor classifier. In most cases, however, the distribution of data does not obey a regular form and may change in different parts of the feature space. Regarding that, this paper proposes a novel local distance metric learning method, namely Local Mahalanobis Distance Learning (LMDL), in order to enhance the performance of the nearest neighbor classifier. LMDL considers the neighborhood influence and learns multiple distance metrics for a reduced set of input samples. The reduced set is called as prototypes which try to preserve local discriminative information as much as possible. The proposed LMDL can be kernelized very easily, which is significantly desirable in the case of highly nonlinear data. The quality as well as the efficiency of the proposed method assesses through a set of different experiments on various datasets and the obtained results show that LDML as well as the kernelized version is superior to the other related state-of-the-art methods.
Tasks	Metric Learning
Published	2018-03-05
URL	http://arxiv.org/abs/1803.01562v2
PDF	http://arxiv.org/pdf/1803.01562v2.pdf
PWC	https://paperswithcode.com/paper/local-distance-metric-learning-for-nearest
Repo
Framework

Visual Attention on the Sun: What Do Existing Models Actually Predict?


Title	Visual Attention on the Sun: What Do Existing Models Actually Predict?
Authors	Jia Li, Daowei Li, Kui Fu, Long Xu
Abstract	Visual attention prediction is a classic problem that seems to be well addressed in the deep learning era. One compelling concern, however, gradually arise along with the rapidly growing performance scores over existing visual attention datasets: do existing deep models really capture the inherent mechanism of human visual attention? To address this concern, this paper proposes a new dataset, named VASUN, that records the free-viewing human attention on solar images. Different from previous datasets, images in VASUN contain many irregular visual patterns that existing deep models have never seen. By benchmarking existing models on VASUN, we find the performances of many state-of-the-art deep models drop remarkably, while many classic shallow models perform impressively. From these results, we find that the significant performance advance of existing deep attention models may come from their capabilities of memorizing and predicting the occurrence of some specific visual patterns other than learning the inherent mechanism of human visual attention. In addition, we also train several baseline models on VASUN to demonstrate the feasibility and key issues of predicting visual attention on the sun. These baseline models, together with the proposed dataset, can be used to revisit the problem of visual attention prediction from a novel perspective that are complementary to existing ones.
Tasks	Deep Attention
Published	2018-11-25
URL	http://arxiv.org/abs/1811.10004v1
PDF	http://arxiv.org/pdf/1811.10004v1.pdf
PWC	https://paperswithcode.com/paper/visual-attention-on-the-sun-what-do-existing
Repo
Framework