Paper Group ANR 1090
Acoustic Scene Classification: A Competition Review. Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering. Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD. Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when …
Acoustic Scene Classification: A Competition Review
Title | Acoustic Scene Classification: A Competition Review |
Authors | Shayan Gharib, Honain Derrar, Daisuke Niizumi, Tuukka Senttula, Janne Tommola, Toni Heittola, Tuomas Virtanen, Heikki Huttunen |
Abstract | In this paper we study the problem of acoustic scene classification, i.e., categorization of audio sequences into mutually exclusive classes based on their spectral content. We describe the methods and results discovered during a competition organized in the context of a graduate machine learning course; both by the students and external participants. We identify the most suitable methods and study the impact of each by performing an ablation study of the mixture of approaches. We also compare the results with a neural network baseline, and show the improvement over that. Finally, we discuss the impact of using a competition as a part of a university course, and justify its importance in the curriculum based on student feedback. |
Tasks | Acoustic Scene Classification, Scene Classification |
Published | 2018-08-02 |
URL | http://arxiv.org/abs/1808.02357v1 |
http://arxiv.org/pdf/1808.02357v1.pdf | |
PWC | https://paperswithcode.com/paper/acoustic-scene-classification-a-competition |
Repo | |
Framework | |
Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering
Title | Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering |
Authors | Rui Zhang, Cicero Nogueira dos Santos, Michihiro Yasunaga, Bing Xiang, Dragomir Radev |
Abstract | Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity. The state-of-the-art end-to-end neural coreference model considers all text spans in a document as potential mentions and learns to link an antecedent for each possible mention. In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention model to get antecedent scores for each possible mention, and (2) jointly optimizing the mention detection accuracy and the mention clustering log-likelihood given the mention cluster labels. Our model achieves the state-of-the-art performance on the CoNLL-2012 Shared Task English test set. |
Tasks | Coreference Resolution |
Published | 2018-05-13 |
URL | http://arxiv.org/abs/1805.04893v1 |
http://arxiv.org/pdf/1805.04893v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-coreference-resolution-with-deep |
Repo | |
Framework | |
Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD
Title | Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD |
Authors | Marten van Dijk, Lam M. Nguyen, Phuong Ha Nguyen, Dzung T. Phan |
Abstract | We study Stochastic Gradient Descent (SGD) with diminishing step sizes for convex objective functions. We introduce a definitional framework and theory that defines and characterizes a core property, called curvature, of convex objective functions. In terms of curvature we can derive a new inequality that can be used to compute an optimal sequence of diminishing step sizes by solving a differential equation. Our exact solutions confirm known results in literature and allows us to fully characterize a new regularizer with its corresponding expected convergence rates. |
Tasks | |
Published | 2018-10-09 |
URL | https://arxiv.org/abs/1810.04100v2 |
https://arxiv.org/pdf/1810.04100v2.pdf | |
PWC | https://paperswithcode.com/paper/characterization-of-convex-objective |
Repo | |
Framework | |
Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting
Title | Exploring Gap Filling as a Cheaper Alternative to Reading Comprehension Questionnaires when Evaluating Machine Translation for Gisting |
Authors | Mikel L. Forcada, Carolina Scarton, Lucia Specia, Barry Haddow, Alexandra Birch |
Abstract | A popular application of machine translation (MT) is gisting: MT is consumed as is to make sense of text in a foreign language. Evaluation of the usefulness of MT for gisting is surprisingly uncommon. The classical method uses reading comprehension questionnaires (RCQ), in which informants are asked to answer professionally-written questions in their language about a foreign text that has been machine-translated into their language. Recently, gap-filling (GF), a form of cloze testing, has been proposed as a cheaper alternative to RCQ. In GF, certain words are removed from reference translations and readers are asked to fill the gaps left using the machine-translated text as a hint. This paper reports, for thefirst time, a comparative evaluation, using both RCQ and GF, of translations from multiple MT systems for the same foreign texts, and a systematic study on the effect of variables such as gap density, gap-selection strategies, and document context in GF. The main findings of the study are: (a) both RCQ and GF clearly identify MT to be useful, (b) global RCQ and GF rankings for the MT systems are mostly in agreement, (c) GF scores vary very widely across informants, making comparisons among MT systems hard, and (d) unlike RCQ, which is framed around documents, GF evaluation can be framed at the sentence level. These findings support the use of GF as a cheaper alternative to RCQ. |
Tasks | Machine Translation, Reading Comprehension |
Published | 2018-09-02 |
URL | http://arxiv.org/abs/1809.00315v1 |
http://arxiv.org/pdf/1809.00315v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-gap-filling-as-a-cheaper |
Repo | |
Framework | |
Deep Discriminative Model for Video Classification
Title | Deep Discriminative Model for Video Classification |
Authors | Mohammad Tavakolian, Abdenour Hadid |
Abstract | This paper presents a new deep learning approach for video-based scene classification. We design a Heterogeneous Deep Discriminative Model (HDDM) whose parameters are initialized by performing an unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBM). In order to avoid the redundancy of adjacent frames, we extract spatiotemporal variation patterns within frames and represent them sparsely using Sparse Cubic Symmetrical Pattern (SCSP). Then, a pre-initialized HDDM is separately trained using the videos of each class to learn class-specific models. According to the minimum reconstruction error from the learnt class-specific models, a weighted voting strategy is employed for the classification. The performance of the proposed method is extensively evaluated on two action recognition datasets; UCF101 and Hollywood II, and three dynamic texture and dynamic scene datasets; DynTex, YUPENN, and Maryland. The experimental results and comparisons against state-of-the-art methods demonstrate that the proposed method consistently achieves superior performance on all datasets. |
Tasks | Scene Classification, Temporal Action Localization, Video Classification |
Published | 2018-07-22 |
URL | http://arxiv.org/abs/1807.08259v1 |
http://arxiv.org/pdf/1807.08259v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-discriminative-model-for-video |
Repo | |
Framework | |
Free-breathing cardiac MRI using bandlimited manifold modelling
Title | Free-breathing cardiac MRI using bandlimited manifold modelling |
Authors | Sunrita Poddar, Yasir Mohsin, Deidra Ansah, Bijoy Thattaliyath, Ravi Ashwath, Mathews Jacob |
Abstract | We introduce a novel bandlimited manifold framework and an algorithm to recover freebreathing and ungated cardiac MR images from highly undersampled measurements. The image frames in the free breathing and ungated dataset are assumed to be points on a bandlimited manifold. We introduce a novel kernel low-rank algorithm to estimate the manifold structure (Laplacian) from a navigator-based acquisition scheme. The structure of the manifold is then used to recover the images from highly undersampled measurements. A computationally efficient algorithm, which relies on the bandlimited approximation of the Laplacian matrix, is used to recover the images. The proposed scheme is demonstrated on several patients with different breathing patterns and cardiac rates, without requiring the need for manually tuning the reconstruction parameters in each case. The proposed scheme enabled the recovery of free-breathing and ungated data, providing reconstructions that are qualitatively similar to breath-held scans performed on the same patients. This shows the potential of the technique as a clinical protocol for free-breathing cardiac scans. |
Tasks | |
Published | 2018-02-24 |
URL | http://arxiv.org/abs/1802.08909v1 |
http://arxiv.org/pdf/1802.08909v1.pdf | |
PWC | https://paperswithcode.com/paper/free-breathing-cardiac-mri-using-bandlimited |
Repo | |
Framework | |
Using Machine Learning to Predict the Evolution of Physics Research
Title | Using Machine Learning to Predict the Evolution of Physics Research |
Authors | Wenyuan Liu, Stanisław Saganowski, Przemysław Kazienko, Siew Ann Cheong |
Abstract | The advancement of science as outlined by Popper and Kuhn is largely qualitative, but with bibliometric data it is possible and desirable to develop a quantitative picture of scientific progress. Furthermore it is also important to allocate finite resources to research topics that have growth potential, to accelerate the process from scientific breakthroughs to technological innovations. In this paper, we address this problem of quantitative knowledge evolution by analysing the APS publication data set from 1981 to 2010. We build the bibliographic coupling and co-citation networks, use the Louvain method to detect topical clusters (TCs) in each year, measure the similarity of TCs in consecutive years, and visualize the results as alluvial diagrams. Having the predictive features describing a given TC and its known evolution in the next year, we can train a machine learning model to predict future changes of TCs, i.e., their continuing, dissolving, merging and splitting. We found the number of papers from certain journals, the degree, closeness, and betweenness to be the most predictive features. Additionally, betweenness increases significantly for merging events, and decreases significantly for splitting events. Our results represent a first step from a descriptive understanding of the Science of Science (SciSci), towards one that is ultimately prescriptive. |
Tasks | |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12116v1 |
http://arxiv.org/pdf/1810.12116v1.pdf | |
PWC | https://paperswithcode.com/paper/using-machine-learning-to-predict-the |
Repo | |
Framework | |
Accelerated Optimization in the PDE Framework: Formulations for the Manifold of Diffeomorphisms
Title | Accelerated Optimization in the PDE Framework: Formulations for the Manifold of Diffeomorphisms |
Authors | Ganesh Sundaramoorthi, Anthony Yezzi |
Abstract | We consider the problem of optimization of cost functionals on the infinite-dimensional manifold of diffeomorphisms. We present a new class of optimization methods, valid for any optimization problem setup on the space of diffeomorphisms by generalizing Nesterov accelerated optimization to the manifold of diffeomorphisms. While our framework is general for infinite dimensional manifolds, we specifically treat the case of diffeomorphisms, motivated by optical flow problems in computer vision. This is accomplished by building on a recent variational approach to a general class of accelerated optimization methods by Wibisono, Wilson and Jordan, which applies in finite dimensions. We generalize that approach to infinite dimensional manifolds. We derive the surprisingly simple continuum evolution equations, which are partial differential equations, for accelerated gradient descent, and relate it to simple mechanical principles from fluid mechanics. Our approach has natural connections to the optimal mass transport problem. This is because one can think of our approach as an evolution of an infinite number of particles endowed with mass (represented with a mass density) that moves in an energy landscape. The mass evolves with the optimization variable, and endows the particles with dynamics. This is different than the finite dimensional case where only a single particle moves and hence the dynamics does not depend on the mass. We derive the theory, compute the PDEs for accelerated optimization, and illustrate the behavior of these new accelerated optimization schemes. |
Tasks | Optical Flow Estimation |
Published | 2018-04-04 |
URL | http://arxiv.org/abs/1804.02307v2 |
http://arxiv.org/pdf/1804.02307v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-optimization-in-the-pde-framework |
Repo | |
Framework | |
Survey of Face Detection on Low-quality Images
Title | Survey of Face Detection on Low-quality Images |
Authors | Yuqian Zhou, Ding Liu, Thomas Huang |
Abstract | Face detection is a well-explored problem. Many challenges on face detectors like extreme pose, illumination, low resolution and small scales are studied in the previous work. However, previous proposed models are mostly trained and tested on good-quality images which are not always the case for practical applications like surveillance systems. In this paper, we first review the current state-of-the-art face detectors and their performance on benchmark dataset FDDB, and compare the design protocols of the algorithms. Secondly, we investigate their performance degradation while testing on low-quality images with different levels of blur, noise, and contrast. Our results demonstrate that both hand-crafted and deep-learning based face detectors are not robust enough for low-quality images. It inspires researchers to produce more robust design for face detection in the wild. |
Tasks | Face Detection |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07362v1 |
http://arxiv.org/pdf/1804.07362v1.pdf | |
PWC | https://paperswithcode.com/paper/survey-of-face-detection-on-low-quality |
Repo | |
Framework | |
Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks
Title | Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks |
Authors | Daniel Tanneberg, Jan Peters, Elmar Rueckert |
Abstract | Autonomous robots need to interact with unknown, unstructured and changing environments, constantly facing novel challenges. Therefore, continuous online adaptation for lifelong-learning and the need of sample-efficient mechanisms to adapt to changes in the environment, the constraints, the tasks, or the robot itself are crucial. In this work, we propose a novel framework for probabilistic online motion planning with online adaptation based on a bio-inspired stochastic recurrent neural network. By using learning signals which mimic the intrinsic motivation signalcognitive dissonance in addition with a mental replay strategy to intensify experiences, the stochastic recurrent network can learn from few physical interactions and adapts to novel environments in seconds. We evaluate our online planning and adaptation framework on an anthropomorphic KUKA LWR arm. The rapid online adaptation is shown by learning unknown workspace constraints sample-efficiently from few physical interactions while following given way points. |
Tasks | Motion Planning |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.08013v2 |
http://arxiv.org/pdf/1802.08013v2.pdf | |
PWC | https://paperswithcode.com/paper/intrinsic-motivation-and-mental-replay-enable |
Repo | |
Framework | |
IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles
Title | IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles |
Authors | Tianze Shi, Kedar Tatwawadi, Kaushik Chakrabarti, Yi Mao, Oleksandr Polozov, Weizhu Chen |
Abstract | We present a sequence-to-action parsing approach for the natural language to SQL task that incrementally fills the slots of a SQL query with feasible actions from a pre-defined inventory. To account for the fact that typically there are multiple correct SQL queries with the same or very similar semantics, we draw inspiration from syntactic parsing techniques and propose to train our sequence-to-action models with non-deterministic oracles. We evaluate our models on the WikiSQL dataset and achieve an execution accuracy of 83.7% on the test set, a 2.1% absolute improvement over the models trained with traditional static oracles assuming a single correct target SQL query. When further combined with the execution-guided decoding strategy, our model sets a new state-of-the-art performance at an execution accuracy of 87.1%. |
Tasks | Action Parsing, Text-To-Sql |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.05054v2 |
http://arxiv.org/pdf/1809.05054v2.pdf | |
PWC | https://paperswithcode.com/paper/incsql-training-incremental-text-to-sql |
Repo | |
Framework | |
Deep Attention-guided Hashing
Title | Deep Attention-guided Hashing |
Authors | Zhan Yang, Osolo Ian Raymond, Wuqing Sun, Jun Long |
Abstract | With the rapid growth of multimedia data (e.g., image, audio and video etc.) on the web, learning-based hashing techniques such as Deep Supervised Hashing (DSH) have proven to be very efficient for large-scale multimedia search. The recent successes seen in Learning-based hashing methods are largely due to the success of deep learning-based hashing methods. However, there are some limitations to previous learning-based hashing methods (e.g., the learned hash codes containing repetitive and highly correlated information). In this paper, we propose a novel learning-based hashing method, named Deep Attention-guided Hashing (DAgH). DAgH is implemented using two stream frameworks. The core idea is to use guided hash codes which are generated by the hashing network of the first stream framework (called first hashing network) to guide the training of the hashing network of the second stream framework (called second hashing network). Specifically, in the first network, it leverages an attention network and hashing network to generate the attention-guided hash codes from the original images. The loss function we propose contains two components: the semantic loss and the attention loss. The attention loss is used to punish the attention network to obtain the salient region from pairs of images; in the second network, these attention-guided hash codes are used to guide the training of the second hashing network (i.e., these codes are treated as supervised labels to train the second network). By doing this, DAgH can make full use of the most critical information contained in images to guide the second hashing network in order to learn efficient hash codes in a true end-to-end fashion. Results from our experiments demonstrate that DAgH can generate high quality hash codes and it outperforms current state-of-the-art methods on three benchmark datasets, CIFAR-10, NUS-WIDE, and ImageNet. |
Tasks | Deep Attention |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01404v2 |
http://arxiv.org/pdf/1812.01404v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-attention-guided-hashing |
Repo | |
Framework | |
An Adaptive Learning Method of Personality Trait Based Mood in Mental State Transition Network by Recurrent Neural Network
Title | An Adaptive Learning Method of Personality Trait Based Mood in Mental State Transition Network by Recurrent Neural Network |
Authors | Takumi Ichimura, Kosuke Tanabe, Toshiyuki Yamashita |
Abstract | Mental State Transition Network (MSTN) is a basic concept of approximating to human psychological and mental responses. A stimulus calculated by Emotion Generating Calculations (EGC) method can cause the transition of mood from an emotional state to others. In this paper, the agent can interact with human to realize smooth communication by an adaptive learning method of the user’s personality trait based mood. The learning method consists of the profit sharing (PS) method and the recurrent neural network (RNN). An emotion for sensor inputs to MSTN is calculated by EGC and the variance of emotion leads to the change of mental state, and then the sequence of states forms an episode. In order to learn the tendency of personality trait effectively, the ineffective rules should be removed from the episode. PS method finds out a detour in episode and should be deleted. Furthermore, RNN works to realize the variance of user’s mood. Some experimental results were shown the success of representing a various human’s delicate emotion. |
Tasks | |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.02813v1 |
http://arxiv.org/pdf/1804.02813v1.pdf | |
PWC | https://paperswithcode.com/paper/an-adaptive-learning-method-of-personality |
Repo | |
Framework | |
Local Distance Metric Learning for Nearest Neighbor Algorithm
Title | Local Distance Metric Learning for Nearest Neighbor Algorithm |
Authors | Hossein Rajabzadeh, Mansoor Zolghadri Jahromi, Mohammad Sadegh Zare, Mostafa Fakhrahmad |
Abstract | Distance metric learning is a successful way to enhance the performance of the nearest neighbor classifier. In most cases, however, the distribution of data does not obey a regular form and may change in different parts of the feature space. Regarding that, this paper proposes a novel local distance metric learning method, namely Local Mahalanobis Distance Learning (LMDL), in order to enhance the performance of the nearest neighbor classifier. LMDL considers the neighborhood influence and learns multiple distance metrics for a reduced set of input samples. The reduced set is called as prototypes which try to preserve local discriminative information as much as possible. The proposed LMDL can be kernelized very easily, which is significantly desirable in the case of highly nonlinear data. The quality as well as the efficiency of the proposed method assesses through a set of different experiments on various datasets and the obtained results show that LDML as well as the kernelized version is superior to the other related state-of-the-art methods. |
Tasks | Metric Learning |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01562v2 |
http://arxiv.org/pdf/1803.01562v2.pdf | |
PWC | https://paperswithcode.com/paper/local-distance-metric-learning-for-nearest |
Repo | |
Framework | |
Visual Attention on the Sun: What Do Existing Models Actually Predict?
Title | Visual Attention on the Sun: What Do Existing Models Actually Predict? |
Authors | Jia Li, Daowei Li, Kui Fu, Long Xu |
Abstract | Visual attention prediction is a classic problem that seems to be well addressed in the deep learning era. One compelling concern, however, gradually arise along with the rapidly growing performance scores over existing visual attention datasets: do existing deep models really capture the inherent mechanism of human visual attention? To address this concern, this paper proposes a new dataset, named VASUN, that records the free-viewing human attention on solar images. Different from previous datasets, images in VASUN contain many irregular visual patterns that existing deep models have never seen. By benchmarking existing models on VASUN, we find the performances of many state-of-the-art deep models drop remarkably, while many classic shallow models perform impressively. From these results, we find that the significant performance advance of existing deep attention models may come from their capabilities of memorizing and predicting the occurrence of some specific visual patterns other than learning the inherent mechanism of human visual attention. In addition, we also train several baseline models on VASUN to demonstrate the feasibility and key issues of predicting visual attention on the sun. These baseline models, together with the proposed dataset, can be used to revisit the problem of visual attention prediction from a novel perspective that are complementary to existing ones. |
Tasks | Deep Attention |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.10004v1 |
http://arxiv.org/pdf/1811.10004v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-attention-on-the-sun-what-do-existing |
Repo | |
Framework | |