October 17, 2019

3127 words 15 mins read

Paper Group ANR 929

An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering. Democratizing Production-Scale Distributed Deep Learning. An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation. Accurate Hand Keypoint Localization on Mobile Devices. Incomplete Multi-view …

An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering


Title	An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering
Authors	Hongzhi Zhang, Guandong Xu, Xiao Liang, Tinglei Huang, Kun fu
Abstract	Relation detection plays a crucial role in Knowledge Base Question Answering (KBQA) because of the high variance of relation expression in the question. Traditional deep learning methods follow an encoding-comparing paradigm, where the question and the candidate relation are represented as vectors to compare their semantic similarity. Max- or average- pooling operation, which compresses the sequence of words into fixed-dimensional vectors, becomes the bottleneck of information. In this paper, we propose to learn attention-based word-level interactions between questions and relations to alleviate the bottleneck issue. Similar to the traditional models, the question and relation are firstly represented as sequences of vectors. Then, instead of merging the sequence into a single vector with pooling operation, soft alignments between words from the question and the relation are learned. The aligned words are subsequently compared with the convolutional neural network (CNN) and the comparison results are merged finally. Through performing the comparison on low-level representations, the attention-based word-level interaction model (ABWIM) relieves the information loss issue caused by merging the sequence into a fixed-dimensional vector before the comparison. The experimental results of relation detection on both SimpleQuestions and WebQuestions datasets show that ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness.
Tasks	Knowledge Base Question Answering, Question Answering, Semantic Similarity, Semantic Textual Similarity
Published	2018-01-30
URL	http://arxiv.org/abs/1801.09893v1
PDF	http://arxiv.org/pdf/1801.09893v1.pdf
PWC	https://paperswithcode.com/paper/an-attention-based-word-level-interaction
Repo
Framework

Democratizing Production-Scale Distributed Deep Learning


Title	Democratizing Production-Scale Distributed Deep Learning
Authors	Minghuang Ma, Hadi Pouransari, Daniel Chao, Saurabh Adya, Santiago Akle Serrano, Yi Qin, Dan Gimnicher, Dominic Walsh
Abstract	The interest and demand for training deep neural networks have been experiencing rapid growth, spanning a wide range of applications in both academia and industry. However, training them distributed and at scale remains difficult due to the complex ecosystem of tools and hardware involved. One consequence is that the responsibility of orchestrating these complex components is often left to one-off scripts and glue code customized for specific problems. To address these restrictions, we introduce \emph{Alchemist} - an internal service built at Apple from the ground up for \emph{easy}, \emph{fast}, and \emph{scalable} distributed training. We discuss its design, implementation, and examples of running different flavors of distributed training. We also present case studies of its internal adoption in the development of autonomous systems, where training times have been reduced by 10x to keep up with the ever-growing data collection.
Tasks
Published	2018-10-31
URL	http://arxiv.org/abs/1811.00143v2
PDF	http://arxiv.org/pdf/1811.00143v2.pdf
PWC	https://paperswithcode.com/paper/democratizing-production-scale-distributed
Repo
Framework

An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation


Title	An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation
Authors	Hamed Zamani, Markus Schedl, Paul Lamere, Ching-Wei Chen
Abstract	The ACM Recommender Systems Challenge 2018 focused on the task of automatic music playlist continuation, which is a form of the more general task of sequential recommendation. Given a playlist of arbitrary length with some additional meta-data, the task was to recommend up to 500 tracks that fit the target characteristics of the original playlist. For the RecSys Challenge, Spotify released a dataset of one million user-generated playlists. Participants could compete in two tracks, i.e., main and creative tracks. Participants in the main track were only allowed to use the provided training set, however, in the creative track, the use of external public sources was permitted. In total, 113 teams submitted 1,228 runs to the main track; 33 teams submitted 239 runs to the creative track. The highest performing team in the main track achieved an R-precision of 0.2241, an NDCG of 0.3946, and an average number of recommended songs clicks of 1.784. In the creative track, an R-precision of 0.2233, an NDCG of 0.3939, and a click rate of 1.785 was obtained by the best team. This article provides an overview of the challenge, including motivation, task definition, dataset description, and evaluation. We further report and analyze the results obtained by the top performing teams in each track and explore the approaches taken by the winners. We finally summarize our key findings, discuss generalizability of approaches and results to domains other than music, and list the open avenues and possible future directions in the area of automatic playlist continuation.
Tasks	Recommendation Systems
Published	2018-10-02
URL	https://arxiv.org/abs/1810.01520v2
PDF	https://arxiv.org/pdf/1810.01520v2.pdf
PWC	https://paperswithcode.com/paper/an-analysis-of-approaches-taken-in-the-acm
Repo
Framework

Accurate Hand Keypoint Localization on Mobile Devices


Title	Accurate Hand Keypoint Localization on Mobile Devices
Authors	Filippos Gouidis, Paschalis Panteleris, Iason Oikonomidis, Antonis Argyros
Abstract	We present a novel approach for 2D hand keypoint localization from regular color input. The proposed approach relies on an appropriately designed Convolutional Neural Network (CNN) that computes a set of heatmaps, one per hand keypoint of interest. Extensive experiments with the proposed method compare it against state of the art approaches and demonstrate its accuracy and computational performance on standard, publicly available datasets. The obtained results demonstrate that the proposed method matches or outperforms the competing methods in accuracy, but clearly outperforms them in computational efficiency, making it a suitable building block for applications that require hand keypoint estimation on mobile devices.
Tasks	Hand Keypoint Localization
Published	2018-12-19
URL	http://arxiv.org/abs/1812.08028v1
PDF	http://arxiv.org/pdf/1812.08028v1.pdf
PWC	https://paperswithcode.com/paper/accurate-hand-keypoint-localization-on-mobile
Repo
Framework

Incomplete Multi-view Clustering via Graph Regularized Matrix Factorization


Title	Incomplete Multi-view Clustering via Graph Regularized Matrix Factorization
Authors	Jie Wen, Zheng Zhang, Yong Xu, Zuofeng Zhong
Abstract	Clustering with incomplete views is a challenge in multi-view clustering. In this paper, we provide a novel and simple method to address this issue. Specifically, the proposed method simultaneously exploits the local information of each view and the complementary information among views to learn the common latent representation for all samples, which can greatly improve the compactness and discriminability of the obtained representation. Compared with the conventional graph embedding methods, the proposed method does not introduce any extra regularization term and corresponding penalty parameter to preserve the local structure of data, and thus does not increase the burden of extra parameter selection. By imposing the orthogonal constraint on the basis matrix of each view, the proposed method is able to handle the out-of-sample. Moreover, the proposed method can be viewed as a unified framework for multi-view learning since it can handle both incomplete and complete multi-view clustering and classification tasks. Extensive experiments conducted on several multi-view datasets prove that the proposed method can significantly improve the clustering performance.
Tasks	Graph Embedding, MULTI-VIEW LEARNING
Published	2018-09-17
URL	http://arxiv.org/abs/1809.05998v1
PDF	http://arxiv.org/pdf/1809.05998v1.pdf
PWC	https://paperswithcode.com/paper/incomplete-multi-view-clustering-via-graph
Repo
Framework

A Compositional Textual Model for Recognition of Imperfect Word Images


Title	A Compositional Textual Model for Recognition of Imperfect Word Images
Authors	Wei Tang, John Corring, Ying Wu, Gang Hua
Abstract	Printed text recognition is an important problem for industrial OCR systems. Printed text is constructed in a standard procedural fashion in most settings. We develop a mathematical model for this process that can be applied to the backward inference problem of text recognition from an image. Through ablation experiments we show that this model is realistic and that a multi-task objective setting can help to stabilize estimation of its free parameters, enabling use of conventional deep learning methods. Furthermore, by directly modeling the geometric perturbations of text synthesis we show that our model can help recover missing characters from incomplete text regions, the bane of multicomponent OCR systems, enabling recognition even when the detection returns incomplete information.
Tasks	Optical Character Recognition
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11239v1
PDF	http://arxiv.org/pdf/1811.11239v1.pdf
PWC	https://paperswithcode.com/paper/a-compositional-textual-model-for-recognition
Repo
Framework

On Chatbots Exhibiting Goal-Directed Autonomy in Dynamic Environments


Title	On Chatbots Exhibiting Goal-Directed Autonomy in Dynamic Environments
Authors	Biplav Srivastava
Abstract	Conversation interfaces (CIs), or chatbots, are a popular form of intelligent agents that engage humans in task-oriented or informal conversation. In this position paper and demonstration, we argue that chatbots working in dynamic environments, like with sensor data, can not only serve as a promising platform to research issues at the intersection of learning, reasoning, representation and execution for goal-directed autonomy; but also handle non-trivial business applications. We explore the underlying issues in the context of Water Advisor, a preliminary multi-modal conversation system that can access and explain water quality data.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09789v1
PDF	http://arxiv.org/pdf/1803.09789v1.pdf
PWC	https://paperswithcode.com/paper/on-chatbots-exhibiting-goal-directed-autonomy
Repo
Framework

Tight Bounds for Collaborative PAC Learning via Multiplicative Weights


Title	Tight Bounds for Collaborative PAC Learning via Multiplicative Weights
Authors	Jiecao Chen, Qin Zhang, Yuan Zhou
Abstract	We study the collaborative PAC learning problem recently proposed in Blum et al.~\cite{BHPQ17}, in which we have $k$ players and they want to learn a target function collaboratively, such that the learned function approximates the target function well on all players’ distributions simultaneously. The quality of the collaborative learning algorithm is measured by the ratio between the sample complexity of the algorithm and that of the learning algorithm for a single distribution (called the overhead). We obtain a collaborative learning algorithm with overhead $O(\ln k)$, improving the one with overhead $O(\ln^2 k)$ in \cite{BHPQ17}. We also show that an $\Omega(\ln k)$ overhead is inevitable when $k$ is polynomial bounded by the VC dimension of the hypothesis class. Finally, our experimental study has demonstrated the superiority of our algorithm compared with the one in Blum et al. on real-world datasets.
Tasks
Published	2018-05-23
URL	http://arxiv.org/abs/1805.09217v2
PDF	http://arxiv.org/pdf/1805.09217v2.pdf
PWC	https://paperswithcode.com/paper/tight-bounds-for-collaborative-pac-learning
Repo
Framework

Profitable Bandits


Title	Profitable Bandits
Authors	Mastane Achab, Stephan Clémençon, Aurélien Garivier
Abstract	Originally motivated by default risk management applications, this paper investigates a novel problem, referred to as the profitable bandit problem here. At each step, an agent chooses a subset of the K possible actions. For each action chosen, she then receives the sum of a random number of rewards. Her objective is to maximize her cumulated earnings. We adapt and study three well-known strategies in this purpose, that were proved to be most efficient in other settings: kl-UCB, Bayes-UCB and Thompson Sampling. For each of them, we prove a finite time regret bound which, together with a lower bound we obtain as well, establishes asymptotic optimality. Our goal is also to compare these three strategies from a theoretical and empirical perspective both at the same time. We give simple, self-contained proofs that emphasize their similarities, as well as their differences. While both Bayesian strategies are automatically adapted to the geometry of information, the numerical experiments carried out show a slight advantage for Thompson Sampling in practice.
Tasks
Published	2018-05-08
URL	http://arxiv.org/abs/1805.02908v1
PDF	http://arxiv.org/pdf/1805.02908v1.pdf
PWC	https://paperswithcode.com/paper/profitable-bandits
Repo
Framework

Survey of Computational Approaches to Lexical Semantic Change


Title	Survey of Computational Approaches to Lexical Semantic Change
Authors	Nina Tahmasebi, Lars Borin, Adam Jatowt
Abstract	Our languages are in constant flux driven by external factors such as cultural, societal and technological changes, as well as by only partially understood internal motivations. Words acquire new meanings and lose old senses, new words are coined or borrowed from other languages and obsolete words slide into obscurity. Understanding the characteristics of shifts in the meaning and in the use of words is useful for those who work with the content of historical texts, the interested general public, but also in and of itself. The findings from automatic lexical semantic change detection, and the models of diachronic conceptual change are currently being incorporated in approaches for measuring document across-time similarity, information retrieval from long-term document archives, the design of OCR algorithms, and so on. In recent years we have seen a surge in interest in the academic community in computational methods and tools supporting inquiry into diachronic conceptual change and lexical replacement. This article is an extract of a survey of recent computational techniques to tackle lexical semantic change currently under review. In this article we focus on diachronic conceptual change as an extension of semantic change.
Tasks	Information Retrieval, Optical Character Recognition
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06278v2
PDF	http://arxiv.org/pdf/1811.06278v2.pdf
PWC	https://paperswithcode.com/paper/survey-of-computational-approaches-to
Repo
Framework

Microscopic Nuclei Classification, Segmentation and Detection with improved Deep Convolutional Neural Network (DCNN) Approaches


Title	Microscopic Nuclei Classification, Segmentation and Detection with improved Deep Convolutional Neural Network (DCNN) Approaches
Authors	Md Zahangir Alom, Chris Yakopcic, Tarek M. Taha, Vijayan K. Asari
Abstract	Due to cellular heterogeneity, cell nuclei classification, segmentation, and detection from pathological images are challenging tasks. In the last few years, Deep Convolutional Neural Networks (DCNN) approaches have been shown state-of-the-art (SOTA) performance on histopathological imaging in different studies. In this work, we have proposed different advanced DCNN models and evaluated for nuclei classification, segmentation, and detection. First, the Densely Connected Recurrent Convolutional Network (DCRN) model is used for nuclei classification. Second, Recurrent Residual U-Net (R2U-Net) is applied for nuclei segmentation. Third, the R2U-Net regression model which is named UD-Net is used for nuclei detection from pathological images. The experiments are conducted with different datasets including Routine Colon Cancer(RCC) classification and detection dataset, and Nuclei Segmentation Challenge 2018 dataset. The experimental results show that the proposed DCNN models provide superior performance compared to the existing approaches for nuclei classification, segmentation, and detection tasks. The results are evaluated with different performance metrics including precision, recall, Dice Coefficient (DC), Means Squared Errors (MSE), F1-score, and overall accuracy. We have achieved around 3.4% and 4.5% better F-1 score for nuclei classification and detection tasks compared to recently published DCNN based method. In addition, R2U-Net shows around 92.15% testing accuracy in term of DC. These improved methods will help for pathological practices for better quantitative analysis of nuclei in Whole Slide Images(WSI) which ultimately will help for better understanding of different types of cancer in clinical workflow.
Tasks	Nuclei Classification
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03447v1
PDF	http://arxiv.org/pdf/1811.03447v1.pdf
PWC	https://paperswithcode.com/paper/microscopic-nuclei-classification
Repo
Framework

From Videos to URLs: A Multi-Browser Guide To Extract User’s Behavior with Optical Character Recognition


Title	From Videos to URLs: A Multi-Browser Guide To Extract User’s Behavior with Optical Character Recognition
Authors	Mojtaba Heidarysafa, James Reed, Kamran Kowsari, April Celeste R. Leviton, Janet I. Warren, Donald E. Brown
Abstract	Tracking users’ activities on the World Wide Web (WWW) allows researchers to analyze each user’s internet behavior as time passes and for the amount of time spent on a particular domain. This analysis can be used in research design, as researchers may access to their participant’s behaviors while browsing the web. Web search behavior has been a subject of interest because of its real-world applications in marketing, digital advertisement, and identifying potential threats online. In this paper, we present an image-processing based method to extract domains which are visited by a participant over multiple browsers during a lab session. This method could provide another way to collect users’ activities during an online session given that the session recorder collected the data. The method can also be used to collect the textual content of web-pages that an individual visits for later analysis
Tasks	Optical Character Recognition
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06193v1
PDF	http://arxiv.org/pdf/1811.06193v1.pdf
PWC	https://paperswithcode.com/paper/from-videos-to-urls-a-multi-browser-guide-to
Repo
Framework

Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery


Title	Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery
Authors	Thomas Stepleton, Razvan Pascanu, Will Dabney, Siddhant M. Jayakumar, Hubert Soyer, Remi Munos
Abstract	Reinforcement learning (RL) agents performing complex tasks must be able to remember observations and actions across sizable time intervals. This is especially true during the initial learning stages, when exploratory behaviour can increase the delay between specific actions and their effects. Many new or popular approaches for learning these distant correlations employ backpropagation through time (BPTT), but this technique requires storing observation traces long enough to span the interval between cause and effect. Besides memory demands, learning dynamics like vanishing gradients and slow convergence due to infrequent weight updates can reduce BPTT’s practicality; meanwhile, although online recurrent network learning is a developing topic, most approaches are not efficient enough to use as replacements. We propose a simple, effective memory strategy that can extend the window over which BPTT can learn without requiring longer traces. We explore this approach empirically on a few tasks and discuss its implications.
Tasks
Published	2018-05-13
URL	http://arxiv.org/abs/1805.04955v1
PDF	http://arxiv.org/pdf/1805.04955v1.pdf
PWC	https://paperswithcode.com/paper/low-pass-recurrent-neural-networks-a-memory
Repo
Framework

Fuzzy Clustering to Identify Clusters at Different Levels of Fuzziness: An Evolutionary Multi-Objective Optimization Approach


Title	Fuzzy Clustering to Identify Clusters at Different Levels of Fuzziness: An Evolutionary Multi-Objective Optimization Approach
Authors	Avisek Gupta, Shounak Datta, Swagatam Das
Abstract	Fuzzy clustering methods identify naturally occurring clusters in a dataset, where the extent to which different clusters are overlapped can differ. Most methods have a parameter to fix the level of fuzziness. However, the appropriate level of fuzziness depends on the application at hand. This paper presents Entropy $c$-Means (ECM), a method of fuzzy clustering that simultaneously optimizes two contradictory objective functions, resulting in the creation of fuzzy clusters with different levels of fuzziness. This allows ECM to identify clusters with different degrees of overlap. ECM optimizes the two objective functions using two multi-objective optimization methods, Non-dominated Sorting Genetic Algorithm II (NSGA-II), and Multiobjective Evolutionary Algorithm based on Decomposition (MOEA/D). We also propose a method to select a suitable trade-off clustering from the Pareto front. Experiments on challenging synthetic datasets as well as real-world datasets show that ECM leads to better cluster detection compared to the conventional fuzzy clustering methods as well as previously used multi-objective methods for fuzzy clustering.
Tasks
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03327v1
PDF	http://arxiv.org/pdf/1808.03327v1.pdf
PWC	https://paperswithcode.com/paper/fuzzy-clustering-to-identify-clusters-at
Repo
Framework

RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks


Title	RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks
Authors	Xiuyuan Cheng, Qiang Qiu, Robert Calderbank, Guillermo Sapiro
Abstract	Explicit encoding of group actions in deep features makes it possible for convolutional neural networks (CNNs) to handle global deformations of images, which is critical to success in many vision tasks. This paper proposes to decompose the convolutional filters over joint steerable bases across the space and the group geometry simultaneously, namely a rotation-equivariant CNN with decomposed convolutional filters (RotDCF). This decomposition facilitates computing the joint convolution, which is proved to be necessary for the group equivariance. It significantly reduces the model size and computational complexity while preserving performance, and truncation of the bases expansion serves implicitly to regularize the filters. On datasets involving in-plane and out-of-plane object rotations, RotDCF deep features demonstrate greater robustness and interpretability than regular CNNs. The stability of the equivariant representation to input variations is also proved theoretically under generic assumptions on the filters in the decomposed form. The RotDCF framework can be extended to groups other than rotations, providing a general approach which achieves both group equivariance and representation stability at a reduced model size.
Tasks
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06846v1
PDF	http://arxiv.org/pdf/1805.06846v1.pdf
PWC	https://paperswithcode.com/paper/rotdcf-decomposition-of-convolutional-filters
Repo
Framework