Paper Group ANR 929
An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering. Democratizing Production-Scale Distributed Deep Learning. An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation. Accurate Hand Keypoint Localization on Mobile Devices. Incomplete Multi-view …
An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering
Title | An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering |
Authors | Hongzhi Zhang, Guandong Xu, Xiao Liang, Tinglei Huang, Kun fu |
Abstract | Relation detection plays a crucial role in Knowledge Base Question Answering (KBQA) because of the high variance of relation expression in the question. Traditional deep learning methods follow an encoding-comparing paradigm, where the question and the candidate relation are represented as vectors to compare their semantic similarity. Max- or average- pooling operation, which compresses the sequence of words into fixed-dimensional vectors, becomes the bottleneck of information. In this paper, we propose to learn attention-based word-level interactions between questions and relations to alleviate the bottleneck issue. Similar to the traditional models, the question and relation are firstly represented as sequences of vectors. Then, instead of merging the sequence into a single vector with pooling operation, soft alignments between words from the question and the relation are learned. The aligned words are subsequently compared with the convolutional neural network (CNN) and the comparison results are merged finally. Through performing the comparison on low-level representations, the attention-based word-level interaction model (ABWIM) relieves the information loss issue caused by merging the sequence into a fixed-dimensional vector before the comparison. The experimental results of relation detection on both SimpleQuestions and WebQuestions datasets show that ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness. |
Tasks | Knowledge Base Question Answering, Question Answering, Semantic Similarity, Semantic Textual Similarity |
Published | 2018-01-30 |
URL | http://arxiv.org/abs/1801.09893v1 |
http://arxiv.org/pdf/1801.09893v1.pdf | |
PWC | https://paperswithcode.com/paper/an-attention-based-word-level-interaction |
Repo | |
Framework | |
Democratizing Production-Scale Distributed Deep Learning
Title | Democratizing Production-Scale Distributed Deep Learning |
Authors | Minghuang Ma, Hadi Pouransari, Daniel Chao, Saurabh Adya, Santiago Akle Serrano, Yi Qin, Dan Gimnicher, Dominic Walsh |
Abstract | The interest and demand for training deep neural networks have been experiencing rapid growth, spanning a wide range of applications in both academia and industry. However, training them distributed and at scale remains difficult due to the complex ecosystem of tools and hardware involved. One consequence is that the responsibility of orchestrating these complex components is often left to one-off scripts and glue code customized for specific problems. To address these restrictions, we introduce \emph{Alchemist} - an internal service built at Apple from the ground up for \emph{easy}, \emph{fast}, and \emph{scalable} distributed training. We discuss its design, implementation, and examples of running different flavors of distributed training. We also present case studies of its internal adoption in the development of autonomous systems, where training times have been reduced by 10x to keep up with the ever-growing data collection. |
Tasks | |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1811.00143v2 |
http://arxiv.org/pdf/1811.00143v2.pdf | |
PWC | https://paperswithcode.com/paper/democratizing-production-scale-distributed |
Repo | |
Framework | |
An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation
Title | An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation |
Authors | Hamed Zamani, Markus Schedl, Paul Lamere, Ching-Wei Chen |
Abstract | The ACM Recommender Systems Challenge 2018 focused on the task of automatic music playlist continuation, which is a form of the more general task of sequential recommendation. Given a playlist of arbitrary length with some additional meta-data, the task was to recommend up to 500 tracks that fit the target characteristics of the original playlist. For the RecSys Challenge, Spotify released a dataset of one million user-generated playlists. Participants could compete in two tracks, i.e., main and creative tracks. Participants in the main track were only allowed to use the provided training set, however, in the creative track, the use of external public sources was permitted. In total, 113 teams submitted 1,228 runs to the main track; 33 teams submitted 239 runs to the creative track. The highest performing team in the main track achieved an R-precision of 0.2241, an NDCG of 0.3946, and an average number of recommended songs clicks of 1.784. In the creative track, an R-precision of 0.2233, an NDCG of 0.3939, and a click rate of 1.785 was obtained by the best team. This article provides an overview of the challenge, including motivation, task definition, dataset description, and evaluation. We further report and analyze the results obtained by the top performing teams in each track and explore the approaches taken by the winners. We finally summarize our key findings, discuss generalizability of approaches and results to domains other than music, and list the open avenues and possible future directions in the area of automatic playlist continuation. |
Tasks | Recommendation Systems |
Published | 2018-10-02 |
URL | https://arxiv.org/abs/1810.01520v2 |
https://arxiv.org/pdf/1810.01520v2.pdf | |
PWC | https://paperswithcode.com/paper/an-analysis-of-approaches-taken-in-the-acm |
Repo | |
Framework | |
Accurate Hand Keypoint Localization on Mobile Devices
Title | Accurate Hand Keypoint Localization on Mobile Devices |
Authors | Filippos Gouidis, Paschalis Panteleris, Iason Oikonomidis, Antonis Argyros |
Abstract | We present a novel approach for 2D hand keypoint localization from regular color input. The proposed approach relies on an appropriately designed Convolutional Neural Network (CNN) that computes a set of heatmaps, one per hand keypoint of interest. Extensive experiments with the proposed method compare it against state of the art approaches and demonstrate its accuracy and computational performance on standard, publicly available datasets. The obtained results demonstrate that the proposed method matches or outperforms the competing methods in accuracy, but clearly outperforms them in computational efficiency, making it a suitable building block for applications that require hand keypoint estimation on mobile devices. |
Tasks | Hand Keypoint Localization |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.08028v1 |
http://arxiv.org/pdf/1812.08028v1.pdf | |
PWC | https://paperswithcode.com/paper/accurate-hand-keypoint-localization-on-mobile |
Repo | |
Framework | |
Incomplete Multi-view Clustering via Graph Regularized Matrix Factorization
Title | Incomplete Multi-view Clustering via Graph Regularized Matrix Factorization |
Authors | Jie Wen, Zheng Zhang, Yong Xu, Zuofeng Zhong |
Abstract | Clustering with incomplete views is a challenge in multi-view clustering. In this paper, we provide a novel and simple method to address this issue. Specifically, the proposed method simultaneously exploits the local information of each view and the complementary information among views to learn the common latent representation for all samples, which can greatly improve the compactness and discriminability of the obtained representation. Compared with the conventional graph embedding methods, the proposed method does not introduce any extra regularization term and corresponding penalty parameter to preserve the local structure of data, and thus does not increase the burden of extra parameter selection. By imposing the orthogonal constraint on the basis matrix of each view, the proposed method is able to handle the out-of-sample. Moreover, the proposed method can be viewed as a unified framework for multi-view learning since it can handle both incomplete and complete multi-view clustering and classification tasks. Extensive experiments conducted on several multi-view datasets prove that the proposed method can significantly improve the clustering performance. |
Tasks | Graph Embedding, MULTI-VIEW LEARNING |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.05998v1 |
http://arxiv.org/pdf/1809.05998v1.pdf | |
PWC | https://paperswithcode.com/paper/incomplete-multi-view-clustering-via-graph |
Repo | |
Framework | |
A Compositional Textual Model for Recognition of Imperfect Word Images
Title | A Compositional Textual Model for Recognition of Imperfect Word Images |
Authors | Wei Tang, John Corring, Ying Wu, Gang Hua |
Abstract | Printed text recognition is an important problem for industrial OCR systems. Printed text is constructed in a standard procedural fashion in most settings. We develop a mathematical model for this process that can be applied to the backward inference problem of text recognition from an image. Through ablation experiments we show that this model is realistic and that a multi-task objective setting can help to stabilize estimation of its free parameters, enabling use of conventional deep learning methods. Furthermore, by directly modeling the geometric perturbations of text synthesis we show that our model can help recover missing characters from incomplete text regions, the bane of multicomponent OCR systems, enabling recognition even when the detection returns incomplete information. |
Tasks | Optical Character Recognition |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.11239v1 |
http://arxiv.org/pdf/1811.11239v1.pdf | |
PWC | https://paperswithcode.com/paper/a-compositional-textual-model-for-recognition |
Repo | |
Framework | |
On Chatbots Exhibiting Goal-Directed Autonomy in Dynamic Environments
Title | On Chatbots Exhibiting Goal-Directed Autonomy in Dynamic Environments |
Authors | Biplav Srivastava |
Abstract | Conversation interfaces (CIs), or chatbots, are a popular form of intelligent agents that engage humans in task-oriented or informal conversation. In this position paper and demonstration, we argue that chatbots working in dynamic environments, like with sensor data, can not only serve as a promising platform to research issues at the intersection of learning, reasoning, representation and execution for goal-directed autonomy; but also handle non-trivial business applications. We explore the underlying issues in the context of Water Advisor, a preliminary multi-modal conversation system that can access and explain water quality data. |
Tasks | |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09789v1 |
http://arxiv.org/pdf/1803.09789v1.pdf | |
PWC | https://paperswithcode.com/paper/on-chatbots-exhibiting-goal-directed-autonomy |
Repo | |
Framework | |
Tight Bounds for Collaborative PAC Learning via Multiplicative Weights
Title | Tight Bounds for Collaborative PAC Learning via Multiplicative Weights |
Authors | Jiecao Chen, Qin Zhang, Yuan Zhou |
Abstract | We study the collaborative PAC learning problem recently proposed in Blum et al.~\cite{BHPQ17}, in which we have $k$ players and they want to learn a target function collaboratively, such that the learned function approximates the target function well on all players’ distributions simultaneously. The quality of the collaborative learning algorithm is measured by the ratio between the sample complexity of the algorithm and that of the learning algorithm for a single distribution (called the overhead). We obtain a collaborative learning algorithm with overhead $O(\ln k)$, improving the one with overhead $O(\ln^2 k)$ in \cite{BHPQ17}. We also show that an $\Omega(\ln k)$ overhead is inevitable when $k$ is polynomial bounded by the VC dimension of the hypothesis class. Finally, our experimental study has demonstrated the superiority of our algorithm compared with the one in Blum et al. on real-world datasets. |
Tasks | |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.09217v2 |
http://arxiv.org/pdf/1805.09217v2.pdf | |
PWC | https://paperswithcode.com/paper/tight-bounds-for-collaborative-pac-learning |
Repo | |
Framework | |
Profitable Bandits
Title | Profitable Bandits |
Authors | Mastane Achab, Stephan Clémençon, Aurélien Garivier |
Abstract | Originally motivated by default risk management applications, this paper investigates a novel problem, referred to as the profitable bandit problem here. At each step, an agent chooses a subset of the K possible actions. For each action chosen, she then receives the sum of a random number of rewards. Her objective is to maximize her cumulated earnings. We adapt and study three well-known strategies in this purpose, that were proved to be most efficient in other settings: kl-UCB, Bayes-UCB and Thompson Sampling. For each of them, we prove a finite time regret bound which, together with a lower bound we obtain as well, establishes asymptotic optimality. Our goal is also to compare these three strategies from a theoretical and empirical perspective both at the same time. We give simple, self-contained proofs that emphasize their similarities, as well as their differences. While both Bayesian strategies are automatically adapted to the geometry of information, the numerical experiments carried out show a slight advantage for Thompson Sampling in practice. |
Tasks | |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.02908v1 |
http://arxiv.org/pdf/1805.02908v1.pdf | |
PWC | https://paperswithcode.com/paper/profitable-bandits |
Repo | |
Framework | |
Survey of Computational Approaches to Lexical Semantic Change
Title | Survey of Computational Approaches to Lexical Semantic Change |
Authors | Nina Tahmasebi, Lars Borin, Adam Jatowt |
Abstract | Our languages are in constant flux driven by external factors such as cultural, societal and technological changes, as well as by only partially understood internal motivations. Words acquire new meanings and lose old senses, new words are coined or borrowed from other languages and obsolete words slide into obscurity. Understanding the characteristics of shifts in the meaning and in the use of words is useful for those who work with the content of historical texts, the interested general public, but also in and of itself. The findings from automatic lexical semantic change detection, and the models of diachronic conceptual change are currently being incorporated in approaches for measuring document across-time similarity, information retrieval from long-term document archives, the design of OCR algorithms, and so on. In recent years we have seen a surge in interest in the academic community in computational methods and tools supporting inquiry into diachronic conceptual change and lexical replacement. This article is an extract of a survey of recent computational techniques to tackle lexical semantic change currently under review. In this article we focus on diachronic conceptual change as an extension of semantic change. |
Tasks | Information Retrieval, Optical Character Recognition |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06278v2 |
http://arxiv.org/pdf/1811.06278v2.pdf | |
PWC | https://paperswithcode.com/paper/survey-of-computational-approaches-to |
Repo | |
Framework | |
Microscopic Nuclei Classification, Segmentation and Detection with improved Deep Convolutional Neural Network (DCNN) Approaches
Title | Microscopic Nuclei Classification, Segmentation and Detection with improved Deep Convolutional Neural Network (DCNN) Approaches |
Authors | Md Zahangir Alom, Chris Yakopcic, Tarek M. Taha, Vijayan K. Asari |
Abstract | Due to cellular heterogeneity, cell nuclei classification, segmentation, and detection from pathological images are challenging tasks. In the last few years, Deep Convolutional Neural Networks (DCNN) approaches have been shown state-of-the-art (SOTA) performance on histopathological imaging in different studies. In this work, we have proposed different advanced DCNN models and evaluated for nuclei classification, segmentation, and detection. First, the Densely Connected Recurrent Convolutional Network (DCRN) model is used for nuclei classification. Second, Recurrent Residual U-Net (R2U-Net) is applied for nuclei segmentation. Third, the R2U-Net regression model which is named UD-Net is used for nuclei detection from pathological images. The experiments are conducted with different datasets including Routine Colon Cancer(RCC) classification and detection dataset, and Nuclei Segmentation Challenge 2018 dataset. The experimental results show that the proposed DCNN models provide superior performance compared to the existing approaches for nuclei classification, segmentation, and detection tasks. The results are evaluated with different performance metrics including precision, recall, Dice Coefficient (DC), Means Squared Errors (MSE), F1-score, and overall accuracy. We have achieved around 3.4% and 4.5% better F-1 score for nuclei classification and detection tasks compared to recently published DCNN based method. In addition, R2U-Net shows around 92.15% testing accuracy in term of DC. These improved methods will help for pathological practices for better quantitative analysis of nuclei in Whole Slide Images(WSI) which ultimately will help for better understanding of different types of cancer in clinical workflow. |
Tasks | Nuclei Classification |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03447v1 |
http://arxiv.org/pdf/1811.03447v1.pdf | |
PWC | https://paperswithcode.com/paper/microscopic-nuclei-classification |
Repo | |
Framework | |
From Videos to URLs: A Multi-Browser Guide To Extract User’s Behavior with Optical Character Recognition
Title | From Videos to URLs: A Multi-Browser Guide To Extract User’s Behavior with Optical Character Recognition |
Authors | Mojtaba Heidarysafa, James Reed, Kamran Kowsari, April Celeste R. Leviton, Janet I. Warren, Donald E. Brown |
Abstract | Tracking users’ activities on the World Wide Web (WWW) allows researchers to analyze each user’s internet behavior as time passes and for the amount of time spent on a particular domain. This analysis can be used in research design, as researchers may access to their participant’s behaviors while browsing the web. Web search behavior has been a subject of interest because of its real-world applications in marketing, digital advertisement, and identifying potential threats online. In this paper, we present an image-processing based method to extract domains which are visited by a participant over multiple browsers during a lab session. This method could provide another way to collect users’ activities during an online session given that the session recorder collected the data. The method can also be used to collect the textual content of web-pages that an individual visits for later analysis |
Tasks | Optical Character Recognition |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06193v1 |
http://arxiv.org/pdf/1811.06193v1.pdf | |
PWC | https://paperswithcode.com/paper/from-videos-to-urls-a-multi-browser-guide-to |
Repo | |
Framework | |
Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery
Title | Low-pass Recurrent Neural Networks - A memory architecture for longer-term correlation discovery |
Authors | Thomas Stepleton, Razvan Pascanu, Will Dabney, Siddhant M. Jayakumar, Hubert Soyer, Remi Munos |
Abstract | Reinforcement learning (RL) agents performing complex tasks must be able to remember observations and actions across sizable time intervals. This is especially true during the initial learning stages, when exploratory behaviour can increase the delay between specific actions and their effects. Many new or popular approaches for learning these distant correlations employ backpropagation through time (BPTT), but this technique requires storing observation traces long enough to span the interval between cause and effect. Besides memory demands, learning dynamics like vanishing gradients and slow convergence due to infrequent weight updates can reduce BPTT’s practicality; meanwhile, although online recurrent network learning is a developing topic, most approaches are not efficient enough to use as replacements. We propose a simple, effective memory strategy that can extend the window over which BPTT can learn without requiring longer traces. We explore this approach empirically on a few tasks and discuss its implications. |
Tasks | |
Published | 2018-05-13 |
URL | http://arxiv.org/abs/1805.04955v1 |
http://arxiv.org/pdf/1805.04955v1.pdf | |
PWC | https://paperswithcode.com/paper/low-pass-recurrent-neural-networks-a-memory |
Repo | |
Framework | |
Fuzzy Clustering to Identify Clusters at Different Levels of Fuzziness: An Evolutionary Multi-Objective Optimization Approach
Title | Fuzzy Clustering to Identify Clusters at Different Levels of Fuzziness: An Evolutionary Multi-Objective Optimization Approach |
Authors | Avisek Gupta, Shounak Datta, Swagatam Das |
Abstract | Fuzzy clustering methods identify naturally occurring clusters in a dataset, where the extent to which different clusters are overlapped can differ. Most methods have a parameter to fix the level of fuzziness. However, the appropriate level of fuzziness depends on the application at hand. This paper presents Entropy $c$-Means (ECM), a method of fuzzy clustering that simultaneously optimizes two contradictory objective functions, resulting in the creation of fuzzy clusters with different levels of fuzziness. This allows ECM to identify clusters with different degrees of overlap. ECM optimizes the two objective functions using two multi-objective optimization methods, Non-dominated Sorting Genetic Algorithm II (NSGA-II), and Multiobjective Evolutionary Algorithm based on Decomposition (MOEA/D). We also propose a method to select a suitable trade-off clustering from the Pareto front. Experiments on challenging synthetic datasets as well as real-world datasets show that ECM leads to better cluster detection compared to the conventional fuzzy clustering methods as well as previously used multi-objective methods for fuzzy clustering. |
Tasks | |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.03327v1 |
http://arxiv.org/pdf/1808.03327v1.pdf | |
PWC | https://paperswithcode.com/paper/fuzzy-clustering-to-identify-clusters-at |
Repo | |
Framework | |
RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks
Title | RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks |
Authors | Xiuyuan Cheng, Qiang Qiu, Robert Calderbank, Guillermo Sapiro |
Abstract | Explicit encoding of group actions in deep features makes it possible for convolutional neural networks (CNNs) to handle global deformations of images, which is critical to success in many vision tasks. This paper proposes to decompose the convolutional filters over joint steerable bases across the space and the group geometry simultaneously, namely a rotation-equivariant CNN with decomposed convolutional filters (RotDCF). This decomposition facilitates computing the joint convolution, which is proved to be necessary for the group equivariance. It significantly reduces the model size and computational complexity while preserving performance, and truncation of the bases expansion serves implicitly to regularize the filters. On datasets involving in-plane and out-of-plane object rotations, RotDCF deep features demonstrate greater robustness and interpretability than regular CNNs. The stability of the equivariant representation to input variations is also proved theoretically under generic assumptions on the filters in the decomposed form. The RotDCF framework can be extended to groups other than rotations, providing a general approach which achieves both group equivariance and representation stability at a reduced model size. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06846v1 |
http://arxiv.org/pdf/1805.06846v1.pdf | |
PWC | https://paperswithcode.com/paper/rotdcf-decomposition-of-convolutional-filters |
Repo | |
Framework | |