May 7, 2019

2954 words 14 mins read

Paper Group ANR 11

Associative Memory using Dictionary Learning and Expander Decoding. Spatio-Temporal Sentiment Hotspot Detection Using Geotagged Photos. A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition. Importance Sampling for Minibatches. Domain Transfer Multi-Instance Dictionary Learning. Learning Semantic Part-Based Models …

Associative Memory using Dictionary Learning and Expander Decoding


Title	Associative Memory using Dictionary Learning and Expander Decoding
Authors	Arya Mazumdar, Ankit Singh Rawat
Abstract	An associative memory is a framework of content-addressable memory that stores a collection of message vectors (or a dataset) over a neural network while enabling a neurally feasible mechanism to recover any message in the dataset from its noisy version. Designing an associative memory requires addressing two main tasks: 1) learning phase: given a dataset, learn a concise representation of the dataset in the form of a graphical model (or a neural network), 2) recall phase: given a noisy version of a message vector from the dataset, output the correct message vector via a neurally feasible algorithm over the network learnt during the learning phase. This paper studies the problem of designing a class of neural associative memories which learns a network representation for a large dataset that ensures correction against a large number of adversarial errors during the recall phase. Specifically, the associative memories designed in this paper can store dataset containing $\exp(n)$ $n$-length message vectors over a network with $O(n)$ nodes and can tolerate $\Omega(\frac{n}{{\rm polylog} n})$ adversarial errors. This paper carries out this memory design by mapping the learning phase and recall phase to the tasks of dictionary learning with a square dictionary and iterative error correction in an expander code, respectively.
Tasks	Dictionary Learning
Published	2016-11-29
URL	http://arxiv.org/abs/1611.09621v1
PDF	http://arxiv.org/pdf/1611.09621v1.pdf
PWC	https://paperswithcode.com/paper/associative-memory-using-dictionary-learning
Repo
Framework

Spatio-Temporal Sentiment Hotspot Detection Using Geotagged Photos


Title	Spatio-Temporal Sentiment Hotspot Detection Using Geotagged Photos
Authors	Yi Zhu, Shawn Newsam
Abstract	We perform spatio-temporal analysis of public sentiment using geotagged photo collections. We develop a deep learning-based classifier that predicts the emotion conveyed by an image. This allows us to associate sentiment with place. We perform spatial hotspot detection and show that different emotions have distinct spatial distributions that match expectations. We also perform temporal analysis using the capture time of the photos. Our spatio-temporal hotspot detection correctly identifies emerging concentrations of specific emotions and year-by-year analyses of select locations show there are strong temporal correlations between the predicted emotions and known events.
Tasks
Published	2016-09-21
URL	http://arxiv.org/abs/1609.06772v1
PDF	http://arxiv.org/pdf/1609.06772v1.pdf
PWC	https://paperswithcode.com/paper/spatio-temporal-sentiment-hotspot-detection
Repo
Framework

A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition


Title	A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition
Authors	Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, Kuan Liu, Avner May, Aurelien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha
Abstract	We study large-scale kernel methods for acoustic modeling and compare to DNNs on performance metrics related to both acoustic modeling and recognition. Measuring perplexity and frame-level classification accuracy, kernel-based acoustic models are as effective as their DNN counterparts. However, on token-error-rates DNN models can be significantly better. We have discovered that this might be attributed to DNN’s unique strength in reducing both the perplexity and the entropy of the predicted posterior probabilities. Motivated by our findings, we propose a new technique, entropy regularized perplexity, for model selection. This technique can noticeably improve the recognition performance of both types of models, and reduces the gap between them. While effective on Broadcast News, this technique could be also applicable to other tasks.
Tasks	Model Selection, Speech Recognition
Published	2016-03-18
URL	http://arxiv.org/abs/1603.05800v1
PDF	http://arxiv.org/pdf/1603.05800v1.pdf
PWC	https://paperswithcode.com/paper/a-comparison-between-deep-neural-nets-and
Repo
Framework

Importance Sampling for Minibatches


Title	Importance Sampling for Minibatches
Authors	Dominik Csiba, Peter Richtárik
Abstract	Minibatching is a very well studied and highly popular technique in supervised learning, used by practitioners due to its ability to accelerate training through better utilization of parallel processing power and reduction of stochastic variance. Another popular technique is importance sampling – a strategy for preferential sampling of more important examples also capable of accelerating the training process. However, despite considerable effort by the community in these areas, and due to the inherent technical difficulty of the problem, there is no existing work combining the power of importance sampling with the strength of minibatching. In this paper we propose the first {\em importance sampling for minibatches} and give simple and rigorous complexity analysis of its performance. We illustrate on synthetic problems that for training data of certain properties, our sampling can lead to several orders of magnitude improvement in training time. We then test the new sampling on several popular datasets, and show that the improvement can reach an order of magnitude.
Tasks
Published	2016-02-06
URL	http://arxiv.org/abs/1602.02283v1
PDF	http://arxiv.org/pdf/1602.02283v1.pdf
PWC	https://paperswithcode.com/paper/importance-sampling-for-minibatches
Repo
Framework

Domain Transfer Multi-Instance Dictionary Learning


Title	Domain Transfer Multi-Instance Dictionary Learning
Authors	Ke Wang, Jiayong Liu, Daniel González
Abstract	In this paper, we invest the domain transfer learning problem with multi-instance data. We assume we already have a well-trained multi-instance dictionary and its corresponding classifier from the source domain, which can be used to represent and classify the bags. But it cannot be directly used to the target domain. Thus we propose to adapt them to the target domain by adding an adaptive term to the source domain classifier. The adaptive function is a linear function based a domain transfer multi-instance dictionary. Given a target domain bag, we first map it to a bag-level feature space using the domain transfer dictionary, and then apply a the linear adaptive function to its bag-level feature vector. To learn the domain-transfer dictionary and the adaptive function parameter, we simultaneously minimize the average classification error of the target domain classifier over the target domain training set, and the complexities of both the adaptive function parameter and the domain transfer dictionary. The minimization problem is solved by an iterative algorithm which update the dictionary and the function parameter alternately. Experiments over several benchmark data sets show the advantage of the proposed method over existing state-of-the-art domain transfer multi-instance learning methods.
Tasks	Dictionary Learning, Transfer Learning
Published	2016-05-26
URL	http://arxiv.org/abs/1605.08397v1
PDF	http://arxiv.org/pdf/1605.08397v1.pdf
PWC	https://paperswithcode.com/paper/domain-transfer-multi-instance-dictionary
Repo
Framework

Learning Semantic Part-Based Models from Google Images


Title	Learning Semantic Part-Based Models from Google Images
Authors	Davide Modolo, Vittorio Ferrari
Abstract	We propose a technique to train semantic part-based models of object classes from Google Images. Our models encompass the appearance of parts and their spatial arrangement on the object, specific to each viewpoint. We learn these rich models by collecting training instances for both parts and objects, and automatically connecting the two levels. Our framework works incrementally, by learning from easy examples first, and then gradually adapting to harder ones. A key benefit of this approach is that it requires no manual part location annotations. We evaluate our models on the challenging PASCAL-Part dataset [1] and show how their performance increases at every step of the learning, with the final models more than doubling the performance of directly training from images retrieved by querying for part names (from 12.9 to 27.2 AP). Moreover, we show that our part models can help object detection performance by enriching the R-CNN detector with parts.
Tasks	Object Detection
Published	2016-09-11
URL	http://arxiv.org/abs/1609.03140v2
PDF	http://arxiv.org/pdf/1609.03140v2.pdf
PWC	https://paperswithcode.com/paper/learning-semantic-part-based-models-from
Repo
Framework

Tasks for agent-based negotiation teams: Analysis, review, and challenges


Title	Tasks for agent-based negotiation teams: Analysis, review, and challenges
Authors	Victor Sanchez-Anguix, Vicente Julian, Vicente Botti, Ana Garcia-Fornes
Abstract	An agent-based negotiation team is a group of interdependent agents that join together as a single negotiation party due to their shared interests in the negotiation at hand. The reasons to employ an agent-based negotiation team may vary: (i) more computation and parallelization capabilities, (ii) unite agents with different expertise and skills whose joint work makes it possible to tackle complex negotiation domains, (iii) the necessity to represent different stakeholders or different preferences in the same party (e.g., organizations, countries, and married couple). The topic of agent-based negotiation teams has been recently introduced in multi-agent research. Therefore, it is necessary to identify good practices, challenges, and related research that may help in advancing the state-of-the-art in agent-based negotiation teams. For that reason, in this article we review the tasks to be carried out by agent-based negotiation teams. Each task is analyzed and related with current advances in different research areas. The analysis aims to identify special challenges that may arise due to the particularities of agent-based negotiation teams.
Tasks
Published	2016-04-16
URL	http://arxiv.org/abs/1604.04727v1
PDF	http://arxiv.org/pdf/1604.04727v1.pdf
PWC	https://paperswithcode.com/paper/tasks-for-agent-based-negotiation-teams
Repo
Framework

Towards Visual Type Theory as a Mathematical Tool and Mathematical User Interface


Title	Towards Visual Type Theory as a Mathematical Tool and Mathematical User Interface
Authors	Lucius Schoenbaum
Abstract	A visual type theory is a cognitive tool that has much in common with language, and may be regarded as an exceptional form of spatial text adjunct. A mathematical visual type theory, called NPM, has been under development that can be viewed as an early-stage project in mathematical knowledge management and mathematical user interface development. We discuss in greater detail the notion of a visual type theory, report on progress towards a usable mathematical visual type theory, and discuss the outlook for future work on this project.
Tasks
Published	2016-08-10
URL	http://arxiv.org/abs/1608.03026v1
PDF	http://arxiv.org/pdf/1608.03026v1.pdf
PWC	https://paperswithcode.com/paper/towards-visual-type-theory-as-a-mathematical
Repo
Framework

MinMax Radon Barcodes for Medical Image Retrieval


Title	MinMax Radon Barcodes for Medical Image Retrieval
Authors	H. R. Tizhoosh, Shujin Zhu, Hanson Lo, Varun Chaudhari, Tahmid Mehdi
Abstract	Content-based medical image retrieval can support diagnostic decisions by clinical experts. Examining similar images may provide clues to the expert to remove uncertainties in his/her final diagnosis. Beyond conventional feature descriptors, binary features in different ways have been recently proposed to encode the image content. A recent proposal is “Radon barcodes” that employ binarized Radon projections to tag/annotate medical images with content-based binary vectors, called barcodes. In this paper, MinMax Radon barcodes are introduced which are superior to “local thresholding” scheme suggested in the literature. Using IRMA dataset with 14,410 x-ray images from 193 different classes, the advantage of using MinMax Radon barcodes over \emph{thresholded} Radon barcodes are demonstrated. The retrieval error for direct search drops by more than 15%. As well, SURF, as a well-established non-binary approach, and BRISK, as a recent binary method are examined to compare their results with MinMax Radon barcodes when retrieving images from IRMA dataset. The results demonstrate that MinMax Radon barcodes are faster and more accurate when applied on IRMA images.
Tasks	Image Retrieval, Medical Image Retrieval
Published	2016-10-02
URL	http://arxiv.org/abs/1610.00318v1
PDF	http://arxiv.org/pdf/1610.00318v1.pdf
PWC	https://paperswithcode.com/paper/minmax-radon-barcodes-for-medical-image
Repo
Framework

Minimal Problems for the Calibrated Trifocal Variety


Title	Minimal Problems for the Calibrated Trifocal Variety
Authors	Joe Kileel
Abstract	We determine the algebraic degree of minimal problems for the calibrated trifocal variety in computer vision. We rely on numerical algebraic geometry and the homotopy continuation software Bertini.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1611.05947v1
PDF	http://arxiv.org/pdf/1611.05947v1.pdf
PWC	https://paperswithcode.com/paper/minimal-problems-for-the-calibrated-trifocal
Repo
Framework

Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions


Title	Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions
Authors	Ronghang Hu, Marcus Rohrbach, Subhashini Venugopalan, Trevor Darrell
Abstract	Image segmentation from referring expressions is a joint vision and language modeling task, where the input is an image and a textual expression describing a particular region in the image; and the goal is to localize and segment the specific image region based on the given expression. One major difficulty to train such language-based image segmentation systems is the lack of datasets with joint vision and text annotations. Although existing vision datasets such as MS COCO provide image captions, there are few datasets with region-level textual annotations for images, and these are often smaller in scale. In this paper, we explore how existing large scale vision-only and text-only datasets can be utilized to train models for image segmentation from referring expressions. We propose a method to address this problem, and show in experiments that our method can help this joint vision and language modeling task with vision-only and text-only data and outperforms previous results.
Tasks	Image Captioning, Language Modelling, Semantic Segmentation
Published	2016-08-30
URL	http://arxiv.org/abs/1608.08305v1
PDF	http://arxiv.org/pdf/1608.08305v1.pdf
PWC	https://paperswithcode.com/paper/utilizing-large-scale-vision-and-text
Repo
Framework

Adiabatic Persistent Contrastive Divergence Learning


Title	Adiabatic Persistent Contrastive Divergence Learning
Authors	Hyeryung Jang, Hyungwon Choi, Yung Yi, Jinwoo Shin
Abstract	This paper studies the problem of parameter learning in probabilistic graphical models having latent variables, where the standard approach is the expectation maximization algorithm alternating expectation (E) and maximization (M) steps. However, both E and M steps are computationally intractable for high dimensional data, while the substitution of one step to a faster surrogate for combating against intractability can often cause failure in convergence. We propose a new learning algorithm which is computationally efficient and provably ensures convergence to a correct optimum. Its key idea is to run only a few cycles of Markov Chains (MC) in both E and M steps. Such an idea of running incomplete MC has been well studied only for M step in the literature, called Contrastive Divergence (CD) learning. While such known CD-based schemes find approximated gradients of the log-likelihood via the mean-field approach in E step, our proposed algorithm does exact ones via MC algorithms in both steps due to the multi-time-scale stochastic approximation theory. Despite its theoretical guarantee in convergence, the proposed scheme might suffer from the slow mixing of MC in E step. To tackle it, we also propose a hybrid approach applying both mean-field and MC approximation in E step, where the hybrid approach outperforms the bare mean-field CD scheme in our experiments on real-world datasets.
Tasks
Published	2016-05-26
URL	http://arxiv.org/abs/1605.08174v2
PDF	http://arxiv.org/pdf/1605.08174v2.pdf
PWC	https://paperswithcode.com/paper/adiabatic-persistent-contrastive-divergence
Repo
Framework

Proposing Plausible Answers for Open-ended Visual Question Answering


Title	Proposing Plausible Answers for Open-ended Visual Question Answering
Authors	Omid Bakhshandeh, Trung Bui, Zhe Lin, Walter Chang
Abstract	Answering open-ended questions is an essential capability for any intelligent agent. One of the most interesting recent open-ended question answering challenges is Visual Question Answering (VQA) which attempts to evaluate a system’s visual understanding through its answers to natural language questions about images. There exist many approaches to VQA, the majority of which do not exhibit deeper semantic understanding of the candidate answers they produce. We study the importance of generating plausible answers to a given question by introducing the novel task of `Answer Proposal’: for a given open-ended question, a system should generate a ranked list of candidate answers informed by the semantics of the question. We experiment with various models including a neural generative model as well as a semantic graph matching one. We provide both intrinsic and extrinsic evaluations for the task of Answer Proposal, showing that our best model learns to propose plausible answers with a high recall and performs competitively with some other solutions to VQA. \|
Tasks	Graph Matching, Question Answering, Visual Question Answering
Published	2016-10-20
URL	http://arxiv.org/abs/1610.06620v2
PDF	http://arxiv.org/pdf/1610.06620v2.pdf
PWC	https://paperswithcode.com/paper/proposing-plausible-answers-for-open-ended
Repo
Framework

Policy Networks with Two-Stage Training for Dialogue Systems


Title	Policy Networks with Two-Stage Training for Dialogue Systems
Authors	Mehdi Fatemi, Layla El Asri, Hannes Schulz, Jing He, Kaheer Suleman
Abstract	In this paper, we propose to use deep policy networks which are trained with an advantage actor-critic method for statistically optimised dialogue systems. First, we show that, on summary state and action spaces, deep Reinforcement Learning (RL) outperforms Gaussian Processes methods. Summary state and action spaces lead to good performance but require pre-engineering effort, RL knowledge, and domain expertise. In order to remove the need to define such summary spaces, we show that deep RL can also be trained efficiently on the original state and action spaces. Dialogue systems based on partially observable Markov decision processes are known to require many dialogues to train, which makes them unappealing for practical deployment. We show that a deep RL method based on an actor-critic architecture can exploit a small amount of data very efficiently. Indeed, with only a few hundred dialogues collected with a handcrafted policy, the actor-critic deep learner is considerably bootstrapped from a combination of supervised and batch RL. In addition, convergence to an optimal policy is significantly sped up compared to other deep RL methods initialized on the data with batch RL. All experiments are performed on a restaurant domain derived from the Dialogue State Tracking Challenge 2 (DSTC2) dataset.
Tasks	Dialogue State Tracking, Gaussian Processes
Published	2016-06-10
URL	http://arxiv.org/abs/1606.03152v4
PDF	http://arxiv.org/pdf/1606.03152v4.pdf
PWC	https://paperswithcode.com/paper/policy-networks-with-two-stage-training-for
Repo
Framework

A Systematic Approach for Cross-source Point Cloud Registration by Preserving Macro and Micro Structures


Title	A Systematic Approach for Cross-source Point Cloud Registration by Preserving Macro and Micro Structures
Authors	Xiaoshui Huang, Jian Zhang, Lixin Fan, Qiang Wu, Chun Yuan
Abstract	We propose a systematic approach for registering cross-source point clouds. The compelling need for cross-source point cloud registration is motivated by the rapid development of a variety of 3D sensing techniques, but many existing registration methods face critical challenges as a result of the large variations in cross-source point clouds. This paper therefore illustrates a novel registration method which successfully aligns two cross-source point clouds in the presence of significant missing data, large variations in point density, scale difference and so on. The robustness of the method is attributed to the extraction of macro and micro structures. Our work has three main contributions: (1) a systematic pipeline to deal with cross-source point cloud registration; (2) a graph construction method to maintain macro and micro structures; (3) a new graph matching method is proposed which considers the global geometric constraint to robustly register these variable graphs. Compared to most of the related methods, the experiments show that the proposed method successfully registers in cross-source datasets, while other methods have difficulty achieving satisfactory results. The proposed method also shows great ability in same-source datasets.
Tasks	graph construction, Graph Matching, Point Cloud Registration
Published	2016-08-18
URL	http://arxiv.org/abs/1608.05143v2
PDF	http://arxiv.org/pdf/1608.05143v2.pdf
PWC	https://paperswithcode.com/paper/a-systematic-approach-for-cross-source-point
Repo
Framework