July 27, 2019

3029 words 15 mins read

Paper Group ANR 652

Relating Input Concepts to Convolutional Neural Network Decisions. Fine-graind Image Classification via Combining Vision and Language. Context Aware Document Embedding. Facial 3D Model Registration Under Occlusions With SensiblePoints-based Reinforced Hypothesis Refinement. Adversarial Discriminative Heterogeneous Face Recognition. Attention-Aware …

Relating Input Concepts to Convolutional Neural Network Decisions


Title	Relating Input Concepts to Convolutional Neural Network Decisions
Authors	Ning Xie, Md Kamruzzaman Sarker, Derek Doran, Pascal Hitzler, Michael Raymer
Abstract	Many current methods to interpret convolutional neural networks (CNNs) use visualization techniques and words to highlight concepts of the input seemingly relevant to a CNN’s decision. The methods hypothesize that the recognition of these concepts are instrumental in the decision a CNN reaches, but the nature of this relationship has not been well explored. To address this gap, this paper examines the quality of a concept’s recognition by a CNN and the degree to which the recognitions are associated with CNN decisions. The study considers a CNN trained for scene recognition over the ADE20k dataset. It uses a novel approach to find and score the strength of minimally distributed representations of input concepts (defined by objects in scene images) across late stage feature maps. Subsequent analysis finds evidence that concept recognition impacts decision making. Strong recognition of concepts frequently-occurring in few scenes are indicative of correct decisions, but recognizing concepts common to many scenes may mislead the network.
Tasks	Decision Making, Scene Recognition
Published	2017-11-21
URL	http://arxiv.org/abs/1711.08006v1
PDF	http://arxiv.org/pdf/1711.08006v1.pdf
PWC	https://paperswithcode.com/paper/relating-input-concepts-to-convolutional
Repo
Framework

Fine-graind Image Classification via Combining Vision and Language


Title	Fine-graind Image Classification via Combining Vision and Language
Authors	Xiangteng He, Yuxin Peng
Abstract	Fine-grained image classification is a challenging task due to the large intra-class variance and small inter-class variance, aiming at recognizing hundreds of sub-categories belonging to the same basic-level category. Most existing fine-grained image classification methods generally learn part detection models to obtain the semantic parts for better classification accuracy. Despite achieving promising results, these methods mainly have two limitations: (1) not all the parts which obtained through the part detection models are beneficial and indispensable for classification, and (2) fine-grained image classification requires more detailed visual descriptions which could not be provided by the part locations or attribute annotations. For addressing the above two limitations, this paper proposes the two-stream model combining vision and language (CVL) for learning latent semantic representations. The vision stream learns deep representations from the original visual information via deep convolutional neural network. The language stream utilizes the natural language descriptions which could point out the discriminative parts or characteristics for each image, and provides a flexible and compact way of encoding the salient visual aspects for distinguishing sub-categories. Since the two streams are complementary, combining the two streams can further achieves better classification accuracy. Comparing with 12 state-of-the-art methods on the widely used CUB-200-2011 dataset for fine-grained image classification, the experimental results demonstrate our CVL approach achieves the best performance.
Tasks	Fine-Grained Image Classification, Image Classification
Published	2017-04-10
URL	http://arxiv.org/abs/1704.02792v2
PDF	http://arxiv.org/pdf/1704.02792v2.pdf
PWC	https://paperswithcode.com/paper/fine-graind-image-classification-via
Repo
Framework

Context Aware Document Embedding


Title	Context Aware Document Embedding
Authors	Zhaocheng Zhu, Junfeng Hu
Abstract	Recently, doc2vec has achieved excellent results in different tasks. In this paper, we present a context aware variant of doc2vec. We introduce a novel weight estimating mechanism that generates weights for each word occurrence according to its contribution in the context, using deep neural networks. Our context aware model can achieve similar results compared to doc2vec initialized byWikipedia trained vectors, while being much more efficient and free from heavy external corpus. Analysis of context aware weights shows they are a kind of enhanced IDF weights that capture sub-topic level keywords in documents. They might result from deep neural networks that learn hidden representations with the least entropy.
Tasks	Document Embedding
Published	2017-07-05
URL	http://arxiv.org/abs/1707.01521v1
PDF	http://arxiv.org/pdf/1707.01521v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-document-embedding
Repo
Framework


Title	Facial 3D Model Registration Under Occlusions With SensiblePoints-based Reinforced Hypothesis Refinement
Authors	Yuhang Wu, Ioannis A. Kakadiaris
Abstract	Registering a 3D facial model to a 2D image under occlusion is difficult. First, not all of the detected facial landmarks are accurate under occlusions. Second, the number of reliable landmarks may not be enough to constrain the problem. We propose a method to synthesize additional points (SensiblePoints) to create pose hypotheses. The visual clues extracted from the fiducial points, non-fiducial points, and facial contour are jointly employed to verify the hypotheses. We define a reward function to measure whether the projected dense 3D model is well-aligned with the confidence maps generated by two fully convolutional networks, and use the function to train recurrent policy networks to move the SensiblePoints. The same reward function is employed in testing to select the best hypothesis from a candidate pool of hypotheses. Experimentation demonstrates that the proposed approach is very promising in solving the facial model registration problem under occlusion.
Tasks
Published	2017-09-02
URL	http://arxiv.org/abs/1709.00531v1
PDF	http://arxiv.org/pdf/1709.00531v1.pdf
PWC	https://paperswithcode.com/paper/facial-3d-model-registration-under-occlusions
Repo
Framework

Adversarial Discriminative Heterogeneous Face Recognition


Title	Adversarial Discriminative Heterogeneous Face Recognition
Authors	Lingxiao Song, Man Zhang, Xiang Wu, Ran He
Abstract	The gap between sensing patterns of different face modalities remains a challenging problem in heterogeneous face recognition (HFR). This paper proposes an adversarial discriminative feature learning framework to close the sensing gap via adversarial learning on both raw-pixel space and compact feature space. This framework integrates cross-spectral face hallucination and discriminative feature learning into an end-to-end adversarial network. In the pixel space, we make use of generative adversarial networks to perform cross-spectral face hallucination. An elaborate two-path model is introduced to alleviate the lack of paired images, which gives consideration to both global structures and local textures. In the feature space, an adversarial loss and a high-order variance discrepancy loss are employed to measure the global and local discrepancy between two heterogeneous distributions respectively. These two losses enhance domain-invariant feature learning and modality independent noise removing. Experimental results on three NIR-VIS databases show that our proposed approach outperforms state-of-the-art HFR methods, without requiring of complex network or large-scale training dataset.
Tasks	Face Hallucination, Face Recognition, Heterogeneous Face Recognition
Published	2017-09-12
URL	http://arxiv.org/abs/1709.03675v1
PDF	http://arxiv.org/pdf/1709.03675v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-discriminative-heterogeneous-face
Repo
Framework

Attention-Aware Face Hallucination via Deep Reinforcement Learning


Title	Attention-Aware Face Hallucination via Deep Reinforcement Learning
Authors	Qingxing Cao, Liang Lin, Yukai Shi, Xiaodan Liang, Guanbin Li
Abstract	Face hallucination is a domain-specific super-resolution problem with the goal to generate high-resolution (HR) faces from low-resolution (LR) input images. In contrast to existing methods that often learn a single patch-to-patch mapping from LR to HR images and are regardless of the contextual interdependency between patches, we propose a novel Attention-aware Face Hallucination (Attention-FH) framework which resorts to deep reinforcement learning for sequentially discovering attended patches and then performing the facial part enhancement by fully exploiting the global interdependency of the image. Specifically, in each time step, the recurrent policy network is proposed to dynamically specify a new attended region by incorporating what happened in the past. The state (i.e., face hallucination result for the whole image) can thus be exploited and updated by the local enhancement network on the selected region. The Attention-FH approach jointly learns the recurrent policy network and local enhancement network through maximizing the long-term reward that reflects the hallucination performance over the whole image. Therefore, our proposed Attention-FH is capable of adaptively personalizing an optimal searching path for each face image according to its own characteristic. Extensive experiments show our approach significantly surpasses the state-of-the-arts on in-the-wild faces with large pose and illumination variations.
Tasks	Face Hallucination, Super-Resolution
Published	2017-08-10
URL	http://arxiv.org/abs/1708.03132v1
PDF	http://arxiv.org/pdf/1708.03132v1.pdf
PWC	https://paperswithcode.com/paper/attention-aware-face-hallucination-via-deep
Repo
Framework

Avoiding Echo-Responses in a Retrieval-Based Conversation System


Title	Avoiding Echo-Responses in a Retrieval-Based Conversation System
Authors	Denis Fedorenko, Nikita Smetanin, Artem Rodichev
Abstract	Retrieval-based conversation systems generally tend to highly rank responses that are semantically similar or even identical to the given conversation context. While the system’s goal is to find the most appropriate response, rather than the most semantically similar one, this tendency results in low-quality responses. We refer to this challenge as the echoing problem. To mitigate this problem, we utilize a hard negative mining approach at the training stage. The evaluation shows that the resulting model reduces echoing and achieves better results in terms of Average Precision and Recall@N metrics, compared to the models trained without the proposed approach.
Tasks
Published	2017-12-15
URL	http://arxiv.org/abs/1712.05626v2
PDF	http://arxiv.org/pdf/1712.05626v2.pdf
PWC	https://paperswithcode.com/paper/avoiding-echo-responses-in-a-retrieval-based
Repo
Framework

From Image to Text Classification: A Novel Approach based on Clustering Word Embeddings


Title	From Image to Text Classification: A Novel Approach based on Clustering Word Embeddings
Authors	Andrei M. Butnaru, Radu Tudor Ionescu
Abstract	In this paper, we propose a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, which is widely used in computer vision. After each word in a collection of documents is represented as word vector using a pre-trained word embeddings model, a k-means algorithm is applied on the word vectors in order to obtain a fixed-size set of clusters. The centroid of each cluster is interpreted as a super word embedding that embodies all the semantically related word vectors in a certain region of the embedding space. Every embedded word in the collection of documents is then assigned to the nearest cluster centroid. In the end, each document is represented as a bag of super word embeddings by computing the frequency of each super word embedding in the respective document. We also diverge from the idea of building a single vocabulary for the entire collection of documents, and propose to build class-specific vocabularies for better performance. Using this kind of representation, we report results on two text mining tasks, namely text categorization by topic and polarity classification. On both tasks, our model yields better performance than the standard bag of words.
Tasks	Text Categorization, Text Classification, Word Embeddings
Published	2017-07-25
URL	http://arxiv.org/abs/1707.08098v1
PDF	http://arxiv.org/pdf/1707.08098v1.pdf
PWC	https://paperswithcode.com/paper/from-image-to-text-classification-a-novel
Repo
Framework

N-gram Language Modeling using Recurrent Neural Network Estimation


Title	N-gram Language Modeling using Recurrent Neural Network Estimation
Authors	Ciprian Chelba, Mohammad Norouzi, Samy Bengio
Abstract	We investigate the effective memory depth of RNN models by using them for $n$-gram language model (LM) smoothing. Experiments on a small corpus (UPenn Treebank, one million words of training data and 10k vocabulary) have found the LSTM cell with dropout to be the best model for encoding the $n$-gram state when compared with feed-forward and vanilla RNN models. When preserving the sentence independence assumption the LSTM $n$-gram matches the LSTM LM performance for $n=9$ and slightly outperforms it for $n=13$. When allowing dependencies across sentence boundaries, the LSTM $13$-gram almost matches the perplexity of the unlimited history LSTM LM. LSTM $n$-gram smoothing also has the desirable property of improving with increasing $n$-gram order, unlike the Katz or Kneser-Ney back-off estimators. Using multinomial distributions as targets in training instead of the usual one-hot target is only slightly beneficial for low $n$-gram orders. Experiments on the One Billion Words benchmark show that the results hold at larger scale: while LSTM smoothing for short $n$-gram contexts does not provide significant advantages over classic N-gram models, it becomes effective with long contexts ($n > 5$); depending on the task and amount of data it can match fully recurrent LSTM models at about $n=13$. This may have implications when modeling short-format text, e.g. voice search/query LMs. Building LSTM $n$-gram LMs may be appealing for some practical situations: the state in a $n$-gram LM can be succinctly represented with $(n-1)*4$ bytes storing the identity of the words in the context and batches of $n$-gram contexts can be processed in parallel. On the downside, the $n$-gram context encoding computed by the LSTM is discarded, making the model more expensive than a regular recurrent LSTM LM.
Tasks	Language Modelling
Published	2017-03-31
URL	http://arxiv.org/abs/1703.10724v2
PDF	http://arxiv.org/pdf/1703.10724v2.pdf
PWC	https://paperswithcode.com/paper/n-gram-language-modeling-using-recurrent
Repo
Framework

Multilingual Adaptation of RNN Based ASR Systems


Title	Multilingual Adaptation of RNN Based ASR Systems
Authors	Markus Müller, Sebastian Stüker, Alex Waibel
Abstract	In this work, we focus on multilingual systems based on recurrent neural networks (RNNs), trained using the Connectionist Temporal Classification (CTC) loss function. Using a multilingual set of acoustic units poses difficulties. To address this issue, we proposed Language Feature Vectors (LFVs) to train language adaptive multilingual systems. Language adaptation, in contrast to speaker adaptation, needs to be applied not only on the feature level, but also to deeper layers of the network. In this work, we therefore extended our previous approach by introducing a novel technique which we call “modulation”. Based on this method, we modulated the hidden layers of RNNs using LFVs. We evaluated this approach in both full and low resource conditions, as well as for grapheme and phone based systems. Lower error rates throughout the different conditions could be achieved by the use of the modulation.
Tasks
Published	2017-11-13
URL	http://arxiv.org/abs/1711.04569v2
PDF	http://arxiv.org/pdf/1711.04569v2.pdf
PWC	https://paperswithcode.com/paper/multilingual-adaptation-of-rnn-based-asr
Repo
Framework

Cluster Based Symbolic Representation for Skewed Text Categorization


Title	Cluster Based Symbolic Representation for Skewed Text Categorization
Authors	Lavanya Narayana Raju, Mahamad Suhil, D S Guru, Harsha S Gowda
Abstract	In this work, a problem associated with imbalanced text corpora is addressed. A method of converting an imbalanced text corpus into a balanced one is presented. The presented method employs a clustering algorithm for conversion. Initially to avoid curse of dimensionality, an effective representation scheme based on term class relevancy measure is adapted, which drastically reduces the dimension to the number of classes in the corpus. Subsequently, the samples of larger sized classes are grouped into a number of subclasses of smaller sizes to make the entire corpus balanced. Each subclass is then given a single symbolic vector representation by the use of interval valued features. This symbolic representation in addition to being compact helps in reducing the space requirement and also the classification time. The proposed model has been empirically demonstrated for its superiority on bench marking datasets viz., Reuters 21578 and TDT2. Further, it has been compared against several other existing contemporary models including model based on support vector machine. The comparative analysis indicates that the proposed model outperforms the other existing models.
Tasks	Text Categorization
Published	2017-06-24
URL	http://arxiv.org/abs/1706.07912v1
PDF	http://arxiv.org/pdf/1706.07912v1.pdf
PWC	https://paperswithcode.com/paper/cluster-based-symbolic-representation-for
Repo
Framework

Game-theoretic Network Centrality: A Review


Title	Game-theoretic Network Centrality: A Review
Authors	Mateusz K. Tarkowski, Tomasz P. Michalak, Talal Rahwan, Michael Wooldridge
Abstract	Game-theoretic centrality is a flexible and sophisticated approach to identify the most important nodes in a network. It builds upon the methods from cooperative game theory and network theory. The key idea is to treat nodes as players in a cooperative game, where the value of each coalition is determined by certain graph-theoretic properties. Using solution concepts from cooperative game theory, it is then possible to measure how responsible each node is for the worth of the network. The literature on the topic is already quite large, and is scattered among game-theoretic and computer science venues. We review the main game-theoretic network centrality measures from both bodies of literature and organize them into two categories: those that are more focused on the connectivity of nodes, and those that are more focused on the synergies achieved by nodes in groups. We present and explain each centrality, with a focus on algorithms and complexity.
Tasks
Published	2017-12-31
URL	http://arxiv.org/abs/1801.00218v1
PDF	http://arxiv.org/pdf/1801.00218v1.pdf
PWC	https://paperswithcode.com/paper/game-theoretic-network-centrality-a-review
Repo
Framework

Predicting Rankings of Software Verification Competitions


Title	Predicting Rankings of Software Verification Competitions
Authors	Mike Czech, Eyke Hüllermeier, Marie-Christine Jakobs, Heike Wehrheim
Abstract	Software verification competitions, such as the annual SV-COMP, evaluate software verification tools with respect to their effectivity and efficiency. Typically, the outcome of a competition is a (possibly category-specific) ranking of the tools. For many applications, such as building portfolio solvers, it would be desirable to have an idea of the (relative) performance of verification tools on a given verification task beforehand, i.e., prior to actually running all tools on the task. In this paper, we present a machine learning approach to predicting rankings of tools on verification tasks. The method builds upon so-called label ranking algorithms, which we complement with appropriate kernels providing a similarity measure for verification tasks. Our kernels employ a graph representation for software source code that mixes elements of control flow and program dependence graphs with abstract syntax trees. Using data sets from SV-COMP, we demonstrate our rank prediction technique to generalize well and achieve a rather high predictive accuracy. In particular, our method outperforms a recently proposed feature-based approach of Demyanova et al. (when applied to rank predictions).
Tasks
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00757v1
PDF	http://arxiv.org/pdf/1703.00757v1.pdf
PWC	https://paperswithcode.com/paper/predicting-rankings-of-software-verification
Repo
Framework

Multi-Objective Vehicle Routing Problem Applied to Large Scale Post Office Deliveries


Title	Multi-Objective Vehicle Routing Problem Applied to Large Scale Post Office Deliveries
Authors	Luis A. A. Meira, Paulo S. Martins, Mauro Menzori, Guilherme A. Zeni
Abstract	The number of optimization techniques in the combinatorial domain is large and diversified. Nevertheless, real-world based benchmarks for testing algorithms are few. This work creates an extensible real-world mail delivery benchmark to the Vehicle Routing Problem (VRP) in a planar graph embedded in the 2D Euclidean space. Such problem is multi-objective on a roadmap with up to 25 vehicles and 30,000 deliveries per day. Each instance models one generic day of mail delivery, allowing both comparison and validation of optimization algorithms for routing problems. The benchmark may be extended to model other scenarios.
Tasks
Published	2017-12-23
URL	http://arxiv.org/abs/1801.00712v1
PDF	http://arxiv.org/pdf/1801.00712v1.pdf
PWC	https://paperswithcode.com/paper/multi-objective-vehicle-routing-problem
Repo
Framework

Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale


Title	Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale
Authors	Matthias Seeger, Syama Rangapuram, Yuyang Wang, David Salinas, Jan Gasthaus, Tim Januschowski, Valentin Flunkert
Abstract	We present a scalable and robust Bayesian inference method for linear state space models. The method is applied to demand forecasting in the context of a large e-commerce platform, paying special attention to intermittent and bursty target statistics. Inference is approximated by the Newton-Raphson algorithm, reduced to linear-time Kalman smoothing, which allows us to operate on several orders of magnitude larger problems than previous related work. In a study on large real-world sales datasets, our method outperforms competing approaches on fast and medium moving items.
Tasks	Bayesian Inference
Published	2017-09-22
URL	http://arxiv.org/abs/1709.07638v1
PDF	http://arxiv.org/pdf/1709.07638v1.pdf
PWC	https://paperswithcode.com/paper/approximate-bayesian-inference-in-linear
Repo
Framework