October 18, 2019

3345 words 16 mins read

Paper Group ANR 614

Predicting Good Configurations for GitHub and Stack Overflow Topic Models. ID Preserving Generative Adversarial Network for Partial Latent Fingerprint Reconstruction. Convolutional Attention Networks for Multimodal Emotion Recognition from Speech and Text Data. Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioni …

Predicting Good Configurations for GitHub and Stack Overflow Topic Models


Title	Predicting Good Configurations for GitHub and Stack Overflow Topic Models
Authors	Christoph Treude, Markus Wagner
Abstract	Software repositories contain large amounts of textual data, ranging from source code comments and issue descriptions to questions, answers, and comments on Stack Overflow. To make sense of this textual data, topic modelling is frequently used as a text-mining tool for the discovery of hidden semantic structures in text bodies. Latent Dirichlet allocation (LDA) is a commonly used topic model that aims to explain the structure of a corpus by grouping texts. LDA requires multiple parameters to work well, and there are only rough and sometimes conflicting guidelines available on how these parameters should be set. In this paper, we contribute (i) a broad study of parameters to arrive at good local optima for GitHub and Stack Overflow text corpora, (ii) an a-posteriori characterisation of text corpora related to eight programming languages, and (iii) an analysis of corpus feature importance via per-corpus LDA configuration. We find that (1) popular rules of thumb for topic modelling parameter configuration are not applicable to the corpora used in our experiments, (2) corpora sampled from GitHub and Stack Overflow have different characteristics and require different configurations to achieve good model fit, and (3) we can predict good configurations for unseen corpora reliably. These findings support researchers and practitioners in efficiently determining suitable configurations for topic modelling when analysing textual data contained in software repositories.
Tasks	Feature Importance, Topic Models
Published	2018-04-13
URL	http://arxiv.org/abs/1804.04749v3
PDF	http://arxiv.org/pdf/1804.04749v3.pdf
PWC	https://paperswithcode.com/paper/per-corpus-configuration-of-topic-modelling
Repo
Framework

ID Preserving Generative Adversarial Network for Partial Latent Fingerprint Reconstruction


Title	ID Preserving Generative Adversarial Network for Partial Latent Fingerprint Reconstruction
Authors	Ali Dabouei, Sobhan Soleymani, Hadi Kazemi, Seyed Mehdi Iranmanesh, Jeremy Dawson, Nasser M. Nasrabadi
Abstract	Performing recognition tasks using latent fingerprint samples is often challenging for automated identification systems due to poor quality, distortion, and partially missing information from the input samples. We propose a direct latent fingerprint reconstruction model based on conditional generative adversarial networks (cGANs). Two modifications are applied to the cGAN to adapt it for the task of latent fingerprint reconstruction. First, the model is forced to generate three additional maps to the ridge map to ensure that the orientation and frequency information is considered in the generation process, and prevent the model from filling large missing areas and generating erroneous minutiae. Second, a perceptual ID preservation approach is developed to force the generator to preserve the ID information during the reconstruction process. Using a synthetically generated database of latent fingerprints, the deep network learns to predict missing information from the input latent samples. We evaluate the proposed method in combination with two different fingerprint matching algorithms on several publicly available latent fingerprint datasets. We achieved the rank-10 accuracy of 88.02% on the IIIT-Delhi latent fingerprint database for the task of latent-to-latent matching and rank-50 accuracy of 70.89% on the IIIT-Delhi MOLF database for the task of latent-to-sensor matching. Experimental results of matching reconstructed samples in both latent-to-sensor and latent-to-latent frameworks indicate that the proposed method significantly increases the matching accuracy of the fingerprint recognition systems for the latent samples.
Tasks
Published	2018-07-31
URL	http://arxiv.org/abs/1808.00035v1
PDF	http://arxiv.org/pdf/1808.00035v1.pdf
PWC	https://paperswithcode.com/paper/id-preserving-generative-adversarial-network
Repo
Framework

Convolutional Attention Networks for Multimodal Emotion Recognition from Speech and Text Data


Title	Convolutional Attention Networks for Multimodal Emotion Recognition from Speech and Text Data
Authors	Chan Woo Lee, Kyu Ye Song, Jihoon Jeong, Woo Yong Choi
Abstract	Emotion recognition has become a popular topic of interest, especially in the field of human computer interaction. Previous works involve unimodal analysis of emotion, while recent efforts focus on multi-modal emotion recognition from vision and speech. In this paper, we propose a new method of learning about the hidden representations between just speech and text data using convolutional attention networks. Compared to the shallow model which employs simple concatenation of feature vectors, the proposed attention model performs much better in classifying emotion from speech and text data contained in the CMU-MOSEI dataset.
Tasks	Emotion Recognition, Multimodal Emotion Recognition
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06606v2
PDF	http://arxiv.org/pdf/1805.06606v2.pdf
PWC	https://paperswithcode.com/paper/convolutional-attention-networks-for
Repo
Framework

Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioning


Title	Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioning
Authors	Di Wang, Jinhui Xu
Abstract	In this paper, we revisit the large-scale constrained linear regression problem and propose faster methods based on some recent developments in sketching and optimization. Our algorithms combine (accelerated) mini-batch SGD with a new method called two-step preconditioning to achieve an approximate solution with a time complexity lower than that of the state-of-the-art techniques for the low precision case. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity. Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.
Tasks
Published	2018-02-09
URL	http://arxiv.org/abs/1802.03337v1
PDF	http://arxiv.org/pdf/1802.03337v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-constrained-linear-regression
Repo
Framework

Efficient Single-Shot Multibox Detector for Construction Site Monitoring


Title	Efficient Single-Shot Multibox Detector for Construction Site Monitoring
Authors	Viral Thakar, Himani Saini, Walid Ahmed, Mohammad M Soltani, Ahmed Aly, Jia Yuan Yu
Abstract	Asset monitoring in construction sites is an intricate, manually intensive task, that can highly benefit from automated solutions engineered using deep neural networks. We use Single-Shot Multibox Detector — SSD, for its fine balance between speed and accuracy, to leverage ubiquitously available images and videos from the surveillance cameras on the construction sites and automate the monitoring tasks, hence enabling project managers to better track the performance and optimize the utilization of each resource. We propose to improve the performance of SSD by clustering the predicted boxes instead of a greedy approach like non-maximum suppression. We do so using Affinity Propagation Clustering — APC to cluster the predicted boxes based on the similarity index computed using the spatial features as well as location of predicted boxes. In our attempts, we have been able to improve the mean average precision of SSD by 3.77% on custom dataset consist of images from construction sites and by 1.67% on PASCAL VOC Challenge.
Tasks
Published	2018-08-17
URL	http://arxiv.org/abs/1808.05730v2
PDF	http://arxiv.org/pdf/1808.05730v2.pdf
PWC	https://paperswithcode.com/paper/efficient-single-shot-multibox-detector-for
Repo
Framework

Topological Map Extraction from Overhead Images


Title	Topological Map Extraction from Overhead Images
Authors	Zuoyue Li, Jan Dirk Wegner, Aurélien Lucchi
Abstract	We propose a new approach, named PolyMapper, to circumvent the conventional pixel-wise segmentation of (aerial) images and predict objects in a vector representation directly. PolyMapper directly extracts the topological map of a city from overhead images as collections of building footprints and road networks. In order to unify the shape representation for different types of objects, we also propose a novel sequentialization method that reformulates a graph structure as closed polygons. Experiments are conducted on both existing and self-collected large-scale datasets of several cities. Our empirical results demonstrate that our end-to-end learnable model is capable of drawing polygons of building footprints and road networks that very closely approximate the structure of existing online map services, in a fully automated manner. Quantitative and qualitative comparison to the state-of-the-art also shows that our approach achieves good levels of performance. To the best of our knowledge, the automatic extraction of large-scale topological maps is a novel contribution in the remote sensing community that we believe will help develop models with more informed geometrical constraints.
Tasks	Semantic Segmentation
Published	2018-12-04
URL	https://arxiv.org/abs/1812.01497v3
PDF	https://arxiv.org/pdf/1812.01497v3.pdf
PWC	https://paperswithcode.com/paper/polymapper-extracting-city-maps-using
Repo
Framework

Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges


Title	Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges
Authors	Gabrielle Ras, Marcel van Gerven, Pim Haselager
Abstract	Issues regarding explainable AI involve four components: users, laws & regulations, explanations and algorithms. Together these components provide a context in which explanation methods can be evaluated regarding their adequacy. The goal of this chapter is to bridge the gap between expert users and lay users. Different kinds of users are identified and their concerns revealed, relevant statements from the General Data Protection Regulation are analyzed in the context of Deep Neural Networks (DNNs), a taxonomy for the classification of existing explanation methods is introduced, and finally, the various classes of explanation methods are analyzed to verify if user concerns are justified. Overall, it is clear that (visual) explanations can be given about various aspects of the influence of the input on the output. However, it is noted that explanation methods or interfaces for lay users are missing and we speculate which criteria these methods / interfaces should satisfy. Finally it is noted that two important concerns are difficult to address with explanation methods: the concern about bias in datasets that leads to biased DNNs, as well as the suspicion about unfair outcomes.
Tasks
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07517v2
PDF	http://arxiv.org/pdf/1803.07517v2.pdf
PWC	https://paperswithcode.com/paper/explanation-methods-in-deep-learning-users
Repo
Framework

Faster Shift-Reduce Constituent Parsing with a Non-Binary, Bottom-Up Strategy


Title	Faster Shift-Reduce Constituent Parsing with a Non-Binary, Bottom-Up Strategy
Authors	Daniel Fernández-González, Carlos Gómez-Rodríguez
Abstract	An increasingly wide range of artificial intelligence applications rely on syntactic information to process and extract meaning from natural language text or speech, with constituent trees being one of the most widely used syntactic formalisms. To produce these phrase-structure representations from sentences in natural language, shift-reduce constituent parsers have become one of the most efficient approaches. Increasing their accuracy and speed is still one of the main objectives pursued by the research community so that artificial intelligence applications that make use of parsing outputs, such as machine translation or voice assistant services, can improve their performance. With this goal in mind, we propose in this article a novel non-binary shift-reduce algorithm for constituent parsing. Our parser follows a classical bottom-up strategy but, unlike others, it straightforwardly creates non-binary branchings with just one Reduce transition, instead of requiring prior binarization or a sequence of binary transitions, allowing its direct application to any language without the need of further resources such as percolation tables. As a result, it uses fewer transitions per sentence than existing transition-based constituent parsers, becoming the fastest such system and, as a consequence, speeding up downstream applications. Using static oracle training and greedy search, the accuracy of this novel approach is on par with state-of-the-art transition-based constituent parsers and outperforms all top-down and bottom-up greedy shift-reduce systems on the Wall Street Journal section from the English Penn Treebank and the Penn Chinese Treebank. Additionally, we develop a dynamic oracle for training the proposed transition-based algorithm, achieving further improvements in both benchmarks and obtaining the best accuracy to date on the Penn Chinese Treebank among greedy shift-reduce parsers.
Tasks	Machine Translation
Published	2018-04-21
URL	https://arxiv.org/abs/1804.07961v3
PDF	https://arxiv.org/pdf/1804.07961v3.pdf
PWC	https://paperswithcode.com/paper/faster-shift-reduce-constituent-parsing-with
Repo
Framework

Classification of Findings with Localized Lesions in Fundoscopic Images using a Regionally Guided CNN


Title	Classification of Findings with Localized Lesions in Fundoscopic Images using a Regionally Guided CNN
Authors	Jaemin Son, Woong Bae, Sangkeun Kim, Sang Jun Park, Kyu-Hwan Jung
Abstract	Fundoscopic images are often investigated by ophthalmologists to spot abnormal lesions to make diagnoses. Recent successes of convolutional neural networks are confined to diagnoses of few diseases without proper localization of lesion. In this paper, we propose an efficient annotation method for localizing lesions and a CNN architecture that can classify an individual finding and localize the lesions at the same time. Also, we introduce a new loss function to guide the network to learn meaningful patterns with the guidance of the regional annotations. In experiments, we demonstrate that our network performed better than the widely used network and the guidance loss helps achieve higher AUROC up to 4.1% and superior localization capability.
Tasks
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00871v1
PDF	http://arxiv.org/pdf/1811.00871v1.pdf
PWC	https://paperswithcode.com/paper/classification-of-findings-with-localized
Repo
Framework

Traits & Transferability of Adversarial Examples against Instance Segmentation & Object Detection


Title	Traits & Transferability of Adversarial Examples against Instance Segmentation & Object Detection
Authors	Raghav Gurbaxani, Shivank Mishra
Abstract	Despite the recent advancements in deploying neural networks for image classification, it has been found that adversarial examples are able to fool these models leading them to misclassify the images. Since these models are now being widely deployed, we provide an insight on the threat of these adversarial examples by evaluating their characteristics and transferability to more complex models that utilize Image Classification as a subtask. We demonstrate the ineffectiveness of adversarial examples when applied to Instance Segmentation & Object Detection models. We show that this ineffectiveness arises from the inability of adversarial examples to withstand transformations such as scaling or a change in lighting conditions. Moreover, we show that there exists a small threshold below which the adversarial property is retained while applying these input transformations. Additionally, these attacks demonstrate weak cross-network transferability across neural network architectures, e.g. VGG16 and ResNet50, however, the attack may fool both the networks if passed sequentially through networks during its formation. The lack of scalability and transferability challenges the question of how adversarial images would be effective in the real world.
Tasks	Image Classification, Instance Segmentation, Object Detection, Semantic Segmentation
Published	2018-08-04
URL	http://arxiv.org/abs/1808.01452v1
PDF	http://arxiv.org/pdf/1808.01452v1.pdf
PWC	https://paperswithcode.com/paper/traits-transferability-of-adversarial
Repo
Framework

Semantic Channel and Shannon’s Channel Mutually Match for Multi-Label Classification


Title	Semantic Channel and Shannon’s Channel Mutually Match for Multi-Label Classification
Authors	Chenguang Lu
Abstract	A group of transition probability functions form a Shannon’s channel whereas a group of truth functions form a semantic channel. Label learning is to let semantic channels match Shannon’s channels and label selection is to let Shannon’s channels match semantic channels. The Channel Matching (CM) algorithm is provided for multi-label classification. This algorithm adheres to maximum semantic information criterion which is compatible with maximum likelihood criterion and regularized least squares criterion. If samples are very large, we can directly convert Shannon’s channels into semantic channels by the third kind of Bayes’ theorem; otherwise, we can train truth functions with parameters by sampling distributions. A label may be a Boolean function of some atomic labels. For simplifying learning, we may only obtain the truth functions of some atomic label. For a given label, instances are divided into three kinds (positive, negative, and unclear) instead of two kinds as in popular studies so that the problem with binary relevance is avoided. For each instance, the classifier selects a compound label with most semantic information or richest connotation. As a predictive model, the semantic channel does not change with the prior probability distribution (source) of instances. It still works when the source is changed. The classifier changes with the source, and hence can overcome class-imbalance problem. It is shown that the old population’s increasing will change the classifier for label “Old” and has been impelling the semantic evolution of “Old”. The CM iteration algorithm for unseen instance classification is introduced.
Tasks	Multi-Label Classification
Published	2018-05-02
URL	http://arxiv.org/abs/1805.01288v1
PDF	http://arxiv.org/pdf/1805.01288v1.pdf
PWC	https://paperswithcode.com/paper/semantic-channel-and-shannons-channel
Repo
Framework

KF-LAX: Kronecker-factored curvature estimation for control variate optimization in reinforcement learning


Title	KF-LAX: Kronecker-factored curvature estimation for control variate optimization in reinforcement learning
Authors	Mohammad Firouzi
Abstract	A key challenge for gradient based optimization methods in model-free reinforcement learning is to develop an approach that is sample efficient and has low variance. In this work, we apply Kronecker-factored curvature estimation technique (KFAC) to a recently proposed gradient estimator for control variate optimization, RELAX, to increase the sample efficiency of using this gradient estimation method in reinforcement learning. The performance of the proposed method is demonstrated on a synthetic problem and a set of three discrete control task Atari games.
Tasks	Atari Games
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04181v1
PDF	http://arxiv.org/pdf/1812.04181v1.pdf
PWC	https://paperswithcode.com/paper/kf-lax-kronecker-factored-curvature
Repo
Framework

SparseNet: A Sparse DenseNet for Image Classification


Title	SparseNet: A Sparse DenseNet for Image Classification
Authors	Wenqi Liu, Kun Zeng
Abstract	Deep neural networks have made remarkable progresses on various computer vision tasks. Recent works have shown that depth, width and shortcut connections of networks are all vital to their performances. In this paper, we introduce a method to sparsify DenseNet which can reduce connections of a L-layer DenseNet from O(L^2) to O(L), and thus we can simultaneously increase depth, width and connections of neural networks in a more parameter-efficient and computation-efficient way. Moreover, an attention module is introduced to further boost our network’s performance. We denote our network as SparseNet. We evaluate SparseNet on datasets of CIFAR(including CIFAR10 and CIFAR100) and SVHN. Experiments show that SparseNet can obtain improvements over the state-of-the-art on CIFAR10 and SVHN. Furthermore, while achieving comparable performances as DenseNet on these datasets, SparseNet is x2.6 smaller and x3.7 faster than the original DenseNet.
Tasks	Image Classification
Published	2018-04-15
URL	http://arxiv.org/abs/1804.05340v1
PDF	http://arxiv.org/pdf/1804.05340v1.pdf
PWC	https://paperswithcode.com/paper/sparsenet-a-sparse-densenet-for-image
Repo
Framework

Learning structure-from-motion from motion


Title	Learning structure-from-motion from motion
Authors	Clément Pinard, Laure Chevalley, Antoine Manzanera, David Filliat
Abstract	This work is based on a questioning of the quality metrics used by deep neural networks performing depth prediction from a single image, and then of the usability of recently published works on unsupervised learning of depth from videos. To overcome their limitations, we propose to learn in the same unsupervised manner a depth map inference system from monocular videos that takes a pair of images as input. This algorithm actually learns structure-from-motion from motion, and not only structure from context appearance. The scale factor issue is explicitly treated, and the absolute depth map can be estimated from camera displacement magnitude, which can be easily measured from cheap external sensors. Our solution is also much more robust with respect to domain variation and adaptation via fine tuning, because it does not rely entirely in depth from context. Two use cases are considered, unstabilized moving camera videos, and stabilized ones. This choice is motivated by the UAV (for Unmanned Aerial Vehicle) use case that generally provides reliable orientation measurement. We provide a set of experiments showing that, used in real conditions where only speed can be known, our network outperforms competitors for most depth quality measures. Results are given on the well known KITTI dataset, which provides robust stabilization for our second use case, but also contains moving scenes which are very typical of the in-car road context. We then present results on a synthetic dataset that we believe to be more representative of typical UAV scenes. Lastly, we present two domain adaptation use cases showing superior robustness of our method compared to single view depth algorithms, which indicates that it is better suited for highly variable visual contexts.
Tasks	Depth Estimation, Domain Adaptation
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04471v2
PDF	http://arxiv.org/pdf/1809.04471v2.pdf
PWC	https://paperswithcode.com/paper/learning-structure-from-motion-from-motion
Repo
Framework

COBRAS: Fast, Iterative, Active Clustering with Pairwise Constraints


Title	COBRAS: Fast, Iterative, Active Clustering with Pairwise Constraints
Authors	Toon Van Craenendonck, Sebastijan Dumančić, Elia Van Wolputte, Hendrik Blockeel
Abstract	Constraint-based clustering algorithms exploit background knowledge to construct clusterings that are aligned with the interests of a particular user. This background knowledge is often obtained by allowing the clustering system to pose pairwise queries to the user: should these two elements be in the same cluster or not? Active clustering methods aim to minimize the number of queries needed to obtain a good clustering by querying the most informative pairs first. Ideally, a user should be able to answer a couple of these queries, inspect the resulting clustering, and repeat these two steps until a satisfactory result is obtained. We present COBRAS, an approach to active clustering with pairwise constraints that is suited for such an interactive clustering process. A core concept in COBRAS is that of a super-instance: a local region in the data in which all instances are assumed to belong to the same cluster. COBRAS constructs such super-instances in a top-down manner to produce high-quality results early on in the clustering process, and keeps refining these super-instances as more pairwise queries are given to get more detailed clusterings later on. We experimentally demonstrate that COBRAS produces good clusterings at fast run times, making it an excellent candidate for the iterative clustering scenario outlined above.
Tasks
Published	2018-03-29
URL	http://arxiv.org/abs/1803.11060v1
PDF	http://arxiv.org/pdf/1803.11060v1.pdf
PWC	https://paperswithcode.com/paper/cobras-fast-iterative-active-clustering-with
Repo
Framework