July 26, 2019

2847 words 14 mins read

Paper Group ANR 775

Convex-constrained Sparse Additive Modeling and Its Extensions. A Feature Embedding Strategy for High-level CNN representations from Multiple ConvNets. Meta-QSAR: a large-scale application of meta-learning to drug design and discovery. Detection and Localization of Image Forgeries using Resampling Features and Deep Learning. BranchConnect: Large-Sc …

Convex-constrained Sparse Additive Modeling and Its Extensions


Title	Convex-constrained Sparse Additive Modeling and Its Extensions
Authors	Junming Yin, Yaoliang Yu
Abstract	Sparse additive modeling is a class of effective methods for performing high-dimensional nonparametric regression. In this work we show how shape constraints such as convexity/concavity and their extensions, can be integrated into additive models. The proposed sparse difference of convex additive models (SDCAM) can estimate most continuous functions without any a priori smoothness assumption. Motivated by a characterization of difference of convex functions, our method incorporates a natural regularization functional to avoid overfitting and to reduce model complexity. Computationally, we develop an efficient backfitting algorithm with linear per-iteration complexity. Experiments on both synthetic and real data verify that our method is competitive against state-of-the-art sparse additive models, with improved performance in most scenarios.
Tasks
Published	2017-05-01
URL	http://arxiv.org/abs/1705.00687v1
PDF	http://arxiv.org/pdf/1705.00687v1.pdf
PWC	https://paperswithcode.com/paper/convex-constrained-sparse-additive-modeling
Repo
Framework

A Feature Embedding Strategy for High-level CNN representations from Multiple ConvNets


Title	A Feature Embedding Strategy for High-level CNN representations from Multiple ConvNets
Authors	Thangarajah Akilan, Q. M. Jonathan Wu, Wei Jiang
Abstract	Following the rapidly growing digital image usage, automatic image categorization has become preeminent research area. It has broaden and adopted many algorithms from time to time, whereby multi-feature (generally, hand-engineered features) based image characterization comes handy to improve accuracy. Recently, in machine learning, pre-trained deep convolutional neural networks (DCNNs or ConvNets) have been that the features extracted through such DCNN can improve classification accuracy. Thence, in this paper, we further investigate a feature embedding strategy to exploit cues from multiple DCNNs. We derive a generalized feature space by embedding three different DCNN bottleneck features with weights respect to their Softmax cross-entropy loss. Test outcomes on six different object classification data-sets and an action classification data-set show that regardless of variation in image statistics and tasks the proposed multi-DCNN bottleneck feature fusion is well suited to image classification tasks and an effective complement of DCNN. The comparisons to existing fusion-based image classification approaches prove that the proposed method surmounts the state-of-the-art methods and produces competitive results with fully trained DCNNs as well.
Tasks	Action Classification, Image Categorization, Image Classification, Object Classification
Published	2017-05-11
URL	http://arxiv.org/abs/1705.04301v1
PDF	http://arxiv.org/pdf/1705.04301v1.pdf
PWC	https://paperswithcode.com/paper/a-feature-embedding-strategy-for-high-level
Repo
Framework

Meta-QSAR: a large-scale application of meta-learning to drug design and discovery


Title	Meta-QSAR: a large-scale application of meta-learning to drug design and discovery
Authors	Ivan Olier, Noureddin Sadawi, G. Richard Bickerton, Joaquin Vanschoren, Crina Grosan, Larisa Soldatova, Ross D. King
Abstract	We investigate the learning of quantitative structure activity relationships (QSARs) as a case-study of meta-learning. This application area is of the highest societal importance, as it is a key step in the development of new medicines. The standard QSAR learning problem is: given a target (usually a protein) and a set of chemical compounds (small molecules) with associated bioactivities (e.g. inhibition of the target), learn a predictive mapping from molecular representation to activity. Although almost every type of machine learning method has been applied to QSAR learning there is no agreed single best way of learning QSARs, and therefore the problem area is well-suited to meta-learning. We first carried out the most comprehensive ever comparison of machine learning methods for QSAR learning: 18 regression methods, 6 molecular representations, applied to more than 2,700 QSAR problems. (These results have been made publicly available on OpenML and represent a valuable resource for testing novel meta-learning methods.) We then investigated the utility of algorithm selection for QSAR problems. We found that this meta-learning approach outperformed the best individual QSAR learning method (random forests using a molecular fingerprint representation) by up to 13%, on average. We conclude that meta-learning outperforms base-learning methods for QSAR learning, and as this investigation is one of the most extensive ever comparisons of base and meta-learning methods ever made, it provides evidence for the general effectiveness of meta-learning over base-learning.
Tasks	Meta-Learning
Published	2017-09-12
URL	http://arxiv.org/abs/1709.03854v1
PDF	http://arxiv.org/pdf/1709.03854v1.pdf
PWC	https://paperswithcode.com/paper/meta-qsar-a-large-scale-application-of-meta
Repo
Framework

Detection and Localization of Image Forgeries using Resampling Features and Deep Learning


Title	Detection and Localization of Image Forgeries using Resampling Features and Deep Learning
Authors	Jason Bunk, Jawadul H. Bappy, Tajuddin Manhar Mohammed, Lakshmanan Nataraj, Arjuna Flenner, B. S. Manjunath, Shivkumar Chandrasekaran, Amit K. Roy-Chowdhury, Lawrence Peterson
Abstract	Resampling is an important signature of manipulated images. In this paper, we propose two methods to detect and localize image manipulations based on a combination of resampling features and deep learning. In the first method, the Radon transform of resampling features are computed on overlapping image patches. Deep learning classifiers and a Gaussian conditional random field model are then used to create a heatmap. Tampered regions are located using a Random Walker segmentation method. In the second method, resampling features computed on overlapping image patches are passed through a Long short-term memory (LSTM) based network for classification and localization. We compare the performance of detection/localization of both these methods. Our experimental results show that both techniques are effective in detecting and localizing digital image forgeries.
Tasks
Published	2017-07-03
URL	http://arxiv.org/abs/1707.00433v1
PDF	http://arxiv.org/pdf/1707.00433v1.pdf
PWC	https://paperswithcode.com/paper/detection-and-localization-of-image-forgeries
Repo
Framework

BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections


Title	BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections
Authors	Karim Ahmed, Lorenzo Torresani
Abstract	We introduce an architecture for large-scale image categorization that enables the end-to-end learning of separate visual features for the different classes to distinguish. The proposed model consists of a deep CNN shaped like a tree. The stem of the tree includes a sequence of convolutional layers common to all classes. The stem then splits into multiple branches implementing parallel feature extractors, which are ultimately connected to the final classification layer via learned gated connections. These learned gates determine for each individual class the subset of features to use. Such a scheme naturally encourages the learning of a heterogeneous set of specialized features through the separate branches and it allows each class to use the subset of features that are optimal for its recognition. We show the generality of our proposed method by reshaping several popular CNNs from the literature into our proposed architecture. Our experiments on the CIFAR100, CIFAR10, and Synth datasets show that in each case our resulting model yields a substantial improvement in accuracy over the original CNN. Our empirical analysis also suggests that our scheme acts as a form of beneficial regularization improving generalization performance.
Tasks	Image Categorization, Object Recognition
Published	2017-04-20
URL	http://arxiv.org/abs/1704.06010v3
PDF	http://arxiv.org/pdf/1704.06010v3.pdf
PWC	https://paperswithcode.com/paper/branchconnect-large-scale-visual-recognition
Repo
Framework

CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks


Title	CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks
Authors	Yuanfang Li, Ardavan Pedram
Abstract	Accelerating the inference of a trained DNN is a well studied subject. In this paper we switch the focus to the training of DNNs. The training phase is compute intensive, demands complicated data communication, and contains multiple levels of data dependencies and parallelism. This paper presents an algorithm/architecture space exploration of efficient accelerators to achieve better network convergence rates and higher energy efficiency for training DNNs. We further demonstrate that an architecture with hierarchical support for collective communication semantics provides flexibility in training various networks performing both stochastic and batched gradient descent based techniques. Our results suggest that smaller networks favor non-batched techniques while performance for larger networks is higher using batched operations. At 45nm technology, CATERPILLAR achieves performance efficiencies of 177 GFLOPS/W at over 80% utilization for SGD training on small networks and 211 GFLOPS/W at over 90% utilization for pipelined SGD/CP training on larger networks using a total area of 103.2 mm$^2$ and 178.9 mm$^2$ respectively.
Tasks
Published	2017-06-01
URL	http://arxiv.org/abs/1706.00517v2
PDF	http://arxiv.org/pdf/1706.00517v2.pdf
PWC	https://paperswithcode.com/paper/caterpillar-coarse-grain-reconfigurable
Repo
Framework

Matrix and Graph Operations for Relationship Inference: An Illustration with the Kinship Inference in the China Biographical Database


Title	Matrix and Graph Operations for Relationship Inference: An Illustration with the Kinship Inference in the China Biographical Database
Authors	Chao-Lin Liu, Hongsu Wang
Abstract	Biographical databases contain diverse information about individuals. Person names, birth information, career, friends, family and special achievements are some possible items in the record for an individual. The relationships between individuals, such as kinship and friendship, provide invaluable insights about hidden communities which are not directly recorded in databases. We show that some simple matrix and graph-based operations are effective for inferring relationships among individuals, and illustrate the main ideas with the China Biographical Database (CBDB).
Tasks
Published	2017-09-09
URL	http://arxiv.org/abs/1709.02968v1
PDF	http://arxiv.org/pdf/1709.02968v1.pdf
PWC	https://paperswithcode.com/paper/matrix-and-graph-operations-for-relationship
Repo
Framework

Attending to All Mention Pairs for Full Abstract Biological Relation Extraction


Title	Attending to All Mention Pairs for Full Abstract Biological Relation Extraction
Authors	Patrick Verga, Emma Strubell, Ofer Shai, Andrew McCallum
Abstract	Most work in relation extraction forms a prediction by looking at a short span of text within a single sentence containing a single entity pair mention. However, many relation types, particularly in biomedical text, are expressed across sentences or require a large context to disambiguate. We propose a model to consider all mention and entity pairs simultaneously in order to make a prediction. We encode full paper abstracts using an efficient self-attention encoder and form pairwise predictions between all mentions with a bi-affine operation. An entity-pair wise pooling aggregates mention pair scores to make a final prediction while alleviating training noise by performing within document multi-instance learning. We improve our model’s performance by jointly training the model to predict named entities and adding an additional corpus of weakly labeled data. We demonstrate our model’s effectiveness by achieving the state of the art on the Biocreative V Chemical Disease Relation dataset for models without KB resources, outperforming ensembles of models which use hand-crafted features and additional linguistic resources.
Tasks	Relation Extraction
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08312v2
PDF	http://arxiv.org/pdf/1710.08312v2.pdf
PWC	https://paperswithcode.com/paper/attending-to-all-mention-pairs-for-full
Repo
Framework

A Collective, Probabilistic Approach to Schema Mapping: Appendix


Title	A Collective, Probabilistic Approach to Schema Mapping: Appendix
Authors	Angelika Kimmig, Alex Memory, Renee J. Miller, Lise Getoor
Abstract	In this appendix we provide additional supplementary material to “A Collective, Probabilistic Approach to Schema Mapping.” We include an additional extended example, supplementary experiment details, and proof for the complexity result stated in the main paper.
Tasks
Published	2017-02-11
URL	http://arxiv.org/abs/1702.03447v1
PDF	http://arxiv.org/pdf/1702.03447v1.pdf
PWC	https://paperswithcode.com/paper/a-collective-probabilistic-approach-to-schema
Repo
Framework

FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos


Title	FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos
Authors	Suyog Dutt Jain, Bo Xiong, Kristen Grauman
Abstract	We propose an end-to-end learning framework for segmenting generic objects in videos. Our method learns to combine appearance and motion information to produce pixel level segmentation masks for all prominent objects in videos. We formulate this task as a structured prediction problem and design a two-stream fully convolutional neural network which fuses together motion and appearance in a unified framework. Since large-scale video datasets with pixel level segmentations are problematic, we show how to bootstrap weakly annotated videos together with existing image recognition datasets for training. Through experiments on three challenging video segmentation benchmarks, our method substantially improves the state-of-the-art for segmenting generic (unseen) objects. Code and pre-trained models are available on the project website.
Tasks	Structured Prediction, Video Semantic Segmentation
Published	2017-01-19
URL	http://arxiv.org/abs/1701.05384v2
PDF	http://arxiv.org/pdf/1701.05384v2.pdf
PWC	https://paperswithcode.com/paper/fusionseg-learning-to-combine-motion-and
Repo
Framework

A Riemannian gossip approach to subspace learning on Grassmann manifold


Title	A Riemannian gossip approach to subspace learning on Grassmann manifold
Authors	Bamdev Mishra, Hiroyuki Kasai, Pratik Jawanpuria, Atul Saroop
Abstract	In this paper, we focus on subspace learning problems on the Grassmann manifold. Interesting applications in this setting include low-rank matrix completion and low-dimensional multivariate regression, among others. Motivated by privacy concerns, we aim to solve such problems in a decentralized setting where multiple agents have access to (and solve) only a part of the whole optimization problem. The agents communicate with each other to arrive at a consensus, i.e., agree on a common quantity, via the gossip protocol. We propose a novel cost function for subspace learning on the Grassmann manifold, which is a weighted sum of several sub-problems (each solved by an agent) and the communication cost among the agents. The cost function has a finite sum structure. In the proposed modeling approach, different agents learn individual local subspace but they achieve asymptotic consensus on the global learned subspace. The approach is scalable and parallelizable. Numerical experiments show the efficacy of the proposed decentralized algorithms on various matrix completion and multivariate regression benchmarks.
Tasks	Low-Rank Matrix Completion, Matrix Completion
Published	2017-05-01
URL	http://arxiv.org/abs/1705.00467v2
PDF	http://arxiv.org/pdf/1705.00467v2.pdf
PWC	https://paperswithcode.com/paper/a-riemannian-gossip-approach-to-subspace
Repo
Framework

Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network


Title	Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network
Authors	Seunghyun Yoon, Hyeongu Yun, Yuna Kim, Gyu-tae Park, Kyomin Jung
Abstract	In this paper, we propose an efficient transfer leaning methods for training a personalized language model using a recurrent neural network with long short-term memory architecture. With our proposed fast transfer learning schemes, a general language model is updated to a personalized language model with a small amount of user data and a limited computing resource. These methods are especially useful for a mobile device environment while the data is prevented from transferring out of the device for privacy purposes. Through experiments on dialogue data in a drama, it is verified that our transfer learning methods have successfully generated the personalized language model, whose output is more similar to the personal language style in both qualitative and quantitative aspects.
Tasks	Language Modelling, Transfer Learning
Published	2017-01-13
URL	http://arxiv.org/abs/1701.03578v1
PDF	http://arxiv.org/pdf/1701.03578v1.pdf
PWC	https://paperswithcode.com/paper/efficient-transfer-learning-schemes-for
Repo
Framework

Deformable Registration through Learning of Context-Specific Metric Aggregation


Title	Deformable Registration through Learning of Context-Specific Metric Aggregation
Authors	Enzo Ferrante, Puneet K Dokania, Rafael Marini, Nikos Paragios
Abstract	We propose a novel weakly supervised discriminative algorithm for learning context specific registration metrics as a linear combination of conventional similarity measures. Conventional metrics have been extensively used over the past two decades and therefore both their strengths and limitations are known. The challenge is to find the optimal relative weighting (or parameters) of different metrics forming the similarity measure of the registration algorithm. Hand-tuning these parameters would result in sub optimal solutions and quickly become infeasible as the number of metrics increases. Furthermore, such hand-crafted combination can only happen at global scale (entire volume) and therefore will not be able to account for the different tissue properties. We propose a learning algorithm for estimating these parameters locally, conditioned to the data semantic classes. The objective function of our formulation is a special case of non-convex function, difference of convex function, which we optimize using the concave convex procedure. As a proof of concept, we show the impact of our approach on three challenging datasets for different anatomical structures and modalities.
Tasks
Published	2017-07-19
URL	http://arxiv.org/abs/1707.06263v1
PDF	http://arxiv.org/pdf/1707.06263v1.pdf
PWC	https://paperswithcode.com/paper/deformable-registration-through-learning-of
Repo
Framework

ORGB: Offset Correction in RGB Color Space for Illumination-Robust Image Processing


Title	ORGB: Offset Correction in RGB Color Space for Illumination-Robust Image Processing
Authors	Zhenqiang Ying, Ge Li, Sixin Wen, Guozhen Tan
Abstract	Single materials have colors which form straight lines in RGB space. However, in severe shadow cases, those lines do not intersect the origin, which is inconsistent with the description of most literature. This paper is concerned with the detection and correction of the offset between the intersection and origin. First, we analyze the reason for forming that offset via an optical imaging model. Second, we present a simple and effective way to detect and remove the offset. The resulting images, named ORGB, have almost the same appearance as the original RGB images while are more illumination-robust for color space conversion. Besides, image processing using ORGB instead of RGB is free from the interference of shadows. Finally, the proposed offset correction method is applied to road detection task, improving the performance both in quantitative and qualitative evaluations.
Tasks
Published	2017-08-03
URL	http://arxiv.org/abs/1708.00975v1
PDF	http://arxiv.org/pdf/1708.00975v1.pdf
PWC	https://paperswithcode.com/paper/orgb-offset-correction-in-rgb-color-space-for
Repo
Framework

Improving End-to-End Speech Recognition with Policy Learning


Title	Improving End-to-End Speech Recognition with Policy Learning
Authors	Yingbo Zhou, Caiming Xiong, Richard Socher
Abstract	Connectionist temporal classification (CTC) is widely used for maximum likelihood learning in end-to-end speech recognition models. However, there is usually a disparity between the negative maximum likelihood and the performance metric used in speech recognition, e.g., word error rate (WER). This results in a mismatch between the objective function and metric during training. We show that the above problem can be mitigated by jointly training with maximum likelihood and policy gradient. In particular, with policy learning we are able to directly optimize on the (otherwise non-differentiable) performance metric. We show that joint training improves relative performance by 4% to 13% for our end-to-end model as compared to the same model learned through maximum likelihood. The model achieves 5.53% WER on Wall Street Journal dataset, and 5.42% and 14.70% on Librispeech test-clean and test-other set, respectively.
Tasks	End-To-End Speech Recognition, Speech Recognition
Published	2017-12-19
URL	http://arxiv.org/abs/1712.07101v1
PDF	http://arxiv.org/pdf/1712.07101v1.pdf
PWC	https://paperswithcode.com/paper/improving-end-to-end-speech-recognition-with-1
Repo
Framework