Paper Group ANR 775
Convex-constrained Sparse Additive Modeling and Its Extensions. A Feature Embedding Strategy for High-level CNN representations from Multiple ConvNets. Meta-QSAR: a large-scale application of meta-learning to drug design and discovery. Detection and Localization of Image Forgeries using Resampling Features and Deep Learning. BranchConnect: Large-Sc …
Convex-constrained Sparse Additive Modeling and Its Extensions
Title | Convex-constrained Sparse Additive Modeling and Its Extensions |
Authors | Junming Yin, Yaoliang Yu |
Abstract | Sparse additive modeling is a class of effective methods for performing high-dimensional nonparametric regression. In this work we show how shape constraints such as convexity/concavity and their extensions, can be integrated into additive models. The proposed sparse difference of convex additive models (SDCAM) can estimate most continuous functions without any a priori smoothness assumption. Motivated by a characterization of difference of convex functions, our method incorporates a natural regularization functional to avoid overfitting and to reduce model complexity. Computationally, we develop an efficient backfitting algorithm with linear per-iteration complexity. Experiments on both synthetic and real data verify that our method is competitive against state-of-the-art sparse additive models, with improved performance in most scenarios. |
Tasks | |
Published | 2017-05-01 |
URL | http://arxiv.org/abs/1705.00687v1 |
http://arxiv.org/pdf/1705.00687v1.pdf | |
PWC | https://paperswithcode.com/paper/convex-constrained-sparse-additive-modeling |
Repo | |
Framework | |
A Feature Embedding Strategy for High-level CNN representations from Multiple ConvNets
Title | A Feature Embedding Strategy for High-level CNN representations from Multiple ConvNets |
Authors | Thangarajah Akilan, Q. M. Jonathan Wu, Wei Jiang |
Abstract | Following the rapidly growing digital image usage, automatic image categorization has become preeminent research area. It has broaden and adopted many algorithms from time to time, whereby multi-feature (generally, hand-engineered features) based image characterization comes handy to improve accuracy. Recently, in machine learning, pre-trained deep convolutional neural networks (DCNNs or ConvNets) have been that the features extracted through such DCNN can improve classification accuracy. Thence, in this paper, we further investigate a feature embedding strategy to exploit cues from multiple DCNNs. We derive a generalized feature space by embedding three different DCNN bottleneck features with weights respect to their Softmax cross-entropy loss. Test outcomes on six different object classification data-sets and an action classification data-set show that regardless of variation in image statistics and tasks the proposed multi-DCNN bottleneck feature fusion is well suited to image classification tasks and an effective complement of DCNN. The comparisons to existing fusion-based image classification approaches prove that the proposed method surmounts the state-of-the-art methods and produces competitive results with fully trained DCNNs as well. |
Tasks | Action Classification, Image Categorization, Image Classification, Object Classification |
Published | 2017-05-11 |
URL | http://arxiv.org/abs/1705.04301v1 |
http://arxiv.org/pdf/1705.04301v1.pdf | |
PWC | https://paperswithcode.com/paper/a-feature-embedding-strategy-for-high-level |
Repo | |
Framework | |
Meta-QSAR: a large-scale application of meta-learning to drug design and discovery
Title | Meta-QSAR: a large-scale application of meta-learning to drug design and discovery |
Authors | Ivan Olier, Noureddin Sadawi, G. Richard Bickerton, Joaquin Vanschoren, Crina Grosan, Larisa Soldatova, Ross D. King |
Abstract | We investigate the learning of quantitative structure activity relationships (QSARs) as a case-study of meta-learning. This application area is of the highest societal importance, as it is a key step in the development of new medicines. The standard QSAR learning problem is: given a target (usually a protein) and a set of chemical compounds (small molecules) with associated bioactivities (e.g. inhibition of the target), learn a predictive mapping from molecular representation to activity. Although almost every type of machine learning method has been applied to QSAR learning there is no agreed single best way of learning QSARs, and therefore the problem area is well-suited to meta-learning. We first carried out the most comprehensive ever comparison of machine learning methods for QSAR learning: 18 regression methods, 6 molecular representations, applied to more than 2,700 QSAR problems. (These results have been made publicly available on OpenML and represent a valuable resource for testing novel meta-learning methods.) We then investigated the utility of algorithm selection for QSAR problems. We found that this meta-learning approach outperformed the best individual QSAR learning method (random forests using a molecular fingerprint representation) by up to 13%, on average. We conclude that meta-learning outperforms base-learning methods for QSAR learning, and as this investigation is one of the most extensive ever comparisons of base and meta-learning methods ever made, it provides evidence for the general effectiveness of meta-learning over base-learning. |
Tasks | Meta-Learning |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03854v1 |
http://arxiv.org/pdf/1709.03854v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-qsar-a-large-scale-application-of-meta |
Repo | |
Framework | |
Detection and Localization of Image Forgeries using Resampling Features and Deep Learning
Title | Detection and Localization of Image Forgeries using Resampling Features and Deep Learning |
Authors | Jason Bunk, Jawadul H. Bappy, Tajuddin Manhar Mohammed, Lakshmanan Nataraj, Arjuna Flenner, B. S. Manjunath, Shivkumar Chandrasekaran, Amit K. Roy-Chowdhury, Lawrence Peterson |
Abstract | Resampling is an important signature of manipulated images. In this paper, we propose two methods to detect and localize image manipulations based on a combination of resampling features and deep learning. In the first method, the Radon transform of resampling features are computed on overlapping image patches. Deep learning classifiers and a Gaussian conditional random field model are then used to create a heatmap. Tampered regions are located using a Random Walker segmentation method. In the second method, resampling features computed on overlapping image patches are passed through a Long short-term memory (LSTM) based network for classification and localization. We compare the performance of detection/localization of both these methods. Our experimental results show that both techniques are effective in detecting and localizing digital image forgeries. |
Tasks | |
Published | 2017-07-03 |
URL | http://arxiv.org/abs/1707.00433v1 |
http://arxiv.org/pdf/1707.00433v1.pdf | |
PWC | https://paperswithcode.com/paper/detection-and-localization-of-image-forgeries |
Repo | |
Framework | |
BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections
Title | BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections |
Authors | Karim Ahmed, Lorenzo Torresani |
Abstract | We introduce an architecture for large-scale image categorization that enables the end-to-end learning of separate visual features for the different classes to distinguish. The proposed model consists of a deep CNN shaped like a tree. The stem of the tree includes a sequence of convolutional layers common to all classes. The stem then splits into multiple branches implementing parallel feature extractors, which are ultimately connected to the final classification layer via learned gated connections. These learned gates determine for each individual class the subset of features to use. Such a scheme naturally encourages the learning of a heterogeneous set of specialized features through the separate branches and it allows each class to use the subset of features that are optimal for its recognition. We show the generality of our proposed method by reshaping several popular CNNs from the literature into our proposed architecture. Our experiments on the CIFAR100, CIFAR10, and Synth datasets show that in each case our resulting model yields a substantial improvement in accuracy over the original CNN. Our empirical analysis also suggests that our scheme acts as a form of beneficial regularization improving generalization performance. |
Tasks | Image Categorization, Object Recognition |
Published | 2017-04-20 |
URL | http://arxiv.org/abs/1704.06010v3 |
http://arxiv.org/pdf/1704.06010v3.pdf | |
PWC | https://paperswithcode.com/paper/branchconnect-large-scale-visual-recognition |
Repo | |
Framework | |
CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks
Title | CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks |
Authors | Yuanfang Li, Ardavan Pedram |
Abstract | Accelerating the inference of a trained DNN is a well studied subject. In this paper we switch the focus to the training of DNNs. The training phase is compute intensive, demands complicated data communication, and contains multiple levels of data dependencies and parallelism. This paper presents an algorithm/architecture space exploration of efficient accelerators to achieve better network convergence rates and higher energy efficiency for training DNNs. We further demonstrate that an architecture with hierarchical support for collective communication semantics provides flexibility in training various networks performing both stochastic and batched gradient descent based techniques. Our results suggest that smaller networks favor non-batched techniques while performance for larger networks is higher using batched operations. At 45nm technology, CATERPILLAR achieves performance efficiencies of 177 GFLOPS/W at over 80% utilization for SGD training on small networks and 211 GFLOPS/W at over 90% utilization for pipelined SGD/CP training on larger networks using a total area of 103.2 mm$^2$ and 178.9 mm$^2$ respectively. |
Tasks | |
Published | 2017-06-01 |
URL | http://arxiv.org/abs/1706.00517v2 |
http://arxiv.org/pdf/1706.00517v2.pdf | |
PWC | https://paperswithcode.com/paper/caterpillar-coarse-grain-reconfigurable |
Repo | |
Framework | |
Matrix and Graph Operations for Relationship Inference: An Illustration with the Kinship Inference in the China Biographical Database
Title | Matrix and Graph Operations for Relationship Inference: An Illustration with the Kinship Inference in the China Biographical Database |
Authors | Chao-Lin Liu, Hongsu Wang |
Abstract | Biographical databases contain diverse information about individuals. Person names, birth information, career, friends, family and special achievements are some possible items in the record for an individual. The relationships between individuals, such as kinship and friendship, provide invaluable insights about hidden communities which are not directly recorded in databases. We show that some simple matrix and graph-based operations are effective for inferring relationships among individuals, and illustrate the main ideas with the China Biographical Database (CBDB). |
Tasks | |
Published | 2017-09-09 |
URL | http://arxiv.org/abs/1709.02968v1 |
http://arxiv.org/pdf/1709.02968v1.pdf | |
PWC | https://paperswithcode.com/paper/matrix-and-graph-operations-for-relationship |
Repo | |
Framework | |
Attending to All Mention Pairs for Full Abstract Biological Relation Extraction
Title | Attending to All Mention Pairs for Full Abstract Biological Relation Extraction |
Authors | Patrick Verga, Emma Strubell, Ofer Shai, Andrew McCallum |
Abstract | Most work in relation extraction forms a prediction by looking at a short span of text within a single sentence containing a single entity pair mention. However, many relation types, particularly in biomedical text, are expressed across sentences or require a large context to disambiguate. We propose a model to consider all mention and entity pairs simultaneously in order to make a prediction. We encode full paper abstracts using an efficient self-attention encoder and form pairwise predictions between all mentions with a bi-affine operation. An entity-pair wise pooling aggregates mention pair scores to make a final prediction while alleviating training noise by performing within document multi-instance learning. We improve our model’s performance by jointly training the model to predict named entities and adding an additional corpus of weakly labeled data. We demonstrate our model’s effectiveness by achieving the state of the art on the Biocreative V Chemical Disease Relation dataset for models without KB resources, outperforming ensembles of models which use hand-crafted features and additional linguistic resources. |
Tasks | Relation Extraction |
Published | 2017-10-23 |
URL | http://arxiv.org/abs/1710.08312v2 |
http://arxiv.org/pdf/1710.08312v2.pdf | |
PWC | https://paperswithcode.com/paper/attending-to-all-mention-pairs-for-full |
Repo | |
Framework | |
A Collective, Probabilistic Approach to Schema Mapping: Appendix
Title | A Collective, Probabilistic Approach to Schema Mapping: Appendix |
Authors | Angelika Kimmig, Alex Memory, Renee J. Miller, Lise Getoor |
Abstract | In this appendix we provide additional supplementary material to “A Collective, Probabilistic Approach to Schema Mapping.” We include an additional extended example, supplementary experiment details, and proof for the complexity result stated in the main paper. |
Tasks | |
Published | 2017-02-11 |
URL | http://arxiv.org/abs/1702.03447v1 |
http://arxiv.org/pdf/1702.03447v1.pdf | |
PWC | https://paperswithcode.com/paper/a-collective-probabilistic-approach-to-schema |
Repo | |
Framework | |
FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos
Title | FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos |
Authors | Suyog Dutt Jain, Bo Xiong, Kristen Grauman |
Abstract | We propose an end-to-end learning framework for segmenting generic objects in videos. Our method learns to combine appearance and motion information to produce pixel level segmentation masks for all prominent objects in videos. We formulate this task as a structured prediction problem and design a two-stream fully convolutional neural network which fuses together motion and appearance in a unified framework. Since large-scale video datasets with pixel level segmentations are problematic, we show how to bootstrap weakly annotated videos together with existing image recognition datasets for training. Through experiments on three challenging video segmentation benchmarks, our method substantially improves the state-of-the-art for segmenting generic (unseen) objects. Code and pre-trained models are available on the project website. |
Tasks | Structured Prediction, Video Semantic Segmentation |
Published | 2017-01-19 |
URL | http://arxiv.org/abs/1701.05384v2 |
http://arxiv.org/pdf/1701.05384v2.pdf | |
PWC | https://paperswithcode.com/paper/fusionseg-learning-to-combine-motion-and |
Repo | |
Framework | |
A Riemannian gossip approach to subspace learning on Grassmann manifold
Title | A Riemannian gossip approach to subspace learning on Grassmann manifold |
Authors | Bamdev Mishra, Hiroyuki Kasai, Pratik Jawanpuria, Atul Saroop |
Abstract | In this paper, we focus on subspace learning problems on the Grassmann manifold. Interesting applications in this setting include low-rank matrix completion and low-dimensional multivariate regression, among others. Motivated by privacy concerns, we aim to solve such problems in a decentralized setting where multiple agents have access to (and solve) only a part of the whole optimization problem. The agents communicate with each other to arrive at a consensus, i.e., agree on a common quantity, via the gossip protocol. We propose a novel cost function for subspace learning on the Grassmann manifold, which is a weighted sum of several sub-problems (each solved by an agent) and the communication cost among the agents. The cost function has a finite sum structure. In the proposed modeling approach, different agents learn individual local subspace but they achieve asymptotic consensus on the global learned subspace. The approach is scalable and parallelizable. Numerical experiments show the efficacy of the proposed decentralized algorithms on various matrix completion and multivariate regression benchmarks. |
Tasks | Low-Rank Matrix Completion, Matrix Completion |
Published | 2017-05-01 |
URL | http://arxiv.org/abs/1705.00467v2 |
http://arxiv.org/pdf/1705.00467v2.pdf | |
PWC | https://paperswithcode.com/paper/a-riemannian-gossip-approach-to-subspace |
Repo | |
Framework | |
Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network
Title | Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network |
Authors | Seunghyun Yoon, Hyeongu Yun, Yuna Kim, Gyu-tae Park, Kyomin Jung |
Abstract | In this paper, we propose an efficient transfer leaning methods for training a personalized language model using a recurrent neural network with long short-term memory architecture. With our proposed fast transfer learning schemes, a general language model is updated to a personalized language model with a small amount of user data and a limited computing resource. These methods are especially useful for a mobile device environment while the data is prevented from transferring out of the device for privacy purposes. Through experiments on dialogue data in a drama, it is verified that our transfer learning methods have successfully generated the personalized language model, whose output is more similar to the personal language style in both qualitative and quantitative aspects. |
Tasks | Language Modelling, Transfer Learning |
Published | 2017-01-13 |
URL | http://arxiv.org/abs/1701.03578v1 |
http://arxiv.org/pdf/1701.03578v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-transfer-learning-schemes-for |
Repo | |
Framework | |
Deformable Registration through Learning of Context-Specific Metric Aggregation
Title | Deformable Registration through Learning of Context-Specific Metric Aggregation |
Authors | Enzo Ferrante, Puneet K Dokania, Rafael Marini, Nikos Paragios |
Abstract | We propose a novel weakly supervised discriminative algorithm for learning context specific registration metrics as a linear combination of conventional similarity measures. Conventional metrics have been extensively used over the past two decades and therefore both their strengths and limitations are known. The challenge is to find the optimal relative weighting (or parameters) of different metrics forming the similarity measure of the registration algorithm. Hand-tuning these parameters would result in sub optimal solutions and quickly become infeasible as the number of metrics increases. Furthermore, such hand-crafted combination can only happen at global scale (entire volume) and therefore will not be able to account for the different tissue properties. We propose a learning algorithm for estimating these parameters locally, conditioned to the data semantic classes. The objective function of our formulation is a special case of non-convex function, difference of convex function, which we optimize using the concave convex procedure. As a proof of concept, we show the impact of our approach on three challenging datasets for different anatomical structures and modalities. |
Tasks | |
Published | 2017-07-19 |
URL | http://arxiv.org/abs/1707.06263v1 |
http://arxiv.org/pdf/1707.06263v1.pdf | |
PWC | https://paperswithcode.com/paper/deformable-registration-through-learning-of |
Repo | |
Framework | |
ORGB: Offset Correction in RGB Color Space for Illumination-Robust Image Processing
Title | ORGB: Offset Correction in RGB Color Space for Illumination-Robust Image Processing |
Authors | Zhenqiang Ying, Ge Li, Sixin Wen, Guozhen Tan |
Abstract | Single materials have colors which form straight lines in RGB space. However, in severe shadow cases, those lines do not intersect the origin, which is inconsistent with the description of most literature. This paper is concerned with the detection and correction of the offset between the intersection and origin. First, we analyze the reason for forming that offset via an optical imaging model. Second, we present a simple and effective way to detect and remove the offset. The resulting images, named ORGB, have almost the same appearance as the original RGB images while are more illumination-robust for color space conversion. Besides, image processing using ORGB instead of RGB is free from the interference of shadows. Finally, the proposed offset correction method is applied to road detection task, improving the performance both in quantitative and qualitative evaluations. |
Tasks | |
Published | 2017-08-03 |
URL | http://arxiv.org/abs/1708.00975v1 |
http://arxiv.org/pdf/1708.00975v1.pdf | |
PWC | https://paperswithcode.com/paper/orgb-offset-correction-in-rgb-color-space-for |
Repo | |
Framework | |
Improving End-to-End Speech Recognition with Policy Learning
Title | Improving End-to-End Speech Recognition with Policy Learning |
Authors | Yingbo Zhou, Caiming Xiong, Richard Socher |
Abstract | Connectionist temporal classification (CTC) is widely used for maximum likelihood learning in end-to-end speech recognition models. However, there is usually a disparity between the negative maximum likelihood and the performance metric used in speech recognition, e.g., word error rate (WER). This results in a mismatch between the objective function and metric during training. We show that the above problem can be mitigated by jointly training with maximum likelihood and policy gradient. In particular, with policy learning we are able to directly optimize on the (otherwise non-differentiable) performance metric. We show that joint training improves relative performance by 4% to 13% for our end-to-end model as compared to the same model learned through maximum likelihood. The model achieves 5.53% WER on Wall Street Journal dataset, and 5.42% and 14.70% on Librispeech test-clean and test-other set, respectively. |
Tasks | End-To-End Speech Recognition, Speech Recognition |
Published | 2017-12-19 |
URL | http://arxiv.org/abs/1712.07101v1 |
http://arxiv.org/pdf/1712.07101v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-end-to-end-speech-recognition-with-1 |
Repo | |
Framework | |