July 29, 2019

2867 words 14 mins read

Paper Group ANR 148

Paper Group ANR 148

Weakly Supervised Object Localization Using Things and Stuff Transfer. KeyVec: Key-semantics Preserving Document Representations. MOLTE: a Modular Optimal Learning Testing Environment. Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora. Improving Small Object Proposals for Company Logo Detection. Nonnegative Matrix Facto …

Weakly Supervised Object Localization Using Things and Stuff Transfer

Title Weakly Supervised Object Localization Using Things and Stuff Transfer
Authors Miaojing Shi, Holger Caesar, Vittorio Ferrari
Abstract We propose to help weakly supervised object localization for classes where location annotations are not available, by transferring things and stuff knowledge from a source set with available annotations. The source and target classes might share similar appearance (e.g. bear fur is similar to cat fur) or appear against similar background (e.g. horse and sheep appear against grass). To exploit this, we acquire three types of knowledge from the source set: a segmentation model trained on both thing and stuff classes; similarity relations between target and source classes; and co-occurrence relations between thing and stuff classes in the source. The segmentation model is used to generate thing and stuff segmentation maps on a target image, while the class similarity and co-occurrence knowledge help refining them. We then incorporate these maps as new cues into a multiple instance learning framework (MIL), propagating the transferred knowledge from the pixel level to the object proposal level. In extensive experiments, we conduct our transfer from the PASCAL Context dataset (source) to the ILSVRC, COCO and PASCAL VOC 2007 datasets (targets). We evaluate our transfer across widely different thing classes, including some that are not similar in appearance, but appear against similar background. The results demonstrate significant improvement over standard MIL, and we outperform the state-of-the-art in the transfer setting.
Tasks Multiple Instance Learning, Object Localization, Weakly-Supervised Object Localization
Published 2017-03-23
URL http://arxiv.org/abs/1703.08000v2
PDF http://arxiv.org/pdf/1703.08000v2.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-object-localization-using-1
Repo
Framework

KeyVec: Key-semantics Preserving Document Representations

Title KeyVec: Key-semantics Preserving Document Representations
Authors Bin Bi, Hao Ma
Abstract Previous studies have demonstrated the empirical success of word embeddings in various applications. In this paper, we investigate the problem of learning distributed representations for text documents which many machine learning algorithms take as input for a number of NLP tasks. We propose a neural network model, KeyVec, which learns document representations with the goal of preserving key semantics of the input text. It enables the learned low-dimensional vectors to retain the topics and important information from the documents that will flow to downstream tasks. Our empirical evaluations show the superior quality of KeyVec representations in two different document understanding tasks.
Tasks Word Embeddings
Published 2017-09-27
URL http://arxiv.org/abs/1709.09749v1
PDF http://arxiv.org/pdf/1709.09749v1.pdf
PWC https://paperswithcode.com/paper/keyvec-key-semantics-preserving-document
Repo
Framework

MOLTE: a Modular Optimal Learning Testing Environment

Title MOLTE: a Modular Optimal Learning Testing Environment
Authors Yingfei Wang, Warren Powell
Abstract We address the relative paucity of empirical testing of learning algorithms (of any type) by introducing a new public-domain, Modular, Optimal Learning Testing Environment (MOLTE) for Bayesian ranking and selection problem, stochastic bandits or sequential experimental design problems. The Matlab-based simulator allows the comparison of a number of learning policies (represented as a series of .m modules) in the context of a wide range of problems (each represented in its own .m module) which makes it easy to add new algorithms and new test problems. State-of-the-art policies and various problem classes are provided in the package. The choice of problems and policies is guided through a spreadsheet-based interface. Different graphical metrics are included. MOLTE is designed to be compatible with parallel computing to scale up from local desktop to clusters and clouds. We offer MOLTE as an easy-to-use tool for the research community that will make it possible to perform much more comprehensive testing, spanning a broader selection of algorithms and test problems. We demonstrate the capabilities of MOLTE through a series of comparisons of policies on a starter library of test problems. We also address the problem of tuning and constructing priors that have been largely overlooked in optimal learning literature. We envision MOLTE as a modest spur to provide researchers an easy environment to study interesting questions involved in optimal learning.
Tasks
Published 2017-09-13
URL http://arxiv.org/abs/1709.04553v1
PDF http://arxiv.org/pdf/1709.04553v1.pdf
PWC https://paperswithcode.com/paper/molte-a-modular-optimal-learning-testing
Repo
Framework

Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora

Title Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora
Authors Jooyeon Kim, Dongwoo Kim, Alice Oh
Abstract Much of scientific progress stems from previously published findings, but searching through the vast sea of scientific publications is difficult. We often rely on metrics of scholarly authority to find the prominent authors but these authority indices do not differentiate authority based on research topics. We present Latent Topical-Authority Indexing (LTAI) for jointly modeling the topics, citations, and topical authority in a corpus of academic papers. Compared to previous models, LTAI differs in two main aspects. First, it explicitly models the generative process of the citations, rather than treating the citations as given. Second, it models each author’s influence on citations of a paper based on the topics of the cited papers, as well as the citing papers. We fit LTAI to four academic corpora: CORA, Arxiv Physics, PNAS, and Citeseer. We compare the performance of LTAI against various baselines, starting with the latent Dirichlet allocation, to the more advanced models including author-link topic model and dynamic author citation topic model. The results show that LTAI achieves improved accuracy over other similar models when predicting words, citations and authors of publications.
Tasks
Published 2017-06-02
URL http://arxiv.org/abs/1706.00593v1
PDF http://arxiv.org/pdf/1706.00593v1.pdf
PWC https://paperswithcode.com/paper/joint-modeling-of-topics-citations-and
Repo
Framework

Improving Small Object Proposals for Company Logo Detection

Title Improving Small Object Proposals for Company Logo Detection
Authors Christian Eggert, Dan Zecha, Stephan Brehm, Rainer Lienhart
Abstract Many modern approaches for object detection are two-staged pipelines. The first stage identifies regions of interest which are then classified in the second stage. Faster R-CNN is such an approach for object detection which combines both stages into a single pipeline. In this paper we apply Faster R-CNN to the task of company logo detection. Motivated by its weak performance on small object instances, we examine in detail both the proposal and the classification stage with respect to a wide range of object sizes. We investigate the influence of feature map resolution on the performance of those stages. Based on theoretical considerations, we introduce an improved scheme for generating anchor proposals and propose a modification to Faster R-CNN which leverages higher-resolution feature maps for small objects. We evaluate our approach on the FlickrLogos dataset improving the RPN performance from 0.52 to 0.71 (MABO) and the detection performance from 0.52 to 0.67 (mAP).
Tasks Object Detection
Published 2017-04-28
URL http://arxiv.org/abs/1704.08881v1
PDF http://arxiv.org/pdf/1704.08881v1.pdf
PWC https://paperswithcode.com/paper/improving-small-object-proposals-for-company
Repo
Framework

Nonnegative Matrix Factorization with Transform Learning

Title Nonnegative Matrix Factorization with Transform Learning
Authors Dylan Fagot, Cédric Févotte, Herwig Wendt
Abstract Traditional NMF-based signal decomposition relies on the factorization of spectral data, which is typically computed by means of short-time frequency transform. In this paper we propose to relax the choice of a pre-fixed transform and learn a short-time orthogonal transform together with the factorization. To this end, we formulate a regularized optimization problem reminiscent of conventional NMF, yet with the transform as additional unknown parameters, and design a novel block-descent algorithm enabling to find stationary points of this objective function. The proposed joint transform learning and factorization approach is tested for two audio signal processing experiments, illustrating its conceptual and practical benefits.
Tasks
Published 2017-05-11
URL http://arxiv.org/abs/1705.04193v2
PDF http://arxiv.org/pdf/1705.04193v2.pdf
PWC https://paperswithcode.com/paper/nonnegative-matrix-factorization-with
Repo
Framework

Kill Two Birds With One Stone: Boosting Both Object Detection Accuracy and Speed With adaptive Patch-of-Interest Composition

Title Kill Two Birds With One Stone: Boosting Both Object Detection Accuracy and Speed With adaptive Patch-of-Interest Composition
Authors Shihao Zhang, Weiyao Lin, Ping Lu, Weihua Li, Shuo Deng
Abstract Object detection is an important yet challenging task in video understanding & analysis, where one major challenge lies in the proper balance between two contradictive factors: detection accuracy and detection speed. In this paper, we propose a new adaptive patch-of-interest composition approach for boosting both the accuracy and speed for object detection. The proposed approach first extracts patches in a video frame which have the potential to include objects-of-interest. Then, an adaptive composition process is introduced to compose the extracted patches into an optimal number of sub-frames for object detection. With this process, we are able to maintain the resolution of the original frame during object detection (for guaranteeing the accuracy), while minimizing the number of inputs in detection (for boosting the speed). Experimental results on various datasets demonstrate the effectiveness of the proposed approach.
Tasks Object Detection, Video Understanding
Published 2017-08-12
URL http://arxiv.org/abs/1708.03795v3
PDF http://arxiv.org/pdf/1708.03795v3.pdf
PWC https://paperswithcode.com/paper/kill-two-birds-with-one-stone-boosting-both
Repo
Framework

Wavelet-based Reflection Symmetry Detection via Textural and Color Histograms

Title Wavelet-based Reflection Symmetry Detection via Textural and Color Histograms
Authors Mohamed Elawady, Christophe Ducottet, Olivier Alata, Cecile Barat, Philippe Colantoni
Abstract Symmetry is one of the significant visual properties inside an image plane, to identify the geometrically balanced structures through real-world objects. Existing symmetry detection methods rely on descriptors of the local image features and their neighborhood behavior, resulting incomplete symmetrical axis candidates to discover the mirror similarities on a global scale. In this paper, we propose a new reflection symmetry detection scheme, based on a reliable edge-based feature extraction using Log-Gabor filters, plus an efficient voting scheme parameterized by their corresponding textural and color neighborhood information. Experimental evaluation on four single-case and three multiple-case symmetry detection datasets validates the superior achievement of the proposed work to find global symmetries inside an image.
Tasks
Published 2017-07-10
URL http://arxiv.org/abs/1707.02931v4
PDF http://arxiv.org/pdf/1707.02931v4.pdf
PWC https://paperswithcode.com/paper/wavelet-based-reflection-symmetry-detection
Repo
Framework

Critical Hyper-Parameters: No Random, No Cry

Title Critical Hyper-Parameters: No Random, No Cry
Authors Olivier Bousquet, Sylvain Gelly, Karol Kurach, Olivier Teytaud, Damien Vincent
Abstract The selection of hyper-parameters is critical in Deep Learning. Because of the long training time of complex models and the availability of compute resources in the cloud, “one-shot” optimization schemes - where the sets of hyper-parameters are selected in advance (e.g. on a grid or in a random manner) and the training is executed in parallel - are commonly used. It is known that grid search is sub-optimal, especially when only a few critical parameters matter, and suggest to use random search instead. Yet, random search can be “unlucky” and produce sets of values that leave some part of the domain unexplored. Quasi-random methods, such as Low Discrepancy Sequences (LDS) avoid these issues. We show that such methods have theoretical properties that make them appealing for performing hyperparameter search, and demonstrate that, when applied to the selection of hyperparameters of complex Deep Learning models (such as state-of-the-art LSTM language models and image classification models), they yield suitable hyperparameters values with much fewer runs than random search. We propose a particularly simple LDS method which can be used as a drop-in replacement for grid or random search in any Deep Learning pipeline, both as a fully one-shot hyperparameter search or as an initializer in iterative batch optimization.
Tasks Image Classification
Published 2017-06-10
URL http://arxiv.org/abs/1706.03200v1
PDF http://arxiv.org/pdf/1706.03200v1.pdf
PWC https://paperswithcode.com/paper/critical-hyper-parameters-no-random-no-cry
Repo
Framework

Network Classification and Categorization

Title Network Classification and Categorization
Authors James P. Canning, Emma E. Ingram, Sammantha Nowak-Wolff, Adriana M. Ortiz, Nesreen K. Ahmed, Ryan A. Rossi, Karl R. B. Schmitt, Sucheta Soundarajan
Abstract To the best of our knowledge, this paper presents the first large-scale study that tests whether network categories (e.g., social networks vs. web graphs) are distinguishable from one another (using both categories of real-world networks and synthetic graphs). A classification accuracy of $94.2%$ was achieved using a random forest classifier with both real and synthetic networks. This work makes two important findings. First, real-world networks from various domains have distinct structural properties that allow us to predict with high accuracy the category of an arbitrary network. Second, classifying synthetic networks is trivial as our models can easily distinguish between synthetic graphs and the real-world networks they are supposed to model.
Tasks
Published 2017-09-13
URL http://arxiv.org/abs/1709.04481v1
PDF http://arxiv.org/pdf/1709.04481v1.pdf
PWC https://paperswithcode.com/paper/network-classification-and-categorization
Repo
Framework

Better Text Understanding Through Image-To-Text Transfer

Title Better Text Understanding Through Image-To-Text Transfer
Authors Karol Kurach, Sylvain Gelly, Michal Jastrzebski, Philip Haeusser, Olivier Teytaud, Damien Vincent, Olivier Bousquet
Abstract Generic text embeddings are successfully used in a variety of tasks. However, they are often learnt by capturing the co-occurrence structure from pure text corpora, resulting in limitations of their ability to generalize. In this paper, we explore models that incorporate visual information into the text representation. Based on comprehensive ablation studies, we propose a conceptually simple, yet well performing architecture. It outperforms previous multimodal approaches on a set of well established benchmarks. We also improve the state-of-the-art results for image-related text datasets, using orders of magnitude less data.
Tasks
Published 2017-05-23
URL http://arxiv.org/abs/1705.08386v2
PDF http://arxiv.org/pdf/1705.08386v2.pdf
PWC https://paperswithcode.com/paper/better-text-understanding-through-image-to
Repo
Framework

Enhanced Experience Replay Generation for Efficient Reinforcement Learning

Title Enhanced Experience Replay Generation for Efficient Reinforcement Learning
Authors Vincent Huang, Tobias Ley, Martha Vlachou-Konchylaki, Wenfeng Hu
Abstract Applying deep reinforcement learning (RL) on real systems suffers from slow data sampling. We propose an enhanced generative adversarial network (EGAN) to initialize an RL agent in order to achieve faster learning. The EGAN utilizes the relation between states and actions to enhance the quality of data samples generated by a GAN. Pre-training the agent with the EGAN shows a steeper learning curve with a 20% improvement of training time in the beginning of learning, compared to no pre-training, and an improvement compared to training with GAN by about 5% with smaller variations. For real time systems with sparse and slow data sampling the EGAN could be used to speed up the early phases of the training process.
Tasks
Published 2017-05-23
URL http://arxiv.org/abs/1705.08245v2
PDF http://arxiv.org/pdf/1705.08245v2.pdf
PWC https://paperswithcode.com/paper/enhanced-experience-replay-generation-for
Repo
Framework

Separation of Water and Fat Magnetic Resonance Imaging Signals Using Deep Learning with Convolutional Neural Networks

Title Separation of Water and Fat Magnetic Resonance Imaging Signals Using Deep Learning with Convolutional Neural Networks
Authors James W Goldfarb
Abstract Purpose: A new method for magnetic resonance (MR) imaging water-fat separation using a convolutional neural network (ConvNet) and deep learning (DL) is presented. Feasibility of the method with complex and magnitude images is demonstrated with a series of patient studies and accuracy of predicted quantitative values is analyzed. Methods: Water-fat separation of 1200 gradient-echo acquisitions from 90 imaging sessions (normal, acute and chronic myocardial infarction) was performed using a conventional model based method with modeling of R2* and off-resonance and a multi-peak fat spectrum. A U-Net convolutional neural network for calculation of water-only, fat-only, R2* and off-resonance images was trained with 900 gradient-echo Multiple and single-echo complex and magnitude input data algorithms were studied and compared to conventional extended echo modeling. Results: The U-Net ConvNet was easily trained and provided water-fat separation results visually comparable to conventional methods. Myocardial fat deposition in chronic myocardial infarction and intramyocardial hemorrhage in acute myocardial infarction were well visualized in the DL results. Predicted values for R2*, off-resonance, water and fat signal intensities were well correlated with conventional model based water fat separation (R2>=0.97, p<0.001). DL images had a 14% higher signal-to-noise ratio (p<0.001) when compared to the conventional method. Conclusion: Deep learning utilizing ConvNets is a feasible method for MR water-fat separationimaging with complex, magnitude and single echo image data. A trained U-Net can be efficiently used for MR water-fat separation, providing results comparable to conventional model based methods.
Tasks
Published 2017-10-27
URL http://arxiv.org/abs/1711.00107v1
PDF http://arxiv.org/pdf/1711.00107v1.pdf
PWC https://paperswithcode.com/paper/separation-of-water-and-fat-magnetic
Repo
Framework

Learning Structured Semantic Embeddings for Visual Recognition

Title Learning Structured Semantic Embeddings for Visual Recognition
Authors Dong Li, Hsin-Ying Lee, Jia-Bin Huang, Shengjin Wang, Ming-Hsuan Yang
Abstract Numerous embedding models have been recently explored to incorporate semantic knowledge into visual recognition. Existing methods typically focus on minimizing the distance between the corresponding images and texts in the embedding space but do not explicitly optimize the underlying structure. Our key observation is that modeling the pairwise image-image relationship improves the discrimination ability of the embedding model. In this paper, we propose the structured discriminative and difference constraints to learn visual-semantic embeddings. First, we exploit the discriminative constraints to capture the intra- and inter-class relationships of image embeddings. The discriminative constraints encourage separability for image instances of different classes. Second, we align the difference vector between a pair of image embeddings with that of the corresponding word embeddings. The difference constraints help regularize image embeddings to preserve the semantic relationships among word embeddings. Extensive evaluations demonstrate the effectiveness of the proposed structured embeddings for single-label classification, multi-label classification, and zero-shot recognition.
Tasks Multi-Label Classification, Word Embeddings, Zero-Shot Learning
Published 2017-06-05
URL http://arxiv.org/abs/1706.01237v1
PDF http://arxiv.org/pdf/1706.01237v1.pdf
PWC https://paperswithcode.com/paper/learning-structured-semantic-embeddings-for
Repo
Framework

Image Segmentation Algorithms Overview

Title Image Segmentation Algorithms Overview
Authors Song Yuheng, Yan Hao
Abstract The technology of image segmentation is widely used in medical image processing, face recognition pedestrian detection, etc. The current image segmentation techniques include region-based segmentation, edge detection segmentation, segmentation based on clustering, segmentation based on weakly-supervised learning in CNN, etc. This paper analyzes and summarizes these algorithms of image segmentation, and compares the advantages and disadvantages of different algorithms. Finally, we make a prediction of the development trend of image segmentation with the combination of these algorithms.
Tasks Edge Detection, Face Recognition, Pedestrian Detection, Semantic Segmentation
Published 2017-07-07
URL http://arxiv.org/abs/1707.02051v1
PDF http://arxiv.org/pdf/1707.02051v1.pdf
PWC https://paperswithcode.com/paper/image-segmentation-algorithms-overview
Repo
Framework
comments powered by Disqus