October 17, 2019

2935 words 14 mins read

Paper Group ANR 899

When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument. Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks. A Statistical Method for Object Counting. Adaptive Representation Selection in Contextual Bandit. Do deep nets really need weight decay and dropout?. Admissible Time Series Motif Discov …

When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument


Title	When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument
Authors	Johannes Lengler, Anders Martinsson, Angelika Steger
Abstract	Hillclimbing is an essential part of any optimization algorithm. An important benchmark for hillclimbing algorithms on pseudo-Boolean functions $f: {0,1}^n \to \mathbb{R}$ are (strictly) montone functions, on which a surprising number of hillclimbers fail to be efficient. For example, the $(1+1)$-Evolutionary Algorithm is a standard hillclimber which flips each bit independently with probability $c/n$ in each round. Perhaps surprisingly, this algorithm shows a phase transition: it optimizes any monotone pseudo-boolean function in quasilinear time if $c<1$, but there are monotone functions for which the algorithm needs exponential time if $c>2.2$. But so far it was unclear whether the threshold is at $c=1$. In this paper we show how Moser’s entropy compression argument can be adapted to this situation, that is, we show that a long runtime would allow us to encode the random steps of the algorithm with less bits than their entropy. Thus there exists a $c_0 > 1$ such that for all $0<c\le c_0$ the $(1+1)$-Evolutionary Algorithm with rate $c/n$ finds the optimum in $O(n \log^2 n)$ steps in expectation.
Tasks
Published	2018-08-03
URL	http://arxiv.org/abs/1808.01137v1
PDF	http://arxiv.org/pdf/1808.01137v1.pdf
PWC	https://paperswithcode.com/paper/when-does-hillclimbing-fail-on-monotone
Repo
Framework

Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks


Title	Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks
Authors	Mengdi Wang, Qing Zhang, Jun Yang, Xiaoyuan Cui, Wei Lin
Abstract	In this work, we propose a graph-adaptive pruning (GAP) method for efficient inference of convolutional neural networks (CNNs). In this method, the network is viewed as a computational graph, in which the vertices denote the computation nodes and edges represent the information flow. Through topology analysis, GAP is capable of adapting to different network structures, especially the widely used cross connections and multi-path data flow in recent novel convolutional models. The models can be adaptively pruned at vertex-level as well as edge-level without any post-processing, thus GAP can directly get practical model compression and inference speed-up. Moreover, it does not need any customized computation library or hardware support. Finetuning is conducted after pruning to restore the model performance. In the finetuning step, we adopt a self-taught knowledge distillation (KD) strategy by utilizing information from the original model, through which, the performance of the optimized model can be sufficiently improved, without introduction of any other teacher model. Experimental results show the proposed GAP can achieve promising result to make inference more efficient, e.g., for ResNeXt-29 on CIFAR10, it can get 13X model compression and 4.3X practical speed-up with marginal loss of accuracy.
Tasks	Model Compression
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08589v1
PDF	http://arxiv.org/pdf/1811.08589v1.pdf
PWC	https://paperswithcode.com/paper/graph-adaptive-pruning-for-efficient
Repo
Framework

A Statistical Method for Object Counting


Title	A Statistical Method for Object Counting
Authors	Jans Glagolevs, Karlis Freivalds
Abstract	In this paper we present a new object counting method that is intended for counting similarly sized and mostly round objects. Unlike many other algorithms of the same purpose, the proposed method does not rely on identifying every object, it uses statistical data obtained from the image instead. The method is evaluated on images with human bone cells, oranges and pills achieving good accuracy. Its strengths are ability to deal with touching and partly overlapping objects, ability to work with different kinds of objects without prior configuration and good performance.
Tasks	Object Counting
Published	2018-07-22
URL	http://arxiv.org/abs/1807.08335v1
PDF	http://arxiv.org/pdf/1807.08335v1.pdf
PWC	https://paperswithcode.com/paper/a-statistical-method-for-object-counting
Repo
Framework

Adaptive Representation Selection in Contextual Bandit


Title	Adaptive Representation Selection in Contextual Bandit
Authors	Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf, Irina Rish
Abstract	We consider an extension of the contextual bandit setting, motivated by several practical applications, where an unlabeled history of contexts can become available for pre-training before the online decision-making begins. We propose an approach for improving the performance of contextual bandit in such setting, via adaptive, dynamic representation learning, which combines offline pre-training on unlabeled history of contexts with online selection and modification of embedding functions. Our experiments on a variety of datasets and in different nonstationary environments demonstrate clear advantages of our approach over the standard contextual bandit.
Tasks	Decision Making, Representation Learning
Published	2018-02-03
URL	http://arxiv.org/abs/1802.00981v2
PDF	http://arxiv.org/pdf/1802.00981v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-representation-selection-in
Repo
Framework

Do deep nets really need weight decay and dropout?


Title	Do deep nets really need weight decay and dropout?
Authors	Alex Hernández-García, Peter König
Abstract	The impressive success of modern deep neural networks on computer vision tasks has been achieved through models of very large capacity compared to the number of available training examples. This overparameterization is often said to be controlled with the help of different regularization techniques, mainly weight decay and dropout. However, since these techniques reduce the effective capacity of the model, typically even deeper and wider architectures are required to compensate for the reduced capacity. Therefore, there seems to be a waste of capacity in this practice. In this paper we build upon recent research that suggests that explicit regularization may not be as important as widely believed and carry out an ablation study that concludes that weight decay and dropout may not be necessary for object recognition if enough data augmentation is introduced.
Tasks	Data Augmentation, Object Recognition
Published	2018-02-20
URL	http://arxiv.org/abs/1802.07042v3
PDF	http://arxiv.org/pdf/1802.07042v3.pdf
PWC	https://paperswithcode.com/paper/do-deep-nets-really-need-weight-decay-and
Repo
Framework

Admissible Time Series Motif Discovery with Missing Data


Title	Admissible Time Series Motif Discovery with Missing Data
Authors	Yan Zhu, Abdullah Mueen, Eamonn Keogh
Abstract	The discovery of time series motifs has emerged as one of the most useful primitives in time series data mining. Researchers have shown its utility for exploratory data mining, summarization, visualization, segmentation, classification, clustering, and rule discovery. Although there has been more than a decade of extensive research, there is still no technique to allow the discovery of time series motifs in the presence of missing data, despite the well-documented ubiquity of missing data in scientific, industrial, and medical datasets. In this work, we introduce a technique for motif discovery in the presence of missing data. We formally prove that our method is admissible, producing no false negatives. We also show that our method can piggy-back off the fastest known motif discovery method with a small constant factor time/space overhead. We will demonstrate our approach on diverse datasets with varying amounts of missing data
Tasks	Time Series
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05472v1
PDF	http://arxiv.org/pdf/1802.05472v1.pdf
PWC	https://paperswithcode.com/paper/admissible-time-series-motif-discovery-with
Repo
Framework

A Universal Sampling Method for Reconstructing Signals with Simple Fourier Transforms


Title	A Universal Sampling Method for Reconstructing Signals with Simple Fourier Transforms
Authors	Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, Amir Zandieh
Abstract	Reconstructing continuous signals from a small number of discrete samples is a fundamental problem across science and engineering. In practice, we are often interested in signals with ‘simple’ Fourier structure, such as bandlimited, multiband, and Fourier sparse signals. More broadly, any prior knowledge about a signal’s Fourier power spectrum can constrain its complexity. Intuitively, signals with more highly constrained Fourier structure require fewer samples to reconstruct. We formalize this intuition by showing that, roughly, a continuous signal from a given class can be approximately reconstructed using a number of samples proportional to the statistical dimension of the allowed power spectrum of that class. Further, in nearly all settings, this natural measure tightly characterizes the sample complexity of signal reconstruction. Surprisingly, we also show that, up to logarithmic factors, a universal non-uniform sampling strategy can achieve this optimal complexity for any class of signals. We present a simple and efficient algorithm for recovering a signal from the samples taken. For bandlimited and sparse signals, our method matches the state-of-the-art. At the same time, it gives the first computationally and sample efficient solution to a broad range of problems, including multiband signal reconstruction and kriging and Gaussian process regression tasks in one dimension. Our work is based on a novel connection between randomized linear algebra and signal reconstruction with constrained Fourier structure. We extend tools based on statistical leverage score sampling and column-based matrix reconstruction to the approximation of continuous linear operators that arise in signal reconstruction. We believe that these extensions are of independent interest and serve as a foundation for tackling a broad range of continuous time problems using randomized methods.
Tasks
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08723v1
PDF	http://arxiv.org/pdf/1812.08723v1.pdf
PWC	https://paperswithcode.com/paper/a-universal-sampling-method-for
Repo
Framework


Title	Discriminative Supervised Hashing for Cross-Modal similarity Search
Authors	Jun Yu, Xiao-Jun Wu, Josef Kittler
Abstract	With the advantage of low storage cost and high retrieval efficiency, hashing techniques have recently been an emerging topic in cross-modal similarity search. As multiple modal data reflect similar semantic content, many researches aim at learning unified binary codes. However, discriminative hashing features learned by these methods are not adequate. This results in lower accuracy and robustness. We propose a novel hashing learning framework which jointly performs classifier learning, subspace learning and matrix factorization to preserve class-specific semantic content, termed Discriminative Supervised Hashing (DSH), to learn the discrimative unified binary codes for multi-modal data. Besides, reducing the loss of information and preserving the non-linear structure of data, DSH non-linearly projects different modalities into the common space in which the similarity among heterogeneous data points can be measured. Extensive experiments conducted on the three publicly available datasets demonstrate that the framework proposed in this paper outperforms several state-of -the-art methods.
Tasks	Cross-Modal Retrieval
Published	2018-12-06
URL	http://arxiv.org/abs/1812.07660v3
PDF	http://arxiv.org/pdf/1812.07660v3.pdf
PWC	https://paperswithcode.com/paper/discriminative-supervised-hashing-for-cross
Repo
Framework

Quantum distance-based classifier with constant size memory, distributed knowledge and state recycling


Title	Quantum distance-based classifier with constant size memory, distributed knowledge and state recycling
Authors	Przemysław Sadowski
Abstract	In this work we examine recently proposed distance-based classification method designed for near-term quantum processing units with limited resources. We further study possibilities to reduce the quantum resources without any efficiency decrease. We show that only a part of the information undergoes coherent evolution and this fact allows us to introduce an algorithm with significantly reduced quantum memory size. Additionally, considering only partial information at a time, we propose a classification protocol with information distributed among a number of agents. Finally, we show that the information evolution during a measurement can lead to a better solution and that accuracy of the algorithm can be improved by harnessing the state after the final measurement.
Tasks
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00853v1
PDF	http://arxiv.org/pdf/1803.00853v1.pdf
PWC	https://paperswithcode.com/paper/quantum-distance-based-classifier-with
Repo
Framework

Adversarial Feature Learning of Online Monitoring Data for Operational Risk Assessment in Distribution Networks


Title	Adversarial Feature Learning of Online Monitoring Data for Operational Risk Assessment in Distribution Networks
Authors	Xin Shi, Robert Qiu, Tiebin Mi, Xing He, Yongli Zhu
Abstract	With the deployment of online monitoring systems in distribution networks, massive amounts of data collected through them contains rich information on the operating states of the networks. By leveraging the data, an unsupervised approach based on bidirectional generative adversarial networks (BiGANs) is proposed for operational risk assessment in distribution networks in this paper. The approach includes two stages: (1) adversarial feature learning. The most representative features are extracted from the online monitoring data and a statistical index $\mathcal{N}{\phi}$ is calculated for the features, during which we make no assumptions or simplifications on the real data. (2) operational risk assessment. The confidence level $1-\alpha$ for the population mean of the standardized $\mathcal{N}{\phi}$ is combined with the operational risk levels which are divided into emergency, high risk, preventive and normal, and the p value for each data point is calculated and compared with $\frac{\alpha}{2}$ to determine the risk levels. The proposed approach is capable of discovering the latent structure of the real data and providing more accurate assessment result. The synthetic data is employed to illustrate the selection of parameters involved in the proposed approach. Case studies on the real-world online monitoring data validate the effectiveness and advantages of the proposed approach in risk assessment.
Tasks
Published	2018-08-27
URL	https://arxiv.org/abs/1808.09050v2
PDF	https://arxiv.org/pdf/1808.09050v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-feature-learning-of-online
Repo
Framework

Textually Guided Ranking Network for Attentional Image Retweet Modeling


Title	Textually Guided Ranking Network for Attentional Image Retweet Modeling
Authors	Zhou Zhao, Hanbing Zhan, Lingtao Meng, Jun Xiao, Jun Yu, Min Yang, Fei Wu, Deng Cai
Abstract	Retweet prediction is a challenging problem in social media sites (SMS). In this paper, we study the problem of image retweet prediction in social media, which predicts the image sharing behavior that the user reposts the image tweets from their followees. Unlike previous studies, we learn user preference ranking model from their past retweeted image tweets in SMS. We first propose heterogeneous image retweet modeling network (IRM) that exploits users’ past retweeted image tweets with associated contexts, their following relations in SMS and preference of their followees. We then develop a novel attentional multi-faceted ranking network learning framework with textually guided multi-modal neural networks for the proposed heterogenous IRM network to learn the joint image tweet representations and user preference representations for prediction task. The extensive experiments on a large-scale dataset from Twitter site shows that our method achieves better performance than other state-of-the-art solutions to the problem.
Tasks
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10226v1
PDF	http://arxiv.org/pdf/1810.10226v1.pdf
PWC	https://paperswithcode.com/paper/textually-guided-ranking-network-for
Repo
Framework

Diagnostic Visualization for Deep Neural Networks Using Stochastic Gradient Langevin Dynamics


Title	Diagnostic Visualization for Deep Neural Networks Using Stochastic Gradient Langevin Dynamics
Authors	Biye Jiang, David M. Chan, Tianhao Zhang, John F. Canny
Abstract	The internal states of most deep neural networks are difficult to interpret, which makes diagnosis and debugging during training challenging. Activation maximization methods are widely used, but lead to multiple optima and are hard to interpret (appear noise-like) for complex neurons. Image-based methods use maximally-activating image regions which are easier to interpret, but do not provide pixel-level insight into why the neuron responds to them. In this work we introduce an MCMC method: Langevin Dynamics Activation Maximization (LDAM), which is designed for diagnostic visualization. LDAM provides two affordances in combination: the ability to explore the set of maximally activating pre-images, and the ability to trade-off interpretability and pixel-level accuracy using a GAN-style discriminator as a regularizer. We present case studies on MNIST, CIFAR and ImageNet datasets exploring these trade-offs. Finally we show that diagnostic visualization using LDAM leads to a novel insight into the parameter averaging method for deep net training.
Tasks
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04604v1
PDF	http://arxiv.org/pdf/1812.04604v1.pdf
PWC	https://paperswithcode.com/paper/diagnostic-visualization-for-deep-neural
Repo
Framework

Bayesian active learning for choice models with deep Gaussian processes


Title	Bayesian active learning for choice models with deep Gaussian processes
Authors	Jie Yang, Diego Klabjan
Abstract	In this paper, we propose an active learning algorithm and models which can gradually learn individual’s preference through pairwise comparisons. The active learning scheme aims at finding individual’s most preferred choice with minimized number of pairwise comparisons. The pairwise comparisons are encoded into probabilistic models based on assumptions of choice models and deep Gaussian processes. The next-to-compare decision is determined by a novel acquisition function. We benchmark the proposed algorithm and models using functions with multiple local optima and one public airline itinerary dataset. The experiments indicate the effectiveness of our active learning algorithm and models.
Tasks	Active Learning, Gaussian Processes
Published	2018-05-04
URL	http://arxiv.org/abs/1805.01867v1
PDF	http://arxiv.org/pdf/1805.01867v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-active-learning-for-choice-models
Repo
Framework

Saliency guided deep network for weakly-supervised image segmentation


Title	Saliency guided deep network for weakly-supervised image segmentation
Authors	Fengdong Sun, Wenhui Li
Abstract	Weakly-supervised image segmentation is an important task in computer vision. A key problem is how to obtain high quality objects location from image-level category. Classification activation mapping is a common method which can be used to generate high-precise object location cues. However these location cues are generally very sparse and small such that they can not provide effective information for image segmentation. In this paper, we propose a saliency guided image segmentation network to resolve this problem. We employ a self-attention saliency method to generate subtle saliency maps, and render the location cues grow as seeds by seeded region growing method to expand pixel-level labels extent. In the process of seeds growing, we use the saliency values to weight the similarity between pixels to control the growing. Therefore saliency information could help generate discriminative object regions, and the effects of wrong salient pixels can be suppressed efficiently. Experimental results on a common segmentation dataset PASCAL VOC2012 demonstrate the effectiveness of our method.
Tasks	Semantic Segmentation
Published	2018-10-19
URL	http://arxiv.org/abs/1810.08378v1
PDF	http://arxiv.org/pdf/1810.08378v1.pdf
PWC	https://paperswithcode.com/paper/saliency-guided-deep-network-for-weakly
Repo
Framework


Title	Y^2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences
Authors	Zhizhong Han, Mingyang Shang, Xiyang Wang, Yu-Shen Liu, Matthias Zwicker
Abstract	A recent method employs 3D voxels to represent 3D shapes, but this limits the approach to low resolutions due to the computational cost caused by the cubic complexity of 3D voxels. Hence the method suffers from a lack of detailed geometry. To resolve this issue, we propose Y^2Seq2Seq, a view-based model, to learn cross-modal representations by joint reconstruction and prediction of view and word sequences. Specifically, the network architecture of Y^2Seq2Seq bridges the semantic meaning embedded in the two modalities by two coupled `Y’ like sequence-to-sequence (Seq2Seq) structures. In addition, our novel hierarchical constraints further increase the discriminability of the cross-modal representations by employing more detailed discriminative information. Experimental results on cross-modal retrieval and 3D shape captioning show that Y^2Seq2Seq outperforms the state-of-the-art methods. \|
Tasks	3D Shape Representation, Cross-Modal Retrieval, Representation Learning
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02745v1
PDF	http://arxiv.org/pdf/1811.02745v1.pdf
PWC	https://paperswithcode.com/paper/y2seq2seq-cross-modal-representation-learning
Repo
Framework