Paper Group ANR 899
When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument. Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks. A Statistical Method for Object Counting. Adaptive Representation Selection in Contextual Bandit. Do deep nets really need weight decay and dropout?. Admissible Time Series Motif Discov …
When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument
Title | When Does Hillclimbing Fail on Monotone Functions: An entropy compression argument |
Authors | Johannes Lengler, Anders Martinsson, Angelika Steger |
Abstract | Hillclimbing is an essential part of any optimization algorithm. An important benchmark for hillclimbing algorithms on pseudo-Boolean functions $f: {0,1}^n \to \mathbb{R}$ are (strictly) montone functions, on which a surprising number of hillclimbers fail to be efficient. For example, the $(1+1)$-Evolutionary Algorithm is a standard hillclimber which flips each bit independently with probability $c/n$ in each round. Perhaps surprisingly, this algorithm shows a phase transition: it optimizes any monotone pseudo-boolean function in quasilinear time if $c<1$, but there are monotone functions for which the algorithm needs exponential time if $c>2.2$. But so far it was unclear whether the threshold is at $c=1$. In this paper we show how Moser’s entropy compression argument can be adapted to this situation, that is, we show that a long runtime would allow us to encode the random steps of the algorithm with less bits than their entropy. Thus there exists a $c_0 > 1$ such that for all $0<c\le c_0$ the $(1+1)$-Evolutionary Algorithm with rate $c/n$ finds the optimum in $O(n \log^2 n)$ steps in expectation. |
Tasks | |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01137v1 |
http://arxiv.org/pdf/1808.01137v1.pdf | |
PWC | https://paperswithcode.com/paper/when-does-hillclimbing-fail-on-monotone |
Repo | |
Framework | |
Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks
Title | Graph-Adaptive Pruning for Efficient Inference of Convolutional Neural Networks |
Authors | Mengdi Wang, Qing Zhang, Jun Yang, Xiaoyuan Cui, Wei Lin |
Abstract | In this work, we propose a graph-adaptive pruning (GAP) method for efficient inference of convolutional neural networks (CNNs). In this method, the network is viewed as a computational graph, in which the vertices denote the computation nodes and edges represent the information flow. Through topology analysis, GAP is capable of adapting to different network structures, especially the widely used cross connections and multi-path data flow in recent novel convolutional models. The models can be adaptively pruned at vertex-level as well as edge-level without any post-processing, thus GAP can directly get practical model compression and inference speed-up. Moreover, it does not need any customized computation library or hardware support. Finetuning is conducted after pruning to restore the model performance. In the finetuning step, we adopt a self-taught knowledge distillation (KD) strategy by utilizing information from the original model, through which, the performance of the optimized model can be sufficiently improved, without introduction of any other teacher model. Experimental results show the proposed GAP can achieve promising result to make inference more efficient, e.g., for ResNeXt-29 on CIFAR10, it can get 13X model compression and 4.3X practical speed-up with marginal loss of accuracy. |
Tasks | Model Compression |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08589v1 |
http://arxiv.org/pdf/1811.08589v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-adaptive-pruning-for-efficient |
Repo | |
Framework | |
A Statistical Method for Object Counting
Title | A Statistical Method for Object Counting |
Authors | Jans Glagolevs, Karlis Freivalds |
Abstract | In this paper we present a new object counting method that is intended for counting similarly sized and mostly round objects. Unlike many other algorithms of the same purpose, the proposed method does not rely on identifying every object, it uses statistical data obtained from the image instead. The method is evaluated on images with human bone cells, oranges and pills achieving good accuracy. Its strengths are ability to deal with touching and partly overlapping objects, ability to work with different kinds of objects without prior configuration and good performance. |
Tasks | Object Counting |
Published | 2018-07-22 |
URL | http://arxiv.org/abs/1807.08335v1 |
http://arxiv.org/pdf/1807.08335v1.pdf | |
PWC | https://paperswithcode.com/paper/a-statistical-method-for-object-counting |
Repo | |
Framework | |
Adaptive Representation Selection in Contextual Bandit
Title | Adaptive Representation Selection in Contextual Bandit |
Authors | Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf, Irina Rish |
Abstract | We consider an extension of the contextual bandit setting, motivated by several practical applications, where an unlabeled history of contexts can become available for pre-training before the online decision-making begins. We propose an approach for improving the performance of contextual bandit in such setting, via adaptive, dynamic representation learning, which combines offline pre-training on unlabeled history of contexts with online selection and modification of embedding functions. Our experiments on a variety of datasets and in different nonstationary environments demonstrate clear advantages of our approach over the standard contextual bandit. |
Tasks | Decision Making, Representation Learning |
Published | 2018-02-03 |
URL | http://arxiv.org/abs/1802.00981v2 |
http://arxiv.org/pdf/1802.00981v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-representation-selection-in |
Repo | |
Framework | |
Do deep nets really need weight decay and dropout?
Title | Do deep nets really need weight decay and dropout? |
Authors | Alex Hernández-García, Peter König |
Abstract | The impressive success of modern deep neural networks on computer vision tasks has been achieved through models of very large capacity compared to the number of available training examples. This overparameterization is often said to be controlled with the help of different regularization techniques, mainly weight decay and dropout. However, since these techniques reduce the effective capacity of the model, typically even deeper and wider architectures are required to compensate for the reduced capacity. Therefore, there seems to be a waste of capacity in this practice. In this paper we build upon recent research that suggests that explicit regularization may not be as important as widely believed and carry out an ablation study that concludes that weight decay and dropout may not be necessary for object recognition if enough data augmentation is introduced. |
Tasks | Data Augmentation, Object Recognition |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.07042v3 |
http://arxiv.org/pdf/1802.07042v3.pdf | |
PWC | https://paperswithcode.com/paper/do-deep-nets-really-need-weight-decay-and |
Repo | |
Framework | |
Admissible Time Series Motif Discovery with Missing Data
Title | Admissible Time Series Motif Discovery with Missing Data |
Authors | Yan Zhu, Abdullah Mueen, Eamonn Keogh |
Abstract | The discovery of time series motifs has emerged as one of the most useful primitives in time series data mining. Researchers have shown its utility for exploratory data mining, summarization, visualization, segmentation, classification, clustering, and rule discovery. Although there has been more than a decade of extensive research, there is still no technique to allow the discovery of time series motifs in the presence of missing data, despite the well-documented ubiquity of missing data in scientific, industrial, and medical datasets. In this work, we introduce a technique for motif discovery in the presence of missing data. We formally prove that our method is admissible, producing no false negatives. We also show that our method can piggy-back off the fastest known motif discovery method with a small constant factor time/space overhead. We will demonstrate our approach on diverse datasets with varying amounts of missing data |
Tasks | Time Series |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05472v1 |
http://arxiv.org/pdf/1802.05472v1.pdf | |
PWC | https://paperswithcode.com/paper/admissible-time-series-motif-discovery-with |
Repo | |
Framework | |
A Universal Sampling Method for Reconstructing Signals with Simple Fourier Transforms
Title | A Universal Sampling Method for Reconstructing Signals with Simple Fourier Transforms |
Authors | Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, Amir Zandieh |
Abstract | Reconstructing continuous signals from a small number of discrete samples is a fundamental problem across science and engineering. In practice, we are often interested in signals with ‘simple’ Fourier structure, such as bandlimited, multiband, and Fourier sparse signals. More broadly, any prior knowledge about a signal’s Fourier power spectrum can constrain its complexity. Intuitively, signals with more highly constrained Fourier structure require fewer samples to reconstruct. We formalize this intuition by showing that, roughly, a continuous signal from a given class can be approximately reconstructed using a number of samples proportional to the statistical dimension of the allowed power spectrum of that class. Further, in nearly all settings, this natural measure tightly characterizes the sample complexity of signal reconstruction. Surprisingly, we also show that, up to logarithmic factors, a universal non-uniform sampling strategy can achieve this optimal complexity for any class of signals. We present a simple and efficient algorithm for recovering a signal from the samples taken. For bandlimited and sparse signals, our method matches the state-of-the-art. At the same time, it gives the first computationally and sample efficient solution to a broad range of problems, including multiband signal reconstruction and kriging and Gaussian process regression tasks in one dimension. Our work is based on a novel connection between randomized linear algebra and signal reconstruction with constrained Fourier structure. We extend tools based on statistical leverage score sampling and column-based matrix reconstruction to the approximation of continuous linear operators that arise in signal reconstruction. We believe that these extensions are of independent interest and serve as a foundation for tackling a broad range of continuous time problems using randomized methods. |
Tasks | |
Published | 2018-12-20 |
URL | http://arxiv.org/abs/1812.08723v1 |
http://arxiv.org/pdf/1812.08723v1.pdf | |
PWC | https://paperswithcode.com/paper/a-universal-sampling-method-for |
Repo | |
Framework | |
Discriminative Supervised Hashing for Cross-Modal similarity Search
Title | Discriminative Supervised Hashing for Cross-Modal similarity Search |
Authors | Jun Yu, Xiao-Jun Wu, Josef Kittler |
Abstract | With the advantage of low storage cost and high retrieval efficiency, hashing techniques have recently been an emerging topic in cross-modal similarity search. As multiple modal data reflect similar semantic content, many researches aim at learning unified binary codes. However, discriminative hashing features learned by these methods are not adequate. This results in lower accuracy and robustness. We propose a novel hashing learning framework which jointly performs classifier learning, subspace learning and matrix factorization to preserve class-specific semantic content, termed Discriminative Supervised Hashing (DSH), to learn the discrimative unified binary codes for multi-modal data. Besides, reducing the loss of information and preserving the non-linear structure of data, DSH non-linearly projects different modalities into the common space in which the similarity among heterogeneous data points can be measured. Extensive experiments conducted on the three publicly available datasets demonstrate that the framework proposed in this paper outperforms several state-of -the-art methods. |
Tasks | Cross-Modal Retrieval |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.07660v3 |
http://arxiv.org/pdf/1812.07660v3.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-supervised-hashing-for-cross |
Repo | |
Framework | |
Quantum distance-based classifier with constant size memory, distributed knowledge and state recycling
Title | Quantum distance-based classifier with constant size memory, distributed knowledge and state recycling |
Authors | Przemysław Sadowski |
Abstract | In this work we examine recently proposed distance-based classification method designed for near-term quantum processing units with limited resources. We further study possibilities to reduce the quantum resources without any efficiency decrease. We show that only a part of the information undergoes coherent evolution and this fact allows us to introduce an algorithm with significantly reduced quantum memory size. Additionally, considering only partial information at a time, we propose a classification protocol with information distributed among a number of agents. Finally, we show that the information evolution during a measurement can lead to a better solution and that accuracy of the algorithm can be improved by harnessing the state after the final measurement. |
Tasks | |
Published | 2018-03-02 |
URL | http://arxiv.org/abs/1803.00853v1 |
http://arxiv.org/pdf/1803.00853v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-distance-based-classifier-with |
Repo | |
Framework | |
Adversarial Feature Learning of Online Monitoring Data for Operational Risk Assessment in Distribution Networks
Title | Adversarial Feature Learning of Online Monitoring Data for Operational Risk Assessment in Distribution Networks |
Authors | Xin Shi, Robert Qiu, Tiebin Mi, Xing He, Yongli Zhu |
Abstract | With the deployment of online monitoring systems in distribution networks, massive amounts of data collected through them contains rich information on the operating states of the networks. By leveraging the data, an unsupervised approach based on bidirectional generative adversarial networks (BiGANs) is proposed for operational risk assessment in distribution networks in this paper. The approach includes two stages: (1) adversarial feature learning. The most representative features are extracted from the online monitoring data and a statistical index $\mathcal{N}{\phi}$ is calculated for the features, during which we make no assumptions or simplifications on the real data. (2) operational risk assessment. The confidence level $1-\alpha$ for the population mean of the standardized $\mathcal{N}{\phi}$ is combined with the operational risk levels which are divided into emergency, high risk, preventive and normal, and the p value for each data point is calculated and compared with $\frac{\alpha}{2}$ to determine the risk levels. The proposed approach is capable of discovering the latent structure of the real data and providing more accurate assessment result. The synthetic data is employed to illustrate the selection of parameters involved in the proposed approach. Case studies on the real-world online monitoring data validate the effectiveness and advantages of the proposed approach in risk assessment. |
Tasks | |
Published | 2018-08-27 |
URL | https://arxiv.org/abs/1808.09050v2 |
https://arxiv.org/pdf/1808.09050v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-feature-learning-of-online |
Repo | |
Framework | |
Textually Guided Ranking Network for Attentional Image Retweet Modeling
Title | Textually Guided Ranking Network for Attentional Image Retweet Modeling |
Authors | Zhou Zhao, Hanbing Zhan, Lingtao Meng, Jun Xiao, Jun Yu, Min Yang, Fei Wu, Deng Cai |
Abstract | Retweet prediction is a challenging problem in social media sites (SMS). In this paper, we study the problem of image retweet prediction in social media, which predicts the image sharing behavior that the user reposts the image tweets from their followees. Unlike previous studies, we learn user preference ranking model from their past retweeted image tweets in SMS. We first propose heterogeneous image retweet modeling network (IRM) that exploits users’ past retweeted image tweets with associated contexts, their following relations in SMS and preference of their followees. We then develop a novel attentional multi-faceted ranking network learning framework with textually guided multi-modal neural networks for the proposed heterogenous IRM network to learn the joint image tweet representations and user preference representations for prediction task. The extensive experiments on a large-scale dataset from Twitter site shows that our method achieves better performance than other state-of-the-art solutions to the problem. |
Tasks | |
Published | 2018-10-24 |
URL | http://arxiv.org/abs/1810.10226v1 |
http://arxiv.org/pdf/1810.10226v1.pdf | |
PWC | https://paperswithcode.com/paper/textually-guided-ranking-network-for |
Repo | |
Framework | |
Diagnostic Visualization for Deep Neural Networks Using Stochastic Gradient Langevin Dynamics
Title | Diagnostic Visualization for Deep Neural Networks Using Stochastic Gradient Langevin Dynamics |
Authors | Biye Jiang, David M. Chan, Tianhao Zhang, John F. Canny |
Abstract | The internal states of most deep neural networks are difficult to interpret, which makes diagnosis and debugging during training challenging. Activation maximization methods are widely used, but lead to multiple optima and are hard to interpret (appear noise-like) for complex neurons. Image-based methods use maximally-activating image regions which are easier to interpret, but do not provide pixel-level insight into why the neuron responds to them. In this work we introduce an MCMC method: Langevin Dynamics Activation Maximization (LDAM), which is designed for diagnostic visualization. LDAM provides two affordances in combination: the ability to explore the set of maximally activating pre-images, and the ability to trade-off interpretability and pixel-level accuracy using a GAN-style discriminator as a regularizer. We present case studies on MNIST, CIFAR and ImageNet datasets exploring these trade-offs. Finally we show that diagnostic visualization using LDAM leads to a novel insight into the parameter averaging method for deep net training. |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04604v1 |
http://arxiv.org/pdf/1812.04604v1.pdf | |
PWC | https://paperswithcode.com/paper/diagnostic-visualization-for-deep-neural |
Repo | |
Framework | |
Bayesian active learning for choice models with deep Gaussian processes
Title | Bayesian active learning for choice models with deep Gaussian processes |
Authors | Jie Yang, Diego Klabjan |
Abstract | In this paper, we propose an active learning algorithm and models which can gradually learn individual’s preference through pairwise comparisons. The active learning scheme aims at finding individual’s most preferred choice with minimized number of pairwise comparisons. The pairwise comparisons are encoded into probabilistic models based on assumptions of choice models and deep Gaussian processes. The next-to-compare decision is determined by a novel acquisition function. We benchmark the proposed algorithm and models using functions with multiple local optima and one public airline itinerary dataset. The experiments indicate the effectiveness of our active learning algorithm and models. |
Tasks | Active Learning, Gaussian Processes |
Published | 2018-05-04 |
URL | http://arxiv.org/abs/1805.01867v1 |
http://arxiv.org/pdf/1805.01867v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-active-learning-for-choice-models |
Repo | |
Framework | |
Saliency guided deep network for weakly-supervised image segmentation
Title | Saliency guided deep network for weakly-supervised image segmentation |
Authors | Fengdong Sun, Wenhui Li |
Abstract | Weakly-supervised image segmentation is an important task in computer vision. A key problem is how to obtain high quality objects location from image-level category. Classification activation mapping is a common method which can be used to generate high-precise object location cues. However these location cues are generally very sparse and small such that they can not provide effective information for image segmentation. In this paper, we propose a saliency guided image segmentation network to resolve this problem. We employ a self-attention saliency method to generate subtle saliency maps, and render the location cues grow as seeds by seeded region growing method to expand pixel-level labels extent. In the process of seeds growing, we use the saliency values to weight the similarity between pixels to control the growing. Therefore saliency information could help generate discriminative object regions, and the effects of wrong salient pixels can be suppressed efficiently. Experimental results on a common segmentation dataset PASCAL VOC2012 demonstrate the effectiveness of our method. |
Tasks | Semantic Segmentation |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08378v1 |
http://arxiv.org/pdf/1810.08378v1.pdf | |
PWC | https://paperswithcode.com/paper/saliency-guided-deep-network-for-weakly |
Repo | |
Framework | |
Y^2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences
Title | Y^2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences |
Authors | Zhizhong Han, Mingyang Shang, Xiyang Wang, Yu-Shen Liu, Matthias Zwicker |
Abstract | A recent method employs 3D voxels to represent 3D shapes, but this limits the approach to low resolutions due to the computational cost caused by the cubic complexity of 3D voxels. Hence the method suffers from a lack of detailed geometry. To resolve this issue, we propose Y^2Seq2Seq, a view-based model, to learn cross-modal representations by joint reconstruction and prediction of view and word sequences. Specifically, the network architecture of Y^2Seq2Seq bridges the semantic meaning embedded in the two modalities by two coupled `Y’ like sequence-to-sequence (Seq2Seq) structures. In addition, our novel hierarchical constraints further increase the discriminability of the cross-modal representations by employing more detailed discriminative information. Experimental results on cross-modal retrieval and 3D shape captioning show that Y^2Seq2Seq outperforms the state-of-the-art methods. | |
Tasks | 3D Shape Representation, Cross-Modal Retrieval, Representation Learning |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.02745v1 |
http://arxiv.org/pdf/1811.02745v1.pdf | |
PWC | https://paperswithcode.com/paper/y2seq2seq-cross-modal-representation-learning |
Repo | |
Framework | |