Paper Group ANR 412
Stochastic Maximum Likelihood Optimization via Hypernetworks. The Power of Sparsity in Convolutional Neural Networks. Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields. On Detection of Faint Edges in Noisy Images. Evaluation of Automatic Video Captioning Using Direct Assessment. Rotation Adaptive Visu …
Stochastic Maximum Likelihood Optimization via Hypernetworks
Title | Stochastic Maximum Likelihood Optimization via Hypernetworks |
Authors | Abdul-Saboor Sheikh, Kashif Rasul, Andreas Merentitis, Urs Bergmann |
Abstract | This work explores maximum likelihood optimization of neural networks through hypernetworks. A hypernetwork initializes the weights of another network, which in turn can be employed for typical functional tasks such as regression and classification. We optimize hypernetworks to directly maximize the conditional likelihood of target variables given input. Using this approach we obtain competitive empirical results on regression and classification benchmarks. |
Tasks | |
Published | 2017-12-04 |
URL | http://arxiv.org/abs/1712.01141v2 |
http://arxiv.org/pdf/1712.01141v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-maximum-likelihood-optimization |
Repo | |
Framework | |
The Power of Sparsity in Convolutional Neural Networks
Title | The Power of Sparsity in Convolutional Neural Networks |
Authors | Soravit Changpinyo, Mark Sandler, Andrey Zhmoginov |
Abstract | Deep convolutional networks are well-known for their high computational and memory demands. Given limited resources, how does one design a network that balances its size, training time, and prediction accuracy? A surprisingly effective approach to trade accuracy for size and speed is to simply reduce the number of channels in each convolutional layer by a fixed fraction and retrain the network. In many cases this leads to significantly smaller networks with only minimal changes to accuracy. In this paper, we take a step further by empirically examining a strategy for deactivating connections between filters in convolutional layers in a way that allows us to harvest savings both in run-time and memory for many network architectures. More specifically, we generalize 2D convolution to use a channel-wise sparse connection structure and show that this leads to significantly better results than the baseline approach for large networks including VGG and Inception V3. |
Tasks | |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06257v1 |
http://arxiv.org/pdf/1702.06257v1.pdf | |
PWC | https://paperswithcode.com/paper/the-power-of-sparsity-in-convolutional-neural |
Repo | |
Framework | |
Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
Title | Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields |
Authors | Ylva Jansson, Tony Lindeberg |
Abstract | This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives. |
Tasks | Dynamic Texture Recognition, Object Recognition |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.04842v3 |
http://arxiv.org/pdf/1710.04842v3.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-texture-recognition-using-time-causal |
Repo | |
Framework | |
On Detection of Faint Edges in Noisy Images
Title | On Detection of Faint Edges in Noisy Images |
Authors | Nati Ofir, Meirav Galun, Sharon Alpert, Achi Brandt, Boaz Nadler, Ronen Basri |
Abstract | A fundamental question for edge detection in noisy images is how faint can an edge be and still be detected. In this paper we offer a formalism to study this question and subsequently introduce computationally efficient multiscale edge detection algorithms designed to detect faint edges in noisy images. In our formalism we view edge detection as a search in a discrete, though potentially large, set of feasible curves. First, we derive approximate expressions for the detection threshold as a function of curve length and the complexity of the search space. We then present two edge detection algorithms, one for straight edges, and the second for curved ones. Both algorithms efficiently search for edges in a large set of candidates by hierarchically constructing difference filters that match the curves traced by the sought edges. We demonstrate the utility of our algorithms in both simulations and applications involving challenging real images. Finally, based on these principles, we develop an algorithm for fiber detection and enhancement. We exemplify its utility to reveal and enhance nerve axons in light microscopy images. |
Tasks | Edge Detection |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07717v1 |
http://arxiv.org/pdf/1706.07717v1.pdf | |
PWC | https://paperswithcode.com/paper/on-detection-of-faint-edges-in-noisy-images |
Repo | |
Framework | |
Evaluation of Automatic Video Captioning Using Direct Assessment
Title | Evaluation of Automatic Video Captioning Using Direct Assessment |
Authors | Yvette Graham, George Awad, Alan Smeaton |
Abstract | We present Direct Assessment, a method for manually assessing the quality of automatically-generated captions for video. Evaluating the accuracy of video captions is particularly difficult because for any given video clip there is no definitive ground truth or correct answer against which to measure. Automatic metrics for comparing automatic video captions against a manual caption such as BLEU and METEOR, drawn from techniques used in evaluating machine translation, were used in the TRECVid video captioning task in 2016 but these are shown to have weaknesses. The work presented here brings human assessment into the evaluation by crowdsourcing how well a caption describes a video. We automatically degrade the quality of some sample captions which are assessed manually and from this we are able to rate the quality of the human assessors, a factor we take into account in the evaluation. Using data from the TRECVid video-to-text task in 2016, we show how our direct assessment method is replicable and robust and should scale to where there many caption-generation techniques to be evaluated. |
Tasks | Machine Translation, Video Captioning |
Published | 2017-10-29 |
URL | http://arxiv.org/abs/1710.10586v1 |
http://arxiv.org/pdf/1710.10586v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluation-of-automatic-video-captioning |
Repo | |
Framework | |
Rotation Adaptive Visual Object Tracking with Motion Consistency
Title | Rotation Adaptive Visual Object Tracking with Motion Consistency |
Authors | Litu Rout, Sidhartha, Gorthi R. K. S. S. Manyam, Deepak Mishra |
Abstract | Visual Object tracking research has undergone significant improvement in the past few years. The emergence of tracking by detection approach in tracking paradigm has been quite successful in many ways. Recently, deep convolutional neural networks have been extensively used in most successful trackers. Yet, the standard approach has been based on correlation or feature selection with minimal consideration given to motion consistency. Thus, there is still a need to capture various physical constraints through motion consistency which will improve accuracy, robustness and more importantly rotation adaptiveness. Therefore, one of the major aspects of this paper is to investigate the outcome of rotation adaptiveness in visual object tracking. Among other key contributions, the paper also includes various consistencies that turn out to be extremely effective in numerous challenging sequences than the current state-of-the-art. |
Tasks | Feature Selection, Object Tracking, Visual Object Tracking |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.06057v2 |
http://arxiv.org/pdf/1709.06057v2.pdf | |
PWC | https://paperswithcode.com/paper/rotation-adaptive-visual-object-tracking-with |
Repo | |
Framework | |
Safe Policy Search with Gaussian Process Models
Title | Safe Policy Search with Gaussian Process Models |
Authors | Kyriakos Polymenakos, Alessandro Abate, Stephen Roberts |
Abstract | We propose a method to optimise the parameters of a policy which will be used to safely perform a given task in a data-efficient manner. We train a Gaussian process model to capture the system dynamics, based on the PILCO framework. Our model has useful analytic properties, which allow closed form computation of error gradients and estimating the probability of violating given state space constraints. During training, as well as operation, only policies that are deemed safe are implemented on the real system, minimising the risk of failure. |
Tasks | |
Published | 2017-12-15 |
URL | https://arxiv.org/abs/1712.05556v3 |
https://arxiv.org/pdf/1712.05556v3.pdf | |
PWC | https://paperswithcode.com/paper/safe-policy-search-with-gaussian-process |
Repo | |
Framework | |
Maximum Volume Inscribed Ellipsoid: A New Simplex-Structured Matrix Factorization Framework via Facet Enumeration and Convex Optimization
Title | Maximum Volume Inscribed Ellipsoid: A New Simplex-Structured Matrix Factorization Framework via Facet Enumeration and Convex Optimization |
Authors | Chia-Hsiang Lin, Ruiyuan Wu, Wing-Kin Ma, Chong-Yung Chi, Yue Wang |
Abstract | Consider a structured matrix factorization model where one factor is restricted to have its columns lying in the unit simplex. This simplex-structured matrix factorization (SSMF) model and the associated factorization techniques have spurred much interest in research topics over different areas, such as hyperspectral unmixing in remote sensing, topic discovery in machine learning, to name a few. In this paper we develop a new theoretical SSMF framework whose idea is to study a maximum volume ellipsoid inscribed in the convex hull of the data points. This maximum volume inscribed ellipsoid (MVIE) idea has not been attempted in prior literature, and we show a sufficient condition under which the MVIE framework guarantees exact recovery of the factors. The sufficient recovery condition we show for MVIE is much more relaxed than that of separable non-negative matrix factorization (or pure-pixel search); coincidentally it is also identical to that of minimum volume enclosing simplex, which is known to be a powerful SSMF framework for non-separable problem instances. We also show that MVIE can be practically implemented by performing facet enumeration and then by solving a convex optimization problem. The potential of the MVIE framework is illustrated by numerical results. |
Tasks | Hyperspectral Unmixing |
Published | 2017-08-09 |
URL | http://arxiv.org/abs/1708.02883v3 |
http://arxiv.org/pdf/1708.02883v3.pdf | |
PWC | https://paperswithcode.com/paper/maximum-volume-inscribed-ellipsoid-a-new |
Repo | |
Framework | |
A Trie-Structured Bayesian Model for Unsupervised Morphological Segmentation
Title | A Trie-Structured Bayesian Model for Unsupervised Morphological Segmentation |
Authors | Murathan Kurfalı, Ahmet Üstün, Burcu Can |
Abstract | In this paper, we introduce a trie-structured Bayesian model for unsupervised morphological segmentation. We adopt prior information from different sources in the model. We use neural word embeddings to discover words that are morphologically derived from each other and thereby that are semantically similar. We use letter successor variety counts obtained from tries that are built by neural word embeddings. Our results show that using different information sources such as neural word embeddings and letter successor variety as prior information improves morphological segmentation in a Bayesian model. Our model outperforms other unsupervised morphological segmentation models on Turkish and gives promising results on English and German for scarce resources. |
Tasks | Word Embeddings |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07329v1 |
http://arxiv.org/pdf/1704.07329v1.pdf | |
PWC | https://paperswithcode.com/paper/a-trie-structured-bayesian-model-for |
Repo | |
Framework | |
A self-organizing neural network architecture for learning human-object interactions
Title | A self-organizing neural network architecture for learning human-object interactions |
Authors | Luiza Mici, German I. Parisi, Stefan Wermter |
Abstract | The visual recognition of transitive actions comprising human-object interactions is a key component for artificial systems operating in natural environments. This challenging task requires jointly the recognition of articulated body actions as well as the extraction of semantic elements from the scene such as the identity of the manipulated objects. In this paper, we present a self-organizing neural network for the recognition of human-object interactions from RGB-D videos. Our model consists of a hierarchy of Grow-When-Required (GWR) networks that learn prototypical representations of body motion patterns and objects, accounting for the development of action-object mappings in an unsupervised fashion. We report experimental results on a dataset of daily activities collected for the purpose of this study as well as on a publicly available benchmark dataset. In line with neurophysiological studies, our self-organizing architecture exhibits higher neural activation for congruent action-object pairs learned during training sessions with respect to synthetically created incongruent ones. We show that our unsupervised model shows competitive classification results on the benchmark dataset with respect to strictly supervised approaches. |
Tasks | Human-Object Interaction Detection |
Published | 2017-10-05 |
URL | http://arxiv.org/abs/1710.01916v2 |
http://arxiv.org/pdf/1710.01916v2.pdf | |
PWC | https://paperswithcode.com/paper/a-self-organizing-neural-network-architecture |
Repo | |
Framework | |
Causal Discovery Using Proxy Variables
Title | Causal Discovery Using Proxy Variables |
Authors | Mateo Rojas-Carulla, Marco Baroni, David Lopez-Paz |
Abstract | Discovering causal relations is fundamental to reasoning and intelligence. In particular, observational causal discovery algorithms estimate the cause-effect relation between two random entities $X$ and $Y$, given $n$ samples from $P(X,Y)$. In this paper, we develop a framework to estimate the cause-effect relation between two static entities $x$ and $y$: for instance, an art masterpiece $x$ and its fraudulent copy $y$. To this end, we introduce the notion of proxy variables, which allow the construction of a pair of random entities $(A,B)$ from the pair of static entities $(x,y)$. Then, estimating the cause-effect relation between $A$ and $B$ using an observational causal discovery algorithm leads to an estimation of the cause-effect relation between $x$ and $y$. For example, our framework detects the causal relation between unprocessed photographs and their modifications, and orders in time a set of shuffled frames from a video. As our main case study, we introduce a human-elicited dataset of 10,000 pairs of casually-linked pairs of words from natural language. Our methods discover 75% of these causal relations. Finally, we discuss the role of proxy variables in machine learning, as a general tool to incorporate static knowledge into prediction tasks. |
Tasks | Causal Discovery |
Published | 2017-02-23 |
URL | http://arxiv.org/abs/1702.07306v1 |
http://arxiv.org/pdf/1702.07306v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-discovery-using-proxy-variables |
Repo | |
Framework | |
Multiscale Hierarchical Convolutional Networks
Title | Multiscale Hierarchical Convolutional Networks |
Authors | Jörn-Henrik Jacobsen, Edouard Oyallon, Stéphane Mallat, Arnold W. M. Smeulders |
Abstract | Deep neural network algorithms are difficult to analyze because they lack structure allowing to understand the properties of underlying transforms and invariants. Multiscale hierarchical convolutional networks are structured deep convolutional networks where layers are indexed by progressively higher dimensional attributes, which are learned from training data. Each new layer is computed with multidimensional convolutions along spatial and attribute variables. We introduce an efficient implementation of such networks where the dimensionality is progressively reduced by averaging intermediate layers along attribute indices. Hierarchical networks are tested on CIFAR image data bases where they obtain comparable precisions to state of the art networks, with much fewer parameters. We study some properties of the attributes learned from these databases. |
Tasks | |
Published | 2017-03-12 |
URL | http://arxiv.org/abs/1703.04140v1 |
http://arxiv.org/pdf/1703.04140v1.pdf | |
PWC | https://paperswithcode.com/paper/multiscale-hierarchical-convolutional |
Repo | |
Framework | |
Computational Content Analysis of Negative Tweets for Obesity, Diet, Diabetes, and Exercise
Title | Computational Content Analysis of Negative Tweets for Obesity, Diet, Diabetes, and Exercise |
Authors | George Shaw Jr., Amir Karami |
Abstract | Social media based digital epidemiology has the potential to support faster response and deeper understanding of public health related threats. This study proposes a new framework to analyze unstructured health related textual data via Twitter users’ post (tweets) to characterize the negative health sentiments and non-health related concerns in relations to the corpus of negative sentiments, regarding Diet Diabetes Exercise, and Obesity (DDEO). Through the collection of 6 million Tweets for one month, this study identified the prominent topics of users as it relates to the negative sentiments. Our proposed framework uses two text mining methods, sentiment analysis and topic modeling, to discover negative topics. The negative sentiments of Twitter users support the literature narratives and the many morbidity issues that are associated with DDEO and the linkage between obesity and diabetes. The framework offers a potential method to understand the publics’ opinions and sentiments regarding DDEO. More importantly, this research provides new opportunities for computational social scientists, medical experts, and public health professionals to collectively address DDEO-related issues. |
Tasks | Epidemiology, Sentiment Analysis |
Published | 2017-09-22 |
URL | http://arxiv.org/abs/1709.07915v1 |
http://arxiv.org/pdf/1709.07915v1.pdf | |
PWC | https://paperswithcode.com/paper/computational-content-analysis-of-negative |
Repo | |
Framework | |
Simulated Annealing for JPEG Quantization
Title | Simulated Annealing for JPEG Quantization |
Authors | Max Hopkins, Michael Mitzenmacher, Sebastian Wagner-Carena |
Abstract | JPEG is one of the most widely used image formats, but in some ways remains surprisingly unoptimized, perhaps because some natural optimizations would go outside the standard that defines JPEG. We show how to improve JPEG compression in a standard-compliant, backward-compatible manner, by finding improved default quantization tables. We describe a simulated annealing technique that has allowed us to find several quantization tables that perform better than the industry standard, in terms of both compressed size and image fidelity. Specifically, we derive tables that reduce the FSIM error by over 10% while improving compression by over 20% at quality level 95 in our tests; we also provide similar results for other quality levels. While we acknowledge our approach can in some images lead to visible artifacts under large magnification, we believe use of these quantization tables, or additional tables that could be found using our methodology, would significantly reduce JPEG file sizes with improved overall image quality. |
Tasks | Quantization |
Published | 2017-09-03 |
URL | http://arxiv.org/abs/1709.00649v1 |
http://arxiv.org/pdf/1709.00649v1.pdf | |
PWC | https://paperswithcode.com/paper/simulated-annealing-for-jpeg-quantization |
Repo | |
Framework | |
View-Invariant Recognition of Action Style Self-Dissimilarity
Title | View-Invariant Recognition of Action Style Self-Dissimilarity |
Authors | Yuping Shen, Hassan Foroosh |
Abstract | Self-similarity was recently introduced as a measure of inter-class congruence for classification of actions. Herein, we investigate the dual problem of intra-class dissimilarity for classification of action styles. We introduce self-dissimilarity matrices that discriminate between same actions performed by different subjects regardless of viewing direction and camera parameters. We investigate two frameworks using these invariant style dissimilarity measures based on Principal Component Analysis (PCA) and Fisher Discriminant Analysis (FDA). Extensive experiments performed on IXMAS dataset indicate remarkably good discriminant characteristics for the proposed invariant measures for gender recognition from video data. |
Tasks | |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.07609v1 |
http://arxiv.org/pdf/1705.07609v1.pdf | |
PWC | https://paperswithcode.com/paper/view-invariant-recognition-of-action-style |
Repo | |
Framework | |