May 5, 2019

2882 words 14 mins read

Paper Group ANR 562

Dictionary Learning Strategies for Compressed Fiber Sensing Using a Probabilistic Sparse Model. Learning Null Space Projections in Operational Space Formulation. Attributes for Improved Attributes: A Multi-Task Network for Attribute Classification. Automatic measurement of vowel duration via structured prediction. Parallel Chromatic MCMC with Spati …

Dictionary Learning Strategies for Compressed Fiber Sensing Using a Probabilistic Sparse Model


Title	Dictionary Learning Strategies for Compressed Fiber Sensing Using a Probabilistic Sparse Model
Authors	Christian Weiss, Abdelhak M. Zoubir
Abstract	We present a sparse estimation and dictionary learning framework for compressed fiber sensing based on a probabilistic hierarchical sparse model. To handle severe dictionary coherence, selective shrinkage is achieved using a Weibull prior, which can be related to non-convex optimization with $p$-norm constraints for $0 < p < 1$. In addition, we leverage the specific dictionary structure to promote collective shrinkage based on a local similarity model. This is incorporated in form of a kernel function in the joint prior density of the sparse coefficients, thereby establishing a Markov random field-relation. Approximate inference is accomplished using a hybrid technique that combines Hamilton Monte Carlo and Gibbs sampling. To estimate the dictionary parameter, we pursue two strategies, relying on either a deterministic or a probabilistic model for the dictionary parameter. In the first strategy, the parameter is estimated based on alternating estimation. In the second strategy, it is jointly estimated along with the sparse coefficients. The performance is evaluated in comparison to an existing method in various scenarios using simulations and experimental data.
Tasks	Dictionary Learning
Published	2016-10-21
URL	http://arxiv.org/abs/1610.06902v1
PDF	http://arxiv.org/pdf/1610.06902v1.pdf
PWC	https://paperswithcode.com/paper/dictionary-learning-strategies-for-compressed
Repo
Framework

Learning Null Space Projections in Operational Space Formulation


Title	Learning Null Space Projections in Operational Space Formulation
Authors	Hsiu-Chin Lin, Matthew Howard
Abstract	In recent years, a number of tools have become available that recover the underlying control policy from constrained movements. However, few have explicitly considered learning the constraints of the motion and ways to cope with unknown environment. In this paper, we consider learning the null space projection matrix of a kinematically constrained system in the absence of any prior knowledge either on the underlying policy, the geometry, or dimensionality of the constraints. Our evaluations have demonstrated the effectiveness of the proposed approach on problems of differing dimensionality, and with different degrees of non-linearity.
Tasks
Published	2016-07-26
URL	http://arxiv.org/abs/1607.07611v1
PDF	http://arxiv.org/pdf/1607.07611v1.pdf
PWC	https://paperswithcode.com/paper/learning-null-space-projections-in
Repo
Framework

Attributes for Improved Attributes: A Multi-Task Network for Attribute Classification


Title	Attributes for Improved Attributes: A Multi-Task Network for Attribute Classification
Authors	Emily M. Hand, Rama Chellappa
Abstract	Attributes, or semantic features, have gained popularity in the past few years in domains ranging from activity recognition in video to face verification. Improving the accuracy of attribute classifiers is an important first step in any application which uses these attributes. In most works to date, attributes have been considered to be independent. However, we know this not to be the case. Many attributes are very strongly related, such as heavy makeup and wearing lipstick. We propose to take advantage of attribute relationships in three ways: by using a multi-task deep convolutional neural network (MCNN) sharing the lowest layers amongst all attributes, sharing the higher layers for related attributes, and by building an auxiliary network on top of the MCNN which utilizes the scores from all attributes to improve the final classification of each attribute. We demonstrate the effectiveness of our method by producing results on two challenging publicly available datasets.
Tasks	Activity Recognition, Face Verification
Published	2016-04-25
URL	http://arxiv.org/abs/1604.07360v1
PDF	http://arxiv.org/pdf/1604.07360v1.pdf
PWC	https://paperswithcode.com/paper/attributes-for-improved-attributes-a-multi
Repo
Framework

Automatic measurement of vowel duration via structured prediction


Title	Automatic measurement of vowel duration via structured prediction
Authors	Yossi Adi, Joseph Keshet, Emily Cibelli, Erin Gustafson, Cynthia Clopper, Matthew Goldrick
Abstract	A key barrier to making phonetic studies scalable and replicable is the need to rely on subjective, manual annotation. To help meet this challenge, a machine learning algorithm was developed for automatic measurement of a widely used phonetic measure: vowel duration. Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel. The model is based on the structured prediction framework. The input signal and a hypothesized set of a vowel’s onset and offset are mapped to an abstract vector space by a set of acoustic feature functions. The learning algorithm is trained in this space to minimize the difference in expectations between predicted and manually-measured vowel durations. The trained model can then automatically estimate vowel durations without phonetic or orthographic transcription. Results comparing the model to three sets of manually annotated data suggest it out-performed the current gold standard for duration measurement, an HMM-based forced aligner (which requires orthographic or phonetic transcription as an input).
Tasks	Structured Prediction
Published	2016-10-26
URL	http://arxiv.org/abs/1610.08166v1
PDF	http://arxiv.org/pdf/1610.08166v1.pdf
PWC	https://paperswithcode.com/paper/automatic-measurement-of-vowel-duration-via
Repo
Framework

Parallel Chromatic MCMC with Spatial Partitioning


Title	Parallel Chromatic MCMC with Spatial Partitioning
Authors	Jun Song, David A. Moore
Abstract	We introduce a novel approach for parallelizing MCMC inference in models with spatially determined conditional independence relationships, for which existing techniques exploiting graphical model structure are not applicable. Our approach is motivated by a model of seismic events and signals, where events detected in distant regions are approximately independent given those in intermediate regions. We perform parallel inference by coloring a factor graph defined over regions of latent space, rather than individual model variables. Evaluating on a model of seismic event detection, we achieve significant speedups over serial MCMC with no degradation in inference quality.
Tasks
Published	2016-12-02
URL	http://arxiv.org/abs/1612.00595v2
PDF	http://arxiv.org/pdf/1612.00595v2.pdf
PWC	https://paperswithcode.com/paper/parallel-chromatic-mcmc-with-spatial
Repo
Framework

Modelling Radiological Language with Bidirectional Long Short-Term Memory Networks


Title	Modelling Radiological Language with Bidirectional Long Short-Term Memory Networks
Authors	Savelie Cornegruta, Robert Bakewell, Samuel Withey, Giovanni Montana
Abstract	Motivated by the need to automate medical information extraction from free-text radiological reports, we present a bi-directional long short-term memory (BiLSTM) neural network architecture for modelling radiological language. The model has been used to address two NLP tasks: medical named-entity recognition (NER) and negation detection. We investigate whether learning several types of word embeddings improves BiLSTM’s performance on those tasks. Using a large dataset of chest x-ray reports, we compare the proposed model to a baseline dictionary-based NER system and a negation detection system that leverages the hand-crafted rules of the NegEx algorithm and the grammatical relations obtained from the Stanford Dependency Parser. Compared to these more traditional rule-based systems, we argue that BiLSTM offers a strong alternative for both our tasks.
Tasks	Medical Named Entity Recognition, Named Entity Recognition, Negation Detection, Word Embeddings
Published	2016-09-27
URL	http://arxiv.org/abs/1609.08409v1
PDF	http://arxiv.org/pdf/1609.08409v1.pdf
PWC	https://paperswithcode.com/paper/modelling-radiological-language-with
Repo
Framework

Neural Style Representations and the Large-Scale Classification of Artistic Style


Title	Neural Style Representations and the Large-Scale Classification of Artistic Style
Authors	Jeremiah Johnson
Abstract	The artistic style of a painting is a subtle aesthetic judgment used by art historians for grouping and classifying artwork. The recently introduced `neural-style' algorithm substantially succeeds in merging the perceived artistic style of one image or set of images with the perceived content of another. In light of this and other recent developments in image analysis via convolutional neural networks, we investigate the effectiveness of a` neural-style’ representation for classifying the artistic style of paintings.
Tasks
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05368v1
PDF	http://arxiv.org/pdf/1611.05368v1.pdf
PWC	https://paperswithcode.com/paper/neural-style-representations-and-the-large
Repo
Framework

Orthogonal symmetric non-negative matrix factorization under the stochastic block model


Title	Orthogonal symmetric non-negative matrix factorization under the stochastic block model
Authors	Subhadeep Paul, Yuguo Chen
Abstract	We present a method based on the orthogonal symmetric non-negative matrix tri-factorization of the normalized Laplacian matrix for community detection in complex networks. While the exact factorization of a given order may not exist and is NP hard to compute, we obtain an approximate factorization by solving an optimization problem. We establish the connection of the factors obtained through the factorization to a non-negative basis of an invariant subspace of the estimated matrix, drawing parallel with the spectral clustering. Using such factorization for clustering in networks is motivated by analyzing a block-diagonal Laplacian matrix with the blocks representing the connected components of a graph. The method is shown to be consistent for community detection in graphs generated from the stochastic block model and the degree corrected stochastic block model. Simulation results and real data analysis show the effectiveness of these methods under a wide variety of situations, including sparse and highly heterogeneous graphs where the usual spectral clustering is known to fail. Our method also performs better than the state of the art in popular benchmark network datasets, e.g., the political web blogs and the karate club data.
Tasks	Community Detection
Published	2016-05-17
URL	http://arxiv.org/abs/1605.05349v1
PDF	http://arxiv.org/pdf/1605.05349v1.pdf
PWC	https://paperswithcode.com/paper/orthogonal-symmetric-non-negative-matrix
Repo
Framework

Retrospective Causal Inference with Machine Learning Ensembles: An Application to Anti-Recidivism Policies in Colombia


Title	Retrospective Causal Inference with Machine Learning Ensembles: An Application to Anti-Recidivism Policies in Colombia
Authors	Cyrus Samii, Laura Paler, Sarah Zukerman Daly
Abstract	We present new methods to estimate causal effects retrospectively from micro data with the assistance of a machine learning ensemble. This approach overcomes two important limitations in conventional methods like regression modeling or matching: (i) ambiguity about the pertinent retrospective counterfactuals and (ii) potential misspecification, overfitting, and otherwise bias-prone or inefficient use of a large identifying covariate set in the estimation of causal effects. Our method targets the analysis toward a well defined ``retrospective intervention effect’’ (RIE) based on hypothetical population interventions and applies a machine learning ensemble that allows data to guide us, in a controlled fashion, on how to use a large identifying covariate set. We illustrate with an analysis of policy options for reducing ex-combatant recidivism in Colombia. \|
Tasks	Causal Inference
Published	2016-07-11
URL	http://arxiv.org/abs/1607.03026v1
PDF	http://arxiv.org/pdf/1607.03026v1.pdf
PWC	https://paperswithcode.com/paper/retrospective-causal-inference-with-machine
Repo
Framework

The Crossover Process: Learnability and Data Protection from Inference Attacks


Title	The Crossover Process: Learnability and Data Protection from Inference Attacks
Authors	Richard Nock, Giorgio Patrini, Finnian Lattimore, Tiberio Caetano
Abstract	It is usual to consider data protection and learnability as conflicting objectives. This is not always the case: we show how to jointly control inference — seen as the attack — and learnability by a noise-free process that mixes training examples, the Crossover Process (cp). One key point is that the cp~is typically able to alter joint distributions without touching on marginals, nor altering the sufficient statistic for the class. In other words, it saves (and sometimes improves) generalization for supervised learning, but can alter the relationship between covariates — and therefore fool measures of nonlinear independence and causal inference into misleading ad-hoc conclusions. For example, a cp~can increase / decrease odds ratios, bring fairness or break fairness, tamper with disparate impact, strengthen, weaken or reverse causal directions, change observed statistical measures of dependence. For each of these, we quantify changes brought by a cp, as well as its statistical impact on generalization abilities via a new complexity measure that we call the Rademacher cp~complexity. Experiments on a dozen readily available domains validate the theory.
Tasks	Causal Inference
Published	2016-06-13
URL	http://arxiv.org/abs/1606.04160v2
PDF	http://arxiv.org/pdf/1606.04160v2.pdf
PWC	https://paperswithcode.com/paper/the-crossover-process-learnability-and-data
Repo
Framework

DefExt: A Semi Supervised Definition Extraction Tool


Title	DefExt: A Semi Supervised Definition Extraction Tool
Authors	Luis Espinosa-Anke, Roberto Carlini, Horacio Saggion, Francesco Ronzano
Abstract	We present DefExt, an easy to use semi supervised Definition Extraction Tool. DefExt is designed to extract from a target corpus those textual fragments where a term is explicitly mentioned together with its core features, i.e. its definition. It works on the back of a Conditional Random Fields based sequential labeling algorithm and a bootstrapping approach. Bootstrapping enables the model to gradually become more aware of the idiosyncrasies of the target corpus. In this paper we describe the main components of the toolkit as well as experimental results stemming from both automatic and manual evaluation. We release DefExt as open source along with the necessary files to run it in any Unix machine. We also provide access to training and test data for immediate use.
Tasks
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02514v1
PDF	http://arxiv.org/pdf/1606.02514v1.pdf
PWC	https://paperswithcode.com/paper/defext-a-semi-supervised-definition
Repo
Framework

Relaxed Earth Mover’s Distances for Chain- and Tree-connected Spaces and their use as a Loss Function in Deep Learning


Title	Relaxed Earth Mover’s Distances for Chain- and Tree-connected Spaces and their use as a Loss Function in Deep Learning
Authors	Manuel Martinez, Monica Haurilet, Ziad Al-Halah, Makarand Tapaswi, Rainer Stiefelhagen
Abstract	The Earth Mover’s Distance (EMD) computes the optimal cost of transforming one distribution into another, given a known transport metric between them. In deep learning, the EMD loss allows us to embed information during training about the output space structure like hierarchical or semantic relations. This helps in achieving better output smoothness and generalization. However EMD is computationally expensive.Moreover, solving EMD optimization problems usually require complex techniques like lasso. These properties limit the applicability of EMD-based approaches in large scale machine learning. We address in this work the difficulties facing incorporation of EMD-based loss in deep learning frameworks. Additionally, we provide insight and novel solutions on how to integrate such loss function in training deep neural networks. Specifically, we make three main contributions: (i) we provide an in-depth analysis of the fastest state-of-the-art EMD algorithm (Sinkhorn Distance) and discuss its limitations in deep learning scenarios. (ii) we derive fast and numerically stable closed-form solutions for the EMD gradient in output spaces with chain- and tree- connectivity; and (iii) we propose a relaxed form of the EMD gradient with equivalent computational complexity but faster convergence rate. We support our claims with experiments on real datasets. In a restricted data setting on the ImageNet dataset, we train a model to classify 1000 categories using 50K images, and demonstrate that our relaxed EMD loss achieves better Top-1 accuracy than the cross entropy loss. Overall, we show that our relaxed EMD loss criterion is a powerful asset for deep learning in the small data regime.
Tasks
Published	2016-11-22
URL	http://arxiv.org/abs/1611.07573v1
PDF	http://arxiv.org/pdf/1611.07573v1.pdf
PWC	https://paperswithcode.com/paper/relaxed-earth-movers-distances-for-chain-and
Repo
Framework

Zipf’s law emerges asymptotically during phase transitions in communicative systems


Title	Zipf’s law emerges asymptotically during phase transitions in communicative systems
Authors	Bohdan B. Khomtchouk, Claes Wahlestedt
Abstract	Zipf’s law predicts a power-law relationship between word rank and frequency in language communication systems, and is widely reported in texts yet remains enigmatic as to its origins. Computer simulations have shown that language communication systems emerge at an abrupt phase transition in the fidelity of mappings between symbols and objects. Since the phase transition approximates the Heaviside or step function, we show that Zipfian scaling emerges asymptotically at high rank based on the Laplace transform. We thereby demonstrate that Zipf’s law gradually emerges from the moment of phase transition in communicative systems. We show that this power-law scaling behavior explains the emergence of natural languages at phase transitions. We find that the emergence of Zipf’s law during language communication suggests that the use of rare words in a lexicon is critical for the construction of an effective communicative system at the phase transition.
Tasks
Published	2016-03-10
URL	http://arxiv.org/abs/1603.03153v2
PDF	http://arxiv.org/pdf/1603.03153v2.pdf
PWC	https://paperswithcode.com/paper/zipfs-law-emerges-asymptotically-during-phase
Repo
Framework

Multi-Sensor Prognostics using an Unsupervised Health Index based on LSTM Encoder-Decoder


Title	Multi-Sensor Prognostics using an Unsupervised Health Index based on LSTM Encoder-Decoder
Authors	Pankaj Malhotra, Vishnu TV, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh Vig, Puneet Agarwal, Gautam Shroff
Abstract	Many approaches for estimation of Remaining Useful Life (RUL) of a machine, using its operational sensor data, make assumptions about how a system degrades or a fault evolves, e.g., exponential degradation. However, in many domains degradation may not follow a pattern. We propose a Long Short Term Memory based Encoder-Decoder (LSTM-ED) scheme to obtain an unsupervised health index (HI) for a system using multi-sensor time-series data. LSTM-ED is trained to reconstruct the time-series corresponding to healthy state of a system. The reconstruction error is used to compute HI which is then used for RUL estimation. We evaluate our approach on publicly available Turbofan Engine and Milling Machine datasets. We also present results on a real-world industry dataset from a pulverizer mill where we find significant correlation between LSTM-ED based HI and maintenance costs.
Tasks	Time Series
Published	2016-08-22
URL	http://arxiv.org/abs/1608.06154v1
PDF	http://arxiv.org/pdf/1608.06154v1.pdf
PWC	https://paperswithcode.com/paper/multi-sensor-prognostics-using-an
Repo
Framework

Semi-supervised Vocabulary-informed Learning


Title	Semi-supervised Vocabulary-informed Learning
Authors	Yanwei Fu, Leonid Sigal
Abstract	Despite significant progress in object categorization, in recent years, a number of important challenges remain, mainly, ability to learn from limited labeled data and ability to recognize object classes within large, potentially open, set of labels. Zero-shot learning is one way of addressing these challenges, but it has only been shown to work with limited sized class vocabularies and typically requires separation between supervised and unsupervised classes, allowing former to inform the latter but not vice versa. We propose the notion of semi-supervised vocabulary-informed learning to alleviate the above mentioned challenges and address problems of supervised, zero-shot and open set recognition using a unified framework. Specifically, we propose a maximum margin framework for semantic manifold-based recognition that incorporates distance constraints from (both supervised and unsupervised) vocabulary atoms, ensuring that labeled samples are projected closest to their correct prototypes, in the embedding space, than to others. We show that resulting model shows improvements in supervised, zero-shot, and large open set recognition, with up to 310K class vocabulary on AwA and ImageNet datasets.
Tasks	Open Set Learning, Zero-Shot Learning
Published	2016-04-24
URL	http://arxiv.org/abs/1604.07093v1
PDF	http://arxiv.org/pdf/1604.07093v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-vocabulary-informed-learning
Repo
Framework