July 27, 2019

2940 words 14 mins read

Paper Group ANR 548

An Automatic Solver for Very Large Jigsaw Puzzles Using Genetic Algorithms. Class-specific image denoising using importance sampling. Scalable synthesis of safety certificates from data with application to learning-based control. Attentive Recurrent Comparators. Causal Consistency of Structural Equation Models. Action-Attending Graphic Neural Netwo …

An Automatic Solver for Very Large Jigsaw Puzzles Using Genetic Algorithms


Title	An Automatic Solver for Very Large Jigsaw Puzzles Using Genetic Algorithms
Authors	Dror Sholomon, Eli David, Nathan S. Netanyahu
Abstract	In this paper we propose the first effective genetic algorithm (GA)-based jigsaw puzzle solver. We introduce a novel crossover procedure that merges two “parent” solutions to an improved “child” configuration by detecting, extracting, and combining correctly assembled puzzle segments. The solver proposed exhibits state-of-the-art performance, as far as handling previously attempted puzzles more accurately and efficiently, as well puzzle sizes that have not been attempted before. The extended experimental results provided in this paper include, among others, a thorough inspection of up to 30,745-piece puzzles (compared to previous attempts on 22,755-piece puzzles), using a considerably faster concurrent implementation of the algorithm. Furthermore, we explore the impact of different phases of the novel crossover operator by experimenting with several variants of the GA. Finally, we compare different fitness functions and their effect on the overall results of the GA-based solver.
Tasks
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06767v1
PDF	http://arxiv.org/pdf/1711.06767v1.pdf
PWC	https://paperswithcode.com/paper/an-automatic-solver-for-very-large-jigsaw
Repo
Framework

Class-specific image denoising using importance sampling


Title	Class-specific image denoising using importance sampling
Authors	Milad Niknejad, Jose M. Bioucas-Dias, Mario A. T. Figueiredo
Abstract	In this paper, we propose a new image denoising method, tailored to specific classes of images, assuming that a dataset of clean images of the same class is available. Similarly to the non-local means (NLM) algorithm, the proposed method computes a weighted average of non-local patches, which we interpret under the importance sampling framework. This viewpoint introduces flexibility regarding the adopted priors, the noise statistics, and the computation of Bayesian estimates. The importance sampling viewpoint is exploited to approximate the minimum mean squared error (MMSE) patch estimates, using the true underlying prior on image patches. The estimates thus obtained converge to the true MMSE estimates, as the number of samples approaches infinity. Experimental results provide evidence that the proposed denoiser outperforms the state-of-the-art in the specific classes of face and text images.
Tasks	Denoising, Image Denoising
Published	2017-06-21
URL	http://arxiv.org/abs/1706.06917v1
PDF	http://arxiv.org/pdf/1706.06917v1.pdf
PWC	https://paperswithcode.com/paper/class-specific-image-denoising-using
Repo
Framework

Scalable synthesis of safety certificates from data with application to learning-based control


Title	Scalable synthesis of safety certificates from data with application to learning-based control
Authors	Kim P. Wabersich, Melanie N. Zeilinger
Abstract	The control of complex systems faces a trade-off between high performance and safety guarantees, which in particular restricts the application of learning-based methods to safety-critical systems. A recently proposed framework to address this issue is the use of a safety controller, which guarantees to keep the system within a safe region of the state space. This paper introduces efficient techniques for the synthesis of a safe set and control law, which offer improved scalability properties by relying on approximations based on convex optimization problems. The first proposed method requires only an approximate linear system model and Lipschitz continuity of the unknown nonlinear dynamics. The second method extends the results by showing how a Gaussian process prior on the unknown system dynamics can be used in order to reduce conservatism of the resulting safe set. We demonstrate the results with numerical examples, including an autonomous convoy of vehicles.
Tasks
Published	2017-11-30
URL	http://arxiv.org/abs/1711.11417v3
PDF	http://arxiv.org/pdf/1711.11417v3.pdf
PWC	https://paperswithcode.com/paper/scalable-synthesis-of-safety-certificates
Repo
Framework

Attentive Recurrent Comparators


Title	Attentive Recurrent Comparators
Authors	Pranav Shyam, Shubham Gupta, Ambedkar Dukkipati
Abstract	Rapid learning requires flexible representations to quickly adopt to new evidence. We develop a novel class of models called Attentive Recurrent Comparators (ARCs) that form representations of objects by cycling through them and making observations. Using the representations extracted by ARCs, we develop a way of approximating a \textit{dynamic representation space} and use it for one-shot learning. In the task of one-shot classification on the Omniglot dataset, we achieve the state of the art performance with an error rate of 1.5%. This represents the first super-human result achieved for this task with a generic model that uses only pixel information.
Tasks	Omniglot, One-Shot Learning
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00767v3
PDF	http://arxiv.org/pdf/1703.00767v3.pdf
PWC	https://paperswithcode.com/paper/attentive-recurrent-comparators
Repo
Framework

Causal Consistency of Structural Equation Models


Title	Causal Consistency of Structural Equation Models
Authors	Paul K. Rubenstein, Sebastian Weichwald, Stephan Bongers, Joris M. Mooij, Dominik Janzing, Moritz Grosse-Wentrup, Bernhard Schölkopf
Abstract	Complex systems can be modelled at various levels of detail. Ideally, causal models of the same system should be consistent with one another in the sense that they agree in their predictions of the effects of interventions. We formalise this notion of consistency in the case of Structural Equation Models (SEMs) by introducing exact transformations between SEMs. This provides a general language to consider, for instance, the different levels of description in the following three scenarios: (a) models with large numbers of variables versus models in which the `irrelevant’ or unobservable variables have been marginalised out; (b) micro-level models versus macro-level models in which the macro-variables are aggregate features of the micro-variables; (c) dynamical time series models versus models of their stationary behaviour. Our analysis stresses the importance of well specified interventions in the causal modelling process and sheds light on the interpretation of cyclic SEMs. \|
Tasks	Time Series
Published	2017-07-04
URL	http://arxiv.org/abs/1707.00819v1
PDF	http://arxiv.org/pdf/1707.00819v1.pdf
PWC	https://paperswithcode.com/paper/causal-consistency-of-structural-equation
Repo
Framework

Action-Attending Graphic Neural Network


Title	Action-Attending Graphic Neural Network
Authors	Chaolong Li, Zhen Cui, Wenming Zheng, Chunyan Xu, Rongrong Ji, Jian Yang
Abstract	The motion analysis of human skeletons is crucial for human action recognition, which is one of the most active topics in computer vision. In this paper, we propose a fully end-to-end action-attending graphic neural network (A$^2$GNN) for skeleton-based action recognition, in which each irregular skeleton is structured as an undirected attribute graph. To extract high-level semantic representation from skeletons, we perform the local spectral graph filtering on the constructed attribute graphs like the standard image convolution operation. Considering not all joints are informative for action analysis, we design an action-attending layer to detect those salient action units (AUs) by adaptively weighting skeletal joints. Herein the filtering responses are parameterized into a weighting function irrelevant to the order of input nodes. To further encode continuous motion variations, the deep features learnt from skeletal graphs are gathered along consecutive temporal slices and then fed into a recurrent gated network. Finally, the spectral graph filtering, action-attending and recurrent temporal encoding are integrated together to jointly train for the sake of robust action recognition as well as the intelligibility of human actions. To evaluate our A$^2$GNN, we conduct extensive experiments on four benchmark skeleton-based action datasets, including the large-scale challenging NTU RGB+D dataset. The experimental results demonstrate that our network achieves the state-of-the-art performances.
Tasks	Skeleton Based Action Recognition, Temporal Action Localization
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06427v1
PDF	http://arxiv.org/pdf/1711.06427v1.pdf
PWC	https://paperswithcode.com/paper/action-attending-graphic-neural-network
Repo
Framework

A Service-Oriented Architecture for Assisting the Authoring of Semantic Crowd Maps


Title	A Service-Oriented Architecture for Assisting the Authoring of Semantic Crowd Maps
Authors	Henrique Santos, Vasco Furtado
Abstract	Although there are increasingly more initiatives for the generation of semantic knowledge based on user participation, there is still a shortage of platforms for regular users to create applications on which semantic data can be exploited and generated automatically. We propose an architecture, called Semantic Maps (SeMaps), for assisting the authoring and hosting of applications in which the maps combine the aggregation of a Geographic Information System and crowd-generated content (called here crowd maps). In these systems, the digital map works as a blackboard for accommodating stories told by people about events they want to share with others typically participating in their social networks. SeMaps offers an environment for the creation and maintenance of sites based on crowd maps with the possibility for the user to characterize semantically that which s/he intends to mark on the map. The designer of a crowd map, by informing a linguistic expression that designates what has to be marked on the maps, is guided in a process that aims to associate a concept from a common-sense base to this linguistic expression. Thus, the crowd maps start to have dominion over common-sense inferential relations that define the meaning of the marker, and are able to make inferences about the network of linked data. This makes it possible to generate maps that have the power to perform inferences and access external sources (such as DBpedia) that constitute information that is useful and appropriate to the context of the map. In this paper we describe the architecture of SeMaps and how it was applied in a crowd map authoring tool.
Tasks	Common Sense Reasoning
Published	2017-04-06
URL	http://arxiv.org/abs/1704.01855v1
PDF	http://arxiv.org/pdf/1704.01855v1.pdf
PWC	https://paperswithcode.com/paper/a-service-oriented-architecture-for-assisting
Repo
Framework

Infinite Latent Feature Selection: A Probabilistic Latent Graph-Based Ranking Approach


Title	Infinite Latent Feature Selection: A Probabilistic Latent Graph-Based Ranking Approach
Authors	Giorgio Roffo, Simone Melzi, Umberto Castellani, Alessandro Vinciarelli
Abstract	Feature selection is playing an increasingly significant role with respect to many computer vision applications spanning from object recognition to visual object tracking. However, most of the recent solutions in feature selection are not robust across different and heterogeneous set of data. In this paper, we address this issue proposing a robust probabilistic latent graph-based feature selection algorithm that performs the ranking step while considering all the possible subsets of features, as paths on a graph, bypassing the combinatorial problem analytically. An appealing characteristic of the approach is that it aims to discover an abstraction behind low-level sensory data, that is, relevancy. Relevancy is modelled as a latent variable in a PLSA-inspired generative process that allows the investigation of the importance of a feature when injected into an arbitrary set of cues. The proposed method has been tested on ten diverse benchmarks, and compared against eleven state of the art feature selection methods. Results show that the proposed approach attains the highest performance levels across many different scenarios and difficulties, thereby confirming its strong robustness while setting a new state of the art in feature selection domain.
Tasks	Feature Selection, Object Recognition, Object Tracking, Visual Object Tracking
Published	2017-07-24
URL	http://arxiv.org/abs/1707.07538v1
PDF	http://arxiv.org/pdf/1707.07538v1.pdf
PWC	https://paperswithcode.com/paper/infinite-latent-feature-selection-a
Repo
Framework

Incremental Import Vector Machines for Classifying Hyperspectral Data


Title	Incremental Import Vector Machines for Classifying Hyperspectral Data
Authors	Ribana Roscher, Björn Waske, Wolfgang Förstner
Abstract	In this paper we propose an incremental learning strategy for import vector machines (IVM), which is a sparse kernel logistic regression approach. We use the procedure for the concept of self-training for sequential classification of hyperspectral data. The strategy comprises the inclusion of new training samples to increase the classification accuracy and the deletion of non-informative samples to be memory- and runtime-efficient. Moreover, we update the parameters in the incremental IVM model without re-training from scratch. Therefore, the incremental classifier is able to deal with large data sets. The performance of the IVM in comparison to support vector machines (SVM) is evaluated in terms of accuracy and experiments are conducted to assess the potential of the probabilistic outputs of the IVM. Experimental results demonstrate that the IVM and SVM perform similar in terms of classification accuracy. However, the number of import vectors is significantly lower when compared to the number of support vectors and thus, the computation time during classification can be decreased. Moreover, the probabilities provided by IVM are more reliable, when compared to the probabilistic information, derived from an SVM’s output. In addition, the proposed self-training strategy can increase the classification accuracy. Overall, the IVM and the its incremental version is worthwhile for the classification of hyperspectral data.
Tasks
Published	2017-08-20
URL	http://arxiv.org/abs/1708.05966v1
PDF	http://arxiv.org/pdf/1708.05966v1.pdf
PWC	https://paperswithcode.com/paper/incremental-import-vector-machines-for
Repo
Framework

Self-Reinforced Cascaded Regression for Face Alignment


Title	Self-Reinforced Cascaded Regression for Face Alignment
Authors	Xin Fan, Risheng Liu, Kang Huyan, Yuyao Feng, Zhongxuan Luo
Abstract	Cascaded regression is prevailing in face alignment thanks to its accuracy and robustness, but typically demands manually annotated examples having low discrepancy between shape-indexed features and shape updates. In this paper, we propose a self-reinforced strategy that iteratively expands the quantity and improves the quality of training examples, thus upgrading the performance of cascaded regression itself. The reinforced term evaluates the example quality upon the consistence on both local appearance and global geometry of human faces, and constitutes the example evolution by the philosophy of “survival of the fittest”. We train a set of discriminative classifiers, each associated with one landmark label, to prune those examples with inconsistent local appearance, and further validate the geometric relationship among groups of labeled landmarks against the common global geometry derived from a projective invariant. We embed this generic strategy into typical cascaded regressions, and the alignment results on several benchmark data sets demonstrate its effectiveness to predict good examples starting from a small subset.
Tasks	Face Alignment
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08624v1
PDF	http://arxiv.org/pdf/1711.08624v1.pdf
PWC	https://paperswithcode.com/paper/self-reinforced-cascaded-regression-for-face
Repo
Framework

Exploiting Nontrivial Connectivity for Automatic Speech Recognition


Title	Exploiting Nontrivial Connectivity for Automatic Speech Recognition
Authors	Marius Paraschiv, Lasse Borgholt, Tycho Max Sylvester Tax, Marco Singh, Lars Maaløe
Abstract	Nontrivial connectivity has allowed the training of very deep networks by addressing the problem of vanishing gradients and offering a more efficient method of reusing parameters. In this paper we make a comparison between residual networks, densely-connected networks and highway networks on an image classification task. Next, we show that these methodologies can easily be deployed into automatic speech recognition and provide significant improvements to existing models.
Tasks	Image Classification, Speech Recognition
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10271v1
PDF	http://arxiv.org/pdf/1711.10271v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-nontrivial-connectivity-for
Repo
Framework

Influence of Resampling on Accuracy of Imbalanced Classification


Title	Influence of Resampling on Accuracy of Imbalanced Classification
Authors	Evgeny Burnaev, Pavel Erofeev, Artem Papanov
Abstract	In many real-world binary classification tasks (e.g. detection of certain objects from images), an available dataset is imbalanced, i.e., it has much less representatives of a one class (a minor class), than of another. Generally, accurate prediction of the minor class is crucial but it’s hard to achieve since there is not much information about the minor class. One approach to deal with this problem is to preliminarily resample the dataset, i.e., add new elements to the dataset or remove existing ones. Resampling can be done in various ways which raises the problem of choosing the most appropriate one. In this paper we experimentally investigate impact of resampling on classification accuracy, compare resampling methods and highlight key points and difficulties of resampling.
Tasks
Published	2017-07-12
URL	http://arxiv.org/abs/1707.03905v1
PDF	http://arxiv.org/pdf/1707.03905v1.pdf
PWC	https://paperswithcode.com/paper/influence-of-resampling-on-accuracy-of
Repo
Framework

Deep Neural Networks for Multiple Speaker Detection and Localization


Title	Deep Neural Networks for Multiple Speaker Detection and Localization
Authors	Weipeng He, Petr Motlicek, Jean-Marc Odobez
Abstract	We propose to use neural networks for simultaneous detection and localization of multiple sound sources in human-robot interaction. In contrast to conventional signal processing techniques, neural network-based sound source localization methods require fewer strong assumptions about the environment. Previous neural network-based methods have been focusing on localizing a single sound source, which do not extend to multiple sources in terms of detection and localization. In this paper, we thus propose a likelihood-based encoding of the network output, which naturally allows the detection of an arbitrary number of sources. In addition, we investigate the use of sub-band cross-correlation information as features for better localization in sound mixtures, as well as three different network architectures based on different motivations. Experiments on real data recorded from a robot show that our proposed methods significantly outperform the popular spatial spectrum-based approaches.
Tasks
Published	2017-11-30
URL	http://arxiv.org/abs/1711.11565v3
PDF	http://arxiv.org/pdf/1711.11565v3.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-for-multiple-speaker
Repo
Framework

A Modified Construction for a Support Vector Classifier to Accommodate Class Imbalances


Title	A Modified Construction for a Support Vector Classifier to Accommodate Class Imbalances
Authors	Matt Parker, Colin Parker
Abstract	Given a training set with binary classification, the Support Vector Machine identifies the hyperplane maximizing the margin between the two classes of training data. This general formulation is useful in that it can be applied without regard to variance differences between the classes. Ignoring these differences is not optimal, however, as the general SVM will give the class with lower variance an unjustifiably wide berth. This increases the chance of misclassification of the other class and results in an overall loss of predictive performance. An alternate construction is proposed in which the margins of the separating hyperplane are different for each class, each proportional to the standard deviation of its class along the direction perpendicular to the hyperplane. The construction agrees with the SVM in the case of equal class variances. This paper will then examine the impact to the dual representation of the modified constraint equations.
Tasks
Published	2017-02-08
URL	http://arxiv.org/abs/1702.02555v2
PDF	http://arxiv.org/pdf/1702.02555v2.pdf
PWC	https://paperswithcode.com/paper/a-modified-construction-for-a-support-vector
Repo
Framework

Topic Identification for Speech without ASR


Title	Topic Identification for Speech without ASR
Authors	Chunxi Liu, Jan Trmal, Matthew Wiesner, Craig Harman, Sanjeev Khudanpur
Abstract	Modern topic identification (topic ID) systems for speech use automatic speech recognition (ASR) to produce speech transcripts, and perform supervised classification on such ASR outputs. However, under resource-limited conditions, the manually transcribed speech required to develop standard ASR systems can be severely limited or unavailable. In this paper, we investigate alternative unsupervised solutions to obtaining tokenizations of speech in terms of a vocabulary of automatically discovered word-like or phoneme-like units, without depending on the supervised training of ASR systems. Moreover, using automatic phoneme-like tokenizations, we demonstrate that a convolutional neural network based framework for learning spoken document representations provides competitive performance compared to a standard bag-of-words representation, as evidenced by comprehensive topic ID evaluations on both single-label and multi-label classification tasks.
Tasks	Multi-Label Classification, Speech Recognition
Published	2017-03-22
URL	http://arxiv.org/abs/1703.07476v2
PDF	http://arxiv.org/pdf/1703.07476v2.pdf
PWC	https://paperswithcode.com/paper/topic-identification-for-speech-without-asr
Repo
Framework