Paper Group ANR 233
Locally Adaptive Learning Loss for Semantic Image Segmentation. Learning with Analytical Models. Formal Context Generation using Dirichlet Distributions. Non-Factorised Variational Inference in Dynamical Systems. Parser Training with Heterogeneous Treebanks. Inference of Users Demographic Attributes based on Homophily in Communication Networks. The …
Locally Adaptive Learning Loss for Semantic Image Segmentation
Title | Locally Adaptive Learning Loss for Semantic Image Segmentation |
Authors | Jinjiang Guo, Pengyuan Ren, Aiguo Gu, Jian Xu, Weixin Wu |
Abstract | We propose a novel locally adaptive learning estimator for enhancing the inter- and intra- discriminative capabilities of Deep Neural Networks, which can be used as improved loss layer for semantic image segmentation tasks. Most loss layers compute pixel-wise cost between feature maps and ground truths, ignoring spatial layouts and interactions between neighboring pixels with same object category, and thus networks cannot be effectively sensitive to intra-class connections. Stride by stride, our method firstly conducts adaptive pooling filter operating over predicted feature maps, aiming to merge predicted distributions over a small group of neighboring pixels with same category, and then it computes cost between the merged distribution vector and their category label. Such design can make groups of neighboring predictions from same category involved into estimations on predicting correctness with respect to their category, and hence train networks to be more sensitive to regional connections between adjacent pixels based on their categories. In the experiments on Pascal VOC 2012 segmentation datasets, the consistently improved results show that our proposed approach achieves better segmentation masks against previous counterparts. |
Tasks | Semantic Segmentation |
Published | 2018-02-23 |
URL | http://arxiv.org/abs/1802.08290v2 |
http://arxiv.org/pdf/1802.08290v2.pdf | |
PWC | https://paperswithcode.com/paper/locally-adaptive-learning-loss-for-semantic |
Repo | |
Framework | |
Learning with Analytical Models
Title | Learning with Analytical Models |
Authors | Huda Ibeid, Siping Meng, Oliver Dobon, Luke Olson, William Gropp |
Abstract | To understand and predict the performance of scientific applications, several analytical and machine learning approaches have been proposed, each having its advantages and disadvantages. In this paper, we propose and validate a hybrid approach for performance modeling and prediction, which combines analytical and machine learning models. The proposed hybrid model aims to minimize prediction cost while providing reasonable prediction accuracy. Our validation results show that the hybrid model is able to learn and correct the analytical models to better match the actual performance. Furthermore, the proposed hybrid model improves the prediction accuracy in comparison to pure machine learning techniques while using small training datasets, thus making it suitable for hardware and workload changes. |
Tasks | |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11772v2 |
http://arxiv.org/pdf/1810.11772v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-with-analytical-models |
Repo | |
Framework | |
Formal Context Generation using Dirichlet Distributions
Title | Formal Context Generation using Dirichlet Distributions |
Authors | Maximilian Felde, Tom Hanika |
Abstract | We suggest an improved way to randomly generate formal contexts based on Dirichlet distributions. For this purpose we investigate the predominant way to generate formal contexts, a coin-tossing model, recapitulate some of its shortcomings and examine its stochastic model. Building up on this we propose our Dirichlet model and develop an algorithm employing this idea. By comparing our generation model to a coin-tossing model we show that our approach is a significant improvement with respect to the variety of contexts generated. Finally, we outline a possible application in null model generation for formal contexts. |
Tasks | |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1809.11160v1 |
http://arxiv.org/pdf/1809.11160v1.pdf | |
PWC | https://paperswithcode.com/paper/formal-context-generation-using-dirichlet |
Repo | |
Framework | |
Non-Factorised Variational Inference in Dynamical Systems
Title | Non-Factorised Variational Inference in Dynamical Systems |
Authors | Alessandro Davide Ialongo, Mark van der Wilk, James Hensman, Carl Edward Rasmussen |
Abstract | We focus on variational inference in dynamical systems where the discrete time transition function (or evolution rule) is modelled by a Gaussian process. The dominant approach so far has been to use a factorised posterior distribution, decoupling the transition function from the system states. This is not exact in general and can lead to an overconfident posterior over the transition function as well as an overestimation of the intrinsic stochasticity of the system (process noise). We propose a new method that addresses these issues and incurs no additional computational costs. |
Tasks | |
Published | 2018-12-14 |
URL | http://arxiv.org/abs/1812.06067v1 |
http://arxiv.org/pdf/1812.06067v1.pdf | |
PWC | https://paperswithcode.com/paper/non-factorised-variational-inference-in |
Repo | |
Framework | |
Parser Training with Heterogeneous Treebanks
Title | Parser Training with Heterogeneous Treebanks |
Authors | Sara Stymne, Miryam de Lhoneux, Aaron Smith, Joakim Nivre |
Abstract | How to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question. We start by investigating previously suggested, but little evaluated, strategies for exploiting multiple treebanks based on concatenating training sets, with or without fine-tuning. We go on to propose a new method based on treebank embeddings. We perform experiments for several languages and show that in many cases fine-tuning and treebank embeddings lead to substantial improvements over single treebanks or concatenation, with average gains of 2.0–3.5 LAS points. We argue that treebank embeddings should be preferred due to their conceptual simplicity, flexibility and extensibility. |
Tasks | |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05089v1 |
http://arxiv.org/pdf/1805.05089v1.pdf | |
PWC | https://paperswithcode.com/paper/parser-training-with-heterogeneous-treebanks |
Repo | |
Framework | |
Inference of Users Demographic Attributes based on Homophily in Communication Networks
Title | Inference of Users Demographic Attributes based on Homophily in Communication Networks |
Authors | Jorge Brea, Javier Burroni, Carlos Sarraute |
Abstract | Over the past decade, mobile phones have become prevalent in all parts of the world, across all demographic backgrounds. Mobile phones are used by men and women across a wide age range in both developed and developing countries. Consequently, they have become one of the most important mechanisms for social interaction within a population, making them an increasingly important source of information to understand human demographics and human behaviour. In this work we combine two sources of information: communication logs from a major mobile operator in a Latin American country, and information on the demographics of a subset of the users population. This allows us to perform an observational study of mobile phone usage, differentiated by age groups categories. This study is interesting in its own right, since it provides knowledge on the structure and demographics of the mobile phone market in the studied country. We then tackle the problem of inferring the age group for all users in the network. We present here an exclusively graph-based inference method relying solely on the topological structure of the mobile network, together with a topological analysis of the performance of the algorithm. The equations for our algorithm can be described as a diffusion process with two added properties: (i) memory of its initial state, and (ii) the information is propagated as a probability vector for each node attribute (instead of the value of the attribute itself). Our algorithm can successfully infer different age groups within the network population given known values for a subset of nodes (seed nodes). Most interestingly, we show that by carefully analysing the topological relationships between correctly predicted nodes and the seed nodes, we can characterize particular subsets of nodes for which our inference method has significantly higher accuracy. |
Tasks | |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00527v1 |
http://arxiv.org/pdf/1808.00527v1.pdf | |
PWC | https://paperswithcode.com/paper/inference-of-users-demographic-attributes |
Repo | |
Framework | |
The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization
Title | The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization |
Authors | Constantinos Daskalakis, Ioannis Panageas |
Abstract | Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has been recognized that they may cycle, and there is no good understanding of their limit points when they do not. When they converge, do they converge to local min-max solutions? We characterize the limit points of two basic first order methods, namely Gradient Descent/Ascent (GDA) and Optimistic Gradient Descent Ascent (OGDA). We show that both dynamics avoid unstable critical points for almost all initializations. Moreover, for small step sizes and under mild assumptions, the set of {OGDA}-stable critical points is a superset of {GDA}-stable critical points, which is a superset of local min-max solutions (strict in some cases). The connecting thread is that the behavior of these dynamics can be studied from a dynamical systems perspective. |
Tasks | |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.03907v1 |
http://arxiv.org/pdf/1807.03907v1.pdf | |
PWC | https://paperswithcode.com/paper/the-limit-points-of-optimistic-gradient |
Repo | |
Framework | |
On the Persistence of Clustering Solutions and True Number of Clusters in a Dataset
Title | On the Persistence of Clustering Solutions and True Number of Clusters in a Dataset |
Authors | Amber Srivastava, Mayank Baranwal, Srinivasa Salapaka |
Abstract | Typically clustering algorithms provide clustering solutions with prespecified number of clusters. The lack of a priori knowledge on the true number of underlying clusters in the dataset makes it important to have a metric to compare the clustering solutions with different number of clusters. This article quantifies a notion of persistence of clustering solutions that enables comparing solutions with different number of clusters. The persistence relates to the range of data-resolution scales over which a clustering solution persists; it is quantified in terms of the maximum over two-norms of all the associated cluster-covariance matrices. Thus we associate a persistence value for each element in a set of clustering solutions with different number of clusters. We show that the datasets where natural clusters are a priori known, the clustering solutions that identify the natural clusters are most persistent - in this way, this notion can be used to identify solutions with true number of clusters. Detailed experiments on a variety of standard and synthetic datasets demonstrate that the proposed persistence-based indicator outperforms the existing approaches, such as, gap-statistic method, $X$-means, $G$-means, $PG$-means, dip-means algorithms and information-theoretic method, in accurately identifying the clustering solutions with true number of clusters. Interestingly, our method can be explained in terms of the phase-transition phenomenon in the deterministic annealing algorithm, where the number of distinct cluster centers changes (bifurcates) with respect to an annealing parameter. |
Tasks | |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1811.00102v2 |
http://arxiv.org/pdf/1811.00102v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-persistence-of-clustering-solutions |
Repo | |
Framework | |
Knowledge-Embedded Representation Learning for Fine-Grained Image Recognition
Title | Knowledge-Embedded Representation Learning for Fine-Grained Image Recognition |
Authors | Tianshui Chen, Liang Lin, Riquan Chen, Yang Wu, Xiaonan Luo |
Abstract | Humans can naturally understand an image in depth with the aid of rich knowledge accumulated from daily lives or professions. For example, to achieve fine-grained image recognition (e.g., categorizing hundreds of subordinate categories of birds) usually requires a comprehensive visual concept organization including category labels and part-level attributes. In this work, we investigate how to unify rich professional knowledge with deep neural network architectures and propose a Knowledge-Embedded Representation Learning (KERL) framework for handling the problem of fine-grained image recognition. Specifically, we organize the rich visual concepts in the form of knowledge graph and employ a Gated Graph Neural Network to propagate node message through the graph for generating the knowledge representation. By introducing a novel gated mechanism, our KERL framework incorporates this knowledge representation into the discriminative image feature learning, i.e., implicitly associating the specific attributes with the feature maps. Compared with existing methods of fine-grained image classification, our KERL framework has several appealing properties: i) The embedded high-level knowledge enhances the feature representation, thus facilitating distinguishing the subtle differences among subordinate categories. ii) Our framework can learn feature maps with a meaningful configuration that the highlighted regions finely accord with the nodes (specific attributes) of the knowledge graph. Extensive experiments on the widely used Caltech-UCSD bird dataset demonstrate the superiority of our KERL framework over existing state-of-the-art methods. |
Tasks | Fine-Grained Image Classification, Fine-Grained Image Recognition, Image Classification, Representation Learning |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00505v1 |
http://arxiv.org/pdf/1807.00505v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-embedded-representation-learning |
Repo | |
Framework | |
Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples
Title | Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples |
Authors | Xiu-Shen Wei, Peng Wang, Lingqiao Liu, Chunhua Shen, Jianxin Wu |
Abstract | Humans are capable of learning a new fine-grained concept with very little supervision, \emph{e.g.}, few exemplary images for a species of bird, yet our best deep learning systems need hundreds or thousands of labeled examples. In this paper, we try to reduce this gap by studying the fine-grained image recognition problem in a challenging few-shot learning setting, termed few-shot fine-grained recognition (FSFG). The task of FSFG requires the learning systems to build classifiers for novel fine-grained categories from few examples (only one or less than five). To solve this problem, we propose an end-to-end trainable deep network which is inspired by the state-of-the-art fine-grained recognition model and is tailored for the FSFG task. Specifically, our network consists of a bilinear feature learning module and a classifier mapping module: while the former encodes the discriminative information of an exemplar image into a feature vector, the latter maps the intermediate feature into the decision boundary of the novel category. The key novelty of our model is a “piecewise mappings” function in the classifier mapping module, which generates the decision boundary via learning a set of more attainable sub-classifiers in a more parameter-economic way. We learn the exemplar-to-classifier mapping based on an auxiliary dataset in a meta-learning fashion, which is expected to be able to generalize to novel categories. By conducting comprehensive experiments on three fine-grained datasets, we demonstrate that the proposed method achieves superior performance over the competing baselines. |
Tasks | Few-Shot Learning, Fine-Grained Image Recognition, Meta-Learning |
Published | 2018-05-11 |
URL | https://arxiv.org/abs/1805.04288v2 |
https://arxiv.org/pdf/1805.04288v2.pdf | |
PWC | https://paperswithcode.com/paper/piecewise-classifier-mappings-learning-fine |
Repo | |
Framework | |
A comparable study of modeling units for end-to-end Mandarin speech recognition
Title | A comparable study of modeling units for end-to-end Mandarin speech recognition |
Authors | Wei Zou, Dongwei Jiang, Shuaijiang Zhao, Xiangang Li |
Abstract | End-To-End speech recognition have become increasingly popular in mandarin speech recognition and achieved delightful performance. Mandarin is a tonal language which is different from English and requires special treatment for the acoustic modeling units. There have been several different kinds of modeling units for mandarin such as phoneme, syllable and Chinese character. In this work, we explore two major end-to-end models: connectionist temporal classification (CTC) model and attention based encoder-decoder model for mandarin speech recognition. We compare the performance of three different scaled modeling units: context dependent phoneme(CDP), syllable with tone and Chinese character. We find that all types of modeling units can achieve approximate character error rate (CER) in CTC model and the performance of Chinese character attention model is better than syllable attention model. Furthermore, we find that Chinese character is a reasonable unit for mandarin speech recognition. On DidiCallcenter task, Chinese character attention model achieves a CER of 5.68% and CTC model gets a CER of 7.29%, on the other DidiReading task, CER are 4.89% and 5.79%, respectively. Moreover, attention model achieves a better performance than CTC model on both datasets. |
Tasks | End-To-End Speech Recognition, Speech Recognition |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.03832v2 |
http://arxiv.org/pdf/1805.03832v2.pdf | |
PWC | https://paperswithcode.com/paper/a-comparable-study-of-modeling-units-for-end |
Repo | |
Framework | |
Adaptive motor control and learning in a spiking neural network realised on a mixed-signal neuromorphic processor
Title | Adaptive motor control and learning in a spiking neural network realised on a mixed-signal neuromorphic processor |
Authors | Sebastian Glatz, Julien N. P. Martel, Raphaela Kreiser, Ning Qiao, Yulia Sandamirskaya |
Abstract | Neuromorphic computing is a new paradigm for design of both the computing hardware and algorithms inspired by biological neural networks. The event-based nature and the inherent parallelism make neuromorphic computing a promising paradigm for building efficient neural network based architectures for control of fast and agile robots. In this paper, we present a spiking neural network architecture that uses sensory feedback to control rotational velocity of a robotic vehicle. When the velocity reaches the target value, the mapping from the target velocity of the vehicle to the correct motor command, both represented in the spiking neural network on the neuromorphic device, is autonomously stored on the device using on-chip plastic synaptic weights. We validate the controller using a wheel motor of a miniature mobile vehicle and inertia measurement unit as the sensory feedback and demonstrate online learning of a simple ‘inverse model’ in a two-layer spiking neural network on the neuromorphic chip. The prototype neuromorphic device that features 256 spiking neurons allows us to realise a simple proof of concept architecture for the purely neuromorphic motor control and learning. The architecture can be easily scaled-up if a larger neuromorphic device is available. |
Tasks | |
Published | 2018-10-25 |
URL | http://arxiv.org/abs/1810.10801v1 |
http://arxiv.org/pdf/1810.10801v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-motor-control-and-learning-in-a |
Repo | |
Framework | |
Optimal Single Sample Tests for Structured versus Unstructured Network Data
Title | Optimal Single Sample Tests for Structured versus Unstructured Network Data |
Authors | Guy Bresler, Dheeraj Nagaraj |
Abstract | We study the problem of testing, using only a single sample, between mean field distributions (like Curie-Weiss, Erd\H{o}s-R'enyi) and structured Gibbs distributions (like Ising model on sparse graphs and Exponential Random Graphs). Our goal is to test without knowing the parameter values of the underlying models: only the \emph{structure} of dependencies is known. We develop a new approach that applies to both the Ising and Exponential Random Graph settings based on a general and natural statistical test. The test can distinguish the hypotheses with high probability above a certain threshold in the (inverse) temperature parameter, and is optimal in that below the threshold no test can distinguish the hypotheses. The thresholds do not correspond to the presence of long-range order in the models. By aggregating information at a global scale, our test works even at very high temperatures. The proofs are based on distributional approximation and sharp concentration of quadratic forms, when restricted to Hamming spheres. The restriction to Hamming spheres is necessary, since otherwise any scalar statistic is useless without explicit knowledge of the temperature parameter. At the same time, this restriction radically changes the behavior of the functions under consideration, resulting in a much smaller variance than in the independent setting; this makes it hard to directly apply standard methods (i.e., Stein’s method) for concentration of weakly dependent variables. Instead, we carry out an additional tensorization argument using a Markov chain that respects the symmetry of the Hamming sphere. |
Tasks | |
Published | 2018-02-17 |
URL | http://arxiv.org/abs/1802.06186v2 |
http://arxiv.org/pdf/1802.06186v2.pdf | |
PWC | https://paperswithcode.com/paper/optimal-single-sample-tests-for-structured |
Repo | |
Framework | |
Cascaded V-Net using ROI masks for brain tumor segmentation
Title | Cascaded V-Net using ROI masks for brain tumor segmentation |
Authors | Adrià Casamitjana, Marcel Catà, Irina Sánchez, Marc Combalia, Verónica Vilaplana |
Abstract | In this work we approach the brain tumor segmentation problem with a cascade of two CNNs inspired in the V-Net architecture \cite{VNet}, reformulating residual connections and making use of ROI masks to constrain the networks to train only on relevant voxels. This architecture allows dense training on problems with highly skewed class distributions, such as brain tumor segmentation, by focusing training only on the vecinity of the tumor area. We report results on BraTS2017 Training and Validation sets. |
Tasks | Brain Tumor Segmentation |
Published | 2018-12-30 |
URL | http://arxiv.org/abs/1812.11588v1 |
http://arxiv.org/pdf/1812.11588v1.pdf | |
PWC | https://paperswithcode.com/paper/cascaded-v-net-using-roi-masks-for-brain |
Repo | |
Framework | |
A3T: Adversarially Augmented Adversarial Training
Title | A3T: Adversarially Augmented Adversarial Training |
Authors | Akram Erraqabi, Aristide Baratin, Yoshua Bengio, Simon Lacoste-Julien |
Abstract | Recent research showed that deep neural networks are highly sensitive to so-called adversarial perturbations, which are tiny perturbations of the input data purposely designed to fool a machine learning classifier. Most classification models, including deep learning models, are highly vulnerable to adversarial attacks. In this work, we investigate a procedure to improve adversarial robustness of deep neural networks through enforcing representation invariance. The idea is to train the classifier jointly with a discriminator attached to one of its hidden layer and trained to filter the adversarial noise. We perform preliminary experiments to test the viability of the approach and to compare it to other standard adversarial training methods. |
Tasks | |
Published | 2018-01-12 |
URL | http://arxiv.org/abs/1801.04055v1 |
http://arxiv.org/pdf/1801.04055v1.pdf | |
PWC | https://paperswithcode.com/paper/a3t-adversarially-augmented-adversarial |
Repo | |
Framework | |