October 20, 2019

3060 words 15 mins read

Paper Group AWR 182

Calibrated Prediction Intervals for Neural Network Regressors. CompNet: Complementary Segmentation Network for Brain MRI Extraction. Geo-Supervised Visual Depth Prediction. Parameter Transfer Extreme Learning Machine based on Projective Model. Bio-inspired digit recognition using reward-modulated spike-timing-dependent plasticity in deep convolutio …

Calibrated Prediction Intervals for Neural Network Regressors


Title	Calibrated Prediction Intervals for Neural Network Regressors
Authors	Gil Keren, Nicholas Cummins, Björn Schuller
Abstract	Ongoing developments in neural network models are continually advancing the state of the art in terms of system accuracy. However, the predicted labels should not be regarded as the only core output; also important is a well-calibrated estimate of the prediction uncertainty. Such estimates and their calibration are critical in many practical applications. Despite their obvious aforementioned advantage in relation to accuracy, contemporary neural networks can, generally, be regarded as poorly calibrated and as such do not produce reliable output probability estimates. Further, while post-processing calibration solutions can be found in the relevant literature, these tend to be for systems performing classification. In this regard, we herein present two novel methods for acquiring calibrated predictions intervals for neural network regressors: empirical calibration and temperature scaling. In experiments using different regression tasks from the audio and computer vision domains, we find that both our proposed methods are indeed capable of producing calibrated prediction intervals for neural network regressors with any desired confidence level, a finding that is consistent across all datasets and neural network architectures we experimented with. In addition, we derive an additional practical recommendation for producing more accurate calibrated prediction intervals. We release the source code implementing our proposed methods for computing calibrated predicted intervals. The code for computing calibrated predicted intervals is publicly available.
Tasks	Calibration
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09546v3
PDF	http://arxiv.org/pdf/1803.09546v3.pdf
PWC	https://paperswithcode.com/paper/calibrated-prediction-intervals-for-neural
Repo	https://github.com/cruvadom/Prediction_Intervals
Framework	none

CompNet: Complementary Segmentation Network for Brain MRI Extraction


Title	CompNet: Complementary Segmentation Network for Brain MRI Extraction
Authors	Raunak Dey, Yi Hong
Abstract	Brain extraction is a fundamental step for most brain imaging studies. In this paper, we investigate the problem of skull stripping and propose complementary segmentation networks (CompNets) to accurately extract the brain from T1-weighted MRI scans, for both normal and pathological brain images. The proposed networks are designed in the framework of encoder-decoder networks and have two pathways to learn features from both the brain tissue and its complementary part located outside of the brain. The complementary pathway extracts the features in the non-brain region and leads to a robust solution to brain extraction from MRIs with pathologies, which do not exist in our training dataset. We demonstrate the effectiveness of our networks by evaluating them on the OASIS dataset, resulting in the state of the art performance under the two-fold cross-validation setting. Moreover, the robustness of our networks is verified by testing on images with introduced pathologies and by showing its invariance to unseen brain pathologies. In addition, our complementary network design is general and can be extended to address other image segmentation problems with better generalization.
Tasks	Semantic Segmentation, Skull Stripping
Published	2018-03-27
URL	http://arxiv.org/abs/1804.00521v2
PDF	http://arxiv.org/pdf/1804.00521v2.pdf
PWC	https://paperswithcode.com/paper/compnet-complementary-segmentation-network
Repo	https://github.com/raun1/ISBI-2020-LITS_Hybrid_Comp_Net
Framework	tf

Geo-Supervised Visual Depth Prediction


Title	Geo-Supervised Visual Depth Prediction
Authors	Xiaohan Fei, Alex Wong, Stefano Soatto
Abstract	We propose using global orientation from inertial measurements, and the bias it induces on the shape of objects populating the scene, to inform visual 3D reconstruction. We test the effect of using the resulting prior in depth prediction from a single image, where the normal vectors to surfaces of objects of certain classes tend to align with gravity or be orthogonal to it. Adding such a prior to baseline methods for monocular depth prediction yields improvements beyond the state-of-the-art and illustrates the power of gravity as a supervisory signal.
Tasks	3D Reconstruction, Depth Estimation
Published	2018-07-30
URL	https://arxiv.org/abs/1807.11130v4
PDF	https://arxiv.org/pdf/1807.11130v4.pdf
PWC	https://paperswithcode.com/paper/geo-supervised-visual-depth-prediction
Repo	https://github.com/feixh/GeoSup
Framework	tf

Parameter Transfer Extreme Learning Machine based on Projective Model


Title	Parameter Transfer Extreme Learning Machine based on Projective Model
Authors	Chao Chen, Boyuan Jiang, Xinyu Jin
Abstract	Recent years, transfer learning has attracted much attention in the community of machine learning. In this paper, we mainly focus on the tasks of parameter transfer under the framework of extreme learning machine (ELM). Unlike the existing parameter transfer approaches, which incorporate the source model information into the target by regularizing the di erence between the source and target domain parameters, an intuitively appealing projective-model is proposed to bridge the source and target model parameters. Specifically, we formulate the parameter transfer in the ELM networks by the means of parameter projection, and train the model by optimizing the projection matrix and classifier parameters jointly. Further more, the `L2,1-norm structured sparsity penalty is imposed on the source domain parameters, which encourages the joint feature selection and parameter transfer. To evaluate the e ectiveness of the proposed method, comprehensive experiments on several commonly used domain adaptation datasets are presented. The results show that the proposed method significantly outperforms the non-transfer ELM networks and other classical transfer learning methods. \|
Tasks	Domain Adaptation, Feature Selection, Transfer Learning
Published	2018-09-04
URL	http://arxiv.org/abs/1809.01018v2
PDF	http://arxiv.org/pdf/1809.01018v2.pdf
PWC	https://paperswithcode.com/paper/parameter-transfer-extreme-learning-machine
Repo	https://github.com/BoyuanJiang/PTELM
Framework	none

Bio-inspired digit recognition using reward-modulated spike-timing-dependent plasticity in deep convolutional networks


Title	Bio-inspired digit recognition using reward-modulated spike-timing-dependent plasticity in deep convolutional networks
Authors	Milad Mozafari, Mohammad Ganjtabesh, Abbas Nowzari-Dalini, Simon J. Thorpe, Timothée Masquelier
Abstract	The primate visual system has inspired the development of deep artificial neural networks, which have revolutionized the computer vision domain. Yet these networks are much less energy-efficient than their biological counterparts, and they are typically trained with backpropagation, which is extremely data-hungry. To address these limitations, we used a deep convolutional spiking neural network (DCSNN) and a latency-coding scheme. We trained it using a combination of spike-timing-dependent plasticity (STDP) for the lower layers and reward-modulated STDP (R-STDP) for the higher ones. In short, with R-STDP a correct (resp. incorrect) decision leads to STDP (resp. anti-STDP). This approach led to an accuracy of $97.2%$ on MNIST, without requiring an external classifier. In addition, we demonstrated that R-STDP extracts features that are diagnostic for the task at hand, and discards the other ones, whereas STDP extracts any feature that repeats. Finally, our approach is biologically plausible, hardware friendly, and energy-efficient.
Tasks
Published	2018-03-31
URL	https://arxiv.org/abs/1804.00227v3
PDF	https://arxiv.org/pdf/1804.00227v3.pdf
PWC	https://paperswithcode.com/paper/bio-inspired-digit-recognition-using-spike
Repo	https://github.com/miladmozafari/SpykeTorch
Framework	pytorch

ColNet: Embedding the Semantics of Web Tables for Column Type Prediction


Title	ColNet: Embedding the Semantics of Web Tables for Column Type Prediction
Authors	Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks, Charles Sutton
Abstract	Automatically annotating column types with knowledge base (KB) concepts is a critical task to gain a basic understanding of web tables. Current methods rely on either table metadata like column name or entity correspondences of cells in the KB, and may fail to deal with growing web tables with incomplete meta information. In this paper we propose a neural network based column type annotation framework named ColNet which is able to integrate KB reasoning and lookup with machine learning and can automatically train Convolutional Neural Networks for prediction. The prediction model not only considers the contextual semantics within a cell using word representation, but also embeds the semantics of a column by learning locality features from multiple cells. The method is evaluated with DBPedia and two different web table datasets, T2Dv2 from the general Web and Limaye from Wikipedia pages, and achieves higher performance than the state-of-the-art approaches.
Tasks
Published	2018-11-04
URL	http://arxiv.org/abs/1811.01304v2
PDF	http://arxiv.org/pdf/1811.01304v2.pdf
PWC	https://paperswithcode.com/paper/colnet-embedding-the-semantics-of-web-tables
Repo	https://github.com/alan-turing-institute/SemAIDA
Framework	none

Classifying and Visualizing Emotions with Emotional DAN


Title	Classifying and Visualizing Emotions with Emotional DAN
Authors	Ivona Tautkute, Tomasz Trzcinski
Abstract	Classification of human emotions remains an important and challenging task for many computer vision algorithms, especially in the era of humanoid robots which coexist with humans in their everyday life. Currently proposed methods for emotion recognition solve this task using multi-layered convolutional networks that do not explicitly infer any facial features in the classification phase. In this work, we postulate a fundamentally different approach to solve emotion recognition task that relies on incorporating facial landmarks as a part of the classification loss function. To that end, we extend a recently proposed Deep Alignment Network (DAN) with a term related to facial features. Thanks to this simple modification, our model called EmotionalDAN is able to outperform state-of-the-art emotion classification methods on two challenging benchmark dataset by up to 5%. Furthermore, we visualize image regions analyzed by the network when making a decision and the results indicate that our EmotionalDAN model is able to correctly identify facial landmarks responsible for expressing the emotions.
Tasks	Emotion Classification, Emotion Recognition
Published	2018-10-23
URL	http://arxiv.org/abs/1810.10529v1
PDF	http://arxiv.org/pdf/1810.10529v1.pdf
PWC	https://paperswithcode.com/paper/classifying-and-visualizing-emotions-with
Repo	https://github.com/IvonaTau/emotionaldan
Framework	tf

Combatting Adversarial Attacks through Denoising and Dimensionality Reduction: A Cascaded Autoencoder Approach


Title	Combatting Adversarial Attacks through Denoising and Dimensionality Reduction: A Cascaded Autoencoder Approach
Authors	Rajeev Sahay, Rehana Mahfuz, Aly El Gamal
Abstract	Machine Learning models are vulnerable to adversarial attacks that rely on perturbing the input data. This work proposes a novel strategy using Autoencoder Deep Neural Networks to defend a machine learning model against two gradient-based attacks: The Fast Gradient Sign attack and Fast Gradient attack. First we use an autoencoder to denoise the test data, which is trained with both clean and corrupted data. Then, we reduce the dimension of the denoised data using the hidden layer representation of another autoencoder. We perform this experiment for multiple values of the bound of adversarial perturbations, and consider different numbers of reduced dimensions. When the test data is preprocessed using this cascaded pipeline, the tested deep neural network classifier yields a much higher accuracy, thus mitigating the effect of the adversarial perturbation.
Tasks	Denoising, Dimensionality Reduction
Published	2018-12-07
URL	http://arxiv.org/abs/1812.03087v1
PDF	http://arxiv.org/pdf/1812.03087v1.pdf
PWC	https://paperswithcode.com/paper/combatting-adversarial-attacks-through
Repo	https://github.com/rajeevsahay/ae-defenses
Framework	none

The Price of Fair PCA: One Extra Dimension


Title	The Price of Fair PCA: One Extra Dimension
Authors	Samira Samadi, Uthaipon Tantipongpipat, Jamie Morgenstern, Mohit Singh, Santosh Vempala
Abstract	We investigate whether the standard dimensionality reduction technique of PCA inadvertently produces data representations with different fidelity for two different populations. We show on several real-world data sets, PCA has higher reconstruction error on population A than on B (for example, women versus men or lower- versus higher-educated individuals). This can happen even when the data set has a similar number of samples from A and B. This motivates our study of dimensionality reduction techniques which maintain similar fidelity for A and B. We define the notion of Fair PCA and give a polynomial-time algorithm for finding a low dimensional representation of the data which is nearly-optimal with respect to this measure. Finally, we show on real-world data sets that our algorithm can be used to efficiently generate a fair low dimensional representation of the data.
Tasks	Dimensionality Reduction
Published	2018-10-31
URL	http://arxiv.org/abs/1811.00103v1
PDF	http://arxiv.org/pdf/1811.00103v1.pdf
PWC	https://paperswithcode.com/paper/the-price-of-fair-pca-one-extra-dimension
Repo	https://github.com/samirasamadi/Fair-PCA
Framework	none

Timeception for Complex Action Recognition


Title	Timeception for Complex Action Recognition
Authors	Noureldien Hussein, Efstratios Gavves, Arnold W. M. Smeulders
Abstract	This paper focuses on the temporal aspect for recognizing human activities in videos; an important visual cue that has long been undervalued. We revisit the conventional definition of activity and restrict it to Complex Action: a set of one-actions with a weak temporal pattern that serves a specific purpose. Related works use spatiotemporal 3D convolutions with fixed kernel size, too rigid to capture the varieties in temporal extents of complex actions, and too short for long-range temporal modeling. In contrast, we use multi-scale temporal convolutions, and we reduce the complexity of 3D convolutions. The outcome is Timeception convolution layers, which reasons about minute-long temporal patterns, a factor of 8 longer than best related works. As a result, Timeception achieves impressive accuracy in recognizing the human activities of Charades, Breakfast Actions, and MultiTHUMOS. Further, we demonstrate that Timeception learns long-range temporal dependencies and tolerate temporal extents of complex actions.
Tasks	Action Recognition In Videos
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01289v2
PDF	http://arxiv.org/pdf/1812.01289v2.pdf
PWC	https://paperswithcode.com/paper/timeception-for-complex-action-recognition
Repo	https://github.com/noureldien/timeception
Framework	pytorch

Learning with privileged information via adversarial discriminative modality distillation


Title	Learning with privileged information via adversarial discriminative modality distillation
Authors	Nuno C. Garcia, Pietro Morerio, Vittorio Murino
Abstract	Heterogeneous data modalities can provide complementary cues for several tasks, usually leading to more robust algorithms and better performance. However, while training data can be accurately collected to include a variety of sensory modalities, it is often the case that not all of them are available in real life (testing) scenarios, where a model has to be deployed. This raises the challenge of how to extract information from multimodal data in the training stage, in a form that can be exploited at test time, considering limitations such as noisy or missing modalities. This paper presents a new approach in this direction for RGB-D vision tasks, developed within the adversarial learning and privileged information frameworks. We consider the practical case of learning representations from depth and RGB videos, while relying only on RGB data at test time. We propose a new approach to train a hallucination network that learns to distill depth information via adversarial learning, resulting in a clean approach without several losses to balance or hyperparameters. We report state-of-the-art results on object classification on the NYUD dataset and video action recognition on the largest multimodal dataset available for this task, the NTU RGB+D, as well as on the Northwestern-UCLA.
Tasks	Action Recognition In Videos, Object Classification
Published	2018-10-19
URL	https://arxiv.org/abs/1810.08437v2
PDF	https://arxiv.org/pdf/1810.08437v2.pdf
PWC	https://paperswithcode.com/paper/learning-with-privileged-information-via
Repo	https://github.com/pmorerio/admd
Framework	tf

Extracting and Analyzing Semantic Relatedness between Cities Using News Articles


Title	Extracting and Analyzing Semantic Relatedness between Cities Using News Articles
Authors	Yingjie Hu, Xinyue Ye, Shih-Lung Shaw
Abstract	News articles capture a variety of topics about our society. They reflect not only the socioeconomic activities that happened in our physical world, but also some of the cultures, human interests, and public concerns that exist only in the perceptions of people. Cities are frequently mentioned in news articles, and two or more cities may co-occur in the same article. Such co-occurrence often suggests certain relatedness between the mentioned cities, and the relatedness may be under different topics depending on the contents of the news articles. We consider the relatedness under different topics as semantic relatedness. By reading news articles, one can grasp the general semantic relatedness between cities, yet, given hundreds of thousands of news articles, it is very difficult, if not impossible, for anyone to manually read them. This paper proposes a computational framework which can “read” a large number of news articles and extract the semantic relatedness between cities. This framework is based on a natural language processing model and employs a machine learning process to identify the main topics of news articles. We describe the overall structure of this framework and its individual modules, and then apply it to an experimental dataset with more than 500,000 news articles covering the top 100 U.S. cities spanning a 10-year period. We perform exploratory visualization of the extracted semantic relatedness under different topics and over multiple years. We also analyze the impact of geographic distance on semantic relatedness and find varied distance decay effects. The proposed framework can be used to support large-scale content analysis in city network research.
Tasks
Published	2018-09-08
URL	http://arxiv.org/abs/1809.02823v1
PDF	http://arxiv.org/pdf/1809.02823v1.pdf
PWC	https://paperswithcode.com/paper/extracting-and-analyzing-semantic-relatedness
Repo	https://github.com/YingjieHu/CityRelatednessViaNews
Framework	none

Classification using margin pursuit


Title	Classification using margin pursuit
Authors	Matthew J. Holland
Abstract	In this work, we study a new approach to optimizing the margin distribution realized by binary classifiers. The classical approach to this problem is simply maximization of the expected margin, while more recent proposals consider simultaneous variance control and proxy objectives based on robust location estimates, in the vein of keeping the margin distribution sharply concentrated in a desirable region. While conceptually appealing, these new approaches are often computationally unwieldy, and theoretical guarantees are limited. Given this context, we propose an algorithm which searches the hypothesis space in such a way that a pre-set “margin level” ends up being a distribution-robust estimator of the margin location. This procedure is easily implemented using gradient descent, and admits finite-sample bounds on the excess risk under unbounded inputs. Empirical tests on real-world benchmark data reinforce the basic principles highlighted by the theory, and are suggestive of a promising new technique for classification.
Tasks
Published	2018-10-11
URL	http://arxiv.org/abs/1810.04863v1
PDF	http://arxiv.org/pdf/1810.04863v1.pdf
PWC	https://paperswithcode.com/paper/classification-using-margin-pursuit
Repo	https://github.com/feedbackward/catcube
Framework	none

Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting


Title	Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting
Authors	Yen-Chun Chen, Mohit Bansal
Abstract	Inspired by how humans summarize long documents, we propose an accurate and fast summarization model that first selects salient sentences and then rewrites them abstractively (i.e., compresses and paraphrases) to generate a concise overall summary. We use a novel sentence-level policy gradient method to bridge the non-differentiable computation between these two neural networks in a hierarchical way, while maintaining language fluency. Empirically, we achieve the new state-of-the-art on all metrics (including human evaluation) on the CNN/Daily Mail dataset, as well as significantly higher abstractiveness scores. Moreover, by first operating at the sentence-level and then the word-level, we enable parallel decoding of our neural generative model that results in substantially faster (10-20x) inference speed as well as 4x faster training convergence than previous long-paragraph encoder-decoder models. We also demonstrate the generalization of our model on the test-only DUC-2002 dataset, where we achieve higher scores than a state-of-the-art model.
Tasks	Abstractive Text Summarization
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11080v1
PDF	http://arxiv.org/pdf/1805.11080v1.pdf
PWC	https://paperswithcode.com/paper/fast-abstractive-summarization-with-reinforce
Repo	https://github.com/ChenRocks/fast_abs_rl
Framework	pytorch

Translating a Math Word Problem to an Expression Tree


Title	Translating a Math Word Problem to an Expression Tree
Authors	Lei Wang, Yan Wang, Deng Cai, Dongxiang Zhang, Xiaojiang Liu
Abstract	Sequence-to-sequence (SEQ2SEQ) models have been successfully applied to automatic math word problem solving. Despite its simplicity, a drawback still remains: a math word problem can be correctly solved by more than one equations. This non-deterministic transduction harms the performance of maximum likelihood estimation. In this paper, by considering the uniqueness of expression tree, we propose an equation normalization method to normalize the duplicated equations. Moreover, we analyze the performance of three popular SEQ2SEQ models on the math word problem solving. We find that each model has its own specialty in solving problems, consequently an ensemble model is then proposed to combine their advantages. Experiments on dataset Math23K show that the ensemble model with equation normalization significantly outperforms the previous state-of-the-art methods.
Tasks	Math Word Problem Solving
Published	2018-11-14
URL	http://arxiv.org/abs/1811.05632v2
PDF	http://arxiv.org/pdf/1811.05632v2.pdf
PWC	https://paperswithcode.com/paper/translating-a-math-word-problem-to-an
Repo	https://github.com/SumbeeLei/Math_EN
Framework	pytorch