October 20, 2019

2814 words 14 mins read

Paper Group AWR 302

Paper Group AWR 302

Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data. Legendre Decomposition for Tensors. Towards Understanding Regularization in Batch Normalization. On the iterative refinement of densely connected representation levels for semantic segmentation. Generating Realistic Geology Conditioned on Physical Measurements with Gen …

Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data

Title Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data
Authors Victor Veitch, Morgane Austern, Wenda Zhou, David M. Blei, Peter Orbanz
Abstract Empirical risk minimization is the main tool for prediction problems, but its extension to relational data remains unsolved. We solve this problem using recent ideas from graph sampling theory to (i) define an empirical risk for relational data and (ii) obtain stochastic gradients for this empirical risk that are automatically unbiased. This is achieved by considering the method by which data is sampled from a graph as an explicit component of model design. By integrating fast implementations of graph sampling schemes with standard automatic differentiation tools, we provide an efficient turnkey solver for the risk minimization problem. We establish basic theoretical properties of the procedure. Finally, we demonstrate relational ERM with application to two non-standard problems: one-stage training for semi-supervised node classification, and learning embedding vectors for vertex attributes. Experiments confirm that the turnkey inference procedure is effective in practice, and that the sampling scheme used for model specification has a strong effect on model performance. Code is available at https://github.com/wooden-spoon/relational-ERM.
Tasks Node Classification
Published 2018-06-27
URL http://arxiv.org/abs/1806.10701v2
PDF http://arxiv.org/pdf/1806.10701v2.pdf
PWC https://paperswithcode.com/paper/empirical-risk-minimization-and-stochastic
Repo https://github.com/wooden-spoon/relational-ERM
Framework tf

Legendre Decomposition for Tensors

Title Legendre Decomposition for Tensors
Authors Mahito Sugiyama, Hiroyuki Nakahara, Koji Tsuda
Abstract We present a novel nonnegative tensor decomposition method, called Legendre decomposition, which factorizes an input tensor into a multiplicative combination of parameters. Thanks to the well-developed theory of information geometry, the reconstructed tensor is unique and always minimizes the KL divergence from an input tensor. We empirically show that Legendre decomposition can more accurately reconstruct tensors than other nonnegative tensor decomposition methods.
Tasks
Published 2018-02-13
URL http://arxiv.org/abs/1802.04502v2
PDF http://arxiv.org/pdf/1802.04502v2.pdf
PWC https://paperswithcode.com/paper/legendre-decomposition-for-tensors
Repo https://github.com/mahito-sugiyama/Legendre-decomposition
Framework none

Towards Understanding Regularization in Batch Normalization

Title Towards Understanding Regularization in Batch Normalization
Authors Ping Luo, Xinjiang Wang, Wenqi Shao, Zhanglin Peng
Abstract Batch Normalization (BN) improves both convergence and generalization in training neural networks. This work understands these phenomena theoretically. We analyze BN by using a basic block of neural networks, consisting of a kernel layer, a BN layer, and a nonlinear activation function. This basic network helps us understand the impacts of BN in three aspects. First, by viewing BN as an implicit regularizer, BN can be decomposed into population normalization (PN) and gamma decay as an explicit regularization. Second, learning dynamics of BN and the regularization show that training converged with large maximum and effective learning rate. Third, generalization of BN is explored by using statistical mechanics. Experiments demonstrate that BN in convolutional neural networks share the same traits of regularization as the above analyses.
Tasks
Published 2018-09-04
URL http://arxiv.org/abs/1809.00846v4
PDF http://arxiv.org/pdf/1809.00846v4.pdf
PWC https://paperswithcode.com/paper/towards-understanding-regularization-in-batch
Repo https://github.com/darshansiddu01/CNN
Framework pytorch

On the iterative refinement of densely connected representation levels for semantic segmentation

Title On the iterative refinement of densely connected representation levels for semantic segmentation
Authors Arantxa Casanova, Guillem Cucurull, Michal Drozdzal, Adriana Romero, Yoshua Bengio
Abstract State-of-the-art semantic segmentation approaches increase the receptive field of their models by using either a downsampling path composed of poolings/strided convolutions or successive dilated convolutions. However, it is not clear which operation leads to best results. In this paper, we systematically study the differences introduced by distinct receptive field enlargement methods and their impact on the performance of a novel architecture, called Fully Convolutional DenseResNet (FC-DRN). FC-DRN has a densely connected backbone composed of residual networks. Following standard image segmentation architectures, receptive field enlargement operations that change the representation level are interleaved among residual networks. This allows the model to exploit the benefits of both residual and dense connectivity patterns, namely: gradient flow, iterative refinement of representations, multi-scale feature combination and deep supervision. In order to highlight the potential of our model, we test it on the challenging CamVid urban scene understanding benchmark and make the following observations: 1) downsampling operations outperform dilations when the model is trained from scratch, 2) dilations are useful during the finetuning step of the model, 3) coarser representations require less refinement steps, and 4) ResNets (by model construction) are good regularizers, since they can reduce the model capacity when needed. Finally, we compare our architecture to alternative methods and report state-of-the-art result on the Camvid dataset, with at least twice fewer parameters.
Tasks Scene Understanding, Semantic Segmentation
Published 2018-04-30
URL http://arxiv.org/abs/1804.11332v1
PDF http://arxiv.org/pdf/1804.11332v1.pdf
PWC https://paperswithcode.com/paper/on-the-iterative-refinement-of-densely
Repo https://github.com/ArantxaCasanova/fc-drn
Framework pytorch

Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks

Title Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks
Authors Emilien Dupont, Tuanfeng Zhang, Peter Tilke, Lin Liang, William Bailey
Abstract An important problem in geostatistics is to build models of the subsurface of the Earth given physical measurements at sparse spatial locations. Typically, this is done using spatial interpolation methods or by reproducing patterns from a reference image. However, these algorithms fail to produce realistic patterns and do not exhibit the wide range of uncertainty inherent in the prediction of geology. In this paper, we show how semantic inpainting with Generative Adversarial Networks can be used to generate varied realizations of geology which honor physical measurements while matching the expected geological patterns. In contrast to other algorithms, our method scales well with the number of data points and mimics a distribution of patterns as opposed to a single pattern or image. The generated conditional samples are state of the art.
Tasks
Published 2018-02-08
URL http://arxiv.org/abs/1802.03065v3
PDF http://arxiv.org/pdf/1802.03065v3.pdf
PWC https://paperswithcode.com/paper/generating-realistic-geology-conditioned-on
Repo https://github.com/amoodie/StratGAN
Framework tf

MNIST Dataset Classification Utilizing k-NN Classifier with Modified Sliding-window Metric

Title MNIST Dataset Classification Utilizing k-NN Classifier with Modified Sliding-window Metric
Authors Divas Grover, Behrad Toghi
Abstract The MNIST dataset of the handwritten digits is known as one of the commonly used datasets for machine learning and computer vision research. We aim to study a widely applicable classification problem and apply a simple yet efficient K-nearest neighbor classifier with an enhanced heuristic. We evaluate the performance of the K-nearest neighbor classification algorithm on the MNIST dataset where the $L2$ Euclidean distance metric is compared to a modified distance metric which utilizes the sliding window technique in order to avoid performance degradation due to slight spatial misalignments. The accuracy metric and confusion matrices are used as the performance indicators to compare the performance of the baseline algorithm versus the enhanced sliding window method and results show significant improvement using this proposed method.
Tasks
Published 2018-09-18
URL http://arxiv.org/abs/1809.06846v4
PDF http://arxiv.org/pdf/1809.06846v4.pdf
PWC https://paperswithcode.com/paper/mnist-dataset-classification-utilizing-k-nn
Repo https://github.com/BehradToghi/kNN_SWin
Framework none

FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation

Title FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation
Authors Xu Han, Hao Zhu, Pengfei Yu, Ziyun Wang, Yuan Yao, Zhiyuan Liu, Maosong Sun
Abstract We present a Few-Shot Relation Classification Dataset (FewRel), consisting of 70, 000 sentences on 100 relations derived from Wikipedia and annotated by crowdworkers. The relation of each sentence is first recognized by distant supervision methods, and then filtered by crowdworkers. We adapt the most recent state-of-the-art few-shot learning methods for relation classification and conduct a thorough evaluation of these methods. Empirical results show that even the most competitive few-shot learning models struggle on this task, especially as compared with humans. We also show that a range of different reasoning skills are needed to solve our task. These results indicate that few-shot relation classification remains an open problem and still requires further research. Our detailed analysis points multiple directions for future research. All details and resources about the dataset and baselines are released on http://zhuhao.me/fewrel.
Tasks Few-Shot Learning, Few-Shot Relation Classification, Relation Classification, Relation Extraction
Published 2018-10-24
URL http://arxiv.org/abs/1810.10147v2
PDF http://arxiv.org/pdf/1810.10147v2.pdf
PWC https://paperswithcode.com/paper/fewrel-a-large-scale-supervised-few-shot
Repo https://github.com/ProKil/FewRel
Framework pytorch

Non-local U-Net for Biomedical Image Segmentation

Title Non-local U-Net for Biomedical Image Segmentation
Authors Zhengyang Wang, Na Zou, Dinggang Shen, Shuiwang Ji
Abstract Deep learning has shown its great promise in various biomedical image segmentation tasks. Existing models are typically based on U-Net and rely on an encoder-decoder architecture with stacked local operators to aggregate long-range information gradually. However, only using the local operators limits the efficiency and effectiveness. In this work, we propose the non-local U-Nets, which are equipped with flexible global aggregation blocks, for biomedical image segmentation. These blocks can be inserted into U-Net as size-preserving processes, as well as down-sampling and up-sampling layers. We perform thorough experiments on the 3D multimodality isointense infant brain MR image segmentation task to evaluate the non-local U-Nets. Results show that our proposed models achieve top performances with fewer parameters and faster computation.
Tasks Brain Image Segmentation, Semantic Segmentation
Published 2018-12-10
URL https://arxiv.org/abs/1812.04103v2
PDF https://arxiv.org/pdf/1812.04103v2.pdf
PWC https://paperswithcode.com/paper/global-deep-learning-methods-for
Repo https://github.com/zhengyang-wang/3D-Unet--Tensorflow
Framework tf

Automatic Skin Lesion Segmentation Using Deep Fully Convolutional Networks

Title Automatic Skin Lesion Segmentation Using Deep Fully Convolutional Networks
Authors Hongming Xu, Tae Hyun Hwang
Abstract This paper summarizes our method and validation results for the ISIC Challenge 2018 - Skin Lesion Analysis Towards Melanoma Detection - Task 1: Lesion Segmentation
Tasks Lesion Segmentation
Published 2018-07-17
URL http://arxiv.org/abs/1807.06466v1
PDF http://arxiv.org/pdf/1807.06466v1.pdf
PWC https://paperswithcode.com/paper/automatic-skin-lesion-segmentation-using-deep
Repo https://github.com/RegulusReggie/CS259
Framework none

Deep $k$-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions

Title Deep $k$-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions
Authors Junru Wu, Yue Wang, Zhenyu Wu, Zhangyang Wang, Ashok Veeraraghavan, Yingyan Lin
Abstract The current trend of pushing CNNs deeper with convolutions has created a pressing demand to achieve higher compression gains on CNNs where convolutions dominate the computation and parameter amount (e.g., GoogLeNet, ResNet and Wide ResNet). Further, the high energy consumption of convolutions limits its deployment on mobile devices. To this end, we proposed a simple yet effective scheme for compressing convolutions though applying k-means clustering on the weights, compression is achieved through weight-sharing, by only recording $K$ cluster centers and weight assignment indexes. We then introduced a novel spectrally relaxed $k$-means regularization, which tends to make hard assignments of convolutional layer weights to $K$ learned cluster centers during re-training. We additionally propose an improved set of metrics to estimate energy consumption of CNN hardware implementations, whose estimation results are verified to be consistent with previously proposed energy estimation tool extrapolated from actual hardware measurements. We finally evaluated Deep $k$-Means across several CNN models in terms of both compression ratio and energy consumption reduction, observing promising results without incurring accuracy loss. The code is available at https://github.com/Sandbox3aster/Deep-K-Means
Tasks
Published 2018-06-24
URL http://arxiv.org/abs/1806.09228v1
PDF http://arxiv.org/pdf/1806.09228v1.pdf
PWC https://paperswithcode.com/paper/deep-k-means-re-training-and-parameter-1
Repo https://github.com/Sandbox3aster/Deep-K-Means
Framework pytorch

QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites

Title QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites
Authors Jiankai Sun, Sobhan Moosavi, Rajiv Ramnath, Srinivasan Parthasarathy
Abstract In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of “social agony” to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models. The QDEE framework also allows us to characterize user expertise in novel ways by identifying interesting patterns and roles played by different users in such CQAs.
Tasks Community Question Answering, Question Answering
Published 2018-03-31
URL http://arxiv.org/abs/1804.00109v2
PDF http://arxiv.org/pdf/1804.00109v2.pdf
PWC https://paperswithcode.com/paper/qdee-question-difficulty-and-expertise
Repo https://github.com/zhenv5/QDEE
Framework none

DropFilter: Dropout for Convolutions

Title DropFilter: Dropout for Convolutions
Authors Zhengsu Chen Jianwei Niu Qi Tian
Abstract Using a large number of parameters , deep neural networks have achieved remarkable performance on computer vison and natural language processing tasks. However the networks usually suffer from overfitting by using too much parameters. Dropout is a widely use method to deal with overfitting. Although dropout can significantly regularize densely connected layers in neural networks, it leads to suboptimal results when using for convolutional layers. To track this problem, we propose DropFilter, a new dropout method for convolutional layers. DropFilter randomly suppresses the outputs of some filters. Because it is observed that co-adaptions are more likely to occurs inter filters rather than intra filters in convolutional layers. Using DropFilter, we remarkably improve the performance of convolutional networks on CIFAR and ImageNet.
Tasks
Published 2018-10-23
URL http://arxiv.org/abs/1810.09849v1
PDF http://arxiv.org/pdf/1810.09849v1.pdf
PWC https://paperswithcode.com/paper/dropfilter-dropout-for-convolutions
Repo https://github.com/harirajeev/Reference
Framework pytorch

Adversarial Defense of Image Classification Using a Variational Auto-Encoder

Title Adversarial Defense of Image Classification Using a Variational Auto-Encoder
Authors Yi Luo, Henry Pfister
Abstract Deep neural networks are known to be vulnerable to adversarial attacks. This exposes them to potential exploits in security-sensitive applications and highlights their lack of robustness. This paper uses a variational auto-encoder (VAE) to defend against adversarial attacks for image classification tasks. This VAE defense has a few nice properties: (1) it is quite flexible and its use of randomness makes it harder to attack; (2) it can learn disentangled representations that prevent blurry reconstruction; and (3) a patch-wise VAE defense strategy is used that does not require retraining for different size images. For moderate to severe attacks, this system outperforms or closely matches the performance of JPEG compression, with the best quality parameter. It also has more flexibility and potential for improvement via training.
Tasks Adversarial Defense, Image Classification
Published 2018-12-07
URL http://arxiv.org/abs/1812.02891v1
PDF http://arxiv.org/pdf/1812.02891v1.pdf
PWC https://paperswithcode.com/paper/adversarial-defense-of-image-classification
Repo https://github.com/Roy-YL/VAE-Adversarial-Defense
Framework tf

Masking: A New Perspective of Noisy Supervision

Title Masking: A New Perspective of Noisy Supervision
Authors Bo Han, Jiangchao Yao, Gang Niu, Mingyuan Zhou, Ivor Tsang, Ya Zhang, Masashi Sugiyama
Abstract It is important to learn various types of classifiers given training data with noisy labels. Noisy labels, in the most popular noise model hitherto, are corrupted from ground-truth labels by an unknown noise transition matrix. Thus, by estimating this matrix, classifiers can escape from overfitting those noisy labels. However, such estimation is practically difficult, due to either the indirect nature of two-step approaches, or not big enough data to afford end-to-end approaches. In this paper, we propose a human-assisted approach called Masking that conveys human cognition of invalid class transitions and naturally speculates the structure of the noise transition matrix. To this end, we derive a structure-aware probabilistic model incorporating a structure prior, and solve the challenges from structure extraction and structure alignment. Thanks to Masking, we only estimate unmasked noise transition probabilities and the burden of estimation is tremendously reduced. We conduct extensive experiments on CIFAR-10 and CIFAR-100 with three noise structures as well as the industrial-level Clothing1M with agnostic noise structure, and the results show that Masking can improve the robustness of classifiers significantly.
Tasks Image Classification
Published 2018-05-21
URL http://arxiv.org/abs/1805.08193v2
PDF http://arxiv.org/pdf/1805.08193v2.pdf
PWC https://paperswithcode.com/paper/masking-a-new-perspective-of-noisy
Repo https://github.com/bhanML/Co-teaching
Framework pytorch

Incorporating Literals into Knowledge Graph Embeddings

Title Incorporating Literals into Knowledge Graph Embeddings
Authors Agustinus Kristiadi, Mohammad Asif Khan, Denis Lukovnikov, Jens Lehmann, Asja Fischer
Abstract Knowledge graphs, on top of entities and their relationships, contain other important elements: literals. Literals encode interesting properties (e.g. the height) of entities that are not captured by links between entities alone. Most of the existing work on embedding (or latent feature) based knowledge graph analysis focuses mainly on the relations between entities. In this work, we study the effect of incorporating literal information into existing link prediction methods. Our approach, which we name LiteralE, is an extension that can be plugged into existing latent feature methods. LiteralE merges entity embeddings with their literal information using a learnable, parametrized function, such as a simple linear or nonlinear transformation, or a multilayer neural network. We extend several popular embedding models based on LiteralE and evaluate their performance on the task of link prediction. Despite its simplicity, LiteralE proves to be an effective way to incorporate literal information into existing embedding based methods, improving their performance on different standard datasets, which we augmented with their literals and provide as testbed for further research.
Tasks Entity Embeddings, Knowledge Graph Embeddings, Knowledge Graphs, Link Prediction
Published 2018-02-03
URL https://arxiv.org/abs/1802.00934v3
PDF https://arxiv.org/pdf/1802.00934v3.pdf
PWC https://paperswithcode.com/paper/incorporating-literals-into-knowledge-graph
Repo https://github.com/SmartDataAnalytics/LiteralE
Framework pytorch
comments powered by Disqus