October 20, 2019

2814 words 14 mins read

Paper Group AWR 302

Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data. Legendre Decomposition for Tensors. Towards Understanding Regularization in Batch Normalization. On the iterative refinement of densely connected representation levels for semantic segmentation. Generating Realistic Geology Conditioned on Physical Measurements with Gen …

Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data


Title	Empirical Risk Minimization and Stochastic Gradient Descent for Relational Data
Authors	Victor Veitch, Morgane Austern, Wenda Zhou, David M. Blei, Peter Orbanz
Abstract	Empirical risk minimization is the main tool for prediction problems, but its extension to relational data remains unsolved. We solve this problem using recent ideas from graph sampling theory to (i) define an empirical risk for relational data and (ii) obtain stochastic gradients for this empirical risk that are automatically unbiased. This is achieved by considering the method by which data is sampled from a graph as an explicit component of model design. By integrating fast implementations of graph sampling schemes with standard automatic differentiation tools, we provide an efficient turnkey solver for the risk minimization problem. We establish basic theoretical properties of the procedure. Finally, we demonstrate relational ERM with application to two non-standard problems: one-stage training for semi-supervised node classification, and learning embedding vectors for vertex attributes. Experiments confirm that the turnkey inference procedure is effective in practice, and that the sampling scheme used for model specification has a strong effect on model performance. Code is available at https://github.com/wooden-spoon/relational-ERM.
Tasks	Node Classification
Published	2018-06-27
URL	http://arxiv.org/abs/1806.10701v2
PDF	http://arxiv.org/pdf/1806.10701v2.pdf
PWC	https://paperswithcode.com/paper/empirical-risk-minimization-and-stochastic
Repo	https://github.com/wooden-spoon/relational-ERM
Framework	tf

Legendre Decomposition for Tensors


Title	Legendre Decomposition for Tensors
Authors	Mahito Sugiyama, Hiroyuki Nakahara, Koji Tsuda
Abstract	We present a novel nonnegative tensor decomposition method, called Legendre decomposition, which factorizes an input tensor into a multiplicative combination of parameters. Thanks to the well-developed theory of information geometry, the reconstructed tensor is unique and always minimizes the KL divergence from an input tensor. We empirically show that Legendre decomposition can more accurately reconstruct tensors than other nonnegative tensor decomposition methods.
Tasks
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04502v2
PDF	http://arxiv.org/pdf/1802.04502v2.pdf
PWC	https://paperswithcode.com/paper/legendre-decomposition-for-tensors
Repo	https://github.com/mahito-sugiyama/Legendre-decomposition
Framework	none

Towards Understanding Regularization in Batch Normalization


Title	Towards Understanding Regularization in Batch Normalization
Authors	Ping Luo, Xinjiang Wang, Wenqi Shao, Zhanglin Peng
Abstract	Batch Normalization (BN) improves both convergence and generalization in training neural networks. This work understands these phenomena theoretically. We analyze BN by using a basic block of neural networks, consisting of a kernel layer, a BN layer, and a nonlinear activation function. This basic network helps us understand the impacts of BN in three aspects. First, by viewing BN as an implicit regularizer, BN can be decomposed into population normalization (PN) and gamma decay as an explicit regularization. Second, learning dynamics of BN and the regularization show that training converged with large maximum and effective learning rate. Third, generalization of BN is explored by using statistical mechanics. Experiments demonstrate that BN in convolutional neural networks share the same traits of regularization as the above analyses.
Tasks
Published	2018-09-04
URL	http://arxiv.org/abs/1809.00846v4
PDF	http://arxiv.org/pdf/1809.00846v4.pdf
PWC	https://paperswithcode.com/paper/towards-understanding-regularization-in-batch
Repo	https://github.com/darshansiddu01/CNN
Framework	pytorch


Title	On the iterative refinement of densely connected representation levels for semantic segmentation
Authors	Arantxa Casanova, Guillem Cucurull, Michal Drozdzal, Adriana Romero, Yoshua Bengio
Abstract	State-of-the-art semantic segmentation approaches increase the receptive field of their models by using either a downsampling path composed of poolings/strided convolutions or successive dilated convolutions. However, it is not clear which operation leads to best results. In this paper, we systematically study the differences introduced by distinct receptive field enlargement methods and their impact on the performance of a novel architecture, called Fully Convolutional DenseResNet (FC-DRN). FC-DRN has a densely connected backbone composed of residual networks. Following standard image segmentation architectures, receptive field enlargement operations that change the representation level are interleaved among residual networks. This allows the model to exploit the benefits of both residual and dense connectivity patterns, namely: gradient flow, iterative refinement of representations, multi-scale feature combination and deep supervision. In order to highlight the potential of our model, we test it on the challenging CamVid urban scene understanding benchmark and make the following observations: 1) downsampling operations outperform dilations when the model is trained from scratch, 2) dilations are useful during the finetuning step of the model, 3) coarser representations require less refinement steps, and 4) ResNets (by model construction) are good regularizers, since they can reduce the model capacity when needed. Finally, we compare our architecture to alternative methods and report state-of-the-art result on the Camvid dataset, with at least twice fewer parameters.
Tasks	Scene Understanding, Semantic Segmentation
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11332v1
PDF	http://arxiv.org/pdf/1804.11332v1.pdf
PWC	https://paperswithcode.com/paper/on-the-iterative-refinement-of-densely
Repo	https://github.com/ArantxaCasanova/fc-drn
Framework	pytorch

Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks


Title	Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks
Authors	Emilien Dupont, Tuanfeng Zhang, Peter Tilke, Lin Liang, William Bailey
Abstract	An important problem in geostatistics is to build models of the subsurface of the Earth given physical measurements at sparse spatial locations. Typically, this is done using spatial interpolation methods or by reproducing patterns from a reference image. However, these algorithms fail to produce realistic patterns and do not exhibit the wide range of uncertainty inherent in the prediction of geology. In this paper, we show how semantic inpainting with Generative Adversarial Networks can be used to generate varied realizations of geology which honor physical measurements while matching the expected geological patterns. In contrast to other algorithms, our method scales well with the number of data points and mimics a distribution of patterns as opposed to a single pattern or image. The generated conditional samples are state of the art.
Tasks
Published	2018-02-08
URL	http://arxiv.org/abs/1802.03065v3
PDF	http://arxiv.org/pdf/1802.03065v3.pdf
PWC	https://paperswithcode.com/paper/generating-realistic-geology-conditioned-on
Repo	https://github.com/amoodie/StratGAN
Framework	tf

MNIST Dataset Classification Utilizing k-NN Classifier with Modified Sliding-window Metric


Title	MNIST Dataset Classification Utilizing k-NN Classifier with Modified Sliding-window Metric
Authors	Divas Grover, Behrad Toghi
Abstract	The MNIST dataset of the handwritten digits is known as one of the commonly used datasets for machine learning and computer vision research. We aim to study a widely applicable classification problem and apply a simple yet efficient K-nearest neighbor classifier with an enhanced heuristic. We evaluate the performance of the K-nearest neighbor classification algorithm on the MNIST dataset where the $L2$ Euclidean distance metric is compared to a modified distance metric which utilizes the sliding window technique in order to avoid performance degradation due to slight spatial misalignments. The accuracy metric and confusion matrices are used as the performance indicators to compare the performance of the baseline algorithm versus the enhanced sliding window method and results show significant improvement using this proposed method.
Tasks
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06846v4
PDF	http://arxiv.org/pdf/1809.06846v4.pdf
PWC	https://paperswithcode.com/paper/mnist-dataset-classification-utilizing-k-nn
Repo	https://github.com/BehradToghi/kNN_SWin
Framework	none

FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation


Title	FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation
Authors	Xu Han, Hao Zhu, Pengfei Yu, Ziyun Wang, Yuan Yao, Zhiyuan Liu, Maosong Sun
Abstract	We present a Few-Shot Relation Classification Dataset (FewRel), consisting of 70, 000 sentences on 100 relations derived from Wikipedia and annotated by crowdworkers. The relation of each sentence is first recognized by distant supervision methods, and then filtered by crowdworkers. We adapt the most recent state-of-the-art few-shot learning methods for relation classification and conduct a thorough evaluation of these methods. Empirical results show that even the most competitive few-shot learning models struggle on this task, especially as compared with humans. We also show that a range of different reasoning skills are needed to solve our task. These results indicate that few-shot relation classification remains an open problem and still requires further research. Our detailed analysis points multiple directions for future research. All details and resources about the dataset and baselines are released on http://zhuhao.me/fewrel.
Tasks	Few-Shot Learning, Few-Shot Relation Classification, Relation Classification, Relation Extraction
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10147v2
PDF	http://arxiv.org/pdf/1810.10147v2.pdf
PWC	https://paperswithcode.com/paper/fewrel-a-large-scale-supervised-few-shot
Repo	https://github.com/ProKil/FewRel
Framework	pytorch

Non-local U-Net for Biomedical Image Segmentation


Title	Non-local U-Net for Biomedical Image Segmentation
Authors	Zhengyang Wang, Na Zou, Dinggang Shen, Shuiwang Ji
Abstract	Deep learning has shown its great promise in various biomedical image segmentation tasks. Existing models are typically based on U-Net and rely on an encoder-decoder architecture with stacked local operators to aggregate long-range information gradually. However, only using the local operators limits the efficiency and effectiveness. In this work, we propose the non-local U-Nets, which are equipped with flexible global aggregation blocks, for biomedical image segmentation. These blocks can be inserted into U-Net as size-preserving processes, as well as down-sampling and up-sampling layers. We perform thorough experiments on the 3D multimodality isointense infant brain MR image segmentation task to evaluate the non-local U-Nets. Results show that our proposed models achieve top performances with fewer parameters and faster computation.
Tasks	Brain Image Segmentation, Semantic Segmentation
Published	2018-12-10
URL	https://arxiv.org/abs/1812.04103v2
PDF	https://arxiv.org/pdf/1812.04103v2.pdf
PWC	https://paperswithcode.com/paper/global-deep-learning-methods-for
Repo	https://github.com/zhengyang-wang/3D-Unet--Tensorflow
Framework	tf

Automatic Skin Lesion Segmentation Using Deep Fully Convolutional Networks


Title	Automatic Skin Lesion Segmentation Using Deep Fully Convolutional Networks
Authors	Hongming Xu, Tae Hyun Hwang
Abstract	This paper summarizes our method and validation results for the ISIC Challenge 2018 - Skin Lesion Analysis Towards Melanoma Detection - Task 1: Lesion Segmentation
Tasks	Lesion Segmentation
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06466v1
PDF	http://arxiv.org/pdf/1807.06466v1.pdf
PWC	https://paperswithcode.com/paper/automatic-skin-lesion-segmentation-using-deep
Repo	https://github.com/RegulusReggie/CS259
Framework	none


Title	Deep $k$-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions
Authors	Junru Wu, Yue Wang, Zhenyu Wu, Zhangyang Wang, Ashok Veeraraghavan, Yingyan Lin
Abstract	The current trend of pushing CNNs deeper with convolutions has created a pressing demand to achieve higher compression gains on CNNs where convolutions dominate the computation and parameter amount (e.g., GoogLeNet, ResNet and Wide ResNet). Further, the high energy consumption of convolutions limits its deployment on mobile devices. To this end, we proposed a simple yet effective scheme for compressing convolutions though applying k-means clustering on the weights, compression is achieved through weight-sharing, by only recording $K$ cluster centers and weight assignment indexes. We then introduced a novel spectrally relaxed $k$-means regularization, which tends to make hard assignments of convolutional layer weights to $K$ learned cluster centers during re-training. We additionally propose an improved set of metrics to estimate energy consumption of CNN hardware implementations, whose estimation results are verified to be consistent with previously proposed energy estimation tool extrapolated from actual hardware measurements. We finally evaluated Deep $k$-Means across several CNN models in terms of both compression ratio and energy consumption reduction, observing promising results without incurring accuracy loss. The code is available at https://github.com/Sandbox3aster/Deep-K-Means
Tasks
Published	2018-06-24
URL	http://arxiv.org/abs/1806.09228v1
PDF	http://arxiv.org/pdf/1806.09228v1.pdf
PWC	https://paperswithcode.com/paper/deep-k-means-re-training-and-parameter-1
Repo	https://github.com/Sandbox3aster/Deep-K-Means
Framework	pytorch

QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites


Title	QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites
Authors	Jiankai Sun, Sobhan Moosavi, Rajiv Ramnath, Srinivasan Parthasarathy
Abstract	In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of “social agony” to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models. The QDEE framework also allows us to characterize user expertise in novel ways by identifying interesting patterns and roles played by different users in such CQAs.
Tasks	Community Question Answering, Question Answering
Published	2018-03-31
URL	http://arxiv.org/abs/1804.00109v2
PDF	http://arxiv.org/pdf/1804.00109v2.pdf
PWC	https://paperswithcode.com/paper/qdee-question-difficulty-and-expertise
Repo	https://github.com/zhenv5/QDEE
Framework	none

DropFilter: Dropout for Convolutions


Title	DropFilter: Dropout for Convolutions
Authors	Zhengsu Chen Jianwei Niu Qi Tian
Abstract	Using a large number of parameters , deep neural networks have achieved remarkable performance on computer vison and natural language processing tasks. However the networks usually suffer from overfitting by using too much parameters. Dropout is a widely use method to deal with overfitting. Although dropout can significantly regularize densely connected layers in neural networks, it leads to suboptimal results when using for convolutional layers. To track this problem, we propose DropFilter, a new dropout method for convolutional layers. DropFilter randomly suppresses the outputs of some filters. Because it is observed that co-adaptions are more likely to occurs inter filters rather than intra filters in convolutional layers. Using DropFilter, we remarkably improve the performance of convolutional networks on CIFAR and ImageNet.
Tasks
Published	2018-10-23
URL	http://arxiv.org/abs/1810.09849v1
PDF	http://arxiv.org/pdf/1810.09849v1.pdf
PWC	https://paperswithcode.com/paper/dropfilter-dropout-for-convolutions
Repo	https://github.com/harirajeev/Reference
Framework	pytorch

Adversarial Defense of Image Classification Using a Variational Auto-Encoder


Title	Adversarial Defense of Image Classification Using a Variational Auto-Encoder
Authors	Yi Luo, Henry Pfister
Abstract	Deep neural networks are known to be vulnerable to adversarial attacks. This exposes them to potential exploits in security-sensitive applications and highlights their lack of robustness. This paper uses a variational auto-encoder (VAE) to defend against adversarial attacks for image classification tasks. This VAE defense has a few nice properties: (1) it is quite flexible and its use of randomness makes it harder to attack; (2) it can learn disentangled representations that prevent blurry reconstruction; and (3) a patch-wise VAE defense strategy is used that does not require retraining for different size images. For moderate to severe attacks, this system outperforms or closely matches the performance of JPEG compression, with the best quality parameter. It also has more flexibility and potential for improvement via training.
Tasks	Adversarial Defense, Image Classification
Published	2018-12-07
URL	http://arxiv.org/abs/1812.02891v1
PDF	http://arxiv.org/pdf/1812.02891v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-defense-of-image-classification
Repo	https://github.com/Roy-YL/VAE-Adversarial-Defense
Framework	tf

Masking: A New Perspective of Noisy Supervision


Title	Masking: A New Perspective of Noisy Supervision
Authors	Bo Han, Jiangchao Yao, Gang Niu, Mingyuan Zhou, Ivor Tsang, Ya Zhang, Masashi Sugiyama
Abstract	It is important to learn various types of classifiers given training data with noisy labels. Noisy labels, in the most popular noise model hitherto, are corrupted from ground-truth labels by an unknown noise transition matrix. Thus, by estimating this matrix, classifiers can escape from overfitting those noisy labels. However, such estimation is practically difficult, due to either the indirect nature of two-step approaches, or not big enough data to afford end-to-end approaches. In this paper, we propose a human-assisted approach called Masking that conveys human cognition of invalid class transitions and naturally speculates the structure of the noise transition matrix. To this end, we derive a structure-aware probabilistic model incorporating a structure prior, and solve the challenges from structure extraction and structure alignment. Thanks to Masking, we only estimate unmasked noise transition probabilities and the burden of estimation is tremendously reduced. We conduct extensive experiments on CIFAR-10 and CIFAR-100 with three noise structures as well as the industrial-level Clothing1M with agnostic noise structure, and the results show that Masking can improve the robustness of classifiers significantly.
Tasks	Image Classification
Published	2018-05-21
URL	http://arxiv.org/abs/1805.08193v2
PDF	http://arxiv.org/pdf/1805.08193v2.pdf
PWC	https://paperswithcode.com/paper/masking-a-new-perspective-of-noisy
Repo	https://github.com/bhanML/Co-teaching
Framework	pytorch

Incorporating Literals into Knowledge Graph Embeddings


Title	Incorporating Literals into Knowledge Graph Embeddings
Authors	Agustinus Kristiadi, Mohammad Asif Khan, Denis Lukovnikov, Jens Lehmann, Asja Fischer
Abstract	Knowledge graphs, on top of entities and their relationships, contain other important elements: literals. Literals encode interesting properties (e.g. the height) of entities that are not captured by links between entities alone. Most of the existing work on embedding (or latent feature) based knowledge graph analysis focuses mainly on the relations between entities. In this work, we study the effect of incorporating literal information into existing link prediction methods. Our approach, which we name LiteralE, is an extension that can be plugged into existing latent feature methods. LiteralE merges entity embeddings with their literal information using a learnable, parametrized function, such as a simple linear or nonlinear transformation, or a multilayer neural network. We extend several popular embedding models based on LiteralE and evaluate their performance on the task of link prediction. Despite its simplicity, LiteralE proves to be an effective way to incorporate literal information into existing embedding based methods, improving their performance on different standard datasets, which we augmented with their literals and provide as testbed for further research.
Tasks	Entity Embeddings, Knowledge Graph Embeddings, Knowledge Graphs, Link Prediction
Published	2018-02-03
URL	https://arxiv.org/abs/1802.00934v3
PDF	https://arxiv.org/pdf/1802.00934v3.pdf
PWC	https://paperswithcode.com/paper/incorporating-literals-into-knowledge-graph
Repo	https://github.com/SmartDataAnalytics/LiteralE
Framework	pytorch