January 25, 2020

2996 words 15 mins read

Paper Group NANR 44

Paper Group NANR 44

LSH Microbatches for Stochastic Gradients: Value in Rearrangement. SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter. Improving deep forest by confidence screening. Does Multi-Task Learning Always Help?: An Evaluation on Health Informatics. BEHAVIOR MODULE IN NEURAL NETWORKS. Causal importance of ori …

LSH Microbatches for Stochastic Gradients: Value in Rearrangement

Title LSH Microbatches for Stochastic Gradients: Value in Rearrangement
Authors Eliav Buchnik, Edith Cohen, Avinatan Hassidim, Yossi Matias
Abstract Metric embeddings are immensely useful representations of associations between entities (images, users, search queries, words, and more). Embeddings are learned by optimizing a loss objective of the general form of a sum over example associations. Typically, the optimization uses stochastic gradient updates over minibatches of examples that are arranged independently at random. In this work, we propose the use of {\em structured arrangements} through randomized {\em microbatches} of examples that are more likely to include similar ones. We make a principled argument for the properties of our arrangements that accelerate the training and present efficient algorithms to generate microbatches that respect the marginal distribution of training examples. Finally, we observe experimentally that our structured arrangements accelerate training by 3-20%. Structured arrangements emerge as a powerful and novel performance knob for SGD that is independent and complementary to other SGD hyperparameters and thus is a candidate for wide deployment.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=r1erRoCqtX
PDF https://openreview.net/pdf?id=r1erRoCqtX
PWC https://paperswithcode.com/paper/lsh-microbatches-for-stochastic-gradients-1
Repo
Framework

SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter

Title SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter
Authors Valerio Basile, Cristina Bosco, Elisabetta Fersini, Debora Nozza, Viviana Patti, Francisco Manuel Rangel Pardo, Paolo Rosso, Manuela Sanguinetti
Abstract The paper describes the organization of the SemEval 2019 Task 5 about the detection of hate speech against immigrants and women in Spanish and English messages extracted from Twitter. The task is organized in two related classification subtasks: a main binary subtask for detecting the presence of hate speech, and a finer-grained one devoted to identifying further features in hateful contents such as the aggressive attitude and the target harassed, to distinguish if the incitement is against an individual rather than a group. HatEval has been one of the most popular tasks in SemEval-2019 with a total of 108 submitted runs for Subtask A and 70 runs for Subtask B, from a total of 74 different teams. Data provided for the task are described by showing how they have been collected and annotated. Moreover, the paper provides an analysis and discussion about the participant systems and the results they achieved in both subtasks.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2007/
PDF https://www.aclweb.org/anthology/S19-2007
PWC https://paperswithcode.com/paper/semeval-2019-task-5-multilingual-detection-of
Repo
Framework

Improving deep forest by confidence screening

Title Improving deep forest by confidence screening
Authors Ming Pang, Kai-Ming Ting, Peng Zhao, Zhi-Hua Zhou
Abstract Most studies about deep learning are based on neural network models, where many layers of parameterized nonlinear differentiable modules are trained by back propagation. Recently, it has been shown that deep learning can also be realized by non-differentiable modules without back propagation training called deep forest. The developed representation learning process is based on a cascade of cascades of decision tree forests, where the high memory requirement and the high time cost inhibit the training of large models. In this paper, we propose a simple yet effective approach to improve the efficiency of deep forest. The key idea is to pass the instances with high confidence directly to the final stage rather than passing through all the levels. We also provide a theoretical analysis suggesting a means to vary the model complexity from low to high as the level increases in the cascade, which further reduces the memory requirement and time cost. Our experiments show that the proposed approach achieves highly competitive predictive performance with significantly reduced time cost and memory requirement by up to one order of magnitude.
Tasks Representation Learning
Published 2019-11-17
URL https://ieeexplore.ieee.org/abstract/document/8594967
PDF http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/icdm18.pdf
PWC https://paperswithcode.com/paper/improving-deep-forest-by-confidence-screening
Repo
Framework

Does Multi-Task Learning Always Help?: An Evaluation on Health Informatics

Title Does Multi-Task Learning Always Help?: An Evaluation on Health Informatics
Authors Aditya Joshi, Sarvnaz Karimi, Ross Sparks, Cecile Paris, C Raina MacIntyre
Abstract Multi-Task Learning (MTL) has been an attractive approach to deal with limited labeled datasets or leverage related tasks, for a variety of NLP problems. We examine the benefit of MTL for three specific pairs of health informatics tasks that deal with: (a) overlapping symptoms for the same classification problem (personal health mention classification for influenza and for a set of symptoms); (b) overlapping medical concepts for related classification problems (vaccine usage and drug usage detection); and, (c) related classification problems (vaccination intent and vaccination relevance detection). We experiment with a simple neural architecture: a shared layer followed by task-specific dense layers. The novelty of this work is that it compares alternatives for shared layers for these pairs of tasks. While our observations agree with the promise of MTL as compared to single-task learning, for health informatics, we show that the benefit also comes with caveats in terms of the choice of shared layers and the relatedness between the participating tasks.
Tasks Multi-Task Learning
Published 2019-04-01
URL https://www.aclweb.org/anthology/U19-1020/
PDF https://www.aclweb.org/anthology/U19-1020
PWC https://paperswithcode.com/paper/does-multi-task-learning-always-help-an
Repo
Framework

BEHAVIOR MODULE IN NEURAL NETWORKS

Title BEHAVIOR MODULE IN NEURAL NETWORKS
Authors Andrey Sakryukin, Yongkang Wong, Mohan S. Kankanhalli
Abstract Prefrontal cortex (PFC) is a part of the brain which is responsible for behavior repertoire. Inspired by PFC functionality and connectivity, as well as human behavior formation process, we propose a novel modular architecture of neural networks with a Behavioral Module (BM) and corresponding end-to-end training strategy. This approach allows the efficient learning of behaviors and preferences representation. This property is particularly useful for user modeling (as for dialog agents) and recommendation tasks, as allows learning personalized representations of different user states. In the experiment with video games playing, the resultsshow that the proposed method allows separation of main task’s objectives andbehaviors between different BMs. The experiments also show network extendability through independent learning of new behavior patterns. Moreover, we demonstrate a strategy for an efficient transfer of newly learned BMs to unseen tasks.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=Syl6tjAqKX
PDF https://openreview.net/pdf?id=Syl6tjAqKX
PWC https://paperswithcode.com/paper/behavior-module-in-neural-networks
Repo
Framework

Causal importance of orientation selectivity for generalization in image recognition

Title Causal importance of orientation selectivity for generalization in image recognition
Authors Jumpei Ukita
Abstract Although both our brain and deep neural networks (DNNs) can perform high-level sensory-perception tasks such as image or speech recognition, the inner mechanism of these hierarchical information-processing systems is poorly understood in both neuroscience and machine learning. Recently, Morcos et al. (2018) examined the effect of class-selective units in DNNs, i.e., units with high-level selectivity, on network generalization, concluding that hidden units that are selectively activated by specific input patterns may harm the network’s performance. In this study, we revisit their hypothesis, considering units with selectivity for lower-level features, and argue that selective units are not always harmful to the network performance. Specifically, by using DNNs trained for image classification (7-layer CNNs and VGG16 trained on CIFAR-10 and ImageNet, respectively), we analyzed the orientation selectivity of individual units. Orientation selectivity is a low-level selectivity widely studied in visual neuroscience, in which, when images of bars with several orientations are presented to the eye, many neurons in the visual cortex respond selectively to a specific orientation. We found that orientation-selective units exist in both lower and higher layers of these DNNs, as in our brain. In particular, units in the lower layers become more orientation-selective as the generalization performance improves during the course of training of the DNNs. Consistently, networks that generalize better are more orientation-selective in the lower layers. We finally reveal that ablating these selective units in the lower layers substantially degrades the generalization performance, at least by disrupting the shift-invariance of the higher layers. These results suggest to the machine-learning community that, contrary to the triviality of units with high-level selectivity, lower-layer units with selectivity for low-level features can be indispensable for generalization, and for neuroscientists, orientation selectivity can play a causally important role in object recognition.
Tasks Image Classification, Object Recognition, Speech Recognition
Published 2019-05-01
URL https://openreview.net/forum?id=Bkx_Dj09tQ
PDF https://openreview.net/pdf?id=Bkx_Dj09tQ
PWC https://paperswithcode.com/paper/causal-importance-of-orientation-selectivity
Repo
Framework

DBee: A Database for Creating and Managing Knowledge Graphs and Embeddings

Title DBee: A Database for Creating and Managing Knowledge Graphs and Embeddings
Authors Viktor Schlegel, Andr{'e} Freitas
Abstract This paper describes DBee, a database to support the construction of data-intensive AI applications. DBee provides a unique data model which operates jointly over large-scale knowledge graphs (KGs) and embedding vector spaces (VSs). This model supports queries which exploit the semantic properties of both types of representations (KGs and VSs). Additionally, DBee aims to facilitate the construction of KGs and VSs, by providing a library of generators, which can be used to create, integrate and transform data into KGs and VSs.
Tasks Knowledge Graphs
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5322/
PDF https://www.aclweb.org/anthology/D19-5322
PWC https://paperswithcode.com/paper/dbee-a-database-for-creating-and-managing
Repo
Framework

A Robust Non-Clairvoyant Dynamic Mechanism for Contextual Auctions

Title A Robust Non-Clairvoyant Dynamic Mechanism for Contextual Auctions
Authors Yuan Deng, Sébastien Lahaie, Vahab Mirrokni
Abstract Dynamic mechanisms offer powerful techniques to improve on both revenue and efficiency by linking sequential auctions using state information, but these techniques rely on exact distributional information of the buyers’ valuations (present and future), which limits their use in learning settings. In this paper, we consider the problem of contextual auctions where the seller gradually learns a model of the buyer’s valuation as a function of the context (e.g., item features) and seeks a pricing policy that optimizes revenue. Building on the concept of a bank account mechanism—a special class of dynamic mechanisms that is known to be revenue-optimal—we develop a non-clairvoyant dynamic mechanism that is robust to both estimation errors in the buyer’s value distribution and strategic behavior on the part of the buyer. We then tailor its structure to achieve a policy with provably low regret against a constant approximation of the optimal dynamic mechanism in contextual auctions. Our result substantially improves on previous results that only provide revenue guarantees against static benchmarks.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9071-a-robust-non-clairvoyant-dynamic-mechanism-for-contextual-auctions
PDF http://papers.nips.cc/paper/9071-a-robust-non-clairvoyant-dynamic-mechanism-for-contextual-auctions.pdf
PWC https://paperswithcode.com/paper/a-robust-non-clairvoyant-dynamic-mechanism
Repo
Framework

Word2Sense: Sparse Interpretable Word Embeddings

Title Word2Sense: Sparse Interpretable Word Embeddings
Authors Abhishek Panigrahi, Harsha Vardhan Simhadri, Chiranjib Bhattacharyya
Abstract We present an unsupervised method to generate Word2Sense word embeddings that are interpretable {—} each dimension of the embedding space corresponds to a fine-grained sense, and the non-negative value of the embedding along the j-th dimension represents the relevance of the j-th sense to the word. The underlying LDA-based generative model can be extended to refine the representation of a polysemous word in a short context, allowing us to use the embedings in contextual tasks. On computational NLP tasks, Word2Sense embeddings compare well with other word embeddings generated by unsupervised methods. Across tasks such as word similarity, entailment, sense induction, and contextual interpretation, Word2Sense is competitive with the state-of-the-art method for that task. Word2Sense embeddings are at least as sparse and fast to compute as prior art.
Tasks Word Embeddings
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1570/
PDF https://www.aclweb.org/anthology/P19-1570
PWC https://paperswithcode.com/paper/word2sense-sparse-interpretable-word
Repo
Framework

Towards a better understanding of Vector Quantized Autoencoders

Title Towards a better understanding of Vector Quantized Autoencoders
Authors Aurko Roy, Ashish Vaswani, Niki Parmar, Arvind Neelakantan
Abstract Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks. There has been a surge in interest in discrete latent variable models, however, despite several recent improvements, the training of discrete latent variable models has remained challenging and their performance has mostly failed to match their continuous counterparts. Recent work on vector quantized autoencoders (VQ-VAE) has made substantial progress in this direction, with its perplexity almost matching that of a VAE on datasets such as CIFAR-10. In this work, we investigate an alternate training technique for VQ-VAE, inspired by its connection to the Expectation Maximization (EM) algorithm. Training the discrete autoencoder with EM and combining it with sequence level knowledge distillation alows us to develop a non-autoregressive machine translation model whose accuracy almost matches a strong greedy autoregressive baseline Transformer, while being 3.3 times faster at inference.
Tasks Latent Variable Models, Machine Translation
Published 2019-05-01
URL https://openreview.net/forum?id=HkGGfhC5Y7
PDF https://openreview.net/pdf?id=HkGGfhC5Y7
PWC https://paperswithcode.com/paper/towards-a-better-understanding-of-vector
Repo
Framework

A Weakly Supervised Fine Label Classifier Enhanced by Coarse Supervision

Title A Weakly Supervised Fine Label Classifier Enhanced by Coarse Supervision
Authors Fariborz Taherkhani, Hadi Kazemi, Ali Dabouei, Jeremy Dawson, Nasser M. Nasrabadi
Abstract Objects are usually organized in a hierarchical structure in which each coarse category (e.g., big cat) corresponds to a superclass of several fine categories (e.g., cheetah, leopard). The objects grouped within the same coarse category, but in different fine categories, usually share a set of global visual features; however, these objects have distinctive local properties that characterize them at a fine level. This paper addresses the challenge of fine image classification in a weakly supervised fashion, whereby a subset of images is tagged by fine labels, while the remaining are tagged by coarse labels. We propose a new deep model that leverages coarse images to improve the classification performance of fine images within the coarse category. Our model is an end to end framework consisting of a Convolutional Neural Network (CNN) which uses both fine and coarse images to tune its parameters. The CNN outputs are then fanned out into two separate branches such that the first branch uses a supervised low rank self expressive layer to project the CNN outputs to the low rank subspaces to capture the global structures for the coarse classification, while the other branch uses a supervised sparse self expressive layer to project them to the sparse subspaces to capture the local structures for the fine classification. Our deep model uses coarse images in conjunction with fine images to jointly explore the low rank and sparse subspaces by sharing the parameters during the training which causes the data points obtained by the CNN to be well-projected to both sparse and low rank subspaces for classification.
Tasks Image Classification
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Taherkhani_A_Weakly_Supervised_Fine_Label_Classifier_Enhanced_by_Coarse_Supervision_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Taherkhani_A_Weakly_Supervised_Fine_Label_Classifier_Enhanced_by_Coarse_Supervision_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/a-weakly-supervised-fine-label-classifier
Repo
Framework

Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)

Title Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)
Authors
Abstract
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6200/
PDF https://www.aclweb.org/anthology/D19-6200
PWC https://paperswithcode.com/paper/proceedings-of-the-tenth-international-1
Repo
Framework

Comparing the Intrinsic Performance of Clinical Concept Embeddings by Their Field of Medicine

Title Comparing the Intrinsic Performance of Clinical Concept Embeddings by Their Field of Medicine
Authors John-Jose Nunez, Giuseppe Carenini
Abstract Pre-trained word embeddings are becoming increasingly popular for natural language processing tasks. This includes medical applications, where embeddings are trained for clinical concepts using specific medical data. Recent work continues to improve on these embeddings. However, no one has yet sought to determine whether these embeddings work as well for one field of medicine as they do in others. In this work, we use intrinsic methods to evaluate embeddings from the various fields of medicine as defined by their ICD-9 systems. We find significant differences between fields, and motivate future work to investigate whether extrinsic tasks will follow a similar pattern.
Tasks Word Embeddings
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6202/
PDF https://www.aclweb.org/anthology/D19-6202
PWC https://paperswithcode.com/paper/comparing-the-intrinsic-performance-of
Repo
Framework

Bipartite expander Hopfield networks as self-decoding high-capacity error correcting codes

Title Bipartite expander Hopfield networks as self-decoding high-capacity error correcting codes
Authors Rishidev Chaudhuri, Ila Fiete
Abstract Neural network models of memory and error correction famously include the Hopfield network, which can directly store—and error-correct through its dynamics—arbitrary N-bit patterns, but only for ~N such patterns. On the other end of the spectrum, Shannon’s coding theory established that it is possible to represent exponentially many states (~e^N) using N symbols in such a way that an optimal decoder could correct all noise upto a threshold. We prove that it is possible to construct an associative content-addressable network that combines the properties of strong error correcting codes and Hopfield networks: it simultaneously possesses exponentially many stable states, these states are robust enough, with large enough basins of attraction that they can be correctly recovered despite errors in a finite fraction of all nodes, and the errors are intrinsically corrected by the network’s own dynamics. The network is a two-layer Boltzmann machine with simple neural dynamics, low dynamic-range (binary) pairwise synaptic connections, and sparse expander graph connectivity. Thus, quasi-random sparse structures—characteristic of important error-correcting codes—may provide for high-performance computation in artificial neural networks and the brain.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8985-bipartite-expander-hopfield-networks-as-self-decoding-high-capacity-error-correcting-codes
PDF http://papers.nips.cc/paper/8985-bipartite-expander-hopfield-networks-as-self-decoding-high-capacity-error-correcting-codes.pdf
PWC https://paperswithcode.com/paper/bipartite-expander-hopfield-networks-as-self
Repo
Framework

A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision

Title A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision
Authors Runmin Wu, Mengyang Feng, Wenlong Guan, Dong Wang, Huchuan Lu, Errui Ding
Abstract Though deep learning techniques have made great progress in salient object detection recently, the predicted saliency maps still suffer from incomplete predictions due to the internal complexity of objects and inaccurate boundaries caused by strides in convolution and pooling operations. To alleviate these issues, we propose to train saliency detection networks by exploiting the supervision from not only salient object detection, but also foreground contour detection and edge detection. First, we leverage salient object detection and foreground contour detection tasks in an intertwined manner to generate saliency maps with uniform highlight. Second, the foreground contour and edge detection tasks guide each other simultaneously, thereby leading to preciser foreground contour prediction and reducing the local noises for edge prediction. In addition, we develop a novel mutual learning module (MLM) which serves as the building block of our method. Each MLM consists of multiple network branches trained in a mutual learning manner, which improves the performance by a large margin. Extensive experiments on seven challenging datasets demonstrate that the proposed method has delivered state-of-the-art results in both salient object detection and edge detection.
Tasks Contour Detection, Edge Detection, Object Detection, Saliency Detection, Salient Object Detection
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_A_Mutual_Learning_Method_for_Salient_Object_Detection_With_Intertwined_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_A_Mutual_Learning_Method_for_Salient_Object_Detection_With_Intertwined_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/a-mutual-learning-method-for-salient-object
Repo
Framework
comments powered by Disqus