Paper Group AWR 173
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent. Multi-Task Deep Neural Networks for Natural Language Understanding. A Clustering-Based Combinatorial Approach to Unsupervised Matching of Product Titles. Quality of Uncertainty Quantification for Bayesian Neural Network Inference. Writer Independent Offline Signature Re …
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
Title | Limitations of the Empirical Fisher Approximation for Natural Gradient Descent |
Authors | Frederik Kunstner, Lukas Balles, Philipp Hennig |
Abstract | Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher—unlike the Fisher—does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects. |
Tasks | |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12558v2 |
https://arxiv.org/pdf/1905.12558v2.pdf | |
PWC | https://paperswithcode.com/paper/limitations-of-the-empirical-fisher |
Repo | https://github.com/fkunstner/limitations-empirical-fisher |
Framework | none |
Multi-Task Deep Neural Networks for Natural Language Understanding
Title | Multi-Task Deep Neural Networks for Natural Language Understanding |
Authors | Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao |
Abstract | In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks. MT-DNN not only leverages large amounts of cross-task data, but also benefits from a regularization effect that leads to more general representations in order to adapt to new tasks and domains. MT-DNN extends the model proposed in Liu et al. (2015) by incorporating a pre-trained bidirectional transformer language model, known as BERT (Devlin et al., 2018). MT-DNN obtains new state-of-the-art results on ten NLU tasks, including SNLI, SciTail, and eight out of nine GLUE tasks, pushing the GLUE benchmark to 82.7% (2.2% absolute improvement). We also demonstrate using the SNLI and SciTail datasets that the representations learned by MT-DNN allow domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations. The code and pre-trained models are publicly available at https://github.com/namisan/mt-dnn. |
Tasks | Domain Adaptation, Language Modelling, Linguistic Acceptability, Natural Language Inference, Paraphrase Identification, Sentiment Analysis |
Published | 2019-01-31 |
URL | https://arxiv.org/abs/1901.11504v2 |
https://arxiv.org/pdf/1901.11504v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-deep-neural-networks-for-natural |
Repo | https://github.com/phueb/BabyBertSRL |
Framework | pytorch |
A Clustering-Based Combinatorial Approach to Unsupervised Matching of Product Titles
Title | A Clustering-Based Combinatorial Approach to Unsupervised Matching of Product Titles |
Authors | Leonidas Akritidis, Athanasios Fevgas, Panayiotis Bozanis, Christos Makris |
Abstract | The constant growth of the e-commerce industry has rendered the problem of product retrieval particularly important. As more enterprises move their activities on the Web, the volume and the diversity of the product-related information increase quickly. These factors make it difficult for the users to identify and compare the features of their desired products. Recent studies proved that the standard similarity metrics cannot effectively identify identical products, since similar titles often refer to different products and vice-versa. Other studies employed external data sources (search engines) to enrich the titles; these solutions are rather impractical mainly because the external data fetching is slow. In this paper we introduce UPM, an unsupervised algorithm for matching products by their titles. UPM is independent of any external sources, since it analyzes the titles and extracts combinations of words out of them. These combinations are evaluated according to several criteria, and the most appropriate of them constitutes the cluster where a product is classified into. UPM is also parameter-free, it avoids product pairwise comparisons, and includes a post-processing verification stage which corrects the erroneous matches. The experimental evaluation of UPM demonstrated its superiority against the state-of-the-art approaches in terms of both efficiency and effectiveness. |
Tasks | |
Published | 2019-03-07 |
URL | http://arxiv.org/abs/1903.04276v1 |
http://arxiv.org/pdf/1903.04276v1.pdf | |
PWC | https://paperswithcode.com/paper/a-clustering-based-combinatorial-approach-to |
Repo | https://github.com/BinaryWiz/Unsupervised-Product-Matching-Using-Combinations-and-Permutations |
Framework | none |
Quality of Uncertainty Quantification for Bayesian Neural Network Inference
Title | Quality of Uncertainty Quantification for Bayesian Neural Network Inference |
Authors | Jiayu Yao, Weiwei Pan, Soumya Ghosh, Finale Doshi-Velez |
Abstract | Bayesian Neural Networks (BNNs) place priors over the parameters in a neural network. Inference in BNNs, however, is difficult; all inference methods for BNNs are approximate. In this work, we empirically compare the quality of predictive uncertainty estimates for 10 common inference methods on both regression and classification tasks. Our experiments demonstrate that commonly used metrics (e.g. test log-likelihood) can be misleading. Our experiments also indicate that inference innovations designed to capture structure in the posterior do not necessarily produce high quality posterior approximations. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09686v1 |
https://arxiv.org/pdf/1906.09686v1.pdf | |
PWC | https://paperswithcode.com/paper/quality-of-uncertainty-quantification-for |
Repo | https://github.com/renato145/ClassificationUncertainty |
Framework | pytorch |
Writer Independent Offline Signature Recognition Using Ensemble Learning
Title | Writer Independent Offline Signature Recognition Using Ensemble Learning |
Authors | Sourya Dipta Das, Himanshu Ladia, Vaibhav Kumar, Shivansh Mishra |
Abstract | The area of Handwritten Signature Verification has been broadly researched in the last decades, but remains an open research problem. In offline (static) signature verification, the dynamic information of the signature writing process is lost, and it is difficult to design good feature extractors that can distinguish genuine signatures and skilled forgeries. This verification task is even harder in writer independent scenarios which is undeniably fiscal for realistic cases. In this paper, we have proposed an Ensemble model for offline writer, independent signature verification task with Deep learning. We have used two CNNs for feature extraction, after that RGBT for classification & Stacking to generate final prediction vector. We have done extensive experiments on various datasets from various sources to maintain a variance in the dataset. We have achieved the state of the art performance on various datasets. |
Tasks | |
Published | 2019-01-19 |
URL | http://arxiv.org/abs/1901.06494v1 |
http://arxiv.org/pdf/1901.06494v1.pdf | |
PWC | https://paperswithcode.com/paper/writer-independent-offline-signature |
Repo | https://github.com/himanshuladia/signature-recognizer |
Framework | none |
EMPNet: Neural Localisation and Mapping Using Embedded Memory Points
Title | EMPNet: Neural Localisation and Mapping Using Embedded Memory Points |
Authors | Gil Avraham, Yan Zuo, Thanuja Dharmasiri, Tom Drummond |
Abstract | Continuously estimating an agent’s state space and a representation of its surroundings has proven vital towards full autonomy. A shared common ground among systems which successfully achieve this feat is the integration of previously encountered observations into the current state being estimated. This necessitates the use of a memory module for incorporating previously visited states whilst simultaneously offering an internal representation of the observed environment. In this work we develop a memory module which contains rigidly aligned point-embeddings that represent a coherent scene structure acquired from an RGB-D sequence of observations. The point-embeddings are extracted using modern convolutional neural network architectures, and alignment is performed by computing a dense correspondence matrix between a new observation and the current embeddings residing in the memory module. The whole framework is end-to-end trainable, resulting in a recurrent joint optimisation of the point-embeddings contained in the memory. This process amplifies the shared information across states, providing increased robustness and accuracy. We show significant improvement of our method across a set of experiments performed on the synthetic VIZDoom environment and a real world Active Vision Dataset. |
Tasks | |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1907.13268v2 |
https://arxiv.org/pdf/1907.13268v2.pdf | |
PWC | https://paperswithcode.com/paper/empnet-neural-localisation-and-mapping-using |
Repo | https://github.com/IntelligenceDatum/ICCV2019_Model_Compression |
Framework | none |
A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling
Title | A Novel Bi-directional Interrelated Model for Joint Intent Detection and Slot Filling |
Authors | Haihong E, Peiqing Niu, Zhongfu Chen, Meina Song |
Abstract | A spoken language understanding (SLU) system includes two main tasks, slot filling (SF) and intent detection (ID). The joint model for the two tasks is becoming a tendency in SLU. But the bi-directional interrelated connections between the intent and slots are not established in the existing joint models. In this paper, we propose a novel bi-directional interrelated model for joint intent detection and slot filling. We introduce an SF-ID network to establish direct connections for the two tasks to help them promote each other mutually. Besides, we design an entirely new iteration mechanism inside the SF-ID network to enhance the bi-directional interrelated connections. The experimental results show that the relative improvement in the sentence-level semantic frame accuracy of our model is 3.79% and 5.42% on ATIS and Snips datasets, respectively, compared to the state-of-the-art model. |
Tasks | Intent Detection, Slot Filling, Spoken Language Understanding |
Published | 2019-06-30 |
URL | https://arxiv.org/abs/1907.00390v1 |
https://arxiv.org/pdf/1907.00390v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-bi-directional-interrelated-model-for |
Repo | https://github.com/Polly42Rose/SiriusIntentPredictionSlotFilling |
Framework | pytorch |
Enhancing Item Response Theory for Cognitive Diagnosis
Title | Enhancing Item Response Theory for Cognitive Diagnosis |
Authors | Song Cheng, Qi Liu |
Abstract | Cognitive diagnosis is a fundamental and crucial task in many educational applications, e.g., computer adaptive test and cognitive assignments. Item Response Theory (IRT) is a classical cognitive diagnosis method which can provide interpretable parameters (i.e., student latent trait, question discrimination, and difficulty) for analyzing student performance. However, traditional IRT ignores the rich information in question texts, cannot diagnose knowledge concept proficiency, and it is inaccurate to diagnose the parameters for the questions which only appear several times. To this end, in this paper, we propose a general Deep Item Response Theory (DIRT) framework to enhance traditional IRT for cognitive diagnosis by exploiting semantic representation from question texts with deep learning. In DIRT, we first use a proficiency vector to represent students’ proficiency in knowledge concepts and embed question texts and knowledge concepts to dense vectors by Word2Vec. Then, we design a deep diagnosis module to diagnose parameters in traditional IRT by deep learning techniques. Finally, with the diagnosed parameters, we input them into the logistic-like formula of IRT to predict student performance. Extensive experimental results on real-world data clearly demonstrate the effectiveness and interpretation power of DIRT framework. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.10957v3 |
https://arxiv.org/pdf/1905.10957v3.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-item-response-theory-for-cognitive |
Repo | https://github.com/chsong513/DIRT |
Framework | pytorch |
Graph Transformer Networks
Title | Graph Transformer Networks |
Authors | Seongjun Yun, Minbyul Jeong, Raehyun Kim, Jaewoo Kang, Hyunwoo J. Kim |
Abstract | Graph neural networks (GNNs) have been widely used in representation learning on graphs and achieved state-of-the-art performance in tasks such as node classification and link prediction. However, most existing GNNs are designed to learn node representations on the fixed and homogeneous graphs. The limitations especially become problematic when learning representations on a misspecified graph or a heterogeneous graph that consists of various types of nodes and edges. In this paper, we propose Graph Transformer Networks (GTNs) that are capable of generating new graph structures, which involve identifying useful connections between unconnected nodes on the original graph, while learning effective node representation on the new graphs in an end-to-end fashion. Graph Transformer layer, a core layer of GTNs, learns a soft selection of edge types and composite relations for generating useful multi-hop connections so-called meta-paths. Our experiments show that GTNs learn new graph structures, based on data and tasks without domain knowledge, and yield powerful node representation via convolution on the new graphs. Without domain-specific graph preprocessing, GTNs achieved the best performance in all three benchmark node classification tasks against the state-of-the-art methods that require pre-defined meta-paths from domain knowledge. |
Tasks | Link Prediction, Node Classification, Representation Learning |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.06455v2 |
https://arxiv.org/pdf/1911.06455v2.pdf | |
PWC | https://paperswithcode.com/paper/graph-transformer-networks-1 |
Repo | https://github.com/seongjunyun/Graph_Transformer_Networks |
Framework | pytorch |
BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization
Title | BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization |
Authors | Kai Wang, Xiaojun Quan, Rui Wang |
Abstract | The success of neural summarization models stems from the meticulous encodings of source articles. To overcome the impediments of limited and sometimes noisy training data, one promising direction is to make better use of the available training data by applying filters during summarization. In this paper, we propose a novel Bi-directional Selective Encoding with Template (BiSET) model, which leverages template discovered from training data to softly select key information from each source article to guide its summarization process. Extensive experiments on a standard summarization dataset were conducted and the results show that the template-equipped BiSET model manages to improve the summarization performance significantly with a new state of the art. |
Tasks | Abstractive Text Summarization |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05012v1 |
https://arxiv.org/pdf/1906.05012v1.pdf | |
PWC | https://paperswithcode.com/paper/biset-bi-directional-selective-encoding-with |
Repo | https://github.com/InitialBug/BiSET |
Framework | pytorch |
Towards Knowledge-Based Personalized Product Description Generation in E-commerce
Title | Towards Knowledge-Based Personalized Product Description Generation in E-commerce |
Authors | Qibin Chen, Junyang Lin, Yichang Zhang, Hongxia Yang, Jingren Zhou, Jie Tang |
Abstract | Quality product descriptions are critical for providing competitive customer experience in an e-commerce platform. An accurate and attractive description not only helps customers make an informed decision but also improves the likelihood of purchase. However, crafting a successful product description is tedious and highly time-consuming. Due to its importance, automating the product description generation has attracted considerable interests from both research and industrial communities. Existing methods mainly use templates or statistical methods, and their performance could be rather limited. In this paper, we explore a new way to generate the personalized product description by combining the power of neural networks and knowledge base. Specifically, we propose a KnOwledge Based pErsonalized (or KOBE) product description generation model in the context of e-commerce. In KOBE, we extend the encoder-decoder framework, the Transformer, to a sequence modeling formulation using self-attention. In order to make the description both informative and personalized, KOBE considers a variety of important factors during text generation, including product aspects, user categories, and knowledge base, etc. Experiments on real-world datasets demonstrate that the proposed method out-performs the baseline on various metrics. KOBE can achieve an improvement of 9.7% over state-of-the-arts in terms of BLEU. We also present several case studies as the anecdotal evidence to further prove the effectiveness of the proposed approach. The framework has been deployed in Taobao, the largest online e-commerce platform in China. |
Tasks | Text Generation |
Published | 2019-03-29 |
URL | https://arxiv.org/abs/1903.12457v3 |
https://arxiv.org/pdf/1903.12457v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-knowledge-based-personalized-product |
Repo | https://github.com/THUDM/KOBE |
Framework | pytorch |
AIRD: Adversarial Learning Framework for Image Repurposing Detection
Title | AIRD: Adversarial Learning Framework for Image Repurposing Detection |
Authors | Ayush Jaiswal, Yue Wu, Wael AbdAlmageed, Iacopo Masi, Premkumar Natarajan |
Abstract | Image repurposing is a commonly used method for spreading misinformation on social media and online forums, which involves publishing untampered images with modified metadata to create rumors and further propaganda. While manual verification is possible, given vast amounts of verified knowledge available on the internet, the increasing prevalence and ease of this form of semantic manipulation call for the development of robust automatic ways of assessing the semantic integrity of multimedia data. In this paper, we present a novel method for image repurposing detection that is based on the real-world adversarial interplay between a bad actor who repurposes images with counterfeit metadata and a watchdog who verifies the semantic consistency between images and their accompanying metadata, where both players have access to a reference dataset of verified content, which they can use to achieve their goals. The proposed method exhibits state-of-the-art performance on location-identity, subject-identity and painting-artist verification, showing its efficacy across a diverse set of scenarios. |
Tasks | |
Published | 2019-03-02 |
URL | http://arxiv.org/abs/1903.00788v3 |
http://arxiv.org/pdf/1903.00788v3.pdf | |
PWC | https://paperswithcode.com/paper/aird-adversarial-learning-framework-for-image |
Repo | https://github.com/isi-vista/AIRD-Datasets |
Framework | none |
Detecting Photoshopped Faces by Scripting Photoshop
Title | Detecting Photoshopped Faces by Scripting Photoshop |
Authors | Sheng-Yu Wang, Oliver Wang, Andrew Owens, Richard Zhang, Alexei A. Efros |
Abstract | Most malicious photo manipulations are created using standard image editing tools, such as Adobe Photoshop. We present a method for detecting one very popular Photoshop manipulation – image warping applied to human faces – using a model trained entirely using fake images that were automatically generated by scripting Photoshop itself. We show that our model outperforms humans at the task of recognizing manipulated images, can predict the specific location of edits, and in some cases can be used to “undo” a manipulation to reconstruct the original, unedited image. We demonstrate that the system can be successfully applied to real, artist-created image manipulations. |
Tasks | Image Manipulation Detection |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05856v2 |
https://arxiv.org/pdf/1906.05856v2.pdf | |
PWC | https://paperswithcode.com/paper/detecting-photoshopped-faces-by-scripting |
Repo | https://github.com/PeterWang512/FALdetector |
Framework | pytorch |
Learning representations of irregular particle-detector geometry with distance-weighted graph networks
Title | Learning representations of irregular particle-detector geometry with distance-weighted graph networks |
Authors | Shah Rukh Qasim, Jan Kieseler, Yutaro Iiyama, Maurizio Pierini |
Abstract | We explore the use of graph networks to deal with irregular-geometry detectors in the context of particle reconstruction. Thanks to their representation-learning capabilities, graph networks can exploit the full detector granularity, while natively managing the event sparsity and arbitrarily complex detector geometries. We introduce two distance-weighted graph network architectures, dubbed GarNet and GravNet layers, and apply them to a typical particle reconstruction task. The performance of the new architectures is evaluated on a data set of simulated particle interactions on a toy model of a highly granular calorimeter, loosely inspired by the endcap calorimeter to be installed in the CMS detector for the High-Luminosity LHC phase. We study the clustering of energy depositions, which is the basis for calorimetric particle reconstruction, and provide a quantitative comparison to alternative approaches. The proposed algorithms provide an interesting alternative to existing methods, offering equally performing or less resource-demanding solutions with less underlying assumptions on the detector geometry and, consequently, the possibility to generalize to other detectors. |
Tasks | Representation Learning |
Published | 2019-02-21 |
URL | https://arxiv.org/abs/1902.07987v2 |
https://arxiv.org/pdf/1902.07987v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-representations-of-irregular |
Repo | https://github.com/jkiesele/caloGraphNN |
Framework | tf |
Physically-interpretable classification of network dynamics for complex collective motions
Title | Physically-interpretable classification of network dynamics for complex collective motions |
Authors | Keisuke Fujii, Naoya Takeishi, Motokazu Hojo, Yuki Inaba, Yoshinobu Kawahara |
Abstract | Understanding complex network dynamics is a fundamental issue in various scientific and engineering fields. Network theory is capable of revealing the relationship between elements and their propagation; however, for complex collective motions, the network properties often transiently and complexly change. A fundamental question addressed here pertains to the classification of collective motion network based on physically-interpretable dynamical properties. Here we apply a data-driven spectral analysis called graph dynamic mode decomposition, which obtains the dynamical properties for collective motion classification. Using a ballgame as an example, we classified the strategic collective motions in different global behaviours and discovered that, in addition to the physical properties, the contextual node information was critical for classification. Furthermore, we discovered the label-specific stronger spectra in the relationship among the nearest agents, providing physical and semantic interpretations. Our approach contributes to the understanding of complex networks involving collective motions from the perspective of nonlinear dynamical systems. |
Tasks | |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.04859v1 |
https://arxiv.org/pdf/1905.04859v1.pdf | |
PWC | https://paperswithcode.com/paper/physically-interpretable-classification-of |
Repo | https://github.com/keisuke198619/GraphDMD |
Framework | none |