Paper Group NAWR 16
Cross-Modal Commentator: Automatic Machine Commenting Based on Cross-Modal Information. Fast and Accurate Stochastic Gradient Estimation. Learning Macroscopic Brain Connectomes via Group-Sparse Factorization. Bridging Machine Learning and Logical Reasoning by Abductive Learning. Cross-channel Communication Networks. Selective Sampling-based Scalabl …
Cross-Modal Commentator: Automatic Machine Commenting Based on Cross-Modal Information
Title | Cross-Modal Commentator: Automatic Machine Commenting Based on Cross-Modal Information |
Authors | Pengcheng Yang, Zhihan Zhang, Fuli Luo, Lei Li, Chengyang Huang, Xu Sun |
Abstract | Automatic commenting of online articles can provide additional opinions and facts to the reader, which improves user experience and engagement on social media platforms. Previous work focuses on automatic commenting based solely on textual content. However, in real-scenarios, online articles usually contain multiple modal contents. For instance, graphic news contains plenty of images in addition to text. Contents other than text are also vital because they are not only more attractive to the reader but also may provide critical information. To remedy this, we propose a new task: cross-model automatic commenting (CMAC), which aims to make comments by integrating multiple modal contents. We construct a large-scale dataset for this task and explore several representative methods. Going a step further, an effective co-attention model is presented to capture the dependency between textual and visual information. Evaluation results show that our proposed model can achieve better performance than competitive baselines. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1257/ |
https://www.aclweb.org/anthology/P19-1257 | |
PWC | https://paperswithcode.com/paper/cross-modal-commentator-automatic-machine |
Repo | https://github.com/lancopku/CMAC |
Framework | pytorch |
Fast and Accurate Stochastic Gradient Estimation
Title | Fast and Accurate Stochastic Gradient Estimation |
Authors | Beidi Chen, Yingchen Xu, Anshumali Shrivastava |
Abstract | Stochastic Gradient Descent or SGD is the most popular optimization algorithm for large-scale problems. SGD estimates the gradient by uniform sampling with sample size one. There have been several other works that suggest faster epoch-wise convergence by using weighted non-uniform sampling for better gradient estimates. Unfortunately, the per-iteration cost of maintaining this adaptive distribution for gradient estimation is more than calculating the full gradient itself, which we call the chicken-and-the-egg loop. As a result, the false impression of faster convergence in iterations, in reality, leads to slower convergence in time. In this paper, we break this barrier by providing the first demonstration of a scheme, Locality sensitive hashing (LSH) sampled Stochastic Gradient Descent (LGD), which leads to superior gradient estimation while keeping the sampling cost per iteration similar to that of the uniform sampling. Such an algorithm is possible due to the sampling view of LSH, which came to light recently. As a consequence of superior and fast estimation, we reduce the running time of all existing gradient descent algorithms, that relies on gradient estimates including Adam, Ada-grad, etc. We demonstrate the effectiveness of our proposal with experiments on linear models as well as the non-linear BERT, which is a recent popular deep learning based language representation model. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9401-fast-and-accurate-stochastic-gradient-estimation |
http://papers.nips.cc/paper/9401-fast-and-accurate-stochastic-gradient-estimation.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-accurate-stochastic-gradient |
Repo | https://github.com/keroro824/LGD |
Framework | none |
Learning Macroscopic Brain Connectomes via Group-Sparse Factorization
Title | Learning Macroscopic Brain Connectomes via Group-Sparse Factorization |
Authors | Farzane Aminmansour, Andrew Patterson, Lei Le, Yisu Peng, Daniel Mitchell, Franco Pestilli, Cesar F. Caiafa, Russell Greiner, Martha White |
Abstract | Mapping structural brain connectomes for living human brains typically requires expert analysis and rule-based models on diffusion-weighted magnetic resonance imaging. A data-driven approach, however, could overcome limitations in such rule-based approaches and improve precision mappings for individuals. In this work, we explore a framework that facilitates applying learning algorithms to automatically extract brain connectomes. Using a tensor encoding, we design an objective with a group-regularizer that prefers biologically plausible fascicle structure. We show that the objective is convex and has unique solutions, ensuring identifiable connectomes for an individual. We develop an efficient optimization strategy for this extremely high-dimensional sparse problem, by reducing the number of parameters using a greedy algorithm designed specifically for the problem. We show that this greedy algorithm significantly improves on a standard greedy algorithm, called Orthogonal Matching Pursuit. We conclude with an analysis of the solutions found by our method, showing we can accurately reconstruct the diffusion information while maintaining contiguous fascicles with smooth direction changes. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9088-learning-macroscopic-brain-connectomes-via-group-sparse-factorization |
http://papers.nips.cc/paper/9088-learning-macroscopic-brain-connectomes-via-group-sparse-factorization.pdf | |
PWC | https://paperswithcode.com/paper/learning-macroscopic-brain-connectomes-via |
Repo | https://github.com/framinmansour/Learning-Macroscopic-Brain-Connectomes-via-Group-Sparse-Factorization |
Framework | none |
Bridging Machine Learning and Logical Reasoning by Abductive Learning
Title | Bridging Machine Learning and Logical Reasoning by Abductive Learning |
Authors | Wang-Zhou Dai, Qiuling Xu, Yang Yu, Zhi-Hua Zhou |
Abstract | Perception and reasoning are two representative abilities of intelligence that are integrated seamlessly during human problem-solving processes. In the area of artificial intelligence (AI), the two abilities are usually realised by machine learning and logic programming, respectively. However, the two categories of techniques were developed separately throughout most of the history of AI. In this paper, we present the abductive learning targeted at unifying the two AI paradigms in a mutually beneficial way, where the machine learning model learns to perceive primitive logic facts from data, while logical reasoning can exploit symbolic domain knowledge and correct the wrongly perceived facts for improving the machine learning models. Furthermore, we propose a novel approach to optimise the machine learning model and the logical reasoning model jointly. We demonstrate that by using abductive learning, machines can learn to recognise numbers and resolve unknown mathematical operations simultaneously from images of simple hand-written equations. Moreover, the learned models can be generalised to longer equations and adapted to different tasks, which is beyond the capability of state-of-the-art deep learning models. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8548-bridging-machine-learning-and-logical-reasoning-by-abductive-learning |
http://papers.nips.cc/paper/8548-bridging-machine-learning-and-logical-reasoning-by-abductive-learning.pdf | |
PWC | https://paperswithcode.com/paper/bridging-machine-learning-and-logical |
Repo | https://github.com/AbductiveLearning/ABL-HED |
Framework | tf |
Cross-channel Communication Networks
Title | Cross-channel Communication Networks |
Authors | Jianwei Yang, Zhile Ren, Chuang Gan, Hongyuan Zhu, Devi Parikh |
Abstract | Convolutional neural networks process input data by sending channel-wise feature response maps to subsequent layers. While a lot of progress has been made by making networks deeper, information from each channel can only be propagated from lower levels to higher levels in a hierarchical feed-forward manner. When viewing each filter in the convolutional layer as a neuron, those neurons are not communicating explicitly within each layer in CNNs. We introduce a novel network unit called Cross-channel Communication (C3) block, a simple yet effective module to encourage the neuron communication within the same layer. The C3 block enables neurons to exchange information through a micro neural network, which consists of a feature encoder, a message communicator, and a feature decoder, before sending the information to the next layer. With C3 block, each neuron accounts for the channel-wise responses from other neurons at the same layer and learns more discriminative and complementary representations. Extensive experiments for multiple computer vision tasks show that our proposed mechanism allows shallower networks to aggregate useful information within each layer, and performances outperform baseline deep networks and other competitive methods. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8411-cross-channel-communication-networks |
http://papers.nips.cc/paper/8411-cross-channel-communication-networks.pdf | |
PWC | https://paperswithcode.com/paper/cross-channel-communication-networks |
Repo | https://github.com/jwyang/C3net |
Framework | pytorch |
Selective Sampling-based Scalable Sparse Subspace Clustering
Title | Selective Sampling-based Scalable Sparse Subspace Clustering |
Authors | Shin Matsushima, Maria Brbic |
Abstract | Sparse subspace clustering (SSC) represents each data point as a sparse linear combination of other data points in the dataset. In the representation learning step SSC finds a lower dimensional representation of data points, while in the spectral clustering step data points are clustered according to the underlying subspaces. However, both steps suffer from high computational and memory complexity, preventing the application of SSC to large-scale datasets. To overcome this limitation, we introduce Selective Sampling-based Scalable Sparse Subspace Clustering (S5C) algorithm which selects subsamples based on the approximated subgradients and linearly scales with the number of data points in terms of time and memory requirements. Along with the computational advantages, we derive theoretical guarantees for the correctness of S5C. Our theoretical result presents novel contribution for SSC in the case of limited number of subsamples. Extensive experimental results demonstrate effectiveness of our approach. |
Tasks | Representation Learning |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9408-selective-sampling-based-scalable-sparse-subspace-clustering |
http://papers.nips.cc/paper/9408-selective-sampling-based-scalable-sparse-subspace-clustering.pdf | |
PWC | https://paperswithcode.com/paper/selective-sampling-based-scalable-sparse |
Repo | https://github.com/smatsus/S5C |
Framework | none |
Annotating and Characterizing Clinical Sentences with Explicit Why-QA Cues
Title | Annotating and Characterizing Clinical Sentences with Explicit Why-QA Cues |
Authors | Jungwei Fan |
Abstract | Many clinical information needs can be stated as why-questions. The answers to them represent important clinical reasoning and justification. Clinical notes are a rich source for such why-question answering (why-QA). However, there are few dedicated corpora, and little is known about the characteristics of clinical why-QA narratives. To address this gap, the study performed manual annotation of 277 sentences containing explicit why-QA cues and summarized their quantitative and qualitative properties. The contributions are: 1) sharing a seed corpus that can be used for various QA-related training purposes, 2) adding to our knowledge about the diversity and distribution of clinical why-QA contents. |
Tasks | Question Answering |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/W19-1913/ |
https://www.aclweb.org/anthology/W19-1913 | |
PWC | https://paperswithcode.com/paper/annotating-and-characterizing-clinical |
Repo | https://github.com/Jung-wei/ClinicalWhyQA |
Framework | none |
Generating Correctness Proofs with Neural Networks
Title | Generating Correctness Proofs with Neural Networks |
Authors | Alex Sanchez-Stern, Yousef Alhessi, Lawrence Saul, Sorin Lerner |
Abstract | Foundational verification allows programmers to build software which has been empirically shown to have high levels of assurance in a variety of important domains. However, the cost of producing foundationally verified software remains prohibitively high for most projects,as it requires significant manual effort by highly trained experts. In this paper we present Proverbot9001 a proof search system using machine learning techniques to produce proofs of software correctness in interactive theorem provers. We demonstrate Proverbot9001 on the proof obligations from a large practical proof project,the CompCert verified C compiler,and show that it can effectively automate what was previously manual proofs,automatically solving 15.77% of proofs in our test dataset. This corresponds to an over 3X improvement over the prior state of the art machine learning technique for generating proofs in Coq. |
Tasks | Automated Theorem Proving |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07794 |
https://arxiv.org/pdf/1907.07794.pdf | |
PWC | https://paperswithcode.com/paper/generating-correctness-proofs-with-neural |
Repo | https://github.com/UCSD-PL/proverbot9001 |
Framework | pytorch |
CogNet: A Large-Scale Cognate Database
Title | CogNet: A Large-Scale Cognate Database |
Authors | Khuyagbaatar Batsuren, Gabor Bella, Fausto Giunchiglia |
Abstract | This paper introduces CogNet, a new, large-scale lexical database that provides cognates -words of common origin and meaning- across languages. The database currently contains 3.1 million cognate pairs across 338 languages using 35 writing systems. The paper also describes the automated method by which cognates were computed from publicly available wordnets, with an accuracy evaluated to 94{%}. Finally, it presents statistics about the cognate data and some initial insights into it, hinting at a possible future exploitation of the resource by various fields of lingustics. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1302/ |
https://www.aclweb.org/anthology/P19-1302 | |
PWC | https://paperswithcode.com/paper/cognet-a-large-scale-cognate-database |
Repo | https://github.com/kbatsuren/wiktra |
Framework | none |
An Empirical study of Binary Neural Networks’ Optimisation
Title | An Empirical study of Binary Neural Networks’ Optimisation |
Authors | Milad Alizadeh, Javier Fernández-Marqués, Nicholas D. Lane, Yarin Gal |
Abstract | Binary neural networks using the Straight-Through-Estimator (STE) have been shown to achieve state-of-the-art results, but their training process is not well-founded. This is due to the discrepancy between the evaluated function in the forward path, and the weight updates in the back-propagation, updates which do not correspond to gradients of the forward path. Efficient convergence and accuracy of binary models often rely on careful fine-tuning and various ad-hoc techniques. In this work, we empirically identify and study the effectiveness of the various ad-hoc techniques commonly used in the literature, providing best-practices for efficient training of binary models. We show that adapting learning rates using second moment methods is crucial for the successful use of the STE, and that other optimisers can easily get stuck in local minima. We also find that many of the commonly employed tricks are only effective towards the end of the training, with these methods making early stages of the training considerably slower. Our analysis disambiguates necessary from unnecessary ad-hoc techniques for training of binary neural networks, paving the way for future development of solid theoretical foundations for these. Our newly-found insights further lead to new procedures which make training of existing binary neural networks notably faster. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=rJfUCoR5KX |
https://openreview.net/pdf?id=rJfUCoR5KX | |
PWC | https://paperswithcode.com/paper/an-empirical-study-of-binary-neural-networks |
Repo | https://github.com/mi-lad/studying-binary-neural-networks |
Framework | tf |
Inducing a Decision Tree with Discriminative Paths to Classify Entities in a Knowledge Graph
Title | Inducing a Decision Tree with Discriminative Paths to Classify Entities in a Knowledge Graph |
Authors | Gilles Vandewiele, Bram Steenwinckel, Femke Ongenae, Filip De Turck |
Abstract | Deep-learning based techniques are increasingly being used for different machine learning tasks on knowledge graphs. While it has been shown empirically that these techniques often achieve better predictive performances than their classical counterparts, where features are extracted from the graph, they lack interpretability. Interpretability is a vital aspect in critical domains such as the health and financial sector. In this paper, we present a technique that builds a decision tree of class-specific substructures in order to classify different entities within the knowledge graph. We show how our proposed technique is competitive to current state-of-the-art deep-learning techniques on four benchmark datasets, while being fully interpretable. |
Tasks | Knowledge Graphs, Node Classification |
Published | 2019-08-22 |
URL | http://ceur-ws.org/Vol-2427/SEPDA_2019_paper_3.pdf |
http://ceur-ws.org/Vol-2427/SEPDA_2019_paper_3.pdf | |
PWC | https://paperswithcode.com/paper/inducing-a-decision-tree-with-discriminative |
Repo | https://github.com/IBCNServices/KGPTree |
Framework | none |
Certifying Geometric Robustness of Neural Networks
Title | Certifying Geometric Robustness of Neural Networks |
Authors | Mislav Balunovic, Maximilian Baader, Gagandeep Singh, Timon Gehr, Martin Vechev |
Abstract | The use of neural networks in safety-critical computer vision systems calls for their robustness certification against natural geometric transformations (e.g., rotation, scaling). However, current certification methods target mostly norm-based pixel perturbations and cannot certify robustness against geometric transformations. In this work, we propose a new method to compute sound and asymptotically optimal linear relaxations for any composition of transformations. Our method is based on a novel combination of sampling and optimization. We implemented the method in a system called DeepG and demonstrated that it certifies significantly more complex geometric transformations than existing methods on both defended and undefended networks while scaling to large architectures. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9666-certifying-geometric-robustness-of-neural-networks |
http://papers.nips.cc/paper/9666-certifying-geometric-robustness-of-neural-networks.pdf | |
PWC | https://paperswithcode.com/paper/certifying-geometric-robustness-of-neural |
Repo | https://github.com/eth-sri/deepg |
Framework | tf |
From legal to technical concept: Towards an automated classification of German political Twitter postings as criminal offenses
Title | From legal to technical concept: Towards an automated classification of German political Twitter postings as criminal offenses |
Authors | Frederike Zufall, Tobias Horsmann, Torsten Zesch |
Abstract | Advances in the automated detection of offensive Internet postings make this mechanism very attractive to social media companies, who are increasingly under pressure to monitor and action activity on their sites. However, these advances also have important implications as a threat to the fundamental right of free expression. In this article, we analyze which Twitter posts could actually be deemed offenses under German criminal law. German law follows the deductive method of the Roman law tradition based on abstract rules as opposed to the inductive reasoning in Anglo-American common law systems. This allows us to show how legal conclusions can be reached and implemented without relying on existing court decisions. We present a data annotation schema, consisting of a series of binary decisions, for determining whether a specific post would constitute a criminal offense. This schema serves as a step towards an inexpensive creation of a sufficient amount of data for an automated classification. We find that the majority of posts deemed offensive actually do not constitute a criminal offense and still contribute to public discourse. Furthermore, laymen can provide sufficiently reliable data to an expert reference but are, for instance, more lenient in the interpretation of what constitutes a disparaging statement. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1135/ |
https://www.aclweb.org/anthology/N19-1135 | |
PWC | https://paperswithcode.com/paper/from-legal-to-technical-concept-towards-an |
Repo | https://github.com/Horsmann/NAACL-2019-legal |
Framework | none |
Joint Activity Recognition and Indoor Localization With WiFi Fingerprints
Title | Joint Activity Recognition and Indoor Localization With WiFi Fingerprints |
Authors | Fei Wang, Jianwei Feng, Yinliang Zhao, Xiaobin Zhang, Shiyuan Zhang, Jinsong Han |
Abstract | Recent years have witnessed the rapid development in the research topic of WiFi sensing that automatically senses human with commercial WiFi devices. This work falls into two major categories, i.e., the activity recognition and the indoor localization. The former work utilizes WiFi devices to recognize human daily activities such as smoking, walking, and dancing. The latter one, indoor localization, can be used for indoor navigation, location-based services, and through-wall surveillance. The key rationale behind this type of work is that people behaviors can influence the WiFi signal propagation and introduce specific patterns into WiFi signals, called WiFi fingerprints, which can be further explored to identify human activities and locations. In this paper, we propose a novel deep learning framework for joint activity recognition and indoor localization task using WiFi Channel State Information~(CSI) fingerprints. More precisely, we develop a system running standard IEEE 802.11n WiFi protocol, and collect more than 1400 CSI fingerprints on 6 activities at 16 indoor locations. Then we propose a dual-task convolutional neural network with 1-dimensional convolutional layers for the joint task of activity recognition and indoor localization. Experimental results and ablation study show that our approach achieves good performances in this joint WiFi sensing task. Data and code have been made publicly available at https://github.com/geekfeiw/apl |
Tasks | Activity Recognition, RF-based Action Recognition |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1904.04964 |
https://arxiv.org/pdf/1904.04964 | |
PWC | https://paperswithcode.com/paper/joint-activity-recognition-and-indoor |
Repo | https://github.com/geekfeiw/apl |
Framework | pytorch |
d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding
Title | d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding |
Authors | Xiang Xu, Xiong Zhou, Ragav Venkatesan, Gurumurthy Swaminathan, Orchid Majumder |
Abstract | On the one hand, deep neural networks are effective in learning large datasets. On the other, they are inefficient with their data usage. They often require copious amount of labeled-data to train their scads of parameters. Training larger and deeper networks is hard without appropriate regularization, particularly while using a small dataset. Laterally, collecting well-annotated data is expensive, time-consuming and often infeasible. A popular way to regularize these networks is to simply train the network with more data from an alternate representative dataset. This can lead to adverse effects if the statistics of the representative dataset are dissimilar to our target.This predicament is due to the problem of domain shift. Data from a shifted domain might not produce bespoke features when a feature extractor from the representative domain is used. Several techniques of domain adaptation have been proposed in the past to solve this problem. In this paper, we propose a new technique (d-SNE) of domain adaptation that cleverly uses stochastic neighborhood embedding techniques and a novel modified-Hausdorff distance. The proposed technique is learnable end-to-end and is therefore, ideally suited to train neural networks. Extensive experiments demonstrate that d-SNE outperforms the current states-of-the-art and is robust to the variances in different datasets, even in the one-shot and semi-supervised learning settings. d-SNE also demonstrates the ability to generalize to multiple domains concurrently. |
Tasks | Domain Adaptation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Xu_d-SNE_Domain_Adaptation_Using_Stochastic_Neighborhood_Embedding_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Xu_d-SNE_Domain_Adaptation_Using_Stochastic_Neighborhood_Embedding_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/d-sne-domain-adaptation-using-stochastic |
Repo | https://github.com/aws-samples/d-SNE |
Framework | mxnet |