Paper Group ANR 383
s-DRN: Stabilized Developmental Resonance Network. Online Community Detection by Spectral CUSUM. Open Named Entity Modeling from Embedding Distribution. A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning. A Search for the Underlying Equation Governing Similar Systems. Towards Generation of Visual Attention Map for S …
s-DRN: Stabilized Developmental Resonance Network
Title | s-DRN: Stabilized Developmental Resonance Network |
Authors | In-Ug Yoon, Ue-Hwan Kim, Jong-Hwan |
Abstract | Online incremental clustering of sequentially incoming data without prior knowledge suffers from changing cluster numbers and tends to fall into local extrema according to given data order. To overcome these limitations, we propose a stabilized developmental resonance network (s-DRN). First, we analyze the instability of the conventional choice function during the node activation process and design a scalable activation function to make clustering performance stable over all input data scales. Next, we devise three criteria for the node grouping algorithm: distance, intersection over union (IoU) and size criteria. The proposed node grouping algorithm effectively excludes unnecessary clusters from incrementally created clusters, diminishes the performance dependency on vigilance parameters and makes the clustering process robust. To verify the performance of the proposed s-DRN model, comparative studies are conducted on six real-world datasets whose statistical characteristics are distinctive. The comparative studies demonstrate the proposed s-DRN outperforms baselines in terms of stability and accuracy. |
Tasks | |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08541v1 |
https://arxiv.org/pdf/1912.08541v1.pdf | |
PWC | https://paperswithcode.com/paper/s-drn-stabilized-developmental-resonance |
Repo | |
Framework | |
Online Community Detection by Spectral CUSUM
Title | Online Community Detection by Spectral CUSUM |
Authors | Minghe Zhang, Liyan Xie, Yao Xie |
Abstract | We present an online community change detection algorithm called spectral CUSUM to detect the emergence of a community using a subspace projection procedure based on a Gaussian model setting. Theoretical analysis is provided to characterize the average run length (ARL) and expected detection delay (EDD), as well as the asymptotic optimality. Simulation and real data examples demonstrate the good performance of the proposed method. |
Tasks | Community Detection, Online Community Detection |
Published | 2019-10-20 |
URL | https://arxiv.org/abs/1910.09083v2 |
https://arxiv.org/pdf/1910.09083v2.pdf | |
PWC | https://paperswithcode.com/paper/online-community-detection-by-spectral-cusum |
Repo | |
Framework | |
Open Named Entity Modeling from Embedding Distribution
Title | Open Named Entity Modeling from Embedding Distribution |
Authors | Ying Luo, Hai Zhao, Tao Wang, Linlin Li, Luo Si |
Abstract | In this paper, we report our discovery on named entity distribution in general word embedding space, which helps an open definition on multilingual named entity definition rather than previous closed and constraint definition on named entities through a named entity dictionary, which is usually derived from huaman labor and replies on schedual update. Our initial visualization of monolingual word embeddings indicates named entities tend to gather together despite of named entity types and language difference, which enable us to model all named entities using a specific geometric structure inside embedding space,namely, the named entity hypersphere. For monolingual case, the proposed named entity model gives an open description on diverse named entity types and different languages. For cross-lingual case, mapping the proposed named entity model provides a novel way to build named entity dataset for resource-poor languages. At last, the proposed named entity model may be shown as a very useful clue to significantly enhance state-of-the-art named entity recognition systems generally. |
Tasks | Named Entity Recognition, Word Embeddings |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00170v1 |
https://arxiv.org/pdf/1909.00170v1.pdf | |
PWC | https://paperswithcode.com/paper/open-named-entity-modeling-from-embedding |
Repo | |
Framework | |
A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning
Title | A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning |
Authors | Fang Liu, Ge Li, Bolin Wei, Xin Xia, Ming Li, Zhiyi Fu, Zhi Jin |
Abstract | Code completion, one of the most useful features in the integrated development environments, can accelerate software development by suggesting the libraries, APIs, method names in real-time. Recent studies have shown that statistical language models can improve the performance of code completion tools through learning from large-scale software repositories. However, these models suffer from three major drawbacks: a) The hierarchical structural information of the programs is not fully utilized in the program’s representation; b) In programs, the semantic relationships can be very long, existing LSTM based language models are not sufficient to model the long-term dependency. c) Existing approaches perform a specific task in one model, which leads to the underuse of the information from related tasks. In this paper, we present a novel method that introduces the hierarchical structural information into the representation of programs by considering the path from the predicting node to the root node. To capture the long-term dependency in the input programs, we apply Transformer-XL network as the base language model. Besides, we creatively propose a Multi-Task Learning (MTL) framework to learn two related tasks in code completion jointly, where knowledge acquired from one task could be beneficial to another task. Experiments on three real-world datasets demonstrate the effectiveness of our model when compared with state-of-the-art methods. |
Tasks | Language Modelling, Multi-Task Learning |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.06983v2 |
https://arxiv.org/pdf/1909.06983v2.pdf | |
PWC | https://paperswithcode.com/paper/a-self-attentional-neural-architecture-for |
Repo | |
Framework | |
A Search for the Underlying Equation Governing Similar Systems
Title | A Search for the Underlying Equation Governing Similar Systems |
Authors | Changwei Loh, Daniel Schneegass, Pengwei Tian |
Abstract | We show a data-driven approach to discover the underlying structural form of the mathematical equation governing the dynamics of multiple but similar systems induced by the same mechanisms. This approach hinges on theories that we lay out involving arguments based on the nature of physical systems. In the same vein, we also introduce a metric to search for the best candidate equation using the datasets generated from the systems. This approach involves symbolic regression by means of genetic programming and regressions to compute the strength of the interplay between the extrinsic parameters in a candidate equation. We relate these extrinsic parameters to the hidden properties of the data-generating systems. The behavior of a new similar system can be predicted easily by utilizing the discovered structural form of the general equation. As illustrations, we apply the approach to identify candidate structural forms of the underlying equation governing two cases: the changes in a sensor measurement of degrading engines; and the search for the governing equation of systems with known variations of an intrinsic parameter. |
Tasks | |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10673v1 |
https://arxiv.org/pdf/1908.10673v1.pdf | |
PWC | https://paperswithcode.com/paper/a-search-for-the-underlying-equation |
Repo | |
Framework | |
Towards Generation of Visual Attention Map for Source Code
Title | Towards Generation of Visual Attention Map for Source Code |
Authors | Takeshi D. Itoh, Takatomi Kubo, Kiyoka Ikeda, Yuki Maruno, Yoshiharu Ikutani, Hideaki Hata, Kenichi Matsumoto, Kazushi Ikeda |
Abstract | Program comprehension is a dominant process in software development and maintenance. Experts are considered to comprehend the source code efficiently by directing their gaze, or attention, to important components in it. However, reflecting the importance of components is still a remaining issue in gaze behavior analysis for source code comprehension. Here we show a conceptual framework to compare the quantified importance of source code components with the gaze behavior of programmers. We use “attention” in attention models (e.g., code2vec) as the importance indices for source code components and evaluate programmers’ gaze locations based on the quantified importance. In this report, we introduce the idea of our gaze behavior analysis using the attention map, and the results of a preliminary experiment. |
Tasks | |
Published | 2019-07-14 |
URL | https://arxiv.org/abs/1907.06182v2 |
https://arxiv.org/pdf/1907.06182v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-generation-of-visual-attention-map |
Repo | |
Framework | |
HellaSwag: Can a Machine Really Finish Your Sentence?
Title | HellaSwag: Can a Machine Really Finish Your Sentence? |
Authors | Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, Yejin Choi |
Abstract | Recent work by Zellers et al. (2018) introduced a new task of commonsense natural language inference: given an event description such as “A woman sits at a piano,” a machine must select the most likely followup: “She sets her fingers on the keys.” With the introduction of BERT, near human-level performance was reached. Does this mean that machines can perform human level commonsense inference? In this paper, we show that commonsense inference still proves difficult for even state-of-the-art models, by presenting HellaSwag, a new challenge dataset. Though its questions are trivial for humans (>95% accuracy), state-of-the-art models struggle (<48%). We achieve this via Adversarial Filtering (AF), a data collection paradigm wherein a series of discriminators iteratively select an adversarial set of machine-generated wrong answers. AF proves to be surprisingly robust. The key insight is to scale up the length and complexity of the dataset examples towards a critical ‘Goldilocks’ zone wherein generated text is ridiculous to humans, yet often misclassified by state-of-the-art models. Our construction of HellaSwag, and its resulting difficulty, sheds light on the inner workings of deep pretrained models. More broadly, it suggests a new path forward for NLP research, in which benchmarks co-evolve with the evolving state-of-the-art in an adversarial way, so as to present ever-harder challenges. |
Tasks | Natural Language Inference |
Published | 2019-05-19 |
URL | https://arxiv.org/abs/1905.07830v1 |
https://arxiv.org/pdf/1905.07830v1.pdf | |
PWC | https://paperswithcode.com/paper/hellaswag-can-a-machine-really-finish-your |
Repo | |
Framework | |
Impact of novel aggregation methods for flexible, time-sensitive EHR prediction without variable selection or cleaning
Title | Impact of novel aggregation methods for flexible, time-sensitive EHR prediction without variable selection or cleaning |
Authors | Jacob Deasy, Ari Ercole, Pietro Liò |
Abstract | Dynamic assessment of patient status (e.g. by an automated, continuously updated assessment of outcome) in the Intensive Care Unit (ICU) is of paramount importance for early alerting, decision support and resource allocation. Extraction and cleaning of expert-selected clinical variables discards information and protracts collaborative efforts to introduce machine learning in medicine. We present improved aggregation methods for a flexible deep learning architecture which learns a joint representation of patient chart, lab and output events. Our models outperform recent deep learning models for patient mortality classification using ICU timeseries, by embedding and aggregating all events with no pre-processing or variable selection. Our model achieves a strong performance of AUROC 0.87 at 48 hours on the MIMIC-III dataset while using 13,233 unique un-preprocessed variables in an interpretable manner via hourly softmax aggregation. This demonstrates how our method can be easily combined with existing electronic health record systems for automated, dynamic patient risk analysis. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.08981v1 |
https://arxiv.org/pdf/1909.08981v1.pdf | |
PWC | https://paperswithcode.com/paper/impact-of-novel-aggregation-methods-for |
Repo | |
Framework | |
Learning spectrograms with convolutional spectral kernels
Title | Learning spectrograms with convolutional spectral kernels |
Authors | Zheyang Shen, Markus Heinonen, Samuel Kaski |
Abstract | We introduce the convolutional spectral kernel (CSK), a novel family of non-stationary, nonparametric covariance kernels for Gaussian process (GP) models, derived from the convolution between two imaginary radial basis functions. We present a principled framework to interpret CSK, as well as other deep probabilistic models, using approximated Fourier transform, yielding a concise representation of input-frequency spectrogram. Observing through the lens of the spectrogram, we provide insight on the interpretability of deep models. We then infer the functional hyperparameters using scalable variational and MCMC methods. On small- and medium-sized spatiotemporal datasets, we demonstrate improved generalization of GP models when equipped with CSK, and their capability to extract non-stationary periodic patterns. |
Tasks | Gaussian Processes |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09917v2 |
https://arxiv.org/pdf/1905.09917v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-spectrograms-with-convolutional |
Repo | |
Framework | |
Learning to Separate: Detecting Heavily-Occluded Objects in Urban Scenes
Title | Learning to Separate: Detecting Heavily-Occluded Objects in Urban Scenes |
Authors | Chenhongyi Yang, Vitaly Ablavsky, Kaihong Wang, Qi Feng, Margrit Betke |
Abstract | In the past decade, deep learning based visual object detection has received a significant amount of attention, but cases when heavy intra-class occlusions occur are not studied thoroughly. In this work, we propose a novel Non-MaximumSuppression (NMS) algorithm that dramatically improves the detection recall while maintaining high precision in scenes with heavy occlusions. Our NMS algorithm is derived from a novel embedding mechanism, in which the semantic and geometric features of the detected boxes are jointly exploited. The embedding makes it possible to determine whether two heavily-overlapping boxes belong to the same object in the physical world. Our approach is particularly useful for car detection and pedestrian detection in urban scenes where occlusions tend to happen. We validate our approach on two widely-adopted datasets, KITTI and CityPersons, and achieve state-of-the-art performance. |
Tasks | Object Detection, Pedestrian Detection |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01674v2 |
https://arxiv.org/pdf/1912.01674v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-separate-detecting-heavily |
Repo | |
Framework | |
Active Learning for Deep Detection Neural Networks
Title | Active Learning for Deep Detection Neural Networks |
Authors | Hamed H. Aghdam, Abel Gonzalez-Garcia, Joost van de Weijer, Antonio M. López |
Abstract | The cost of drawing object bounding boxes (i.e. labeling) for millions of images is prohibitively high. For instance, labeling pedestrians in a regular urban image could take 35 seconds on average. Active learning aims to reduce the cost of labeling by selecting only those images that are informative to improve the detection network accuracy. In this paper, we propose a method to perform active learning of object detectors based on convolutional neural networks. We propose a new image-level scoring process to rank unlabeled images for their automatic selection, which clearly outperforms classical scores. The proposed method can be applied to videos and sets of still images. In the former case, temporal selection rules can complement our scoring process. As a relevant use case, we extensively study the performance of our method on the task of pedestrian detection. Overall, the experiments show that the proposed method performs better than random selection. Our codes are publicly available at www.gitlab.com/haghdam/deep_active_learning. |
Tasks | Active Learning, Pedestrian Detection |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.09168v1 |
https://arxiv.org/pdf/1911.09168v1.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-for-deep-detection-neural-1 |
Repo | |
Framework | |
Automatically Inferring Gender Associations from Language
Title | Automatically Inferring Gender Associations from Language |
Authors | Serina Chang, Kathleen McKeown |
Abstract | In this paper, we pose the question: do people talk about women and men in different ways? We introduce two datasets and a novel integration of approaches for automatically inferring gender associations from language, discovering coherent word clusters, and labeling the clusters for the semantic concepts they represent. The datasets allow us to compare how people write about women and men in two different settings - one set draws from celebrity news and the other from student reviews of computer science professors. We demonstrate that there are large-scale differences in the ways that people talk about women and men and that these differences vary across domains. Human evaluations show that our methods significantly outperform strong baselines. |
Tasks | |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1909.00091v1 |
https://arxiv.org/pdf/1909.00091v1.pdf | |
PWC | https://paperswithcode.com/paper/automatically-inferring-gender-associations |
Repo | |
Framework | |
Fast Polynomial Kernel Classification for Massive Data
Title | Fast Polynomial Kernel Classification for Massive Data |
Authors | Jinshan Zeng, Minrun Wu, Shao-Bo Lin, Ding-Xuan Zhou |
Abstract | In the era of big data, it is highly desired to develop efficient machine learning algorithms to tackle massive data challenges such as storage bottleneck, algorithmic scalability, and interpretability. In this paper, we develop a novel efficient classification algorithm, called fast polynomial kernel classification (FPC), to conquer the scalability and storage challenges. Our main tools are a suitable selected feature mapping based on polynomial kernels and an alternating direction method of multipliers (ADMM) algorithm for a related non-smooth convex optimization problem. Fast learning rates as well as feasibility verifications including the convergence of ADMM and the selection of center points are established to justify theoretical behaviors of FPC. Our theoretical assertions are verified by a series of simulations and real data applications. The numerical results demonstrate that FPC significantly reduces the computational burden and storage memory of the existing learning schemes such as support vector machines and boosting, without sacrificing their generalization abilities much. |
Tasks | |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10558v2 |
https://arxiv.org/pdf/1911.10558v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-polynomial-kernel-classification-for |
Repo | |
Framework | |
Artificial Consciousness and Security
Title | Artificial Consciousness and Security |
Authors | Andrew Powell |
Abstract | This paper describes a possible way to improve computer security by implementing a program which implements the following three features related to a weak notion of artificial consciousness: (partial) self-monitoring, ability to compute the truth of quantifier-free propositions and the ability to communicate with the user. The integrity of the program could be enhanced by using a trusted computing approach, that is to say a hardware module that is at the root of a chain of trust. This paper outlines a possible approach but does not refer to an implementation (which would need further work), but the author believes that an implementation using current processors, a debugger, a monitoring program and a trusted processing module is currently possible. |
Tasks | |
Published | 2019-05-11 |
URL | https://arxiv.org/abs/1905.11807v1 |
https://arxiv.org/pdf/1905.11807v1.pdf | |
PWC | https://paperswithcode.com/paper/190511807 |
Repo | |
Framework | |
Investigations on the inference optimization techniques and their impact on multiple hardware platforms for Semantic Segmentation
Title | Investigations on the inference optimization techniques and their impact on multiple hardware platforms for Semantic Segmentation |
Authors | Sethu Hareesh Kolluru |
Abstract | In this work, the task of pixel-wise semantic segmentation in the context of self-driving with a goal to reduce the inference time is explored. Fully Convolutional Network (FCN-8s, FCN-16s, and FCN-32s) with a VGG16 encoder architecture and skip connections is trained and validated on the Cityscapes dataset. Numerical investigations are carried out for several inference optimization techniques built into TensorFlow and TensorRT to quantify their impact on the inference time and network size. Finally, the trained network is ported on to an embedded platform (Nvidia Jetson TX1) and the inference time, as well as the total energy consumed for inference across hardware platforms, are compared. |
Tasks | Semantic Segmentation |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.12993v1 |
https://arxiv.org/pdf/1911.12993v1.pdf | |
PWC | https://paperswithcode.com/paper/investigations-on-the-inference-optimization |
Repo | |
Framework | |