July 29, 2019

3369 words 16 mins read

Paper Group AWR 122

Paper Group AWR 122

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks. Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multi-Layer Calorimeters. Neutral evolution and turnover over centuries of English word popularity. Neural network an1alysis of sleep stages enables efficient diagno …

Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks

Title Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks
Authors Nathan Hartmann, Erick Fonseca, Christopher Shulby, Marcos Treviso, Jessica Rodrigues, Sandra Aluisio
Abstract Word embeddings have been found to provide meaningful representations for words in an efficient way; therefore, they have become common in Natural Language Processing sys- tems. In this paper, we evaluated different word embedding models trained on a large Portuguese corpus, including both Brazilian and European variants. We trained 31 word embedding models using FastText, GloVe, Wang2Vec and Word2Vec. We evaluated them intrinsically on syntactic and semantic analogies and extrinsically on POS tagging and sentence semantic similarity tasks. The obtained results suggest that word analogies are not appropriate for word embedding evaluation; task-specific evaluations appear to be a better option.
Tasks Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published 2017-08-20
URL http://arxiv.org/abs/1708.06025v1
PDF http://arxiv.org/pdf/1708.06025v1.pdf
PWC https://paperswithcode.com/paper/portuguese-word-embeddings-evaluating-on-word
Repo https://github.com/nathanshartmann/portuguese_word_embeddings
Framework none

Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multi-Layer Calorimeters

Title Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multi-Layer Calorimeters
Authors Michela Paganini, Luke de Oliveira, Benjamin Nachman
Abstract Physicists at the Large Hadron Collider (LHC) rely on detailed simulations of particle collisions to build expectations of what experimental data may look like under different theory modeling assumptions. Petabytes of simulated data are needed to develop analysis techniques, though they are expensive to generate using existing algorithms and computing resources. The modeling of detectors and the precise description of particle cascades as they interact with the material in the calorimeter are the most computationally demanding steps in the simulation pipeline. We therefore introduce a deep neural network-based generative model to enable high-fidelity, fast, electromagnetic calorimeter simulation. There are still challenges for achieving precision across the entire phase space, but our current solution can reproduce a variety of particle shower properties while achieving speed-up factors of up to 100,000$\times$. This opens the door to a new era of fast simulation that could save significant computing time and disk space, while extending the reach of physics searches and precision measurements at the LHC and beyond.
Tasks
Published 2017-05-05
URL http://arxiv.org/abs/1705.02355v2
PDF http://arxiv.org/pdf/1705.02355v2.pdf
PWC https://paperswithcode.com/paper/accelerating-science-with-generative
Repo https://github.com/hep-lbdl/CaloGAN
Framework none

Neutral evolution and turnover over centuries of English word popularity

Title Neutral evolution and turnover over centuries of English word popularity
Authors Damian Ruck, R. Alexander Bentley, Alberto Acerbi, Philip Garnett, Daniel J. Hruschka
Abstract Here we test Neutral models against the evolution of English word frequency and vocabulary at the population scale, as recorded in annual word frequencies from three centuries of English language books. Against these data, we test both static and dynamic predictions of two neutral models, including the relation between corpus size and vocabulary size, frequency distributions, and turnover within those frequency distributions. Although a commonly used Neutral model fails to replicate all these emergent properties at once, we find that modified two-stage Neutral model does replicate the static and dynamic properties of the corpus data. This two-stage model is meant to represent a relatively small corpus (population) of English books, analogous to a `canon’, sampled by an exponentially increasing corpus of books in the wider population of authors. More broadly, this mode – a smaller neutral model within a larger neutral model – could represent more broadly those situations where mass attention is focused on a small subset of the cultural variants. |
Tasks
Published 2017-03-30
URL http://arxiv.org/abs/1703.10698v1
PDF http://arxiv.org/pdf/1703.10698v1.pdf
PWC https://paperswithcode.com/paper/neutral-evolution-and-turnover-over-centuries
Repo https://github.com/dr2g08/Neutral-evolution-and-turnover-over-centuries-of-English-word-popularity
Framework none

Neural network an1alysis of sleep stages enables efficient diagnosis of narcolepsy

Title Neural network an1alysis of sleep stages enables efficient diagnosis of narcolepsy
Authors Jens B. Stephansen, Alexander N. Olesen, Mads Olsen, Aditya Ambati, Eileen B. Leary, Hyatt E. Moore, Oscar Carrillo, Ling Lin, Fang Han, Han Yan, Yun L. Sun, Yves Dauvilliers, Sabine Scholz, Lucie Barateau, Birgit Hogl, Ambra Stefani, Seung Chul Hong, Tae Won Kim, Fabio Pizza, Giuseppe Plazzi, Stefano Vandi, Elena Antelmi, Dimitri Perrin, Samuel T. Kuna, Paula K. Schweitzer, Clete Kushida, Paul E. Peppard, Helge B. D. Sorensen, Poul Jennum, Emmanuel Mignot
Abstract Analysis of sleep for the diagnosis of sleep disorders such as Type-1 Narcolepsy (T1N) currently requires visual inspection of polysomnography records by trained scoring technicians. Here, we used neural networks in approximately 3,000 normal and abnormal sleep recordings to automate sleep stage scoring, producing a hypnodensity graph - a probability distribution conveying more information than classical hypnograms. Accuracy of sleep stage scoring was validated in 70 subjects assessed by six scorers. The best model performed better than any individual scorer (87% versus consensus). It also reliably scores sleep down to 5 instead of 30 second scoring epochs. A T1N marker based on unusual sleep-stage overlaps achieved a specificity of 96% and a sensitivity of 91%, validated in independent datasets. Addition of HLA-DQB1*06:02 typing increased specificity to 99%. Our method can reduce time spent in sleep clinics and automates T1N diagnosis. It also opens the possibility of diagnosing T1N using home sleep studies.
Tasks
Published 2017-10-05
URL http://arxiv.org/abs/1710.02094v2
PDF http://arxiv.org/pdf/1710.02094v2.pdf
PWC https://paperswithcode.com/paper/neural-network-an1alysis-of-sleep-stages
Repo https://github.com/stanford-stages/stanford-stages
Framework tf

Topic Modeling based on Keywords and Context

Title Topic Modeling based on Keywords and Context
Authors Johannes Schneider
Abstract Current topic models often suffer from discovering topics not matching human intuition, unnatural switching of topics within documents and high computational demands. We address these concerns by proposing a topic model and an inference algorithm based on automatically identifying characteristic keywords for topics. Keywords influence topic-assignments of nearby words. Our algorithm learns (key)word-topic scores and it self-regulates the number of topics. Inference is simple and easily parallelizable. Qualitative analysis yields comparable results to state-of-the-art models (eg. LDA), but with different strengths and weaknesses. Quantitative analysis using 9 datasets shows gains in terms of classification accuracy, PMI score, computational performance and consistency of topic assignments within documents, while most often using less topics.
Tasks Topic Models
Published 2017-10-07
URL http://arxiv.org/abs/1710.02650v2
PDF http://arxiv.org/pdf/1710.02650v2.pdf
PWC https://paperswithcode.com/paper/topic-modeling-based-on-keywords-and-context
Repo https://github.com/JohnTailor/tkm
Framework none

Dialogue Act Sequence Labeling using Hierarchical encoder with CRF

Title Dialogue Act Sequence Labeling using Hierarchical encoder with CRF
Authors Harshit Kumar, Arvind Agarwal, Riddhiman Dasgupta, Sachindra Joshi, Arun Kumar
Abstract Dialogue Act recognition associate dialogue acts (i.e., semantic labels) to utterances in a conversation. The problem of associating semantic labels to utterances can be treated as a sequence labeling problem. In this work, we build a hierarchical recurrent neural network using bidirectional LSTM as a base unit and the conditional random field (CRF) as the top layer to classify each utterance into its corresponding dialogue act. The hierarchical network learns representations at multiple levels, i.e., word level, utterance level, and conversation level. The conversation level representations are input to the CRF layer, which takes into account not only all previous utterances but also their dialogue acts, thus modeling the dependency among both, labels and utterances, an important consideration of natural dialogue. We validate our approach on two different benchmark data sets, Switchboard and Meeting Recorder Dialogue Act, and show performance improvement over the state-of-the-art methods by $2.2%$ and $4.1%$ absolute points, respectively. It is worth noting that the inter-annotator agreement on Switchboard data set is $84%$, and our method is able to achieve the accuracy of about $79%$ despite being trained on the noisy data.
Tasks Dialogue Act Classification
Published 2017-09-13
URL http://arxiv.org/abs/1709.04250v2
PDF http://arxiv.org/pdf/1709.04250v2.pdf
PWC https://paperswithcode.com/paper/dialogue-act-sequence-labeling-using
Repo https://github.com/ilimugur/short-text-classification
Framework tf

Adversarial Discriminative Sim-to-real Transfer of Visuo-motor Policies

Title Adversarial Discriminative Sim-to-real Transfer of Visuo-motor Policies
Authors Fangyi Zhang, Jürgen Leitner, Zongyuan Ge, Michael Milford, Peter Corke
Abstract Various approaches have been proposed to learn visuo-motor policies for real-world robotic applications. One solution is first learning in simulation then transferring to the real world. In the transfer, most existing approaches need real-world images with labels. However, the labelling process is often expensive or even impractical in many robotic applications. In this paper, we propose an adversarial discriminative sim-to-real transfer approach to reduce the cost of labelling real data. The effectiveness of the approach is demonstrated with modular networks in a table-top object reaching task where a 7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter through visual observations. The adversarial transfer approach reduced the labelled real data requirement by 50%. Policies can be transferred to real environments with only 93 labelled and 186 unlabelled real images. The transferred visuo-motor policies are robust to novel (not seen in training) objects in clutter and even a moving target, achieving a 97.8% success rate and 1.8 cm control accuracy.
Tasks
Published 2017-09-18
URL http://arxiv.org/abs/1709.05746v2
PDF http://arxiv.org/pdf/1709.05746v2.pdf
PWC https://paperswithcode.com/paper/adversarial-discriminative-sim-to-real
Repo https://github.com/Fanleyrobot/ADT
Framework torch

Generative Partition Networks for Multi-Person Pose Estimation

Title Generative Partition Networks for Multi-Person Pose Estimation
Authors Xuecheng Nie, Jiashi Feng, Junliang Xing, Shuicheng Yan
Abstract This paper proposes a new Generative Partition Network (GPN) to address the challenging multi-person pose estimation problem. Different from existing models that are either completely top-down or bottom-up, the proposed GPN introduces a novel strategy–it generates partitions for multiple persons from their global joint candidates and infers instance-specific joint configurations simultaneously. The GPN is favorably featured by low complexity and high accuracy of joint detection and re-organization. In particular, GPN designs a generative model that performs one feed-forward pass to efficiently generate robust person detections with joint partitions, relying on dense regressions from global joint candidates in an embedding space parameterized by centroids of persons. In addition, GPN formulates the inference procedure for joint configurations of human poses as a graph partition problem, and conducts local optimization for each person detection with reliable global affinity cues, leading to complexity reduction and performance improvement. GPN is implemented with the Hourglass architecture as the backbone network to simultaneously learn joint detector and dense regressor. Extensive experiments on benchmarks MPII Human Pose Multi-Person, extended PASCAL-Person-Part, and WAF, show the efficiency of GPN with new state-of-the-art performance.
Tasks Human Detection, Multi-Person Pose Estimation, Pose Estimation
Published 2017-05-21
URL http://arxiv.org/abs/1705.07422v2
PDF http://arxiv.org/pdf/1705.07422v2.pdf
PWC https://paperswithcode.com/paper/generative-partition-networks-for-multi
Repo https://github.com/NieXC/pytorch-ppn
Framework pytorch

Automatic Differentiation for Tensor Algebras

Title Automatic Differentiation for Tensor Algebras
Authors Sebastian Urban, Patrick van der Smagt
Abstract Kjolstad et. al. proposed a tensor algebra compiler. It takes expressions that define a tensor element-wise, such as $f_{ij}(a,b,c,d) = \exp\left[-\sum_{k=0}^4 \left((a_{ik}+b_{jk})^2, c_{ii} + d_{i+k}^3 \right) \right]$, and generates the corresponding compute kernel code. For machine learning, especially deep learning, it is often necessary to compute the gradient of a loss function $l(a,b,c,d)=l(f(a,b,c,d))$ with respect to parameters $a,b,c,d$. If tensor compilers are to be applied in this field, it is necessary to derive expressions for the derivatives of element-wise defined tensors, i.e. expressions for $(da)_{ik}=\partial l/\partial a_{ik}$. When the mapping between function indices and argument indices is not 1:1, special attention is required. For the function $f_{ij} (x) = x_i^2$, the derivative of the loss is $(dx)_i=\partial l/\partial x_i=\sum_j (df)_{ij}2x_i$; the sum is necessary because index $j$ does not appear in the indices of $f$. Another example is $f_{i}(x)=x_{ii}^2$, where $x$ is a matrix; here we have $(dx)_{ij}=\delta_{ij}(df)_i2x_{ii}$; the Kronecker delta is necessary because the derivative is zero for off-diagonal elements. Another indexing scheme is used by $f_{ij}(x)=\exp x_{i+j}$; here the correct derivative is $(dx)_{k}=\sum_i (df)_{i,k-i} \exp x_{k}$, where the range of the sum must be chosen appropriately. In this publication we present an algorithm that can handle any case in which the indices of an argument are an arbitrary linear combination of the indices of the function, thus all the above examples can be handled. Sums (and their ranges) and Kronecker deltas are automatically inserted into the derivatives as necessary. Additionally, the indices are transformed, if required (as in the last example). The algorithm outputs a symbolic expression that can be subsequently fed into a tensor algebra compiler. Source code is provided.
Tasks
Published 2017-11-03
URL http://arxiv.org/abs/1711.01348v1
PDF http://arxiv.org/pdf/1711.01348v1.pdf
PWC https://paperswithcode.com/paper/automatic-differentiation-for-tensor-algebras
Repo https://github.com/surban/TensorAlgDiff
Framework none

Deep Forest

Title Deep Forest
Authors Zhi-Hua Zhou, Ji Feng
Abstract Current deep learning models are mostly build upon neural networks, i.e., multiple layers of parameterized differentiable nonlinear modules that can be trained by backpropagation. In this paper, we explore the possibility of building deep models based on non-differentiable modules. We conjecture that the mystery behind the success of deep neural networks owes much to three characteristics, i.e., layer-by-layer processing, in-model feature transformation and sufficient model complexity. We propose the gcForest approach, which generates \textit{deep forest} holding these characteristics. This is a decision tree ensemble approach, with much less hyper-parameters than deep neural networks, and its model complexity can be automatically determined in a data-dependent way. Experiments show that its performance is quite robust to hyper-parameter settings, such that in most cases, even across different data from different domains, it is able to get excellent performance by using the same default setting. This study opens the door of deep learning based on non-differentiable modules, and exhibits the possibility of constructing deep models without using backpropagation.
Tasks
Published 2017-02-28
URL http://arxiv.org/abs/1702.08835v3
PDF http://arxiv.org/pdf/1702.08835v3.pdf
PWC https://paperswithcode.com/paper/deep-forest
Repo https://github.com/leopiney/deep-forest
Framework none

FFDNet: Toward a Fast and Flexible Solution for CNN based Image Denoising

Title FFDNet: Toward a Fast and Flexible Solution for CNN based Image Denoising
Authors Kai Zhang, Wangmeng Zuo, Lei Zhang
Abstract Due to the fast inference and good performance, discriminative learning methods have been widely studied in image denoising. However, these methods mostly learn a specific model for each noise level, and require multiple models for denoising images with different noise levels. They also lack flexibility to deal with spatially variant noise, limiting their applications in practical denoising. To address these issues, we present a fast and flexible denoising convolutional neural network, namely FFDNet, with a tunable noise level map as the input. The proposed FFDNet works on downsampled sub-images, achieving a good trade-off between inference speed and denoising performance. In contrast to the existing discriminative denoisers, FFDNet enjoys several desirable properties, including (i) the ability to handle a wide range of noise levels (i.e., [0, 75]) effectively with a single network, (ii) the ability to remove spatially variant noise by specifying a non-uniform noise level map, and (iii) faster speed than benchmark BM3D even on CPU without sacrificing denoising performance. Extensive experiments on synthetic and real noisy images are conducted to evaluate FFDNet in comparison with state-of-the-art denoisers. The results show that FFDNet is effective and efficient, making it highly attractive for practical denoising applications.
Tasks Denoising, Image Denoising
Published 2017-10-11
URL http://arxiv.org/abs/1710.04026v2
PDF http://arxiv.org/pdf/1710.04026v2.pdf
PWC https://paperswithcode.com/paper/ffdnet-toward-a-fast-and-flexible-solution
Repo https://github.com/Aoi-hosizora/FFDNet_pytorch
Framework pytorch

Protein identification with deep learning: from abc to xyz

Title Protein identification with deep learning: from abc to xyz
Authors Ngoc Hieu Tran, Zachariah Levine, Lei Xin, Baozhen Shan, Ming Li
Abstract Proteins are the main workhorses of biological functions in a cell, a tissue, or an organism. Identification and quantification of proteins in a given sample, e.g. a cell type under normal/disease conditions, are fundamental tasks for the understanding of human health and disease. In this paper, we present DeepNovo, a deep learning-based tool to address the problem of protein identification from tandem mass spectrometry data. The idea was first proposed in the context of de novo peptide sequencing [1] in which convolutional neural networks and recurrent neural networks were applied to predict the amino acid sequence of a peptide from its spectrum, a similar task to generating a caption from an image. We further develop DeepNovo to perform sequence database search, the main technique for peptide identification that greatly benefits from numerous existing protein databases. We combine two modules de novo sequencing and database search into a single deep learning framework for peptide identification, and integrate de Bruijn graph assembly technique to offer a complete solution to reconstruct protein sequences from tandem mass spectrometry data. This paper describes a comprehensive protocol of DeepNovo for protein identification, including training neural network models, dynamic programming search, database querying, estimation of false discovery rate, and de Bruijn graph assembly. Training and testing data, model implementations, and comprehensive tutorials in form of IPython notebooks are available in our GitHub repository (https://github.com/nh2tran/DeepNovo).
Tasks
Published 2017-10-08
URL http://arxiv.org/abs/1710.02765v1
PDF http://arxiv.org/pdf/1710.02765v1.pdf
PWC https://paperswithcode.com/paper/protein-identification-with-deep-learning
Repo https://github.com/nh2tran/DeepNovo
Framework tf

Universal Semantic Parsing

Title Universal Semantic Parsing
Authors Siva Reddy, Oscar Täckström, Slav Petrov, Mark Steedman, Mirella Lapata
Abstract Universal Dependencies (UD) offer a uniform cross-lingual syntactic representation, with the aim of advancing multilingual applications. Recent work shows that semantic parsing can be accomplished by transforming syntactic dependencies to logical forms. However, this work is limited to English, and cannot process dependency graphs, which allow handling complex phenomena such as control. In this work, we introduce UDepLambda, a semantic interface for UD, which maps natural language to logical forms in an almost language-independent fashion and can process dependency graphs. We perform experiments on question answering against Freebase and provide German and Spanish translations of the WebQuestions and GraphQuestions datasets to facilitate multilingual evaluation. Results show that UDepLambda outperforms strong baselines across languages and datasets. For English, it achieves a 4.9 F1 point improvement over the state-of-the-art on GraphQuestions. Our code and data can be downloaded at https://github.com/sivareddyg/udeplambda.
Tasks Question Answering, Semantic Parsing
Published 2017-02-10
URL http://arxiv.org/abs/1702.03196v4
PDF http://arxiv.org/pdf/1702.03196v4.pdf
PWC https://paperswithcode.com/paper/universal-semantic-parsing
Repo https://github.com/sivareddyg/udeplambda
Framework none

FDR-Corrected Sparse Canonical Correlation Analysis with Applications to Imaging Genomics

Title FDR-Corrected Sparse Canonical Correlation Analysis with Applications to Imaging Genomics
Authors Alexej Gossmann, Pascal Zille, Vince Calhoun, Yu-Ping Wang
Abstract Reducing the number of false discoveries is presently one of the most pressing issues in the life sciences. It is of especially great importance for many applications in neuroimaging and genomics, where datasets are typically high-dimensional, which means that the number of explanatory variables exceeds the sample size. The false discovery rate (FDR) is a criterion that can be employed to address that issue. Thus it has gained great popularity as a tool for testing multiple hypotheses. Canonical correlation analysis (CCA) is a statistical technique that is used to make sense of the cross-correlation of two sets of measurements collected on the same set of samples (e.g., brain imaging and genomic data for the same mental illness patients), and sparse CCA extends the classical method to high-dimensional settings. Here we propose a way of applying the FDR concept to sparse CCA, and a method to control the FDR. The proposed FDR correction directly influences the sparsity of the solution, adapting it to the unknown true sparsity level. Theoretical derivation as well as simulation studies show that our procedure indeed keeps the FDR of the canonical vectors below a user-specified target level. We apply the proposed method to an imaging genomics dataset from the Philadelphia Neurodevelopmental Cohort. Our results link the brain connectivity profiles derived from brain activity during an emotion identification task, as measured by functional magnetic resonance imaging (fMRI), to the corresponding subjects’ genomic data.
Tasks
Published 2017-05-11
URL http://arxiv.org/abs/1705.04312v4
PDF http://arxiv.org/pdf/1705.04312v4.pdf
PWC https://paperswithcode.com/paper/fdr-corrected-sparse-canonical-correlation
Repo https://github.com/agisga/FDRcorrectedSCCA
Framework none

From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood

Title From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood
Authors Kelvin Guu, Panupong Pasupat, Evan Zheran Liu, Percy Liang
Abstract Our goal is to learn a semantic parser that maps natural language utterances into executable programs when only indirect supervision is available: examples are labeled with the correct execution result, but not the program itself. Consequently, we must search the space of programs for those that output the correct result, while not being misled by spurious programs: incorrect programs that coincidentally output the correct result. We connect two common learning paradigms, reinforcement learning (RL) and maximum marginal likelihood (MML), and then present a new learning algorithm that combines the strengths of both. The new algorithm guards against spurious programs by combining the systematic search traditionally employed in MML with the randomized exploration of RL, and by updating parameters such that probability is spread more evenly across consistent programs. We apply our learning algorithm to a new neural semantic parser and show significant gains over existing state-of-the-art results on a recent context-dependent semantic parsing task.
Tasks Semantic Parsing
Published 2017-04-25
URL http://arxiv.org/abs/1704.07926v1
PDF http://arxiv.org/pdf/1704.07926v1.pdf
PWC https://paperswithcode.com/paper/from-language-to-programs-bridging
Repo https://github.com/kelvinguu/lang2program
Framework tf
comments powered by Disqus