May 7, 2019

2755 words 13 mins read

Paper Group AWR 75

Paper Group AWR 75

A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection. DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network. Learning through Dialogue Interactions by Asking Questions. C-mix: a high dimensional mixture model for censored durations, with applications to genetic d …

A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection

Title A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection
Authors Mingbin Xu, Hui Jiang
Abstract In this paper, we study a novel approach for named entity recognition (NER) and mention detection in natural language processing. Instead of treating NER as a sequence labelling problem, we propose a new local detection approach, which rely on the recent fixed-size ordinally forgetting encoding (FOFE) method to fully encode each sentence fragment and its left/right contexts into a fixed-size representation. Afterwards, a simple feedforward neural network is used to reject or predict entity label for each individual fragment. The proposed method has been evaluated in several popular NER and mention detection tasks, including the CoNLL 2003 NER task and TAC-KBP2015 and TAC-KBP2016 Tri-lingual Entity Discovery and Linking (EDL) tasks. Our methods have yielded pretty strong performance in all of these examined tasks. This local detection approach has shown many advantages over the traditional sequence labelling methods.
Tasks Named Entity Recognition
Published 2016-11-02
URL http://arxiv.org/abs/1611.00801v1
PDF http://arxiv.org/pdf/1611.00801v1.pdf
PWC https://paperswithcode.com/paper/a-fofe-based-local-detection-approach-for
Repo https://github.com/xmb-cipher/fofe-ner
Framework tf

DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network

Title DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network
Authors Jared Katzman, Uri Shaham, Jonathan Bates, Alexander Cloninger, Tingting Jiang, Yuval Kluger
Abstract Medical practitioners use survival models to explore and understand the relationships between patients’ covariates (e.g. clinical and genetic features) and the effectiveness of various treatment options. Standard survival models like the linear Cox proportional hazards model require extensive feature engineering or prior medical knowledge to model treatment interaction at an individual level. While nonlinear survival methods, such as neural networks and survival forests, can inherently model these high-level interaction terms, they have yet to be shown as effective treatment recommender systems. We introduce DeepSurv, a Cox proportional hazards deep neural network and state-of-the-art survival method for modeling interactions between a patient’s covariates and treatment effectiveness in order to provide personalized treatment recommendations. We perform a number of experiments training DeepSurv on simulated and real survival data. We demonstrate that DeepSurv performs as well as or better than other state-of-the-art survival models and validate that DeepSurv successfully models increasingly complex relationships between a patient’s covariates and their risk of failure. We then show how DeepSurv models the relationship between a patient’s features and effectiveness of different treatment options to show how DeepSurv can be used to provide individual treatment recommendations. Finally, we train DeepSurv on real clinical studies to demonstrate how it’s personalized treatment recommendations would increase the survival time of a set of patients. The predictive and modeling capabilities of DeepSurv will enable medical researchers to use deep neural networks as a tool in their exploration, understanding, and prediction of the effects of a patient’s characteristics on their risk of failure.
Tasks Feature Engineering, Predicting Patient Outcomes, Recommendation Systems, Survival Analysis
Published 2016-06-02
URL http://arxiv.org/abs/1606.00931v3
PDF http://arxiv.org/pdf/1606.00931v3.pdf
PWC https://paperswithcode.com/paper/deepsurv-personalized-treatment-recommender
Repo https://github.com/jaredleekatzman/DeepSurv
Framework none

Learning through Dialogue Interactions by Asking Questions

Title Learning through Dialogue Interactions by Asking Questions
Authors Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc’Aurelio Ranzato, Jason Weston
Abstract A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction. In this work, we explore this direction by designing a simulator and a set of synthetic tasks in the movie domain that allow such interactions between a learner and a teacher. We investigate how a learner can benefit from asking questions in both offline and online reinforcement learning settings, and demonstrate that the learner improves when asking questions. Finally, real experiments with Mechanical Turk validate the approach. Our work represents a first step in developing such end-to-end learned interactive dialogue agents.
Tasks
Published 2016-12-15
URL http://arxiv.org/abs/1612.04936v4
PDF http://arxiv.org/pdf/1612.04936v4.pdf
PWC https://paperswithcode.com/paper/learning-through-dialogue-interactions-by
Repo https://github.com/aus10powell/Automated-Health-Responses
Framework tf

C-mix: a high dimensional mixture model for censored durations, with applications to genetic data

Title C-mix: a high dimensional mixture model for censored durations, with applications to genetic data
Authors Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot
Abstract We introduce a mixture model for censored durations (C-mix), and develop maximum likelihood inference for the joint estimation of the time distributions and latent regression parameters of the model. We consider a high-dimensional setting, with datasets containing a large number of biomedical covariates. We therefore penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model. Inference is achieved using an efficient Quasi-Newton Expectation Maximization (QNEM) algorithm, for which we provide convergence properties. We then propose a score by assessing the patients risk of early adverse event. The statistical performance of the method is examined on an extensive Monte Carlo simulation study, and finally illustrated on three genetic datasets with high-dimensional covariates. We show that our approach outperforms the state-of-the-art, namely both the CURE and Cox proportional hazards models for this task, both in terms of C-index and AUC(t).
Tasks
Published 2016-10-24
URL http://arxiv.org/abs/1610.07407v5
PDF http://arxiv.org/pdf/1610.07407v5.pdf
PWC https://paperswithcode.com/paper/c-mix-a-high-dimensional-mixture-model-for
Repo https://github.com/SimonBussy/C-mix
Framework none

Temporal Generative Adversarial Nets with Singular Value Clipping

Title Temporal Generative Adversarial Nets with Singular Value Clipping
Authors Masaki Saito, Eiichi Matsumoto, Shunta Saito
Abstract In this paper, we propose a generative model, Temporal Generative Adversarial Nets (TGAN), which can learn a semantic representation of unlabeled videos, and is capable of generating videos. Unlike existing Generative Adversarial Nets (GAN)-based methods that generate videos with a single generator consisting of 3D deconvolutional layers, our model exploits two different types of generators: a temporal generator and an image generator. The temporal generator takes a single latent variable as input and outputs a set of latent variables, each of which corresponds to an image frame in a video. The image generator transforms a set of such latent variables into a video. To deal with instability in training of GAN with such advanced networks, we adopt a recently proposed model, Wasserstein GAN, and propose a novel method to train it stably in an end-to-end manner. The experimental results demonstrate the effectiveness of our methods.
Tasks Video Generation
Published 2016-11-21
URL http://arxiv.org/abs/1611.06624v3
PDF http://arxiv.org/pdf/1611.06624v3.pdf
PWC https://paperswithcode.com/paper/temporal-generative-adversarial-nets-with
Repo https://github.com/pfnet-research/tgan
Framework none

A guide to convolution arithmetic for deep learning

Title A guide to convolution arithmetic for deep learning
Authors Vincent Dumoulin, Francesco Visin
Abstract We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and output shape) of convolutional, pooling and transposed convolutional layers, as well as the relationship between convolutional and transposed convolutional layers. Relationships are derived for various cases, and are illustrated in order to make them intuitive.
Tasks
Published 2016-03-23
URL http://arxiv.org/abs/1603.07285v2
PDF http://arxiv.org/pdf/1603.07285v2.pdf
PWC https://paperswithcode.com/paper/a-guide-to-convolution-arithmetic-for-deep
Repo https://github.com/vdumoulin/conv_arithmetic
Framework none

ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala

Title ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala
Authors N. Astrakhantsev
Abstract Automatically recognized terminology is widely used for various domain-specific texts processing tasks, such as machine translation, information retrieval or sentiment analysis. However, there is still no agreement on which methods are best suited for particular settings and, moreover, there is no reliable comparison of already developed methods. We believe that one of the main reasons is the lack of state-of-the-art methods implementations, which are usually non-trivial to recreate. In order to address these issues, we present ATR4S, an open-source software written in Scala that comprises more than 15 methods for automatic terminology recognition (ATR) and implements the whole pipeline from text document preprocessing, to term candidates collection, term candidates scoring, and finally, term candidates ranking. It is highly scalable, modular and configurable tool with support of automatic caching. We also compare 10 state-of-the-art methods on 7 open datasets by average precision and processing time. Experimental comparison reveals that no single method demonstrates best average precision for all datasets and that other available tools for ATR do not contain the best methods.
Tasks Information Retrieval, Machine Translation, Sentiment Analysis
Published 2016-11-23
URL http://arxiv.org/abs/1611.07804v1
PDF http://arxiv.org/pdf/1611.07804v1.pdf
PWC https://paperswithcode.com/paper/atr4s-toolkit-with-state-of-the-art-automatic
Repo https://github.com/ispras/atr4s
Framework none

Continuous 3D Label Stereo Matching using Local Expansion Moves

Title Continuous 3D Label Stereo Matching using Local Expansion Moves
Authors Tatsunori Taniai, Yasuyuki Matsushita, Yoichi Sato, Takeshi Naemura
Abstract We present an accurate stereo matching method using local expansion moves based on graph cuts. This new move-making scheme is used to efficiently infer per-pixel 3D plane labels on a pairwise Markov random field (MRF) that effectively combines recently proposed slanted patch matching and curvature regularization terms. The local expansion moves are presented as many alpha-expansions defined for small grid regions. The local expansion moves extend traditional expansion moves by two ways: localization and spatial propagation. By localization, we use different candidate alpha-labels according to the locations of local alpha-expansions. By spatial propagation, we design our local alpha-expansions to propagate currently assigned labels for nearby regions. With this localization and spatial propagation, our method can efficiently infer MRF models with a continuous label space using randomized search. Our method has several advantages over previous approaches that are based on fusion moves or belief propagation; it produces submodular moves deriving a subproblem optimality; it helps find good, smooth, piecewise linear disparity maps; it is suitable for parallelization; it can use cost-volume filtering techniques for accelerating the matching cost computations. Even using a simple pairwise MRF, our method is shown to have best performance in the Middlebury stereo benchmark V2 and V3.
Tasks Stereo Matching, Stereo Matching Hand
Published 2016-03-28
URL http://arxiv.org/abs/1603.08328v3
PDF http://arxiv.org/pdf/1603.08328v3.pdf
PWC https://paperswithcode.com/paper/continuous-3d-label-stereo-matching-using
Repo https://github.com/kbatsos/CBMV
Framework none

XGBoost: A Scalable Tree Boosting System

Title XGBoost: A Scalable Tree Boosting System
Authors Tianqi Chen, Carlos Guestrin
Abstract Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.
Tasks Dimensionality Reduction
Published 2016-03-09
URL http://arxiv.org/abs/1603.02754v3
PDF http://arxiv.org/pdf/1603.02754v3.pdf
PWC https://paperswithcode.com/paper/xgboost-a-scalable-tree-boosting-system
Repo https://github.com/jlanday/X-ray-Object-Classification
Framework none

Stability selection for component-wise gradient boosting in multiple dimensions

Title Stability selection for component-wise gradient boosting in multiple dimensions
Authors Janek Thomas, Andreas Mayr, Bernd Bischl, Matthias Schmid, Adam Smith, Benjamin Hofner
Abstract We present a new algorithm for boosting generalized additive models for location, scale and shape (GAMLSS) that allows to incorporate stability selection, an increasingly popular way to obtain stable sets of covariates while controlling the per-family error rate (PFER). The model is fitted repeatedly to subsampled data and variables with high selection frequencies are extracted. To apply stability selection to boosted GAMLSS, we develop a new “noncyclical” fitting algorithm that incorporates an additional selection step of the best-fitting distribution parameter in each iteration. This new algorithms has the additional advantage that optimizing the tuning parameters of boosting is reduced from a multi-dimensional to a one-dimensional problem with vastly decreased complexity. The performance of the novel algorithm is evaluated in an extensive simulation study. We apply this new algorithm to a study to estimate abundance of common eider in Massachusetts, USA, featuring excess zeros, overdispersion, non-linearity and spatio-temporal structures. Eider abundance is estimated via boosted GAMLSS, allowing both mean and overdispersion to be regressed on covariates. Stability selection is used to obtain a sparse set of stable predictors.
Tasks
Published 2016-11-30
URL http://arxiv.org/abs/1611.10171v1
PDF http://arxiv.org/pdf/1611.10171v1.pdf
PWC https://paperswithcode.com/paper/stability-selection-for-component-wise
Repo https://github.com/boost-R/gamboostLSS
Framework none

Error bounds for approximations with deep ReLU networks

Title Error bounds for approximations with deep ReLU networks
Authors Dmitry Yarotsky
Abstract We study expressive power of shallow and deep neural networks with piece-wise linear activation functions. We establish new rigorous upper and lower bounds for the network complexity in the setting of approximations in Sobolev spaces. In particular, we prove that deep ReLU networks more efficiently approximate smooth functions than shallow networks. In the case of approximations of 1D Lipschitz functions we describe adaptive depth-6 network architectures more efficient than the standard shallow architecture.
Tasks
Published 2016-10-03
URL http://arxiv.org/abs/1610.01145v3
PDF http://arxiv.org/pdf/1610.01145v3.pdf
PWC https://paperswithcode.com/paper/error-bounds-for-approximations-with-deep
Repo https://github.com/arsenal9971/TUB_MoDL
Framework tf

Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?

Title Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?
Authors Arturo Deza, Miguel P. Eckstein
Abstract Previous studies have proposed image-based clutter measures that correlate with human search times and/or eye movements. However, most models do not take into account the fact that the effects of clutter interact with the foveated nature of the human visual system: visual clutter further from the fovea has an increasing detrimental influence on perception. Here, we introduce a new foveated clutter model to predict the detrimental effects in target search utilizing a forced fixation search task. We use Feature Congestion (Rosenholtz et al.) as our non foveated clutter model, and we stack a peripheral architecture on top of Feature Congestion for our foveated model. We introduce the Peripheral Integration Feature Congestion (PIFC) coefficient, as a fundamental ingredient of our model that modulates clutter as a non-linear gain contingent on eccentricity. We finally show that Foveated Feature Congestion (FFC) clutter scores r(44) = -0.82 correlate better with target detection (hit rate) than regular Feature Congestion r(44) = -0.19 in forced fixation search. Thus, our model allows us to enrich clutter perception research by computing fixation specific clutter maps. A toolbox for creating peripheral architectures: Piranhas: Peripheral Architectures for Natural, Hybrid and Artificial Systems will be made available.
Tasks
Published 2016-08-14
URL http://arxiv.org/abs/1608.04042v1
PDF http://arxiv.org/pdf/1608.04042v1.pdf
PWC https://paperswithcode.com/paper/can-peripheral-representations-improve
Repo https://github.com/ArturoDeza/Piranhas
Framework torch

TextBoxes: A Fast Text Detector with a Single Deep Neural Network

Title TextBoxes: A Fast Text Detector with a Single Deep Neural Network
Authors Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, Wenyu Liu
Abstract This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression. TextBoxes outperforms competing methods in terms of text localization accuracy and is much faster, taking only 0.09s per image in a fast implementation. Furthermore, combined with a text recognizer, TextBoxes significantly outperforms state-of-the-art approaches on word spotting and end-to-end text recognition tasks.
Tasks
Published 2016-11-21
URL http://arxiv.org/abs/1611.06779v1
PDF http://arxiv.org/pdf/1611.06779v1.pdf
PWC https://paperswithcode.com/paper/textboxes-a-fast-text-detector-with-a-single
Repo https://github.com/MhLiao/TextBoxes
Framework none

Unsupervised, Efficient and Semantic Expertise Retrieval

Title Unsupervised, Efficient and Semantic Expertise Retrieval
Authors Christophe Van Gysel, Maarten de Rijke, Marcel Worring
Abstract We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. We exclusively employ textual evidence and avoid explicit feature engineering by learning distributed word representations in an unsupervised way. We compare our model to state-of-the-art unsupervised statistical vector space and probabilistic generative approaches. Our proposed log-linear model achieves the retrieval performance levels of state-of-the-art document-centric methods with the low inference cost of so-called profile-centric approaches. It yields a statistically significant improved ranking over vector space and generative models in most cases, matching the performance of supervised methods on various benchmarks. That is, by using solely text we can do as well as methods that work with external evidence and/or relevance feedback. A contrastive analysis of rankings produced by discriminative and generative approaches shows that they have complementary strengths due to the ability of the unsupervised discriminative model to perform semantic matching.
Tasks Feature Engineering
Published 2016-08-23
URL http://arxiv.org/abs/1608.06651v2
PDF http://arxiv.org/pdf/1608.06651v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-efficient-and-semantic-expertise
Repo https://github.com/cvangysel/SERT
Framework none

ImageNet pre-trained models with batch normalization

Title ImageNet pre-trained models with batch normalization
Authors Marcel Simon, Erik Rodner, Joachim Denzler
Abstract Convolutional neural networks (CNN) pre-trained on ImageNet are the backbone of most state-of-the-art approaches. In this paper, we present a new set of pre-trained models with popular state-of-the-art architectures for the Caffe framework. The first release includes Residual Networks (ResNets) with generation script as well as the batch-normalization-variants of AlexNet and VGG19. All models outperform previous models with the same architecture. The models and training code are available at http://www.inf-cv.uni-jena.de/Research/CNN+Models.html and https://github.com/cvjena/cnn-models
Tasks
Published 2016-12-05
URL http://arxiv.org/abs/1612.01452v2
PDF http://arxiv.org/pdf/1612.01452v2.pdf
PWC https://paperswithcode.com/paper/imagenet-pre-trained-models-with-batch
Repo https://github.com/fdac18/ForensicImages
Framework none
comments powered by Disqus