May 7, 2019

2755 words 13 mins read

Paper Group AWR 75

A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection. DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network. Learning through Dialogue Interactions by Asking Questions. C-mix: a high dimensional mixture model for censored durations, with applications to genetic d …

A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection


Title	A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection
Authors	Mingbin Xu, Hui Jiang
Abstract	In this paper, we study a novel approach for named entity recognition (NER) and mention detection in natural language processing. Instead of treating NER as a sequence labelling problem, we propose a new local detection approach, which rely on the recent fixed-size ordinally forgetting encoding (FOFE) method to fully encode each sentence fragment and its left/right contexts into a fixed-size representation. Afterwards, a simple feedforward neural network is used to reject or predict entity label for each individual fragment. The proposed method has been evaluated in several popular NER and mention detection tasks, including the CoNLL 2003 NER task and TAC-KBP2015 and TAC-KBP2016 Tri-lingual Entity Discovery and Linking (EDL) tasks. Our methods have yielded pretty strong performance in all of these examined tasks. This local detection approach has shown many advantages over the traditional sequence labelling methods.
Tasks	Named Entity Recognition
Published	2016-11-02
URL	http://arxiv.org/abs/1611.00801v1
PDF	http://arxiv.org/pdf/1611.00801v1.pdf
PWC	https://paperswithcode.com/paper/a-fofe-based-local-detection-approach-for
Repo	https://github.com/xmb-cipher/fofe-ner
Framework	tf

DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network


Title	DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network
Authors	Jared Katzman, Uri Shaham, Jonathan Bates, Alexander Cloninger, Tingting Jiang, Yuval Kluger
Abstract	Medical practitioners use survival models to explore and understand the relationships between patients’ covariates (e.g. clinical and genetic features) and the effectiveness of various treatment options. Standard survival models like the linear Cox proportional hazards model require extensive feature engineering or prior medical knowledge to model treatment interaction at an individual level. While nonlinear survival methods, such as neural networks and survival forests, can inherently model these high-level interaction terms, they have yet to be shown as effective treatment recommender systems. We introduce DeepSurv, a Cox proportional hazards deep neural network and state-of-the-art survival method for modeling interactions between a patient’s covariates and treatment effectiveness in order to provide personalized treatment recommendations. We perform a number of experiments training DeepSurv on simulated and real survival data. We demonstrate that DeepSurv performs as well as or better than other state-of-the-art survival models and validate that DeepSurv successfully models increasingly complex relationships between a patient’s covariates and their risk of failure. We then show how DeepSurv models the relationship between a patient’s features and effectiveness of different treatment options to show how DeepSurv can be used to provide individual treatment recommendations. Finally, we train DeepSurv on real clinical studies to demonstrate how it’s personalized treatment recommendations would increase the survival time of a set of patients. The predictive and modeling capabilities of DeepSurv will enable medical researchers to use deep neural networks as a tool in their exploration, understanding, and prediction of the effects of a patient’s characteristics on their risk of failure.
Tasks	Feature Engineering, Predicting Patient Outcomes, Recommendation Systems, Survival Analysis
Published	2016-06-02
URL	http://arxiv.org/abs/1606.00931v3
PDF	http://arxiv.org/pdf/1606.00931v3.pdf
PWC	https://paperswithcode.com/paper/deepsurv-personalized-treatment-recommender
Repo	https://github.com/jaredleekatzman/DeepSurv
Framework	none

Learning through Dialogue Interactions by Asking Questions


Title	Learning through Dialogue Interactions by Asking Questions
Authors	Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc’Aurelio Ranzato, Jason Weston
Abstract	A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction. In this work, we explore this direction by designing a simulator and a set of synthetic tasks in the movie domain that allow such interactions between a learner and a teacher. We investigate how a learner can benefit from asking questions in both offline and online reinforcement learning settings, and demonstrate that the learner improves when asking questions. Finally, real experiments with Mechanical Turk validate the approach. Our work represents a first step in developing such end-to-end learned interactive dialogue agents.
Tasks
Published	2016-12-15
URL	http://arxiv.org/abs/1612.04936v4
PDF	http://arxiv.org/pdf/1612.04936v4.pdf
PWC	https://paperswithcode.com/paper/learning-through-dialogue-interactions-by
Repo	https://github.com/aus10powell/Automated-Health-Responses
Framework	tf

C-mix: a high dimensional mixture model for censored durations, with applications to genetic data


Title	C-mix: a high dimensional mixture model for censored durations, with applications to genetic data
Authors	Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot
Abstract	We introduce a mixture model for censored durations (C-mix), and develop maximum likelihood inference for the joint estimation of the time distributions and latent regression parameters of the model. We consider a high-dimensional setting, with datasets containing a large number of biomedical covariates. We therefore penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model. Inference is achieved using an efficient Quasi-Newton Expectation Maximization (QNEM) algorithm, for which we provide convergence properties. We then propose a score by assessing the patients risk of early adverse event. The statistical performance of the method is examined on an extensive Monte Carlo simulation study, and finally illustrated on three genetic datasets with high-dimensional covariates. We show that our approach outperforms the state-of-the-art, namely both the CURE and Cox proportional hazards models for this task, both in terms of C-index and AUC(t).
Tasks
Published	2016-10-24
URL	http://arxiv.org/abs/1610.07407v5
PDF	http://arxiv.org/pdf/1610.07407v5.pdf
PWC	https://paperswithcode.com/paper/c-mix-a-high-dimensional-mixture-model-for
Repo	https://github.com/SimonBussy/C-mix
Framework	none

Temporal Generative Adversarial Nets with Singular Value Clipping


Title	Temporal Generative Adversarial Nets with Singular Value Clipping
Authors	Masaki Saito, Eiichi Matsumoto, Shunta Saito
Abstract	In this paper, we propose a generative model, Temporal Generative Adversarial Nets (TGAN), which can learn a semantic representation of unlabeled videos, and is capable of generating videos. Unlike existing Generative Adversarial Nets (GAN)-based methods that generate videos with a single generator consisting of 3D deconvolutional layers, our model exploits two different types of generators: a temporal generator and an image generator. The temporal generator takes a single latent variable as input and outputs a set of latent variables, each of which corresponds to an image frame in a video. The image generator transforms a set of such latent variables into a video. To deal with instability in training of GAN with such advanced networks, we adopt a recently proposed model, Wasserstein GAN, and propose a novel method to train it stably in an end-to-end manner. The experimental results demonstrate the effectiveness of our methods.
Tasks	Video Generation
Published	2016-11-21
URL	http://arxiv.org/abs/1611.06624v3
PDF	http://arxiv.org/pdf/1611.06624v3.pdf
PWC	https://paperswithcode.com/paper/temporal-generative-adversarial-nets-with
Repo	https://github.com/pfnet-research/tgan
Framework	none

A guide to convolution arithmetic for deep learning


Title	A guide to convolution arithmetic for deep learning
Authors	Vincent Dumoulin, Francesco Visin
Abstract	We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and output shape) of convolutional, pooling and transposed convolutional layers, as well as the relationship between convolutional and transposed convolutional layers. Relationships are derived for various cases, and are illustrated in order to make them intuitive.
Tasks
Published	2016-03-23
URL	http://arxiv.org/abs/1603.07285v2
PDF	http://arxiv.org/pdf/1603.07285v2.pdf
PWC	https://paperswithcode.com/paper/a-guide-to-convolution-arithmetic-for-deep
Repo	https://github.com/vdumoulin/conv_arithmetic
Framework	none

ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala


Title	ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala
Authors	N. Astrakhantsev
Abstract	Automatically recognized terminology is widely used for various domain-specific texts processing tasks, such as machine translation, information retrieval or sentiment analysis. However, there is still no agreement on which methods are best suited for particular settings and, moreover, there is no reliable comparison of already developed methods. We believe that one of the main reasons is the lack of state-of-the-art methods implementations, which are usually non-trivial to recreate. In order to address these issues, we present ATR4S, an open-source software written in Scala that comprises more than 15 methods for automatic terminology recognition (ATR) and implements the whole pipeline from text document preprocessing, to term candidates collection, term candidates scoring, and finally, term candidates ranking. It is highly scalable, modular and configurable tool with support of automatic caching. We also compare 10 state-of-the-art methods on 7 open datasets by average precision and processing time. Experimental comparison reveals that no single method demonstrates best average precision for all datasets and that other available tools for ATR do not contain the best methods.
Tasks	Information Retrieval, Machine Translation, Sentiment Analysis
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07804v1
PDF	http://arxiv.org/pdf/1611.07804v1.pdf
PWC	https://paperswithcode.com/paper/atr4s-toolkit-with-state-of-the-art-automatic
Repo	https://github.com/ispras/atr4s
Framework	none

Continuous 3D Label Stereo Matching using Local Expansion Moves


Title	Continuous 3D Label Stereo Matching using Local Expansion Moves
Authors	Tatsunori Taniai, Yasuyuki Matsushita, Yoichi Sato, Takeshi Naemura
Abstract	We present an accurate stereo matching method using local expansion moves based on graph cuts. This new move-making scheme is used to efficiently infer per-pixel 3D plane labels on a pairwise Markov random field (MRF) that effectively combines recently proposed slanted patch matching and curvature regularization terms. The local expansion moves are presented as many alpha-expansions defined for small grid regions. The local expansion moves extend traditional expansion moves by two ways: localization and spatial propagation. By localization, we use different candidate alpha-labels according to the locations of local alpha-expansions. By spatial propagation, we design our local alpha-expansions to propagate currently assigned labels for nearby regions. With this localization and spatial propagation, our method can efficiently infer MRF models with a continuous label space using randomized search. Our method has several advantages over previous approaches that are based on fusion moves or belief propagation; it produces submodular moves deriving a subproblem optimality; it helps find good, smooth, piecewise linear disparity maps; it is suitable for parallelization; it can use cost-volume filtering techniques for accelerating the matching cost computations. Even using a simple pairwise MRF, our method is shown to have best performance in the Middlebury stereo benchmark V2 and V3.
Tasks	Stereo Matching, Stereo Matching Hand
Published	2016-03-28
URL	http://arxiv.org/abs/1603.08328v3
PDF	http://arxiv.org/pdf/1603.08328v3.pdf
PWC	https://paperswithcode.com/paper/continuous-3d-label-stereo-matching-using
Repo	https://github.com/kbatsos/CBMV
Framework	none

XGBoost: A Scalable Tree Boosting System


Title	XGBoost: A Scalable Tree Boosting System
Authors	Tianqi Chen, Carlos Guestrin
Abstract	Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.
Tasks	Dimensionality Reduction
Published	2016-03-09
URL	http://arxiv.org/abs/1603.02754v3
PDF	http://arxiv.org/pdf/1603.02754v3.pdf
PWC	https://paperswithcode.com/paper/xgboost-a-scalable-tree-boosting-system
Repo	https://github.com/jlanday/X-ray-Object-Classification
Framework	none

Stability selection for component-wise gradient boosting in multiple dimensions


Title	Stability selection for component-wise gradient boosting in multiple dimensions
Authors	Janek Thomas, Andreas Mayr, Bernd Bischl, Matthias Schmid, Adam Smith, Benjamin Hofner
Abstract	We present a new algorithm for boosting generalized additive models for location, scale and shape (GAMLSS) that allows to incorporate stability selection, an increasingly popular way to obtain stable sets of covariates while controlling the per-family error rate (PFER). The model is fitted repeatedly to subsampled data and variables with high selection frequencies are extracted. To apply stability selection to boosted GAMLSS, we develop a new “noncyclical” fitting algorithm that incorporates an additional selection step of the best-fitting distribution parameter in each iteration. This new algorithms has the additional advantage that optimizing the tuning parameters of boosting is reduced from a multi-dimensional to a one-dimensional problem with vastly decreased complexity. The performance of the novel algorithm is evaluated in an extensive simulation study. We apply this new algorithm to a study to estimate abundance of common eider in Massachusetts, USA, featuring excess zeros, overdispersion, non-linearity and spatio-temporal structures. Eider abundance is estimated via boosted GAMLSS, allowing both mean and overdispersion to be regressed on covariates. Stability selection is used to obtain a sparse set of stable predictors.
Tasks
Published	2016-11-30
URL	http://arxiv.org/abs/1611.10171v1
PDF	http://arxiv.org/pdf/1611.10171v1.pdf
PWC	https://paperswithcode.com/paper/stability-selection-for-component-wise
Repo	https://github.com/boost-R/gamboostLSS
Framework	none

Error bounds for approximations with deep ReLU networks


Title	Error bounds for approximations with deep ReLU networks
Authors	Dmitry Yarotsky
Abstract	We study expressive power of shallow and deep neural networks with piece-wise linear activation functions. We establish new rigorous upper and lower bounds for the network complexity in the setting of approximations in Sobolev spaces. In particular, we prove that deep ReLU networks more efficiently approximate smooth functions than shallow networks. In the case of approximations of 1D Lipschitz functions we describe adaptive depth-6 network architectures more efficient than the standard shallow architecture.
Tasks
Published	2016-10-03
URL	http://arxiv.org/abs/1610.01145v3
PDF	http://arxiv.org/pdf/1610.01145v3.pdf
PWC	https://paperswithcode.com/paper/error-bounds-for-approximations-with-deep
Repo	https://github.com/arsenal9971/TUB_MoDL
Framework	tf

Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?


Title	Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?
Authors	Arturo Deza, Miguel P. Eckstein
Abstract	Previous studies have proposed image-based clutter measures that correlate with human search times and/or eye movements. However, most models do not take into account the fact that the effects of clutter interact with the foveated nature of the human visual system: visual clutter further from the fovea has an increasing detrimental influence on perception. Here, we introduce a new foveated clutter model to predict the detrimental effects in target search utilizing a forced fixation search task. We use Feature Congestion (Rosenholtz et al.) as our non foveated clutter model, and we stack a peripheral architecture on top of Feature Congestion for our foveated model. We introduce the Peripheral Integration Feature Congestion (PIFC) coefficient, as a fundamental ingredient of our model that modulates clutter as a non-linear gain contingent on eccentricity. We finally show that Foveated Feature Congestion (FFC) clutter scores r(44) = -0.82 correlate better with target detection (hit rate) than regular Feature Congestion r(44) = -0.19 in forced fixation search. Thus, our model allows us to enrich clutter perception research by computing fixation specific clutter maps. A toolbox for creating peripheral architectures: Piranhas: Peripheral Architectures for Natural, Hybrid and Artificial Systems will be made available.
Tasks
Published	2016-08-14
URL	http://arxiv.org/abs/1608.04042v1
PDF	http://arxiv.org/pdf/1608.04042v1.pdf
PWC	https://paperswithcode.com/paper/can-peripheral-representations-improve
Repo	https://github.com/ArturoDeza/Piranhas
Framework	torch

TextBoxes: A Fast Text Detector with a Single Deep Neural Network


Title	TextBoxes: A Fast Text Detector with a Single Deep Neural Network
Authors	Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, Wenyu Liu
Abstract	This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression. TextBoxes outperforms competing methods in terms of text localization accuracy and is much faster, taking only 0.09s per image in a fast implementation. Furthermore, combined with a text recognizer, TextBoxes significantly outperforms state-of-the-art approaches on word spotting and end-to-end text recognition tasks.
Tasks
Published	2016-11-21
URL	http://arxiv.org/abs/1611.06779v1
PDF	http://arxiv.org/pdf/1611.06779v1.pdf
PWC	https://paperswithcode.com/paper/textboxes-a-fast-text-detector-with-a-single
Repo	https://github.com/MhLiao/TextBoxes
Framework	none

Unsupervised, Efficient and Semantic Expertise Retrieval


Title	Unsupervised, Efficient and Semantic Expertise Retrieval
Authors	Christophe Van Gysel, Maarten de Rijke, Marcel Worring
Abstract	We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. We exclusively employ textual evidence and avoid explicit feature engineering by learning distributed word representations in an unsupervised way. We compare our model to state-of-the-art unsupervised statistical vector space and probabilistic generative approaches. Our proposed log-linear model achieves the retrieval performance levels of state-of-the-art document-centric methods with the low inference cost of so-called profile-centric approaches. It yields a statistically significant improved ranking over vector space and generative models in most cases, matching the performance of supervised methods on various benchmarks. That is, by using solely text we can do as well as methods that work with external evidence and/or relevance feedback. A contrastive analysis of rankings produced by discriminative and generative approaches shows that they have complementary strengths due to the ability of the unsupervised discriminative model to perform semantic matching.
Tasks	Feature Engineering
Published	2016-08-23
URL	http://arxiv.org/abs/1608.06651v2
PDF	http://arxiv.org/pdf/1608.06651v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-efficient-and-semantic-expertise
Repo	https://github.com/cvangysel/SERT
Framework	none

ImageNet pre-trained models with batch normalization


Title	ImageNet pre-trained models with batch normalization
Authors	Marcel Simon, Erik Rodner, Joachim Denzler
Abstract	Convolutional neural networks (CNN) pre-trained on ImageNet are the backbone of most state-of-the-art approaches. In this paper, we present a new set of pre-trained models with popular state-of-the-art architectures for the Caffe framework. The first release includes Residual Networks (ResNets) with generation script as well as the batch-normalization-variants of AlexNet and VGG19. All models outperform previous models with the same architecture. The models and training code are available at http://www.inf-cv.uni-jena.de/Research/CNN+Models.html and https://github.com/cvjena/cnn-models
Tasks
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01452v2
PDF	http://arxiv.org/pdf/1612.01452v2.pdf
PWC	https://paperswithcode.com/paper/imagenet-pre-trained-models-with-batch
Repo	https://github.com/fdac18/ForensicImages
Framework	none