Paper Group AWR 75
A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection. DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network. Learning through Dialogue Interactions by Asking Questions. C-mix: a high dimensional mixture model for censored durations, with applications to genetic d …
A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection
Title | A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection |
Authors | Mingbin Xu, Hui Jiang |
Abstract | In this paper, we study a novel approach for named entity recognition (NER) and mention detection in natural language processing. Instead of treating NER as a sequence labelling problem, we propose a new local detection approach, which rely on the recent fixed-size ordinally forgetting encoding (FOFE) method to fully encode each sentence fragment and its left/right contexts into a fixed-size representation. Afterwards, a simple feedforward neural network is used to reject or predict entity label for each individual fragment. The proposed method has been evaluated in several popular NER and mention detection tasks, including the CoNLL 2003 NER task and TAC-KBP2015 and TAC-KBP2016 Tri-lingual Entity Discovery and Linking (EDL) tasks. Our methods have yielded pretty strong performance in all of these examined tasks. This local detection approach has shown many advantages over the traditional sequence labelling methods. |
Tasks | Named Entity Recognition |
Published | 2016-11-02 |
URL | http://arxiv.org/abs/1611.00801v1 |
http://arxiv.org/pdf/1611.00801v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fofe-based-local-detection-approach-for |
Repo | https://github.com/xmb-cipher/fofe-ner |
Framework | tf |
DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network
Title | DeepSurv: Personalized Treatment Recommender System Using A Cox Proportional Hazards Deep Neural Network |
Authors | Jared Katzman, Uri Shaham, Jonathan Bates, Alexander Cloninger, Tingting Jiang, Yuval Kluger |
Abstract | Medical practitioners use survival models to explore and understand the relationships between patients’ covariates (e.g. clinical and genetic features) and the effectiveness of various treatment options. Standard survival models like the linear Cox proportional hazards model require extensive feature engineering or prior medical knowledge to model treatment interaction at an individual level. While nonlinear survival methods, such as neural networks and survival forests, can inherently model these high-level interaction terms, they have yet to be shown as effective treatment recommender systems. We introduce DeepSurv, a Cox proportional hazards deep neural network and state-of-the-art survival method for modeling interactions between a patient’s covariates and treatment effectiveness in order to provide personalized treatment recommendations. We perform a number of experiments training DeepSurv on simulated and real survival data. We demonstrate that DeepSurv performs as well as or better than other state-of-the-art survival models and validate that DeepSurv successfully models increasingly complex relationships between a patient’s covariates and their risk of failure. We then show how DeepSurv models the relationship between a patient’s features and effectiveness of different treatment options to show how DeepSurv can be used to provide individual treatment recommendations. Finally, we train DeepSurv on real clinical studies to demonstrate how it’s personalized treatment recommendations would increase the survival time of a set of patients. The predictive and modeling capabilities of DeepSurv will enable medical researchers to use deep neural networks as a tool in their exploration, understanding, and prediction of the effects of a patient’s characteristics on their risk of failure. |
Tasks | Feature Engineering, Predicting Patient Outcomes, Recommendation Systems, Survival Analysis |
Published | 2016-06-02 |
URL | http://arxiv.org/abs/1606.00931v3 |
http://arxiv.org/pdf/1606.00931v3.pdf | |
PWC | https://paperswithcode.com/paper/deepsurv-personalized-treatment-recommender |
Repo | https://github.com/jaredleekatzman/DeepSurv |
Framework | none |
Learning through Dialogue Interactions by Asking Questions
Title | Learning through Dialogue Interactions by Asking Questions |
Authors | Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc’Aurelio Ranzato, Jason Weston |
Abstract | A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction. In this work, we explore this direction by designing a simulator and a set of synthetic tasks in the movie domain that allow such interactions between a learner and a teacher. We investigate how a learner can benefit from asking questions in both offline and online reinforcement learning settings, and demonstrate that the learner improves when asking questions. Finally, real experiments with Mechanical Turk validate the approach. Our work represents a first step in developing such end-to-end learned interactive dialogue agents. |
Tasks | |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.04936v4 |
http://arxiv.org/pdf/1612.04936v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-through-dialogue-interactions-by |
Repo | https://github.com/aus10powell/Automated-Health-Responses |
Framework | tf |
C-mix: a high dimensional mixture model for censored durations, with applications to genetic data
Title | C-mix: a high dimensional mixture model for censored durations, with applications to genetic data |
Authors | Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot |
Abstract | We introduce a mixture model for censored durations (C-mix), and develop maximum likelihood inference for the joint estimation of the time distributions and latent regression parameters of the model. We consider a high-dimensional setting, with datasets containing a large number of biomedical covariates. We therefore penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model. Inference is achieved using an efficient Quasi-Newton Expectation Maximization (QNEM) algorithm, for which we provide convergence properties. We then propose a score by assessing the patients risk of early adverse event. The statistical performance of the method is examined on an extensive Monte Carlo simulation study, and finally illustrated on three genetic datasets with high-dimensional covariates. We show that our approach outperforms the state-of-the-art, namely both the CURE and Cox proportional hazards models for this task, both in terms of C-index and AUC(t). |
Tasks | |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07407v5 |
http://arxiv.org/pdf/1610.07407v5.pdf | |
PWC | https://paperswithcode.com/paper/c-mix-a-high-dimensional-mixture-model-for |
Repo | https://github.com/SimonBussy/C-mix |
Framework | none |
Temporal Generative Adversarial Nets with Singular Value Clipping
Title | Temporal Generative Adversarial Nets with Singular Value Clipping |
Authors | Masaki Saito, Eiichi Matsumoto, Shunta Saito |
Abstract | In this paper, we propose a generative model, Temporal Generative Adversarial Nets (TGAN), which can learn a semantic representation of unlabeled videos, and is capable of generating videos. Unlike existing Generative Adversarial Nets (GAN)-based methods that generate videos with a single generator consisting of 3D deconvolutional layers, our model exploits two different types of generators: a temporal generator and an image generator. The temporal generator takes a single latent variable as input and outputs a set of latent variables, each of which corresponds to an image frame in a video. The image generator transforms a set of such latent variables into a video. To deal with instability in training of GAN with such advanced networks, we adopt a recently proposed model, Wasserstein GAN, and propose a novel method to train it stably in an end-to-end manner. The experimental results demonstrate the effectiveness of our methods. |
Tasks | Video Generation |
Published | 2016-11-21 |
URL | http://arxiv.org/abs/1611.06624v3 |
http://arxiv.org/pdf/1611.06624v3.pdf | |
PWC | https://paperswithcode.com/paper/temporal-generative-adversarial-nets-with |
Repo | https://github.com/pfnet-research/tgan |
Framework | none |
A guide to convolution arithmetic for deep learning
Title | A guide to convolution arithmetic for deep learning |
Authors | Vincent Dumoulin, Francesco Visin |
Abstract | We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and output shape) of convolutional, pooling and transposed convolutional layers, as well as the relationship between convolutional and transposed convolutional layers. Relationships are derived for various cases, and are illustrated in order to make them intuitive. |
Tasks | |
Published | 2016-03-23 |
URL | http://arxiv.org/abs/1603.07285v2 |
http://arxiv.org/pdf/1603.07285v2.pdf | |
PWC | https://paperswithcode.com/paper/a-guide-to-convolution-arithmetic-for-deep |
Repo | https://github.com/vdumoulin/conv_arithmetic |
Framework | none |
ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala
Title | ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala |
Authors | N. Astrakhantsev |
Abstract | Automatically recognized terminology is widely used for various domain-specific texts processing tasks, such as machine translation, information retrieval or sentiment analysis. However, there is still no agreement on which methods are best suited for particular settings and, moreover, there is no reliable comparison of already developed methods. We believe that one of the main reasons is the lack of state-of-the-art methods implementations, which are usually non-trivial to recreate. In order to address these issues, we present ATR4S, an open-source software written in Scala that comprises more than 15 methods for automatic terminology recognition (ATR) and implements the whole pipeline from text document preprocessing, to term candidates collection, term candidates scoring, and finally, term candidates ranking. It is highly scalable, modular and configurable tool with support of automatic caching. We also compare 10 state-of-the-art methods on 7 open datasets by average precision and processing time. Experimental comparison reveals that no single method demonstrates best average precision for all datasets and that other available tools for ATR do not contain the best methods. |
Tasks | Information Retrieval, Machine Translation, Sentiment Analysis |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.07804v1 |
http://arxiv.org/pdf/1611.07804v1.pdf | |
PWC | https://paperswithcode.com/paper/atr4s-toolkit-with-state-of-the-art-automatic |
Repo | https://github.com/ispras/atr4s |
Framework | none |
Continuous 3D Label Stereo Matching using Local Expansion Moves
Title | Continuous 3D Label Stereo Matching using Local Expansion Moves |
Authors | Tatsunori Taniai, Yasuyuki Matsushita, Yoichi Sato, Takeshi Naemura |
Abstract | We present an accurate stereo matching method using local expansion moves based on graph cuts. This new move-making scheme is used to efficiently infer per-pixel 3D plane labels on a pairwise Markov random field (MRF) that effectively combines recently proposed slanted patch matching and curvature regularization terms. The local expansion moves are presented as many alpha-expansions defined for small grid regions. The local expansion moves extend traditional expansion moves by two ways: localization and spatial propagation. By localization, we use different candidate alpha-labels according to the locations of local alpha-expansions. By spatial propagation, we design our local alpha-expansions to propagate currently assigned labels for nearby regions. With this localization and spatial propagation, our method can efficiently infer MRF models with a continuous label space using randomized search. Our method has several advantages over previous approaches that are based on fusion moves or belief propagation; it produces submodular moves deriving a subproblem optimality; it helps find good, smooth, piecewise linear disparity maps; it is suitable for parallelization; it can use cost-volume filtering techniques for accelerating the matching cost computations. Even using a simple pairwise MRF, our method is shown to have best performance in the Middlebury stereo benchmark V2 and V3. |
Tasks | Stereo Matching, Stereo Matching Hand |
Published | 2016-03-28 |
URL | http://arxiv.org/abs/1603.08328v3 |
http://arxiv.org/pdf/1603.08328v3.pdf | |
PWC | https://paperswithcode.com/paper/continuous-3d-label-stereo-matching-using |
Repo | https://github.com/kbatsos/CBMV |
Framework | none |
XGBoost: A Scalable Tree Boosting System
Title | XGBoost: A Scalable Tree Boosting System |
Authors | Tianqi Chen, Carlos Guestrin |
Abstract | Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems. |
Tasks | Dimensionality Reduction |
Published | 2016-03-09 |
URL | http://arxiv.org/abs/1603.02754v3 |
http://arxiv.org/pdf/1603.02754v3.pdf | |
PWC | https://paperswithcode.com/paper/xgboost-a-scalable-tree-boosting-system |
Repo | https://github.com/jlanday/X-ray-Object-Classification |
Framework | none |
Stability selection for component-wise gradient boosting in multiple dimensions
Title | Stability selection for component-wise gradient boosting in multiple dimensions |
Authors | Janek Thomas, Andreas Mayr, Bernd Bischl, Matthias Schmid, Adam Smith, Benjamin Hofner |
Abstract | We present a new algorithm for boosting generalized additive models for location, scale and shape (GAMLSS) that allows to incorporate stability selection, an increasingly popular way to obtain stable sets of covariates while controlling the per-family error rate (PFER). The model is fitted repeatedly to subsampled data and variables with high selection frequencies are extracted. To apply stability selection to boosted GAMLSS, we develop a new “noncyclical” fitting algorithm that incorporates an additional selection step of the best-fitting distribution parameter in each iteration. This new algorithms has the additional advantage that optimizing the tuning parameters of boosting is reduced from a multi-dimensional to a one-dimensional problem with vastly decreased complexity. The performance of the novel algorithm is evaluated in an extensive simulation study. We apply this new algorithm to a study to estimate abundance of common eider in Massachusetts, USA, featuring excess zeros, overdispersion, non-linearity and spatio-temporal structures. Eider abundance is estimated via boosted GAMLSS, allowing both mean and overdispersion to be regressed on covariates. Stability selection is used to obtain a sparse set of stable predictors. |
Tasks | |
Published | 2016-11-30 |
URL | http://arxiv.org/abs/1611.10171v1 |
http://arxiv.org/pdf/1611.10171v1.pdf | |
PWC | https://paperswithcode.com/paper/stability-selection-for-component-wise |
Repo | https://github.com/boost-R/gamboostLSS |
Framework | none |
Error bounds for approximations with deep ReLU networks
Title | Error bounds for approximations with deep ReLU networks |
Authors | Dmitry Yarotsky |
Abstract | We study expressive power of shallow and deep neural networks with piece-wise linear activation functions. We establish new rigorous upper and lower bounds for the network complexity in the setting of approximations in Sobolev spaces. In particular, we prove that deep ReLU networks more efficiently approximate smooth functions than shallow networks. In the case of approximations of 1D Lipschitz functions we describe adaptive depth-6 network architectures more efficient than the standard shallow architecture. |
Tasks | |
Published | 2016-10-03 |
URL | http://arxiv.org/abs/1610.01145v3 |
http://arxiv.org/pdf/1610.01145v3.pdf | |
PWC | https://paperswithcode.com/paper/error-bounds-for-approximations-with-deep |
Repo | https://github.com/arsenal9971/TUB_MoDL |
Framework | tf |
Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?
Title | Can Peripheral Representations Improve Clutter Metrics on Complex Scenes? |
Authors | Arturo Deza, Miguel P. Eckstein |
Abstract | Previous studies have proposed image-based clutter measures that correlate with human search times and/or eye movements. However, most models do not take into account the fact that the effects of clutter interact with the foveated nature of the human visual system: visual clutter further from the fovea has an increasing detrimental influence on perception. Here, we introduce a new foveated clutter model to predict the detrimental effects in target search utilizing a forced fixation search task. We use Feature Congestion (Rosenholtz et al.) as our non foveated clutter model, and we stack a peripheral architecture on top of Feature Congestion for our foveated model. We introduce the Peripheral Integration Feature Congestion (PIFC) coefficient, as a fundamental ingredient of our model that modulates clutter as a non-linear gain contingent on eccentricity. We finally show that Foveated Feature Congestion (FFC) clutter scores r(44) = -0.82 correlate better with target detection (hit rate) than regular Feature Congestion r(44) = -0.19 in forced fixation search. Thus, our model allows us to enrich clutter perception research by computing fixation specific clutter maps. A toolbox for creating peripheral architectures: Piranhas: Peripheral Architectures for Natural, Hybrid and Artificial Systems will be made available. |
Tasks | |
Published | 2016-08-14 |
URL | http://arxiv.org/abs/1608.04042v1 |
http://arxiv.org/pdf/1608.04042v1.pdf | |
PWC | https://paperswithcode.com/paper/can-peripheral-representations-improve |
Repo | https://github.com/ArturoDeza/Piranhas |
Framework | torch |
TextBoxes: A Fast Text Detector with a Single Deep Neural Network
Title | TextBoxes: A Fast Text Detector with a Single Deep Neural Network |
Authors | Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, Wenyu Liu |
Abstract | This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression. TextBoxes outperforms competing methods in terms of text localization accuracy and is much faster, taking only 0.09s per image in a fast implementation. Furthermore, combined with a text recognizer, TextBoxes significantly outperforms state-of-the-art approaches on word spotting and end-to-end text recognition tasks. |
Tasks | |
Published | 2016-11-21 |
URL | http://arxiv.org/abs/1611.06779v1 |
http://arxiv.org/pdf/1611.06779v1.pdf | |
PWC | https://paperswithcode.com/paper/textboxes-a-fast-text-detector-with-a-single |
Repo | https://github.com/MhLiao/TextBoxes |
Framework | none |
Unsupervised, Efficient and Semantic Expertise Retrieval
Title | Unsupervised, Efficient and Semantic Expertise Retrieval |
Authors | Christophe Van Gysel, Maarten de Rijke, Marcel Worring |
Abstract | We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. We exclusively employ textual evidence and avoid explicit feature engineering by learning distributed word representations in an unsupervised way. We compare our model to state-of-the-art unsupervised statistical vector space and probabilistic generative approaches. Our proposed log-linear model achieves the retrieval performance levels of state-of-the-art document-centric methods with the low inference cost of so-called profile-centric approaches. It yields a statistically significant improved ranking over vector space and generative models in most cases, matching the performance of supervised methods on various benchmarks. That is, by using solely text we can do as well as methods that work with external evidence and/or relevance feedback. A contrastive analysis of rankings produced by discriminative and generative approaches shows that they have complementary strengths due to the ability of the unsupervised discriminative model to perform semantic matching. |
Tasks | Feature Engineering |
Published | 2016-08-23 |
URL | http://arxiv.org/abs/1608.06651v2 |
http://arxiv.org/pdf/1608.06651v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-efficient-and-semantic-expertise |
Repo | https://github.com/cvangysel/SERT |
Framework | none |
ImageNet pre-trained models with batch normalization
Title | ImageNet pre-trained models with batch normalization |
Authors | Marcel Simon, Erik Rodner, Joachim Denzler |
Abstract | Convolutional neural networks (CNN) pre-trained on ImageNet are the backbone of most state-of-the-art approaches. In this paper, we present a new set of pre-trained models with popular state-of-the-art architectures for the Caffe framework. The first release includes Residual Networks (ResNets) with generation script as well as the batch-normalization-variants of AlexNet and VGG19. All models outperform previous models with the same architecture. The models and training code are available at http://www.inf-cv.uni-jena.de/Research/CNN+Models.html and https://github.com/cvjena/cnn-models |
Tasks | |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01452v2 |
http://arxiv.org/pdf/1612.01452v2.pdf | |
PWC | https://paperswithcode.com/paper/imagenet-pre-trained-models-with-batch |
Repo | https://github.com/fdac18/ForensicImages |
Framework | none |