Paper Group ANR 394
About Learning in Recurrent Bistable Gradient Networks. Identification of refugee influx patterns in Greece via model-theoretic analysis of daily arrivals. Causal Inference in Observational Data. Sensitivity Maps of the Hilbert-Schmidt Independence Criterion. The Inflation Technique for Causal Inference with Latent Variables. Hough-CNN: Deep Learni …
About Learning in Recurrent Bistable Gradient Networks
Title | About Learning in Recurrent Bistable Gradient Networks |
Authors | J. Fischer, S. Lackner |
Abstract | Recurrent Bistable Gradient Networks are attractor based neural networks characterized by bistable dynamics of each single neuron. Coupled together using linear interaction determined by the interconnection weights, these networks do not suffer from spurious states or very limited capacity anymore. Vladimir Chinarov and Michael Menzinger, who invented these networks, trained them using Hebb’s learning rule. We show, that this way of computing the weights leads to unwanted behaviour and limitations of the networks capabilities. Furthermore we evince, that using the first order of Hintons Contrastive Divergence algorithm leads to a quite promising recurrent neural network. These findings are tested by learning images of the MNIST database for handwritten numbers. |
Tasks | |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.08265v1 |
http://arxiv.org/pdf/1608.08265v1.pdf | |
PWC | https://paperswithcode.com/paper/about-learning-in-recurrent-bistable-gradient |
Repo | |
Framework | |
Identification of refugee influx patterns in Greece via model-theoretic analysis of daily arrivals
Title | Identification of refugee influx patterns in Greece via model-theoretic analysis of daily arrivals |
Authors | Harris V. Georgiou |
Abstract | The refugee crisis is perhaps the single most challenging problem for Europe today. Hundreds of thousands of people have already traveled across dangerous sea passages from Turkish shores to Greek islands, resulting in thousands of dead and missing, despite the best rescue efforts from both sides. One of the main reasons is the total lack of any early warning-alerting system, which could provide some preparation time for the prompt and effective deployment of resources at the hot zones. This work is such an attempt for a systemic analysis of the refugee influx in Greece, aiming at (a) the statistical and signal-level characterization of the smuggling networks and (b) the formulation and preliminary assessment of such models for predictive purposes, i.e., as the basis of such an early warning-alerting protocol. To our knowledge, this is the first-ever attempt to design such a system, since this refugee crisis itself and its geographical properties are unique (intense event handling, little or no warning). The analysis employs a wide range of statistical, signal-based and matrix factorization (decomposition) techniques, including linear & linear-cosine regression, spectral analysis, ARMA, SVD, Probabilistic PCA, ICA, K-SVD for Dictionary Learning, as well as fractal dimension analysis. It is established that the behavioral patterns of the smuggling networks closely match (as expected) the regular burst and pause periods of store-and-forward networks in digital communications. There are also major periodic trends in the range of 6.2-6.5 days and strong correlations in lags of four or more days, with distinct preference in the Sunday-Monday 48-hour time frame. These results show that such models can be used successfully for short-term forecasting of the influx intensity, producing an invaluable operational asset for planners, decision-makers and first-responders. |
Tasks | Dictionary Learning |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02784v1 |
http://arxiv.org/pdf/1605.02784v1.pdf | |
PWC | https://paperswithcode.com/paper/identification-of-refugee-influx-patterns-in |
Repo | |
Framework | |
Causal Inference in Observational Data
Title | Causal Inference in Observational Data |
Authors | Pranjul Yadav, Lisiane Prunelli, Alexander Hoff, Michael Steinbach, Bonnie Westra, Vipin Kumar, Gyorgy Simon |
Abstract | Our aging population increasingly suffers from multiple chronic diseases simultaneously, necessitating the comprehensive treatment of these conditions. Finding the optimal set of drugs for a combinatorial set of diseases is a combinatorial pattern exploration problem. Association rule mining is a popular tool for such problems, but the requirement of health care for finding causal, rather than associative, patterns renders association rule mining unsuitable. To address this issue, we propose a novel framework based on the Rubin-Neyman causal model for extracting causal rules from observational data, correcting for a number of common biases. Specifically, given a set of interventions and a set of items that define subpopulations (e.g., diseases), we wish to find all subpopulations in which effective intervention combinations exist and in each such subpopulation, we wish to find all intervention combinations such that dropping any intervention from this combination will reduce the efficacy of the treatment. A key aspect of our framework is the concept of closed intervention sets which extend the concept of quantifying the effect of a single intervention to a set of concurrent interventions. We also evaluated our causal rule mining framework on the Electronic Health Records (EHR) data of a large cohort of patients from Mayo Clinic and showed that the patterns we extracted are sufficiently rich to explain the controversial findings in the medical literature regarding the effect of a class of cholesterol drugs on Type-II Diabetes Mellitus (T2DM). |
Tasks | Causal Inference |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04660v1 |
http://arxiv.org/pdf/1611.04660v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-inference-in-observational-data |
Repo | |
Framework | |
Sensitivity Maps of the Hilbert-Schmidt Independence Criterion
Title | Sensitivity Maps of the Hilbert-Schmidt Independence Criterion |
Authors | Adrián Pérez-Suay, Gustau Camps-Valls |
Abstract | Kernel dependence measures yield accurate estimates of nonlinear relations between random variables, and they are also endorsed with solid theoretical properties and convergence rates. Besides, the empirical estimates are easy to compute in closed form just involving linear algebra operations. However, they are hampered by two important problems: the high computational cost involved, as two kernel matrices of the sample size have to be computed and stored, and the interpretability of the measure, which remains hidden behind the implicit feature map. We here address these two issues. We introduce the Sensitivity Maps (SMs) for the Hilbert-Schmidt independence criterion (HSIC). Sensitivity maps allow us to explicitly analyze and visualize the relative relevance of both examples and features on the dependence measure. We also present the randomized HSIC (RHSIC) and its corresponding sensitivity maps to cope with large scale problems. We build upon the framework of random features and the Bochner’s theorem to approximate the involved kernels in the canonical HSIC. The power of the RHSIC measure scales favourably with the number of samples, and it approximates HSIC and the sensitivity maps efficiently. Convergence bounds of both the measure and the sensitivity map are also provided. Our proposal is illustrated in synthetic examples, and challenging real problems of dependence estimation, feature selection, and causal inference from empirical data. |
Tasks | Causal Inference, Feature Selection |
Published | 2016-11-02 |
URL | http://arxiv.org/abs/1611.00555v1 |
http://arxiv.org/pdf/1611.00555v1.pdf | |
PWC | https://paperswithcode.com/paper/sensitivity-maps-of-the-hilbert-schmidt |
Repo | |
Framework | |
The Inflation Technique for Causal Inference with Latent Variables
Title | The Inflation Technique for Causal Inference with Latent Variables |
Authors | Elie Wolfe, Robert W. Spekkens, Tobias Fritz |
Abstract | The problem of causal inference is to determine if a given probability distribution on observed variables is compatible with some causal structure. The difficult case is when the causal structure includes latent variables. We here introduce the $\textit{inflation technique}$ for tackling this problem. An inflation of a causal structure is a new causal structure that can contain multiple copies of each of the original variables, but where the ancestry of each copy mirrors that of the original. To every distribution of the observed variables that is compatible with the original causal structure, we assign a family of marginal distributions on certain subsets of the copies that are compatible with the inflated causal structure. It follows that compatibility constraints for the inflation can be translated into compatibility constraints for the original causal structure. Even if the constraints at the level of inflation are weak, such as observable statistical independences implied by disjoint causal ancestry, the translated constraints can be strong. We apply this method to derive new inequalities whose violation by a distribution witnesses that distribution’s incompatibility with the causal structure (of which Bell inequalities and Pearl’s instrumental inequality are prominent examples). We describe an algorithm for deriving all such inequalities for the original causal structure that follow from ancestral independences in the inflation. For three observed binary variables with pairwise common causes, it yields inequalities that are stronger in at least some aspects than those obtainable by existing methods. We also describe an algorithm that derives a weaker set of inequalities but is more efficient. Finally, we discuss which inflations are such that the inequalities one obtains from them remain valid even for quantum (and post-quantum) generalizations of the notion of a causal model. |
Tasks | Causal Inference |
Published | 2016-09-02 |
URL | https://arxiv.org/abs/1609.00672v5 |
https://arxiv.org/pdf/1609.00672v5.pdf | |
PWC | https://paperswithcode.com/paper/the-inflation-technique-for-causal-inference |
Repo | |
Framework | |
Hough-CNN: Deep Learning for Segmentation of Deep Brain Regions in MRI and Ultrasound
Title | Hough-CNN: Deep Learning for Segmentation of Deep Brain Regions in MRI and Ultrasound |
Authors | Fausto Milletari, Seyed-Ahmad Ahmadi, Christine Kroll, Annika Plate, Verena Rozanski, Juliana Maiostre, Johannes Levin, Olaf Dietrich, Birgit Ertl-Wagner, Kai Bötzel, Nassir Navab |
Abstract | In this work we propose a novel approach to perform segmentation by leveraging the abstraction capabilities of convolutional neural networks (CNNs). Our method is based on Hough voting, a strategy that allows for fully automatic localisation and segmentation of the anatomies of interest. This approach does not only use the CNN classification outcomes, but it also implements voting by exploiting the features produced by the deepest portion of the network. We show that this learning-based segmentation method is robust, multi-region, flexible and can be easily adapted to different modalities. In the attempt to show the capabilities and the behaviour of CNNs when they are applied to medical image analysis, we perform a systematic study of the performances of six different network architectures, conceived according to state-of-the-art criteria, in various situations. We evaluate the impact of both different amount of training data and different data dimensionality (2D, 2.5D and 3D) on the final results. We show results on both MRI and transcranial US volumes depicting respectively 26 regions of the basal ganglia and the midbrain. |
Tasks | |
Published | 2016-01-26 |
URL | http://arxiv.org/abs/1601.07014v3 |
http://arxiv.org/pdf/1601.07014v3.pdf | |
PWC | https://paperswithcode.com/paper/hough-cnn-deep-learning-for-segmentation-of |
Repo | |
Framework | |
Context-aware Natural Language Generation with Recurrent Neural Networks
Title | Context-aware Natural Language Generation with Recurrent Neural Networks |
Authors | Jian Tang, Yifan Yang, Sam Carton, Ming Zhang, Qiaozhu Mei |
Abstract | This paper studied generating natural languages at particular contexts or situations. We proposed two novel approaches which encode the contexts into a continuous semantic representation and then decode the semantic representation into text sequences with recurrent neural networks. During decoding, the context information are attended through a gating mechanism, addressing the problem of long-range dependency caused by lengthy sequences. We evaluate the effectiveness of the proposed approaches on user review data, in which rich contexts are available and two informative contexts, sentiments and products, are selected for evaluation. Experiments show that the fake reviews generated by our approaches are very natural. Results of fake review detection with human judges show that more than 50% of the fake reviews are misclassified as the real reviews, and more than 90% are misclassified by existing state-of-the-art fake review detection algorithm. |
Tasks | Text Generation |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09900v1 |
http://arxiv.org/pdf/1611.09900v1.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-natural-language-generation |
Repo | |
Framework | |
Learning to Generate Posters of Scientific Papers
Title | Learning to Generate Posters of Scientific Papers |
Authors | Yuting Qiang, Yanwei Fu, Yanwen Guo, Zhi-Hua Zhou, Leonid Sigal |
Abstract | Researchers often summarize their work in the form of posters. Posters provide a coherent and efficient way to convey core ideas from scientific papers. Generating a good scientific poster, however, is a complex and time consuming cognitive task, since such posters need to be readable, informative, and visually aesthetic. In this paper, for the first time, we study the challenging problem of learning to generate posters from scientific papers. To this end, a data-driven framework, that utilizes graphical models, is proposed. Specifically, given content to display, the key elements of a good poster, including panel layout and attributes of each panel, are learned and inferred from data. Then, given inferred layout and attributes, composition of graphical elements within each panel is synthesized. To learn and validate our model, we collect and make public a Poster-Paper dataset, which consists of scientific papers and corresponding posters with exhaustively labelled panels and attributes. Qualitative and quantitative results indicate the effectiveness of our approach. |
Tasks | |
Published | 2016-04-05 |
URL | http://arxiv.org/abs/1604.01219v1 |
http://arxiv.org/pdf/1604.01219v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-generate-posters-of-scientific |
Repo | |
Framework | |
SYSTRAN’s Pure Neural Machine Translation Systems
Title | SYSTRAN’s Pure Neural Machine Translation Systems |
Authors | Josep Crego, Jungi Kim, Guillaume Klein, Anabel Rebollo, Kathy Yang, Jean Senellart, Egor Akhanov, Patrice Brunelle, Aurelien Coquard, Yongchao Deng, Satoshi Enoue, Chiyo Geiss, Joshua Johanson, Ardas Khalsa, Raoum Khiari, Byeongil Ko, Catherine Kobus, Jean Lorieux, Leidiana Martins, Dang-Chuan Nguyen, Alexandra Priori, Thomas Riccardi, Natalia Segal, Christophe Servan, Cyril Tiquet, Bo Wang, Jin Yang, Dakun Zhang, Jing Zhou, Peter Zoldan |
Abstract | Since the first online demonstration of Neural Machine Translation (NMT) by LISA, NMT development has recently moved from laboratory to production systems as demonstrated by several entities announcing roll-out of NMT engines to replace their existing technologies. NMT systems have a large number of training configurations and the training process of such systems is usually very long, often a few weeks, so role of experimentation is critical and important to share. In this work, we present our approach to production-ready systems simultaneously with release of online demonstrators covering a large variety of languages (12 languages, for 32 language pairs). We explore different practical choices: an efficient and evolutive open-source framework; data preparation; network architecture; additional implemented features; tuning for production; etc. We discuss about evaluation methodology, present our first findings and we finally outline further work. Our ultimate goal is to share our expertise to build competitive production systems for “generic” translation. We aim at contributing to set up a collaborative framework to speed-up adoption of the technology, foster further research efforts and enable the delivery and adoption to/by industry of use-case specific engines integrated in real production workflows. Mastering of the technology would allow us to build translation engines suited for particular needs, outperforming current simplest/uniform systems. |
Tasks | Machine Translation |
Published | 2016-10-18 |
URL | http://arxiv.org/abs/1610.05540v1 |
http://arxiv.org/pdf/1610.05540v1.pdf | |
PWC | https://paperswithcode.com/paper/systrans-pure-neural-machine-translation |
Repo | |
Framework | |
Learning Tree-Structured Detection Cascades for Heterogeneous Networks of Embedded Devices
Title | Learning Tree-Structured Detection Cascades for Heterogeneous Networks of Embedded Devices |
Authors | Hamid Dadkhahi, Benjamin M. Marlin |
Abstract | In this paper, we present a new approach to learning cascaded classifiers for use in computing environments that involve networks of heterogeneous and resource-constrained, low-power embedded compute and sensing nodes. We present a generalization of the classical linear detection cascade to the case of tree-structured cascades where different branches of the tree execute on different physical compute nodes in the network. Different nodes have access to different features, as well as access to potentially different computation and energy resources. We concentrate on the problem of jointly learning the parameters for all of the classifiers in the cascade given a fixed cascade architecture and a known set of costs required to carry out the computation at each node.To accomplish the objective of joint learning of all detectors, we propose a novel approach to combining classifier outputs during training that better matches the hard cascade setting in which the learned system will be deployed. This work is motivated by research in the area of mobile health where energy efficient real time detectors integrating information from multiple wireless on-body sensors and a smart phone are needed for real-time monitoring and delivering just- in-time adaptive interventions. We apply our framework to two activity recognition datasets as well as the problem of cigarette smoking detection from a combination of wrist-worn actigraphy data and respiration chest band data. |
Tasks | Activity Recognition |
Published | 2016-07-30 |
URL | http://arxiv.org/abs/1608.00159v4 |
http://arxiv.org/pdf/1608.00159v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-tree-structured-detection-cascades |
Repo | |
Framework | |
Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs
Title | Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs |
Authors | Phong Le, Willem Zuidema |
Abstract | Recursive neural networks (RNN) and their recently proposed extension recursive long short term memory networks (RLSTM) are models that compute representations for sentences, by recursively combining word embeddings according to an externally provided parse tree. Both models thus, unlike recurrent networks, explicitly make use of the hierarchical structure of a sentence. In this paper, we demonstrate that RNNs nevertheless suffer from the vanishing gradient and long distance dependency problem, and that RLSTMs greatly improve over RNN’s on these problems. We present an artificial learning task that allows us to quantify the severity of these problems for both models. We further show that a ratio of gradients (at the root node and a focal leaf node) is highly indicative of the success of backpropagation at optimizing the relevant weights low in the tree. This paper thus provides an explanation for existing, superior results of RLSTMs on tasks such as sentiment analysis, and suggests that the benefits of including hierarchical structure and of including LSTM-style gating are complementary. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2016-03-01 |
URL | http://arxiv.org/abs/1603.00423v1 |
http://arxiv.org/pdf/1603.00423v1.pdf | |
PWC | https://paperswithcode.com/paper/quantifying-the-vanishing-gradient-and-long |
Repo | |
Framework | |
Detecting Engagement in Egocentric Video
Title | Detecting Engagement in Egocentric Video |
Authors | Yu-Chuan Su, Kristen Grauman |
Abstract | In a wearable camera video, we see what the camera wearer sees. While this makes it easy to know roughly what he chose to look at, it does not immediately reveal when he was engaged with the environment. Specifically, at what moments did his focus linger, as he paused to gather more information about something he saw? Knowing this answer would benefit various applications in video summarization and augmented reality, yet prior work focuses solely on the “what” question (estimating saliency, gaze) without considering the “when” (engagement). We propose a learning-based approach that uses long-term egomotion cues to detect engagement, specifically in browsing scenarios where one frequently takes in new visual information (e.g., shopping, touring). We introduce a large, richly annotated dataset for ego-engagement that is the first of its kind. Our approach outperforms a wide array of existing methods. We show engagement can be detected well independent of both scene appearance and the camera wearer’s identity. |
Tasks | Video Summarization |
Published | 2016-04-04 |
URL | http://arxiv.org/abs/1604.00906v1 |
http://arxiv.org/pdf/1604.00906v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-engagement-in-egocentric-video |
Repo | |
Framework | |
Concept Generation in Language Evolution
Title | Concept Generation in Language Evolution |
Authors | Martha Lewis, Jonathan Lawry |
Abstract | This thesis investigates the generation of new concepts from combinations of existing concepts as a language evolves. We give a method for combining concepts, and will be investigating the utility of composite concepts in language evolution and thence the utility of concept generation. |
Tasks | |
Published | 2016-01-25 |
URL | http://arxiv.org/abs/1601.06732v1 |
http://arxiv.org/pdf/1601.06732v1.pdf | |
PWC | https://paperswithcode.com/paper/concept-generation-in-language-evolution |
Repo | |
Framework | |
Active Information Acquisition
Title | Active Information Acquisition |
Authors | He He, Paul Mineiro, Nikos Karampatziakis |
Abstract | We propose a general framework for sequential and dynamic acquisition of useful information in order to solve a particular task. While our goal could in principle be tackled by general reinforcement learning, our particular setting is constrained enough to allow more efficient algorithms. In this paper, we work under the Learning to Search framework and show how to formulate the goal of finding a dynamic information acquisition policy in that framework. We apply our formulation on two tasks, sentiment analysis and image recognition, and show that the learned policies exhibit good statistical performance. As an emergent byproduct, the learned policies show a tendency to focus on the most prominent parts of each instance and give harder instances more attention without explicitly being trained to do so. |
Tasks | Sentiment Analysis |
Published | 2016-02-05 |
URL | http://arxiv.org/abs/1602.02181v1 |
http://arxiv.org/pdf/1602.02181v1.pdf | |
PWC | https://paperswithcode.com/paper/active-information-acquisition |
Repo | |
Framework | |
Adversarial Delays in Online Strongly-Convex Optimization
Title | Adversarial Delays in Online Strongly-Convex Optimization |
Authors | Daniel Khashabi, Kent Quanrud, Amirhossein Taghvaei |
Abstract | We consider the problem of strongly-convex online optimization in presence of adversarial delays; in a T-iteration online game, the feedback of the player’s query at time t is arbitrarily delayed by an adversary for d_t rounds and delivered before the game ends, at iteration t+d_t-1. Specifically for \algo{online-gradient-descent} algorithm we show it has a simple regret bound of \Oh{\sum_{t=1}^T \log (1+ \frac{d_t}{t})}. This gives a clear and simple bound without resorting any distributional and limiting assumptions on the delays. We further show how this result encompasses and generalizes several of the existing known results in the literature. Specifically it matches the celebrated logarithmic regret \Oh{\log T} when there are no delays (i.e. d_t = 1) and regret bound of \Oh{\tau \log T} for constant delays d_t = \tau. |
Tasks | |
Published | 2016-05-20 |
URL | https://arxiv.org/abs/1605.06201v4 |
https://arxiv.org/pdf/1605.06201v4.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-delays-in-online-strongly-convex |
Repo | |
Framework | |