Paper Group ANR 240
Multivariate Regression with Gross Errors on Manifold-valued Data. Deep Semantic Abstractions of Everyday Human Activities: On Commonsense Representations of Human Interactions. EMR-based medical knowledge representation and inference via Markov random fields and distributed representation learning. Care about you: towards large-scale human-centric …
Multivariate Regression with Gross Errors on Manifold-valued Data
Title | Multivariate Regression with Gross Errors on Manifold-valued Data |
Authors | Xiaowei Zhang, Xudong Shi, Yu Sun, Li Cheng |
Abstract | We consider the topic of multivariate regression on manifold-valued output, that is, for a multivariate observation, its output response lies on a manifold. Moreover, we propose a new regression model to deal with the presence of grossly corrupted manifold-valued responses, a bottleneck issue commonly encountered in practical scenarios. Our model first takes a correction step on the grossly corrupted responses via geodesic curves on the manifold, and then performs multivariate linear regression on the corrected data. This results in a nonconvex and nonsmooth optimization problem on manifolds. To this end, we propose a dedicated approach named PALMR, by utilizing and extending the proximal alternating linearized minimization techniques. Theoretically, we investigate its convergence property, where it is shown to converge to a critical point under mild conditions. Empirically, we test our model on both synthetic and real diffusion tensor imaging data, and show that our model outperforms other multivariate regression models when manifold-valued responses contain gross errors, and is effective in identifying gross errors. |
Tasks | |
Published | 2017-03-26 |
URL | http://arxiv.org/abs/1703.08772v2 |
http://arxiv.org/pdf/1703.08772v2.pdf | |
PWC | https://paperswithcode.com/paper/multivariate-regression-with-gross-errors-on |
Repo | |
Framework | |
Deep Semantic Abstractions of Everyday Human Activities: On Commonsense Representations of Human Interactions
Title | Deep Semantic Abstractions of Everyday Human Activities: On Commonsense Representations of Human Interactions |
Authors | Jakob Suchan, Mehul Bhatt |
Abstract | We propose a deep semantic characterization of space and motion categorically from the viewpoint of grounding embodied human-object interactions. Our key focus is on an ontological model that would be adept to formalisation from the viewpoint of commonsense knowledge representation, relational learning, and qualitative reasoning about space and motion in cognitive robotics settings. We demonstrate key aspects of the space & motion ontology and its formalization as a representational framework in the backdrop of select examples from a dataset of everyday activities. Furthermore, focussing on human-object interaction data obtained from RGBD sensors, we also illustrate how declarative (spatio-temporal) reasoning in the (constraint) logic programming family may be performed with the developed deep semantic abstractions. |
Tasks | Human-Object Interaction Detection, Relational Reasoning |
Published | 2017-10-10 |
URL | http://arxiv.org/abs/1710.04076v1 |
http://arxiv.org/pdf/1710.04076v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-semantic-abstractions-of-everyday-human |
Repo | |
Framework | |
EMR-based medical knowledge representation and inference via Markov random fields and distributed representation learning
Title | EMR-based medical knowledge representation and inference via Markov random fields and distributed representation learning |
Authors | Chao Zhao, Jingchi Jiang, Yi Guan |
Abstract | Objective: Electronic medical records (EMRs) contain an amount of medical knowledge which can be used for clinical decision support (CDS). Our objective is a general system that can extract and represent these knowledge contained in EMRs to support three CDS tasks: test recommendation, initial diagnosis, and treatment plan recommendation, with the given condition of one patient. Methods: We extracted four kinds of medical entities from records and constructed an EMR-based medical knowledge network (EMKN), in which nodes are entities and edges reflect their co-occurrence in a single record. Three bipartite subgraphs (bi-graphs) were extracted from the EMKN to support each task. One part of the bi-graph was the given condition (e.g., symptoms), and the other was the condition to be inferred (e.g., diseases). Each bi-graph was regarded as a Markov random field to support the inference. Three lazy energy functions and one parameter-based energy function were proposed, as well as two knowledge representation learning-based energy functions, which can provide a distributed representation of medical entities. Three measures were utilized for performance evaluation. Results: On the initial diagnosis task, 80.11% of the test records identified at least one correct disease from top 10 candidates. Test and treatment recommendation results were 87.88% and 92.55%, respectively. These results altogether indicate that the proposed system outperformed the baseline methods. The distributed representation of medical entities does reflect similarity relationships in regards to knowledge level. Conclusion: Combining EMKN and MRF is an effective approach for general medical knowledge representation and inference. Different tasks, however, require designing their energy functions individually. |
Tasks | Representation Learning |
Published | 2017-09-20 |
URL | http://arxiv.org/abs/1709.06908v1 |
http://arxiv.org/pdf/1709.06908v1.pdf | |
PWC | https://paperswithcode.com/paper/emr-based-medical-knowledge-representation |
Repo | |
Framework | |
Care about you: towards large-scale human-centric visual relationship detection
Title | Care about you: towards large-scale human-centric visual relationship detection |
Authors | Bohan Zhuang, Qi Wu, Chunhua Shen, Ian Reid, Anton van den Hengel |
Abstract | Visual relationship detection aims to capture interactions between pairs of objects in images. Relationships between objects and humans represent a particularly important subset of this problem, with implications for challenges such as understanding human behaviour, and identifying affordances, amongst others. In addressing this problem we first construct a large-scale human-centric visual relationship detection dataset (HCVRD), which provides many more types of relationship annotation (nearly 10K categories) than the previous released datasets. This large label space better reflects the reality of human-object interactions, but gives rise to a long-tail distribution problem, which in turn demands a zero-shot approach to labels appearing only in the test set. This is the first time this issue has been addressed. We propose a webly-supervised approach to these problems and demonstrate that the proposed model provides a strong baseline on our HCVRD dataset. |
Tasks | Human-Object Interaction Detection |
Published | 2017-05-28 |
URL | http://arxiv.org/abs/1705.09892v1 |
http://arxiv.org/pdf/1705.09892v1.pdf | |
PWC | https://paperswithcode.com/paper/care-about-you-towards-large-scale-human |
Repo | |
Framework | |
Markov Brains: A Technical Introduction
Title | Markov Brains: A Technical Introduction |
Authors | Arend Hintze, Jeffrey A. Edlund, Randal S. Olson, David B. Knoester, Jory Schossau, Larissa Albantakis, Ali Tehrani-Saleh, Peter Kvam, Leigh Sheneman, Heather Goldsby, Clifford Bohm, Christoph Adami |
Abstract | Markov Brains are a class of evolvable artificial neural networks (ANN). They differ from conventional ANNs in many aspects, but the key difference is that instead of a layered architecture, with each node performing the same function, Markov Brains are networks built from individual computational components. These computational components interact with each other, receive inputs from sensors, and control motor outputs. The function of the computational components, their connections to each other, as well as connections to sensors and motors are all subject to evolutionary optimization. Here we describe in detail how a Markov Brain works, what techniques can be used to study them, and how they can be evolved. |
Tasks | |
Published | 2017-09-17 |
URL | http://arxiv.org/abs/1709.05601v1 |
http://arxiv.org/pdf/1709.05601v1.pdf | |
PWC | https://paperswithcode.com/paper/markov-brains-a-technical-introduction |
Repo | |
Framework | |
A Robust Genetic Algorithm for Learning Temporal Specifications from Data
Title | A Robust Genetic Algorithm for Learning Temporal Specifications from Data |
Authors | Laura Nenzi, Simone Silvetti, Ezio Bartocci, Luca Bortolussi |
Abstract | We consider the problem of mining signal temporal logical requirements from a dataset of regular (good) and anomalous (bad) trajectories of a dynamical system. We assume the training set to be labeled by human experts and that we have access only to a limited amount of data, typically noisy. We provide a systematic approach to synthesize both the syntactical structure and the parameters of the temporal logic formula using a two-steps procedure: first, we leverage a novel evolutionary algorithm for learning the structure of the formula; second, we perform the parameter synthesis operating on the statistical emulation of the average robustness for a candidate formula w.r.t. its parameters. We compare our results with our previous work [{BufoBSBLB14] and with a recently proposed decision-tree [bombara_decision_2016] based method. We present experimental results on two case studies: an anomalous trajectory detection problem of a naval surveillance system and the characterization of an Ineffective Respiratory effort, showing the usefulness of our work. |
Tasks | |
Published | 2017-11-13 |
URL | http://arxiv.org/abs/1711.06202v3 |
http://arxiv.org/pdf/1711.06202v3.pdf | |
PWC | https://paperswithcode.com/paper/a-robust-genetic-algorithm-for-learning |
Repo | |
Framework | |
Opposition based Ensemble Micro Differential Evolution
Title | Opposition based Ensemble Micro Differential Evolution |
Authors | Hojjat Salehinejad, Shahryar Rahnamayan, Hamid R. Tizhoosh |
Abstract | Differential evolution (DE) algorithm with a small population size is called Micro-DE (MDE). A small population size decreases the computational complexity but also reduces the exploration ability of DE by limiting the population diversity. In this paper, we propose the idea of combining ensemble mutation scheme selection and opposition-based learning concepts to enhance the diversity of population in MDE at mutation and selection stages. The proposed algorithm enhances the diversity of population by generating a random mutation scale factor per individual and per dimension, randomly assigning a mutation scheme to each individual in each generation, and diversifying individuals selection using opposition-based learning. This approach is easy to implement and does not require the setting of mutation scheme selection and mutation scale factor. Experimental results are conducted for a variety of objective functions with low and high dimensionality on the CEC Black- Box Optimization Benchmarking 2015 (CEC-BBOB 2015). The results show superior performance of the proposed algorithm compared to the other micro-DE algorithms. |
Tasks | |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.06909v2 |
http://arxiv.org/pdf/1709.06909v2.pdf | |
PWC | https://paperswithcode.com/paper/opposition-based-ensemble-micro-differential |
Repo | |
Framework | |
Information Directed Sampling for Stochastic Bandits with Graph Feedback
Title | Information Directed Sampling for Stochastic Bandits with Graph Feedback |
Authors | Fang Liu, Swapna Buccapatnam, Ness Shroff |
Abstract | We consider stochastic multi-armed bandit problems with graph feedback, where the decision maker is allowed to observe the neighboring actions of the chosen action. We allow the graph structure to vary with time and consider both deterministic and Erd\H{o}s-R'enyi random graph models. For such a graph feedback model, we first present a novel analysis of Thompson sampling that leads to tighter performance bound than existing work. Next, we propose new Information Directed Sampling based policies that are graph-aware in their decision making. Under the deterministic graph case, we establish a Bayesian regret bound for the proposed policies that scales with the clique cover number of the graph instead of the number of actions. Under the random graph case, we provide a Bayesian regret bound for the proposed policies that scales with the ratio of the number of actions over the expected number of observations per iteration. To the best of our knowledge, this is the first analytical result for stochastic bandits with random graph feedback. Finally, using numerical evaluations, we demonstrate that our proposed IDS policies outperform existing approaches, including adaptions of upper confidence bound, $\epsilon$-greedy and Exp3 algorithms. |
Tasks | Decision Making |
Published | 2017-11-08 |
URL | http://arxiv.org/abs/1711.03198v1 |
http://arxiv.org/pdf/1711.03198v1.pdf | |
PWC | https://paperswithcode.com/paper/information-directed-sampling-for-stochastic |
Repo | |
Framework | |
Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform
Title | Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform |
Authors | Chaim Baskin, Natan Liss, Evgenii Zheltonozhskii, Alex M. Bronshtein, Avi Mendelson |
Abstract | Deep neural networks (DNNs) are used by different applications that are executed on a range of computer architectures, from IoT devices to supercomputers. The footprint of these networks is huge as well as their computational and communication needs. In order to ease the pressure on resources, research indicates that in many cases a low precision representation (1-2 bit per parameter) of weights and other parameters can achieve similar accuracy while requiring less resources. Using quantized values enables the use of FPGAs to run NNs, since FPGAs are well fitted to these primitives; e.g., FPGAs provide efficient support for bitwise operations and can work with arbitrary-precision representation of numbers. This paper presents a new streaming architecture for running QNNs on FPGAs. The proposed architecture scales out better than alternatives, allowing us to take advantage of systems with multiple FPGAs. We also included support for skip connections, that are used in state-of-the art NNs, and shown that our architecture allows to add those connections almost for free. All this allowed us to implement an 18-layer ResNet for 224x224 images classification, achieving 57.5% top-1 accuracy. In addition, we implemented a full-sized quantized AlexNet. In contrast to previous works, we use 2-bit activations instead of 1-bit ones, which improves AlexNet’s top-1 accuracy from 41.8% to 51.03% for the ImageNet classification. Both AlexNet and ResNet can handle 1000-class real-time classification on an FPGA. Our implementation of ResNet-18 consumes 5x less power and is 4x slower for ImageNet, when compared to the same NN on the latest Nvidia GPUs. Smaller NNs, that fit a single FPGA, are running faster then on GPUs on small (32x32) inputs, while consuming up to 20x less energy and power. |
Tasks | |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1708.00052v3 |
http://arxiv.org/pdf/1708.00052v3.pdf | |
PWC | https://paperswithcode.com/paper/streaming-architecture-for-large-scale |
Repo | |
Framework | |
Combining Static and Dynamic Features for Multivariate Sequence Classification
Title | Combining Static and Dynamic Features for Multivariate Sequence Classification |
Authors | Anna Leontjeva, Ilya Kuzovkin |
Abstract | Model precision in a classification task is highly dependent on the feature space that is used to train the model. Moreover, whether the features are sequential or static will dictate which classification method can be applied as most of the machine learning algorithms are designed to deal with either one or another type of data. In real-life scenarios, however, it is often the case that both static and dynamic features are present, or can be extracted from the data. In this work, we demonstrate how generative models such as Hidden Markov Models (HMM) and Long Short-Term Memory (LSTM) artificial neural networks can be used to extract temporal information from the dynamic data. We explore how the extracted information can be combined with the static features in order to improve the classification performance. We evaluate the existing techniques and suggest a hybrid approach, which outperforms other methods on several public datasets. |
Tasks | |
Published | 2017-12-20 |
URL | http://arxiv.org/abs/1712.08160v1 |
http://arxiv.org/pdf/1712.08160v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-static-and-dynamic-features-for |
Repo | |
Framework | |
Social media mining for identification and exploration of health-related information from pregnant women
Title | Social media mining for identification and exploration of health-related information from pregnant women |
Authors | Pramod Bharadwaj Chandrashekar, Arjun Magge, Abeed Sarker, Graciela Gonzalez |
Abstract | Widespread use of social media has led to the generation of substantial amounts of information about individuals, including health-related information. Social media provides the opportunity to study health-related information about selected population groups who may be of interest for a particular study. In this paper, we explore the possibility of utilizing social media to perform targeted data collection and analysis from a particular population group – pregnant women. We hypothesize that we can use social media to identify cohorts of pregnant women and follow them over time to analyze crucial health-related information. To identify potentially pregnant women, we employ simple rule-based searches that attempt to detect pregnancy announcements with moderate precision. To further filter out false positives and noise, we employ a supervised classifier using a small number of hand-annotated data. We then collect their posts over time to create longitudinal health timelines and attempt to divide the timelines into different pregnancy trimesters. Finally, we assess the usefulness of the timelines by performing a preliminary analysis to estimate drug intake patterns of our cohort at different trimesters. Our rule-based cohort identification technique collected 53,820 users over thirty months from Twitter. Our pregnancy announcement classification technique achieved an F-measure of 0.81 for the pregnancy class, resulting in 34,895 user timelines. Analysis of the timelines revealed that pertinent health-related information, such as drug-intake and adverse reactions can be mined from the data. Our approach to using user timelines in this fashion has produced very encouraging results and can be employed for other important tasks where cohorts, for which health-related information may not be available from other sources, are required to be followed over time to derive population-based estimates. |
Tasks | |
Published | 2017-02-08 |
URL | http://arxiv.org/abs/1702.02261v1 |
http://arxiv.org/pdf/1702.02261v1.pdf | |
PWC | https://paperswithcode.com/paper/social-media-mining-for-identification-and |
Repo | |
Framework | |
A Double Parametric Bootstrap Test for Topic Models
Title | A Double Parametric Bootstrap Test for Topic Models |
Authors | Skyler Seto, Sarah Tan, Giles Hooker, Martin T. Wells |
Abstract | Non-negative matrix factorization (NMF) is a technique for finding latent representations of data. The method has been applied to corpora to construct topic models. However, NMF has likelihood assumptions which are often violated by real document corpora. We present a double parametric bootstrap test for evaluating the fit of an NMF-based topic model based on the duality of the KL divergence and Poisson maximum likelihood estimation. The test correctly identifies whether a topic model based on an NMF approach yields reliable results in simulated and real data. |
Tasks | Topic Models |
Published | 2017-11-19 |
URL | http://arxiv.org/abs/1711.07104v2 |
http://arxiv.org/pdf/1711.07104v2.pdf | |
PWC | https://paperswithcode.com/paper/a-double-parametric-bootstrap-test-for-topic |
Repo | |
Framework | |
Image Pivoting for Learning Multilingual Multimodal Representations
Title | Image Pivoting for Learning Multilingual Multimodal Representations |
Authors | Spandana Gella, Rico Sennrich, Frank Keller, Mirella Lapata |
Abstract | In this paper we propose a model to learn multimodal multilingual representations for matching images and sentences in different languages, with the aim of advancing multilingual versions of image search and image understanding. Our model learns a common representation for images and their descriptions in two different languages (which need not be parallel) by considering the image as a pivot between two languages. We introduce a new pairwise ranking loss function which can handle both symmetric and asymmetric similarity between the two modalities. We evaluate our models on image-description ranking for German and English, and on semantic textual similarity of image descriptions in English. In both cases we achieve state-of-the-art performance. |
Tasks | Image Retrieval, Semantic Textual Similarity |
Published | 2017-07-24 |
URL | http://arxiv.org/abs/1707.07601v1 |
http://arxiv.org/pdf/1707.07601v1.pdf | |
PWC | https://paperswithcode.com/paper/image-pivoting-for-learning-multilingual |
Repo | |
Framework | |
Maximally Correlated Principal Component Analysis
Title | Maximally Correlated Principal Component Analysis |
Authors | Soheil Feizi, David Tse |
Abstract | In the era of big data, reducing data dimensionality is critical in many areas of science. Widely used Principal Component Analysis (PCA) addresses this problem by computing a low dimensional data embedding that maximally explain variance of the data. However, PCA has two major weaknesses. Firstly, it only considers linear correlations among variables (features), and secondly it is not suitable for categorical data. We resolve these issues by proposing Maximally Correlated Principal Component Analysis (MCPCA). MCPCA computes transformations of variables whose covariance matrix has the largest Ky Fan norm. Variable transformations are unknown, can be nonlinear and are computed in an optimization. MCPCA can also be viewed as a multivariate extension of Maximal Correlation. For jointly Gaussian variables we show that the covariance matrix corresponding to the identity (or the negative of the identity) transformations majorizes covariance matrices of non-identity functions. Using this result we characterize global MCPCA optimizers for nonlinear functions of jointly Gaussian variables for every rank constraint. For categorical variables we characterize global MCPCA optimizers for the rank one constraint based on the leading eigenvector of a matrix computed using pairwise joint distributions. For a general rank constraint we propose a block coordinate descend algorithm and show its convergence to stationary points of the MCPCA optimization. We compare MCPCA with PCA and other state-of-the-art dimensionality reduction methods including Isomap, LLE, multilayer autoencoders (neural networks), kernel PCA, probabilistic PCA and diffusion maps on several synthetic and real datasets. We show that MCPCA consistently provides improved performance compared to other methods. |
Tasks | Dimensionality Reduction |
Published | 2017-02-17 |
URL | http://arxiv.org/abs/1702.05471v2 |
http://arxiv.org/pdf/1702.05471v2.pdf | |
PWC | https://paperswithcode.com/paper/maximally-correlated-principal-component |
Repo | |
Framework | |
A Novel Weight-Shared Multi-Stage CNN for Scale Robustness
Title | A Novel Weight-Shared Multi-Stage CNN for Scale Robustness |
Authors | Ryo Takahashi, Takashi Matsubara, Kuniaki Uehara |
Abstract | Convolutional neural networks (CNNs) have demonstrated remarkable results in image classification for benchmark tasks and practical applications. The CNNs with deeper architectures have achieved even higher performance recently thanks to their robustness to the parallel shift of objects in images as well as their numerous parameters and the resulting high expression ability. However, CNNs have a limited robustness to other geometric transformations such as scaling and rotation. This limits the performance improvement of the deep CNNs, but there is no established solution. This study focuses on scale transformation and proposes a network architecture called the weight-shared multi-stage network (WSMS-Net), which consists of multiple stages of CNNs. The proposed WSMS-Net is easily combined with existing deep CNNs such as ResNet and DenseNet and enables them to acquire robustness to object scaling. Experimental results on the CIFAR-10, CIFAR-100, and ImageNet datasets demonstrate that existing deep CNNs combined with the proposed WSMS-Net achieve higher accuracies for image classification tasks with only a minor increase in the number of parameters and computation time. |
Tasks | Image Classification |
Published | 2017-02-12 |
URL | http://arxiv.org/abs/1702.03505v3 |
http://arxiv.org/pdf/1702.03505v3.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-weight-shared-multi-stage-network |
Repo | |
Framework | |