January 28, 2020

3129 words 15 mins read

Paper Group ANR 1023

Paper Group ANR 1023

Transcriptional Response of SK-N-AS Cells to Methamidophos. Signal propagation in continuous approximations of binary neural networks. Transformer-based Acoustic Modeling for Hybrid Speech Recognition. A Proposal-based Approach for Activity Image-to-Video Retrieval. Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation. …

Transcriptional Response of SK-N-AS Cells to Methamidophos

Title Transcriptional Response of SK-N-AS Cells to Methamidophos
Authors Akos Vertes, Albert-Baskar Arul, Peter Avar, Andrew R. Korte, Lida Parvin, Ziad J. Sahab, Deborah I. Bunin, Merrill Knapp, Denise Nishita, Andrew Poggio, Mark-Oliver Stehr, Carolyn L. Talcott, Brian M. Davis, Christine A. Morton, Christopher J. Sevinsky, Maria I. Zavodszky
Abstract Transcriptomics response of SK-N-AS cells to methamidophos (an acetylcholine esterase inhibitor) exposure was measured at 10 time points between 0.5 and 48 h. The data was analyzed using a combination of traditional statistical methods and novel machine learning algorithms for detecting anomalous behavior and infer causal relations between time profiles. We identified several processes that appeared to be upregulated in cells treated with methamidophos including: unfolded protein response, response to cAMP, calcium ion response, and cell-cell signaling. The data confirmed the expected consequence of acetylcholine buildup. In addition, transcripts with potentially key roles were identified and causal networks relating these transcripts were inferred using two different computational methods: Siamese convolutional networks and time warp causal inference. Two types of anomaly detection algorithms, one based on Autoencoders and the other one based on Generative Adversarial Networks (GANs), were applied to narrow down the set of relevant transcripts.
Tasks Anomaly Detection, Causal Inference
Published 2019-08-11
URL https://arxiv.org/abs/1908.03841v1
PDF https://arxiv.org/pdf/1908.03841v1.pdf
PWC https://paperswithcode.com/paper/transcriptional-response-of-sk-n-as-cells-to
Repo
Framework

Signal propagation in continuous approximations of binary neural networks

Title Signal propagation in continuous approximations of binary neural networks
Authors George Stamatescu, Federica Gerace, Carlo Lucibello, Ian Fuss, Langford B. White
Abstract The training of stochastic neural network models with binary ($\pm1$) weights and activations via a deterministic and continuous surrogate network is investigated. We derive, using mean field theory, a set of scalar equations describing how input signals propagate through the surrogate network. The equations reveal that these continuous models exhibit an order to chaos transition, and the presence of depth scales that limit the maximum trainable depth. Moreover, we predict theoretically and confirm numerically, that common weight initialization schemes used in standard continuous networks, when applied to the mean values of the stochastic binary weights, yield poor training performance. This study shows that, contrary to common intuition, the means of the stochastic binary weights should be initialised close to $\pm 1$ for deeper networks to be trainable.
Tasks
Published 2019-02-01
URL http://arxiv.org/abs/1902.00177v1
PDF http://arxiv.org/pdf/1902.00177v1.pdf
PWC https://paperswithcode.com/paper/signal-propagation-in-continuous
Repo
Framework

Transformer-based Acoustic Modeling for Hybrid Speech Recognition

Title Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Authors Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer
Abstract We propose and evaluate transformer-based acoustic models (AMs) for hybrid speech recognition. Several modeling choices are discussed in this work, including various positional embedding methods and an iterated loss to enable training deep transformers. We also present a preliminary study of using limited right context in transformer models, which makes it possible for streaming applications. We demonstrate that on the widely used Librispeech benchmark, our transformer-based AM outperforms the best published hybrid result by 19% to 26% relative when the standard n-gram language model (LM) is used. Combined with neural network LM for rescoring, our proposed approach achieves state-of-the-art results on Librispeech. Our findings are also confirmed on a much larger internal dataset.
Tasks Language Modelling, Speech Recognition
Published 2019-10-22
URL https://arxiv.org/abs/1910.09799v1
PDF https://arxiv.org/pdf/1910.09799v1.pdf
PWC https://paperswithcode.com/paper/transformer-based-acoustic-modeling-for
Repo
Framework

A Proposal-based Approach for Activity Image-to-Video Retrieval

Title A Proposal-based Approach for Activity Image-to-Video Retrieval
Authors Ruicong Xu, Li Niu, Jianfu Zhang, Liqing Zhang
Abstract Activity image-to-video retrieval task aims to retrieve videos containing the similar activity as the query image, which is a challenging task because videos generally have many background segments irrelevant to the activity. In this paper, we utilize R-C3D model to represent a video by a bag of activity proposals, which can filter out background segments to some extent. However, there are still noisy proposals in each bag. Thus, we propose an Activity Proposal-based Image-to-Video Retrieval (APIVR) approach, which incorporates multi-instance learning into cross-modal retrieval framework to address the proposal noise issue. Specifically, we propose a Graph Multi-Instance Learning (GMIL) module with graph convolutional layer, and integrate this module with classification loss, adversarial loss, and triplet loss in our cross-modal retrieval framework. Moreover, we propose geometry-aware triplet loss based on point-to-subspace distance to preserve the structural information of activity proposals. Extensive experiments on three widely-used datasets verify the effectiveness of our approach.
Tasks Cross-Modal Retrieval, Video Retrieval
Published 2019-11-24
URL https://arxiv.org/abs/1911.10531v1
PDF https://arxiv.org/pdf/1911.10531v1.pdf
PWC https://paperswithcode.com/paper/a-proposal-based-approach-for-activity-image
Repo
Framework

Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation

Title Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation
Authors Shantipriya Parida, Ondřej Bojar, Satya Ranjan Dash
Abstract Visual Genome is a dataset connecting structured image information with English language. We present ``Hindi Visual Genome’', a multimodal dataset consisting of text and images suitable for English-Hindi multimodal machine translation task and multimodal research. We have selected short English segments (captions) from Visual Genome along with associated images and automatically translated them to Hindi with manual post-editing which took the associated images into account. We prepared a set of 31525 segments, accompanied by a challenge test set of 1400 segments. This challenge test set was created by searching for (particularly) ambiguous English words based on the embedding similarity and manually selecting those where the image helps to resolve the ambiguity. Our dataset is the first for multimodal English-Hindi machine translation, freely available for non-commercial research purposes. Our Hindi version of Visual Genome also allows to create Hindi image labelers or other practical tools. Hindi Visual Genome also serves in Workshop on Asian Translation (WAT) 2019 Multi-Modal Translation Task. |
Tasks Machine Translation, Multimodal Machine Translation
Published 2019-07-21
URL https://arxiv.org/abs/1907.08948v1
PDF https://arxiv.org/pdf/1907.08948v1.pdf
PWC https://paperswithcode.com/paper/hindi-visual-genome-a-dataset-for-multimodal
Repo
Framework

Design Space of Behaviour Planning for Autonomous Driving

Title Design Space of Behaviour Planning for Autonomous Driving
Authors Marko Ilievski, Sean Sedwards, Ashish Gaurav, Aravind Balakrishnan, Atrisha Sarkar, Jaeyoung Lee, Frédéric Bouchard, Ryan De Iaco, Krzysztof Czarnecki
Abstract We explore the complex design space of behaviour planning for autonomous driving. Design choices that successfully address one aspect of behaviour planning can critically constrain others. To aid the design process, in this work we decompose the design space with respect to important choices arising from the current state of the art approaches, and describe the resulting trade-offs. In doing this, we also identify interesting directions of future work.
Tasks Autonomous Driving
Published 2019-08-21
URL https://arxiv.org/abs/1908.07931v1
PDF https://arxiv.org/pdf/1908.07931v1.pdf
PWC https://paperswithcode.com/paper/190807931
Repo
Framework

Language Grounding through Social Interactions and Curiosity-Driven Multi-Goal Learning

Title Language Grounding through Social Interactions and Curiosity-Driven Multi-Goal Learning
Authors Nicolas Lair, Cédric Colas, Rémy Portelas, Jean-Michel Dussoux, Peter Ford Dominey, Pierre-Yves Oudeyer
Abstract Autonomous reinforcement learning agents, like children, do not have access to predefined goals and reward functions. They must discover potential goals, learn their own reward functions and engage in their own learning trajectory. Children, however, benefit from exposure to language, helping to organize and mediate their thought. We propose LE2 (Language Enhanced Exploration), a learning algorithm leveraging intrinsic motivations and natural language (NL) interactions with a descriptive social partner (SP). Using NL descriptions from the SP, it can learn an NL-conditioned reward function to formulate goals for intrinsically motivated goal exploration and learn a goal-conditioned policy. By exploring, collecting descriptions from the SP and jointly learning the reward function and the policy, the agent grounds NL descriptions into real behavioral goals. From simple goals discovered early to more complex goals discovered by experimenting on simpler ones, our agent autonomously builds its own behavioral repertoire. This naturally occurring curriculum is supplemented by an active learning curriculum resulting from the agent’s intrinsic motivations. Experiments are presented with a simulated robotic arm that interacts with several objects including tools.
Tasks Active Learning
Published 2019-11-08
URL https://arxiv.org/abs/1911.03219v1
PDF https://arxiv.org/pdf/1911.03219v1.pdf
PWC https://paperswithcode.com/paper/language-grounding-through-social
Repo
Framework

On a method to construct exponential families by representation theory

Title On a method to construct exponential families by representation theory
Authors Koichi Tojo, Taro Yoshino
Abstract Exponential family plays an important role in information geometry. In arXiv:1811.01394, we introduced a method to construct an exponential family $\mathcal{P}={p_\theta}_{\theta\in\Theta}$ on a homogeneous space $G/H$ from a pair $(V,v_0)$. Here $V$ is a representation of $G$ and $v_0$ is an $H$-fixed vector in $V$. Then the following questions naturally arise: (Q1) when is the correspondence $\theta\mapsto p_\theta$ injective? (Q2) when do distinct pairs $(V,v_0)$ and $(V’,v_0’)$ generate the same family? In this paper, we answer these two questions (Theorems 1 and 2). Moreover, in Section 3, we consider the case $(G,H)=(\mathbb{R}_{>0}, {1})$ with a certain representation on $\mathbb{R}^2$. Then we see the family obtained by our method is essentially generalized inverse Gaussian distribution (GIG).
Tasks
Published 2019-07-06
URL https://arxiv.org/abs/1907.04212v1
PDF https://arxiv.org/pdf/1907.04212v1.pdf
PWC https://paperswithcode.com/paper/on-a-method-to-construct-exponential-families
Repo
Framework

Few-Shot Knowledge Graph Completion

Title Few-Shot Knowledge Graph Completion
Authors Chuxu Zhang, Huaxiu Yao, Chao Huang, Meng Jiang, Zhenhui Li, Nitesh V. Chawla
Abstract Knowledge graphs (KGs) serve as useful resources for various natural language processing applications. Previous KG completion approaches require a large number of training instances (i.e., head-tail entity pairs) for every relation. The real case is that for most of the relations, very few entity pairs are available. Existing work of one-shot learning limits method generalizability for few-shot scenarios and does not fully use the supervisory information; however, few-shot KG completion has not been well studied yet. In this work, we propose a novel few-shot relation learning model (FSRL) that aims at discovering facts of new relations with few-shot references. FSRL can effectively capture knowledge from heterogeneous graph structure, aggregate representations of few-shot references, and match similar entity pairs of reference set for every relation. Extensive experiments on two public datasets demonstrate that FSRL outperforms the state-of-the-art.
Tasks Knowledge Graph Completion, Knowledge Graphs, One-Shot Learning
Published 2019-11-26
URL https://arxiv.org/abs/1911.11298v1
PDF https://arxiv.org/pdf/1911.11298v1.pdf
PWC https://paperswithcode.com/paper/few-shot-knowledge-graph-completion
Repo
Framework

Comment: Reflections on the Deconfounder

Title Comment: Reflections on the Deconfounder
Authors Alexander D’Amour
Abstract The aim of this comment (set to appear in a formal discussion in JASA) is to draw out some conclusions from an extended back-and-forth I have had with Wang and Blei regarding the deconfounder method proposed in “The Blessings of Multiple Causes” [arXiv:1805.06826]. I will make three points here. First, in my role as the critic in this conversation, I will summarize some arguments about the lack of causal identification in the bulk of settings where the “informal” message of the paper suggests that the deconfounder could be used. This is a point that is discussed at length in D’Amour 2019 [arXiv:1902.10286], which motivated the results concerning causal identification in Theorems 6–8 of “Blessings”. Second, I will argue that adding parametric assumptions to the working model in order to obtain identification of causal parameters (a strategy followed in Theorem 6 and in the experimental examples) is a risky strategy, and should only be done when extremely strong prior information is available. Finally, I will consider the implications of the nonparametric identification results provided for a narrow, but non-trivial, set of causal estimands in Theorems 7 and 8. I will highlight that these results may be even more interesting from the perspective of detecting causal identification from observed data, under relatively weak assumptions about confounders.
Tasks Causal Identification
Published 2019-10-17
URL https://arxiv.org/abs/1910.08042v1
PDF https://arxiv.org/pdf/1910.08042v1.pdf
PWC https://paperswithcode.com/paper/comment-reflections-on-the-deconfounder
Repo
Framework

Estimating Individualized Treatment Regimes from Crossover Designs

Title Estimating Individualized Treatment Regimes from Crossover Designs
Authors Crystal T. Nguyen, Daniel J. Luckett, Anna R. Kahkoska, Grace E. Shearrer, Donna Spruijt-Metz, Jaimie N. Davis, Michael R. Kosorok
Abstract The field of precision medicine aims to tailor treatment based on patient-specific factors in a reproducible way. To this end, estimating an optimal individualized treatment regime (ITR) that recommends treatment decisions based on patient characteristics to maximize the mean of a pre-specified outcome is of particular interest. Several methods have been proposed for estimating an optimal ITR from clinical trial data in the parallel group setting where each subject is randomized to a single intervention. However, little work has been done in the area of estimating the optimal ITR from crossover study designs. Such designs naturally lend themselves to precision medicine, because they allow for observing the response to multiple treatments for each patient. In this paper, we introduce a method for estimating the optimal ITR using data from a 2x2 crossover study with or without carryover effects. The proposed method is similar to policy search methods such as outcome weighted learning; however, we take advantage of the crossover design by using the difference in responses under each treatment as the observed reward. We establish Fisher and global consistency, present numerical experiments, and analyze data from a feeding trial to demonstrate the improved performance of the proposed method compared to standard methods for a parallel study design.
Tasks
Published 2019-02-05
URL http://arxiv.org/abs/1902.05499v1
PDF http://arxiv.org/pdf/1902.05499v1.pdf
PWC https://paperswithcode.com/paper/estimating-individualized-treatment-regimes
Repo
Framework

A Comparative Review of Recent Kinect-based Action Recognition Algorithms

Title A Comparative Review of Recent Kinect-based Action Recognition Algorithms
Authors Lei Wang, Du Q. Huynh, Piotr Koniusz
Abstract Video-based human action recognition is currently one of the most active research areas in computer vision. Various research studies indicate that the performance of action recognition is highly dependent on the type of features being extracted and how the actions are represented. Since the release of the Kinect camera, a large number of Kinect-based human action recognition techniques have been proposed in the literature. However, there still does not exist a thorough comparison of these Kinect-based techniques under the grouping of feature types, such as handcrafted versus deep learning features and depth-based versus skeleton-based features. In this paper, we analyze and compare ten recent Kinect-based algorithms for both cross-subject action recognition and cross-view action recognition using six benchmark datasets. In addition, we have implemented and improved some of these techniques and included their variants in the comparison. Our experiments show that the majority of methods perform better on cross-subject action recognition than cross-view action recognition, that skeleton-based features are more robust for cross-view recognition than depth-based features, and that deep learning features are suitable for large datasets.
Tasks Skeleton Based Action Recognition, Temporal Action Localization
Published 2019-06-24
URL https://arxiv.org/abs/1906.09955v1
PDF https://arxiv.org/pdf/1906.09955v1.pdf
PWC https://paperswithcode.com/paper/a-comparative-review-of-recent-kinect-based
Repo
Framework

Utilizing Deep Learning Towards Multi-modal Bio-sensing and Vision-based Affective Computing

Title Utilizing Deep Learning Towards Multi-modal Bio-sensing and Vision-based Affective Computing
Authors Siddharth Siddharth, Tzyy-Ping Jung, Terrence J. Sejnowski
Abstract In recent years, the use of bio-sensing signals such as electroencephalogram (EEG), electrocardiogram (ECG), etc. have garnered interest towards applications in affective computing. The parallel trend of deep-learning has led to a huge leap in performance towards solving various vision-based research problems such as object detection. Yet, these advances in deep-learning have not adequately translated into bio-sensing research. This work applies novel deep-learning-based methods to various bio-sensing and video data of four publicly available multi-modal emotion datasets. For each dataset, we first individually evaluate the emotion-classification performance obtained by each modality. We then evaluate the performance obtained by fusing the features from these modalities. We show that our algorithms outperform the results reported by other studies for emotion/valence/arousal/liking classification on DEAP and MAHNOB-HCI datasets and set up benchmarks for the newer AMIGOS and DREAMER datasets. We also evaluate the performance of our algorithms by combining the datasets and by using transfer learning to show that the proposed method overcomes the inconsistencies between the datasets. Hence, we do a thorough analysis of multi-modal affective data from more than 120 subjects and 2,800 trials. Finally, utilizing a convolution-deconvolution network, we propose a new technique towards identifying salient brain regions corresponding to various affective states.
Tasks EEG, Emotion Classification, Object Detection, Transfer Learning
Published 2019-05-16
URL https://arxiv.org/abs/1905.07039v1
PDF https://arxiv.org/pdf/1905.07039v1.pdf
PWC https://paperswithcode.com/paper/utilizing-deep-learning-towards-multi-modal
Repo
Framework

Shallow Unorganized Neural Networks using Smart Neuron Model for Visual Perception

Title Shallow Unorganized Neural Networks using Smart Neuron Model for Visual Perception
Authors Richard Jiang, Danny Crookes
Abstract The recent success of Deep Neural Networks (DNNs) has revealed the significant capability of neural computing in many challenging applications. Although DNNs are derived from emulating biological neurons, there still exist doubts over whether or not DNNs are the final and best model to emulate the mechanism of human intelligence. In particular, there are two discrepancies between computational DNN models and the observed facts of biological neurons. First, human neurons are interconnected randomly, while DNNs need carefully-designed architectures to work properly. Second, human neurons usually have a long spiking latency (~100ms) which implies that not many layers can be involved in making a decision, while DNNs could have hundreds of layers to guarantee high accuracy. In this paper, we propose a new computational model, namely shallow unorganized neural networks (SUNNs), in contrast to ANNs/DNNs. The proposed SUNNs differ from standard ANNs or DNNs in three fundamental aspects: 1) SUNNs are based on an adaptive neuron cell model, Smart Neurons, that allows each artificial neuron cell to adaptively respond to its inputs rather than carrying out a fixed weighted-sum operation like the classic neuron model in ANNs/DNNs; 2) SUNNs can cope with computational tasks with very shallow architectures; 3) SUNNs have a natural topology with random interconnections, as the human brain does, and as proposed by Turing’s B-type unorganized machines. We implemented the proposed SUNN architecture and tested it on a number of unsupervised early stage visual perception tasks. Surprisingly, such simple shallow architectures achieved very good results in our experiments. The success of our new computational model makes it the first workable example of Turing’s B-Type unorganized machine that can achieve comparable or better performance against the state-of-the-art algorithms.
Tasks
Published 2019-07-21
URL https://arxiv.org/abs/1907.09050v2
PDF https://arxiv.org/pdf/1907.09050v2.pdf
PWC https://paperswithcode.com/paper/shallow-unorganized-neural-networks-using
Repo
Framework

Improving Prostate Cancer Detection with Breast Histopathology Images

Title Improving Prostate Cancer Detection with Breast Histopathology Images
Authors Umair Akhtar Hasan Khan, Carolin Stürenberg, Oguzhan Gencoglu, Kevin Sandeman, Timo Heikkinen, Antti Rannikko, Tuomas Mirtti
Abstract Deep neural networks have introduced significant advancements in the field of machine learning-based analysis of digital pathology images including prostate tissue images. With the help of transfer learning, classification and segmentation performance of neural network models have been further increased. However, due to the absence of large, extensively annotated, publicly available prostate histopathology datasets, several previous studies employ datasets from well-studied computer vision tasks such as ImageNet dataset. In this work, we propose a transfer learning scheme from breast histopathology images to improve prostate cancer detection performance. We validate our approach on annotated prostate whole slide images by using a publicly available breast histopathology dataset as pre-training. We show that the proposed cross-cancer approach outperforms transfer learning from ImageNet dataset.
Tasks Transfer Learning
Published 2019-03-14
URL http://arxiv.org/abs/1903.05769v1
PDF http://arxiv.org/pdf/1903.05769v1.pdf
PWC https://paperswithcode.com/paper/improving-prostate-cancer-detection-with
Repo
Framework
comments powered by Disqus