Paper Group ANR 1023
Transcriptional Response of SK-N-AS Cells to Methamidophos. Signal propagation in continuous approximations of binary neural networks. Transformer-based Acoustic Modeling for Hybrid Speech Recognition. A Proposal-based Approach for Activity Image-to-Video Retrieval. Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation. …
Transcriptional Response of SK-N-AS Cells to Methamidophos
Title | Transcriptional Response of SK-N-AS Cells to Methamidophos |
Authors | Akos Vertes, Albert-Baskar Arul, Peter Avar, Andrew R. Korte, Lida Parvin, Ziad J. Sahab, Deborah I. Bunin, Merrill Knapp, Denise Nishita, Andrew Poggio, Mark-Oliver Stehr, Carolyn L. Talcott, Brian M. Davis, Christine A. Morton, Christopher J. Sevinsky, Maria I. Zavodszky |
Abstract | Transcriptomics response of SK-N-AS cells to methamidophos (an acetylcholine esterase inhibitor) exposure was measured at 10 time points between 0.5 and 48 h. The data was analyzed using a combination of traditional statistical methods and novel machine learning algorithms for detecting anomalous behavior and infer causal relations between time profiles. We identified several processes that appeared to be upregulated in cells treated with methamidophos including: unfolded protein response, response to cAMP, calcium ion response, and cell-cell signaling. The data confirmed the expected consequence of acetylcholine buildup. In addition, transcripts with potentially key roles were identified and causal networks relating these transcripts were inferred using two different computational methods: Siamese convolutional networks and time warp causal inference. Two types of anomaly detection algorithms, one based on Autoencoders and the other one based on Generative Adversarial Networks (GANs), were applied to narrow down the set of relevant transcripts. |
Tasks | Anomaly Detection, Causal Inference |
Published | 2019-08-11 |
URL | https://arxiv.org/abs/1908.03841v1 |
https://arxiv.org/pdf/1908.03841v1.pdf | |
PWC | https://paperswithcode.com/paper/transcriptional-response-of-sk-n-as-cells-to |
Repo | |
Framework | |
Signal propagation in continuous approximations of binary neural networks
Title | Signal propagation in continuous approximations of binary neural networks |
Authors | George Stamatescu, Federica Gerace, Carlo Lucibello, Ian Fuss, Langford B. White |
Abstract | The training of stochastic neural network models with binary ($\pm1$) weights and activations via a deterministic and continuous surrogate network is investigated. We derive, using mean field theory, a set of scalar equations describing how input signals propagate through the surrogate network. The equations reveal that these continuous models exhibit an order to chaos transition, and the presence of depth scales that limit the maximum trainable depth. Moreover, we predict theoretically and confirm numerically, that common weight initialization schemes used in standard continuous networks, when applied to the mean values of the stochastic binary weights, yield poor training performance. This study shows that, contrary to common intuition, the means of the stochastic binary weights should be initialised close to $\pm 1$ for deeper networks to be trainable. |
Tasks | |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.00177v1 |
http://arxiv.org/pdf/1902.00177v1.pdf | |
PWC | https://paperswithcode.com/paper/signal-propagation-in-continuous |
Repo | |
Framework | |
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Title | Transformer-based Acoustic Modeling for Hybrid Speech Recognition |
Authors | Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer |
Abstract | We propose and evaluate transformer-based acoustic models (AMs) for hybrid speech recognition. Several modeling choices are discussed in this work, including various positional embedding methods and an iterated loss to enable training deep transformers. We also present a preliminary study of using limited right context in transformer models, which makes it possible for streaming applications. We demonstrate that on the widely used Librispeech benchmark, our transformer-based AM outperforms the best published hybrid result by 19% to 26% relative when the standard n-gram language model (LM) is used. Combined with neural network LM for rescoring, our proposed approach achieves state-of-the-art results on Librispeech. Our findings are also confirmed on a much larger internal dataset. |
Tasks | Language Modelling, Speech Recognition |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.09799v1 |
https://arxiv.org/pdf/1910.09799v1.pdf | |
PWC | https://paperswithcode.com/paper/transformer-based-acoustic-modeling-for |
Repo | |
Framework | |
A Proposal-based Approach for Activity Image-to-Video Retrieval
Title | A Proposal-based Approach for Activity Image-to-Video Retrieval |
Authors | Ruicong Xu, Li Niu, Jianfu Zhang, Liqing Zhang |
Abstract | Activity image-to-video retrieval task aims to retrieve videos containing the similar activity as the query image, which is a challenging task because videos generally have many background segments irrelevant to the activity. In this paper, we utilize R-C3D model to represent a video by a bag of activity proposals, which can filter out background segments to some extent. However, there are still noisy proposals in each bag. Thus, we propose an Activity Proposal-based Image-to-Video Retrieval (APIVR) approach, which incorporates multi-instance learning into cross-modal retrieval framework to address the proposal noise issue. Specifically, we propose a Graph Multi-Instance Learning (GMIL) module with graph convolutional layer, and integrate this module with classification loss, adversarial loss, and triplet loss in our cross-modal retrieval framework. Moreover, we propose geometry-aware triplet loss based on point-to-subspace distance to preserve the structural information of activity proposals. Extensive experiments on three widely-used datasets verify the effectiveness of our approach. |
Tasks | Cross-Modal Retrieval, Video Retrieval |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10531v1 |
https://arxiv.org/pdf/1911.10531v1.pdf | |
PWC | https://paperswithcode.com/paper/a-proposal-based-approach-for-activity-image |
Repo | |
Framework | |
Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation
Title | Hindi Visual Genome: A Dataset for Multimodal English-to-Hindi Machine Translation |
Authors | Shantipriya Parida, Ondřej Bojar, Satya Ranjan Dash |
Abstract | Visual Genome is a dataset connecting structured image information with English language. We present ``Hindi Visual Genome’', a multimodal dataset consisting of text and images suitable for English-Hindi multimodal machine translation task and multimodal research. We have selected short English segments (captions) from Visual Genome along with associated images and automatically translated them to Hindi with manual post-editing which took the associated images into account. We prepared a set of 31525 segments, accompanied by a challenge test set of 1400 segments. This challenge test set was created by searching for (particularly) ambiguous English words based on the embedding similarity and manually selecting those where the image helps to resolve the ambiguity. Our dataset is the first for multimodal English-Hindi machine translation, freely available for non-commercial research purposes. Our Hindi version of Visual Genome also allows to create Hindi image labelers or other practical tools. Hindi Visual Genome also serves in Workshop on Asian Translation (WAT) 2019 Multi-Modal Translation Task. | |
Tasks | Machine Translation, Multimodal Machine Translation |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.08948v1 |
https://arxiv.org/pdf/1907.08948v1.pdf | |
PWC | https://paperswithcode.com/paper/hindi-visual-genome-a-dataset-for-multimodal |
Repo | |
Framework | |
Design Space of Behaviour Planning for Autonomous Driving
Title | Design Space of Behaviour Planning for Autonomous Driving |
Authors | Marko Ilievski, Sean Sedwards, Ashish Gaurav, Aravind Balakrishnan, Atrisha Sarkar, Jaeyoung Lee, Frédéric Bouchard, Ryan De Iaco, Krzysztof Czarnecki |
Abstract | We explore the complex design space of behaviour planning for autonomous driving. Design choices that successfully address one aspect of behaviour planning can critically constrain others. To aid the design process, in this work we decompose the design space with respect to important choices arising from the current state of the art approaches, and describe the resulting trade-offs. In doing this, we also identify interesting directions of future work. |
Tasks | Autonomous Driving |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.07931v1 |
https://arxiv.org/pdf/1908.07931v1.pdf | |
PWC | https://paperswithcode.com/paper/190807931 |
Repo | |
Framework | |
Language Grounding through Social Interactions and Curiosity-Driven Multi-Goal Learning
Title | Language Grounding through Social Interactions and Curiosity-Driven Multi-Goal Learning |
Authors | Nicolas Lair, Cédric Colas, Rémy Portelas, Jean-Michel Dussoux, Peter Ford Dominey, Pierre-Yves Oudeyer |
Abstract | Autonomous reinforcement learning agents, like children, do not have access to predefined goals and reward functions. They must discover potential goals, learn their own reward functions and engage in their own learning trajectory. Children, however, benefit from exposure to language, helping to organize and mediate their thought. We propose LE2 (Language Enhanced Exploration), a learning algorithm leveraging intrinsic motivations and natural language (NL) interactions with a descriptive social partner (SP). Using NL descriptions from the SP, it can learn an NL-conditioned reward function to formulate goals for intrinsically motivated goal exploration and learn a goal-conditioned policy. By exploring, collecting descriptions from the SP and jointly learning the reward function and the policy, the agent grounds NL descriptions into real behavioral goals. From simple goals discovered early to more complex goals discovered by experimenting on simpler ones, our agent autonomously builds its own behavioral repertoire. This naturally occurring curriculum is supplemented by an active learning curriculum resulting from the agent’s intrinsic motivations. Experiments are presented with a simulated robotic arm that interacts with several objects including tools. |
Tasks | Active Learning |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03219v1 |
https://arxiv.org/pdf/1911.03219v1.pdf | |
PWC | https://paperswithcode.com/paper/language-grounding-through-social |
Repo | |
Framework | |
On a method to construct exponential families by representation theory
Title | On a method to construct exponential families by representation theory |
Authors | Koichi Tojo, Taro Yoshino |
Abstract | Exponential family plays an important role in information geometry. In arXiv:1811.01394, we introduced a method to construct an exponential family $\mathcal{P}={p_\theta}_{\theta\in\Theta}$ on a homogeneous space $G/H$ from a pair $(V,v_0)$. Here $V$ is a representation of $G$ and $v_0$ is an $H$-fixed vector in $V$. Then the following questions naturally arise: (Q1) when is the correspondence $\theta\mapsto p_\theta$ injective? (Q2) when do distinct pairs $(V,v_0)$ and $(V’,v_0’)$ generate the same family? In this paper, we answer these two questions (Theorems 1 and 2). Moreover, in Section 3, we consider the case $(G,H)=(\mathbb{R}_{>0}, {1})$ with a certain representation on $\mathbb{R}^2$. Then we see the family obtained by our method is essentially generalized inverse Gaussian distribution (GIG). |
Tasks | |
Published | 2019-07-06 |
URL | https://arxiv.org/abs/1907.04212v1 |
https://arxiv.org/pdf/1907.04212v1.pdf | |
PWC | https://paperswithcode.com/paper/on-a-method-to-construct-exponential-families |
Repo | |
Framework | |
Few-Shot Knowledge Graph Completion
Title | Few-Shot Knowledge Graph Completion |
Authors | Chuxu Zhang, Huaxiu Yao, Chao Huang, Meng Jiang, Zhenhui Li, Nitesh V. Chawla |
Abstract | Knowledge graphs (KGs) serve as useful resources for various natural language processing applications. Previous KG completion approaches require a large number of training instances (i.e., head-tail entity pairs) for every relation. The real case is that for most of the relations, very few entity pairs are available. Existing work of one-shot learning limits method generalizability for few-shot scenarios and does not fully use the supervisory information; however, few-shot KG completion has not been well studied yet. In this work, we propose a novel few-shot relation learning model (FSRL) that aims at discovering facts of new relations with few-shot references. FSRL can effectively capture knowledge from heterogeneous graph structure, aggregate representations of few-shot references, and match similar entity pairs of reference set for every relation. Extensive experiments on two public datasets demonstrate that FSRL outperforms the state-of-the-art. |
Tasks | Knowledge Graph Completion, Knowledge Graphs, One-Shot Learning |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11298v1 |
https://arxiv.org/pdf/1911.11298v1.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-knowledge-graph-completion |
Repo | |
Framework | |
Comment: Reflections on the Deconfounder
Title | Comment: Reflections on the Deconfounder |
Authors | Alexander D’Amour |
Abstract | The aim of this comment (set to appear in a formal discussion in JASA) is to draw out some conclusions from an extended back-and-forth I have had with Wang and Blei regarding the deconfounder method proposed in “The Blessings of Multiple Causes” [arXiv:1805.06826]. I will make three points here. First, in my role as the critic in this conversation, I will summarize some arguments about the lack of causal identification in the bulk of settings where the “informal” message of the paper suggests that the deconfounder could be used. This is a point that is discussed at length in D’Amour 2019 [arXiv:1902.10286], which motivated the results concerning causal identification in Theorems 6–8 of “Blessings”. Second, I will argue that adding parametric assumptions to the working model in order to obtain identification of causal parameters (a strategy followed in Theorem 6 and in the experimental examples) is a risky strategy, and should only be done when extremely strong prior information is available. Finally, I will consider the implications of the nonparametric identification results provided for a narrow, but non-trivial, set of causal estimands in Theorems 7 and 8. I will highlight that these results may be even more interesting from the perspective of detecting causal identification from observed data, under relatively weak assumptions about confounders. |
Tasks | Causal Identification |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.08042v1 |
https://arxiv.org/pdf/1910.08042v1.pdf | |
PWC | https://paperswithcode.com/paper/comment-reflections-on-the-deconfounder |
Repo | |
Framework | |
Estimating Individualized Treatment Regimes from Crossover Designs
Title | Estimating Individualized Treatment Regimes from Crossover Designs |
Authors | Crystal T. Nguyen, Daniel J. Luckett, Anna R. Kahkoska, Grace E. Shearrer, Donna Spruijt-Metz, Jaimie N. Davis, Michael R. Kosorok |
Abstract | The field of precision medicine aims to tailor treatment based on patient-specific factors in a reproducible way. To this end, estimating an optimal individualized treatment regime (ITR) that recommends treatment decisions based on patient characteristics to maximize the mean of a pre-specified outcome is of particular interest. Several methods have been proposed for estimating an optimal ITR from clinical trial data in the parallel group setting where each subject is randomized to a single intervention. However, little work has been done in the area of estimating the optimal ITR from crossover study designs. Such designs naturally lend themselves to precision medicine, because they allow for observing the response to multiple treatments for each patient. In this paper, we introduce a method for estimating the optimal ITR using data from a 2x2 crossover study with or without carryover effects. The proposed method is similar to policy search methods such as outcome weighted learning; however, we take advantage of the crossover design by using the difference in responses under each treatment as the observed reward. We establish Fisher and global consistency, present numerical experiments, and analyze data from a feeding trial to demonstrate the improved performance of the proposed method compared to standard methods for a parallel study design. |
Tasks | |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.05499v1 |
http://arxiv.org/pdf/1902.05499v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-individualized-treatment-regimes |
Repo | |
Framework | |
A Comparative Review of Recent Kinect-based Action Recognition Algorithms
Title | A Comparative Review of Recent Kinect-based Action Recognition Algorithms |
Authors | Lei Wang, Du Q. Huynh, Piotr Koniusz |
Abstract | Video-based human action recognition is currently one of the most active research areas in computer vision. Various research studies indicate that the performance of action recognition is highly dependent on the type of features being extracted and how the actions are represented. Since the release of the Kinect camera, a large number of Kinect-based human action recognition techniques have been proposed in the literature. However, there still does not exist a thorough comparison of these Kinect-based techniques under the grouping of feature types, such as handcrafted versus deep learning features and depth-based versus skeleton-based features. In this paper, we analyze and compare ten recent Kinect-based algorithms for both cross-subject action recognition and cross-view action recognition using six benchmark datasets. In addition, we have implemented and improved some of these techniques and included their variants in the comparison. Our experiments show that the majority of methods perform better on cross-subject action recognition than cross-view action recognition, that skeleton-based features are more robust for cross-view recognition than depth-based features, and that deep learning features are suitable for large datasets. |
Tasks | Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09955v1 |
https://arxiv.org/pdf/1906.09955v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-review-of-recent-kinect-based |
Repo | |
Framework | |
Utilizing Deep Learning Towards Multi-modal Bio-sensing and Vision-based Affective Computing
Title | Utilizing Deep Learning Towards Multi-modal Bio-sensing and Vision-based Affective Computing |
Authors | Siddharth Siddharth, Tzyy-Ping Jung, Terrence J. Sejnowski |
Abstract | In recent years, the use of bio-sensing signals such as electroencephalogram (EEG), electrocardiogram (ECG), etc. have garnered interest towards applications in affective computing. The parallel trend of deep-learning has led to a huge leap in performance towards solving various vision-based research problems such as object detection. Yet, these advances in deep-learning have not adequately translated into bio-sensing research. This work applies novel deep-learning-based methods to various bio-sensing and video data of four publicly available multi-modal emotion datasets. For each dataset, we first individually evaluate the emotion-classification performance obtained by each modality. We then evaluate the performance obtained by fusing the features from these modalities. We show that our algorithms outperform the results reported by other studies for emotion/valence/arousal/liking classification on DEAP and MAHNOB-HCI datasets and set up benchmarks for the newer AMIGOS and DREAMER datasets. We also evaluate the performance of our algorithms by combining the datasets and by using transfer learning to show that the proposed method overcomes the inconsistencies between the datasets. Hence, we do a thorough analysis of multi-modal affective data from more than 120 subjects and 2,800 trials. Finally, utilizing a convolution-deconvolution network, we propose a new technique towards identifying salient brain regions corresponding to various affective states. |
Tasks | EEG, Emotion Classification, Object Detection, Transfer Learning |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.07039v1 |
https://arxiv.org/pdf/1905.07039v1.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-deep-learning-towards-multi-modal |
Repo | |
Framework | |
Shallow Unorganized Neural Networks using Smart Neuron Model for Visual Perception
Title | Shallow Unorganized Neural Networks using Smart Neuron Model for Visual Perception |
Authors | Richard Jiang, Danny Crookes |
Abstract | The recent success of Deep Neural Networks (DNNs) has revealed the significant capability of neural computing in many challenging applications. Although DNNs are derived from emulating biological neurons, there still exist doubts over whether or not DNNs are the final and best model to emulate the mechanism of human intelligence. In particular, there are two discrepancies between computational DNN models and the observed facts of biological neurons. First, human neurons are interconnected randomly, while DNNs need carefully-designed architectures to work properly. Second, human neurons usually have a long spiking latency (~100ms) which implies that not many layers can be involved in making a decision, while DNNs could have hundreds of layers to guarantee high accuracy. In this paper, we propose a new computational model, namely shallow unorganized neural networks (SUNNs), in contrast to ANNs/DNNs. The proposed SUNNs differ from standard ANNs or DNNs in three fundamental aspects: 1) SUNNs are based on an adaptive neuron cell model, Smart Neurons, that allows each artificial neuron cell to adaptively respond to its inputs rather than carrying out a fixed weighted-sum operation like the classic neuron model in ANNs/DNNs; 2) SUNNs can cope with computational tasks with very shallow architectures; 3) SUNNs have a natural topology with random interconnections, as the human brain does, and as proposed by Turing’s B-type unorganized machines. We implemented the proposed SUNN architecture and tested it on a number of unsupervised early stage visual perception tasks. Surprisingly, such simple shallow architectures achieved very good results in our experiments. The success of our new computational model makes it the first workable example of Turing’s B-Type unorganized machine that can achieve comparable or better performance against the state-of-the-art algorithms. |
Tasks | |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.09050v2 |
https://arxiv.org/pdf/1907.09050v2.pdf | |
PWC | https://paperswithcode.com/paper/shallow-unorganized-neural-networks-using |
Repo | |
Framework | |
Improving Prostate Cancer Detection with Breast Histopathology Images
Title | Improving Prostate Cancer Detection with Breast Histopathology Images |
Authors | Umair Akhtar Hasan Khan, Carolin Stürenberg, Oguzhan Gencoglu, Kevin Sandeman, Timo Heikkinen, Antti Rannikko, Tuomas Mirtti |
Abstract | Deep neural networks have introduced significant advancements in the field of machine learning-based analysis of digital pathology images including prostate tissue images. With the help of transfer learning, classification and segmentation performance of neural network models have been further increased. However, due to the absence of large, extensively annotated, publicly available prostate histopathology datasets, several previous studies employ datasets from well-studied computer vision tasks such as ImageNet dataset. In this work, we propose a transfer learning scheme from breast histopathology images to improve prostate cancer detection performance. We validate our approach on annotated prostate whole slide images by using a publicly available breast histopathology dataset as pre-training. We show that the proposed cross-cancer approach outperforms transfer learning from ImageNet dataset. |
Tasks | Transfer Learning |
Published | 2019-03-14 |
URL | http://arxiv.org/abs/1903.05769v1 |
http://arxiv.org/pdf/1903.05769v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-prostate-cancer-detection-with |
Repo | |
Framework | |