July 27, 2019

2809 words 14 mins read

Paper Group ANR 476

Discrete Event, Continuous Time RNNs. Demystifying AlphaGo Zero as AlphaGo GAN. Learning Causal Structures Using Regression Invariance. Complexity-Aware Assignment of Latent Values in Discriminative Models for Accurate Gesture Recognition. SenGen: Sentence Generating Neural Variational Topic Model. ReFACTor: Practical Low-Rank Matrix Estimation Und …

Discrete Event, Continuous Time RNNs


Title	Discrete Event, Continuous Time RNNs
Authors	Michael C. Mozer, Denis Kazakov, Robert V. Lindsey
Abstract	We investigate recurrent neural network architectures for event-sequence processing. Event sequences, characterized by discrete observations stamped with continuous-valued times of occurrence, are challenging due to the potentially wide dynamic range of relevant time scales as well as interactions between time scales. We describe four forms of inductive bias that should benefit architectures for event sequences: temporal locality, position and scale homogeneity, and scale interdependence. We extend the popular gated recurrent unit (GRU) architecture to incorporate these biases via intrinsic temporal dynamics, obtaining a continuous-time GRU. The CT-GRU arises by interpreting the gates of a GRU as selecting a time scale of memory, and the CT-GRU generalizes the GRU by incorporating multiple time scales of memory and performing context-dependent selection of time scales for information storage and retrieval. Event time-stamps drive decay dynamics of the CT-GRU, whereas they serve as generic additional inputs to the GRU. Despite the very different manner in which the two models consider time, their performance on eleven data sets we examined is essentially identical. Our surprising results point both to the robustness of GRU and LSTM architectures for handling continuous time, and to the potency of incorporating continuous dynamics into neural architectures.
Tasks
Published	2017-10-11
URL	http://arxiv.org/abs/1710.04110v1
PDF	http://arxiv.org/pdf/1710.04110v1.pdf
PWC	https://paperswithcode.com/paper/discrete-event-continuous-time-rnns
Repo
Framework

Demystifying AlphaGo Zero as AlphaGo GAN


Title	Demystifying AlphaGo Zero as AlphaGo GAN
Authors	Xiao Dong, Jiasong Wu, Ling Zhou
Abstract	The astonishing success of AlphaGo Zero\cite{Silver_AlphaGo} invokes a worldwide discussion of the future of our human society with a mixed mood of hope, anxiousness, excitement and fear. We try to dymystify AlphaGo Zero by a qualitative analysis to indicate that AlphaGo Zero can be understood as a specially structured GAN system which is expected to possess an inherent good convergence property. Thus we deduct the success of AlphaGo Zero may not be a sign of a new generation of AI.
Tasks
Published	2017-11-24
URL	http://arxiv.org/abs/1711.09091v1
PDF	http://arxiv.org/pdf/1711.09091v1.pdf
PWC	https://paperswithcode.com/paper/demystifying-alphago-zero-as-alphago-gan
Repo
Framework

Learning Causal Structures Using Regression Invariance


Title	Learning Causal Structures Using Regression Invariance
Authors	AmirEmad Ghassami, Saber Salehkaleybar, Negar Kiyavash, Kun Zhang
Abstract	We study causal inference in a multi-environment setting, in which the functional relations for producing the variables from their direct causes remain the same across environments, while the distribution of exogenous noises may vary. We introduce the idea of using the invariance of the functional relations of the variables to their causes across a set of environments. We define a notion of completeness for a causal inference algorithm in this setting and prove the existence of such algorithm by proposing the baseline algorithm. Additionally, we present an alternate algorithm that has significantly improved computational and sample complexity compared to the baseline algorithm. The experiment results show that the proposed algorithm outperforms the other existing algorithms.
Tasks	Causal Inference
Published	2017-05-26
URL	http://arxiv.org/abs/1705.09644v1
PDF	http://arxiv.org/pdf/1705.09644v1.pdf
PWC	https://paperswithcode.com/paper/learning-causal-structures-using-regression
Repo
Framework

Complexity-Aware Assignment of Latent Values in Discriminative Models for Accurate Gesture Recognition


Title	Complexity-Aware Assignment of Latent Values in Discriminative Models for Accurate Gesture Recognition
Authors	Manoel Horta Ribeiro, Bruno Teixeira, Antônio Otávio Fernandes, Wagner Meira Jr., Erickson R. Nascimento
Abstract	Many of the state-of-the-art algorithms for gesture recognition are based on Conditional Random Fields (CRFs). Successful approaches, such as the Latent-Dynamic CRFs, extend the CRF by incorporating latent variables, whose values are mapped to the values of the labels. In this paper we propose a novel methodology to set the latent values according to the gesture complexity. We use an heuristic that iterates through the samples associated with each label value, stimating their complexity. We then use it to assign the latent values to the label values. We evaluate our method on the task of recognizing human gestures from video streams. The experiments were performed in binary datasets, generated by grouping different labels. Our results demonstrate that our approach outperforms the arbitrary one in many cases, increasing the accuracy by up to 10%.
Tasks	Gesture Recognition
Published	2017-04-01
URL	http://arxiv.org/abs/1704.00180v1
PDF	http://arxiv.org/pdf/1704.00180v1.pdf
PWC	https://paperswithcode.com/paper/complexity-aware-assignment-of-latent-values
Repo
Framework

SenGen: Sentence Generating Neural Variational Topic Model


Title	SenGen: Sentence Generating Neural Variational Topic Model
Authors	Ramesh Nallapati, Igor Melnyk, Abhishek Kumar, Bowen Zhou
Abstract	We present a new topic model that generates documents by sampling a topic for one whole sentence at a time, and generating the words in the sentence using an RNN decoder that is conditioned on the topic of the sentence. We argue that this novel formalism will help us not only visualize and model the topical discourse structure in a document better, but also potentially lead to more interpretable topics since we can now illustrate topics by sampling representative sentences instead of bag of words or phrases. We present a variational auto-encoder approach for learning in which we use a factorized variational encoder that independently models the posterior over topical mixture vectors of documents using a feed-forward network, and the posterior over topic assignments to sentences using an RNN. Our preliminary experiments on two different datasets indicate early promise, but also expose many challenges that remain to be addressed.
Tasks
Published	2017-08-01
URL	http://arxiv.org/abs/1708.00308v1
PDF	http://arxiv.org/pdf/1708.00308v1.pdf
PWC	https://paperswithcode.com/paper/sengen-sentence-generating-neural-variational
Repo
Framework

ReFACTor: Practical Low-Rank Matrix Estimation Under Column-Sparsity


Title	ReFACTor: Practical Low-Rank Matrix Estimation Under Column-Sparsity
Authors	Matan Gavish, Regev Schweiger, Elior Rahmani, Eran Halperin
Abstract	Various problems in data analysis and statistical genetics call for recovery of a column-sparse, low-rank matrix from noisy observations. We propose ReFACTor, a simple variation of the classical Truncated Singular Value Decomposition (TSVD) algorithm. In contrast to previous sparse principal component analysis (PCA) algorithms, our algorithm can provably reveal a low-rank signal matrix better, and often significantly better, than the widely used TSVD, making it the algorithm of choice whenever column-sparsity is suspected. Empirically, we observe that ReFACTor consistently outperforms TSVD even when the underlying signal is not sparse, suggesting that it is generally safe to use ReFACTor instead of TSVD and PCA. The algorithm is extremely simple to implement and its running time is dominated by the runtime of PCA, making it as practical as standard principal component analysis.
Tasks
Published	2017-05-22
URL	http://arxiv.org/abs/1705.07654v1
PDF	http://arxiv.org/pdf/1705.07654v1.pdf
PWC	https://paperswithcode.com/paper/refactor-practical-low-rank-matrix-estimation
Repo
Framework

Attenuation correction for brain PET imaging using deep neural network based on dixon and ZTE MR images


Title	Attenuation correction for brain PET imaging using deep neural network based on dixon and ZTE MR images
Authors	Kuang Gong, Jaewon Yang, Kyungsang Kim, Georges El Fakhri, Youngho Seo, Quanzheng Li
Abstract	Positron Emission Tomography (PET) is a functional imaging modality widely used in neuroscience studies. To obtain meaningful quantitative results from PET images, attenuation correction is necessary during image reconstruction. For PET/MR hybrid systems, PET attenuation is challenging as Magnetic Resonance (MR) images do not reflect attenuation coefficients directly. To address this issue, we present deep neural network methods to derive the continuous attenuation coefficients for brain PET imaging from MR images. With only Dixon MR images as the network input, the existing U-net structure was adopted and analysis using forty patient data sets shows it is superior than other Dixon based methods. When both Dixon and zero echo time (ZTE) images are available, we have proposed a modified U-net structure, named GroupU-net, to efficiently make use of both Dixon and ZTE information through group convolution modules when the network goes deeper. Quantitative analysis based on fourteen real patient data sets demonstrates that both network approaches can perform better than the standard methods, and the proposed network structure can further reduce the PET quantification error compared to the U-net structure.
Tasks	Image Reconstruction
Published	2017-12-17
URL	http://arxiv.org/abs/1712.06203v2
PDF	http://arxiv.org/pdf/1712.06203v2.pdf
PWC	https://paperswithcode.com/paper/attenuation-correction-for-brain-pet-imaging
Repo
Framework

Grounding Visual Explanations (Extended Abstract)


Title	Grounding Visual Explanations (Extended Abstract)
Authors	Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata
Abstract	Existing models which generate textual explanations enforce task relevance through a discriminative term loss function, but such mechanisms only weakly constrain mentioned object parts to actually be present in the image. In this paper, a new model is proposed for generating explanations by utilizing localized grounding of constituent phrases in generated explanations to ensure image relevance. Specifically, we introduce a phrase-critic model to refine (re-score/re-rank) generated candidate explanations and employ a relative-attribute inspired ranking loss using “flipped” phrases as negative examples for training. At test time, our phrase-critic model takes an image and a candidate explanation as input and outputs a score indicating how well the candidate explanation is grounded in the image.
Tasks
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06465v1
PDF	http://arxiv.org/pdf/1711.06465v1.pdf
PWC	https://paperswithcode.com/paper/grounding-visual-explanations-extended
Repo
Framework

Multidimensional classification of hippocampal shape features discriminates Alzheimer’s disease and mild cognitive impairment from normal aging


Title	Multidimensional classification of hippocampal shape features discriminates Alzheimer’s disease and mild cognitive impairment from normal aging
Authors	Emilie Gerardin, Gaël Chételat, Marie Chupin, Rémi Cuingnet, Béatrice Desgranges, Ho-Sung Kim, Marc Niethammer, Bruno Dubois, Stéphane Lehéricy, Line Garnero, Francis Eustache, Olivier Colliot
Abstract	We describe a new method to automatically discriminate between patients with Alzheimer’s disease (AD) or mild cognitive impairment (MCI) and elderly controls, based on multidimensional classification of hippocampal shape features. This approach uses spherical harmonics (SPHARM) coefficients to model the shape of the hippocampi, which are segmented from magnetic resonance images (MRI) using a fully automatic method that we previously developed. SPHARM coefficients are used as features in a classification procedure based on support vector machines (SVM). The most relevant features for classification are selected using a bagging strategy. We evaluate the accuracy of our method in a group of 23 patients with AD (10 males, 13 females, age $\pm$ standard-deviation (SD) = 73 $\pm$ 6 years, mini-mental score (MMS) = 24.4 $\pm$ 2.8), 23 patients with amnestic MCI (10 males, 13 females, age $\pm$ SD = 74 $\pm$ 8 years, MMS = 27.3 $\pm$ 1.4) and 25 elderly healthy controls (13 males, 12 females, age $\pm$ SD = 64 $\pm$ 8 years), using leave-one-out cross-validation. For AD vs controls, we obtain a correct classification rate of 94%, a sensitivity of 96%, and a specificity of 92%. For MCI vs controls, we obtain a classification rate of 83%, a sensitivity of 83%, and a specificity of 84%. This accuracy is superior to that of hippocampal volumetry and is comparable to recently published SVM-based whole-brain classification methods, which relied on a different strategy. This new method may become a useful tool to assist in the diagnosis of Alzheimer’s disease.
Tasks
Published	2017-07-19
URL	http://arxiv.org/abs/1707.05961v1
PDF	http://arxiv.org/pdf/1707.05961v1.pdf
PWC	https://paperswithcode.com/paper/multidimensional-classification-of
Repo
Framework

Learning to Segment Instances in Videos with Spatial Propagation Network


Title	Learning to Segment Instances in Videos with Spatial Propagation Network
Authors	Jingchun Cheng, Sifei Liu, Yi-Hsuan Tsai, Wei-Chih Hung, Shalini De Mello, Jinwei Gu, Jan Kautz, Shengjin Wang, Ming-Hsuan Yang
Abstract	We propose a deep learning-based framework for instance-level object segmentation. Our method mainly consists of three steps. First, We train a generic model based on ResNet-101 for foreground/background segmentations. Second, based on this generic model, we fine-tune it to learn instance-level models and segment individual objects by using augmented object annotations in first frames of test videos. To distinguish different instances in the same video, we compute a pixel-level score map for each object from these instance-level models. Each score map indicates the objectness likelihood and is only computed within the foreground mask obtained in the first step. To further refine this per frame score map, we learn a spatial propagation network. This network aims to learn how to propagate a coarse segmentation mask spatially based on the pairwise similarities in each frame. In addition, we apply a filter on the refined score map that aims to recognize the best connected region using spatial and temporal consistencies in the video. Finally, we decide the instance-level object segmentation in each video by comparing score maps of different instances.
Tasks	Semantic Segmentation
Published	2017-09-14
URL	http://arxiv.org/abs/1709.04609v1
PDF	http://arxiv.org/pdf/1709.04609v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-segment-instances-in-videos-with
Repo
Framework

The ALAMO approach to machine learning


Title	The ALAMO approach to machine learning
Authors	Zachary T. Wilson, Nikolaos V. Sahinidis
Abstract	ALAMO is a computational methodology for leaning algebraic functions from data. Given a data set, the approach begins by building a low-complexity, linear model composed of explicit non-linear transformations of the independent variables. Linear combinations of these non-linear transformations allow a linear model to better approximate complex behavior observed in real processes. The model is refined, as additional data are obtained in an adaptive fashion through error maximization sampling using derivative-free optimization. Models built using ALAMO can enforce constraints on the response variables to incorporate first-principles knowledge. The ability of ALAMO to generate simple and accurate models for a number of reaction problems is demonstrated. The error maximization sampling is compared with Latin hypercube designs to demonstrate its sampling efficiency. ALAMO’s constrained regression methodology is used to further refine concentration models, resulting in models that perform better on validation data and satisfy upper and lower bounds placed on model outputs.
Tasks
Published	2017-05-31
URL	http://arxiv.org/abs/1705.10918v1
PDF	http://arxiv.org/pdf/1705.10918v1.pdf
PWC	https://paperswithcode.com/paper/the-alamo-approach-to-machine-learning
Repo
Framework

Discriminative Neural Topic Models


Title	Discriminative Neural Topic Models
Authors	Gaurav Pandey, Ambedkar Dukkipati
Abstract	We propose a neural network based approach for learning topics from text and image datasets. The model makes no assumptions about the conditional distribution of the observed features given the latent topics. This allows us to perform topic modelling efficiently using sentences of documents and patches of images as observed features, rather than limiting ourselves to words. Moreover, the proposed approach is online, and hence can be used for streaming data. Furthermore, since the approach utilizes neural networks, it can be implemented on GPU with ease, and hence it is very scalable.
Tasks	Topic Models
Published	2017-01-24
URL	http://arxiv.org/abs/1701.06796v2
PDF	http://arxiv.org/pdf/1701.06796v2.pdf
PWC	https://paperswithcode.com/paper/discriminative-neural-topic-models
Repo
Framework

Belief Propagation Min-Sum Algorithm for Generalized Min-Cost Network Flow


Title	Belief Propagation Min-Sum Algorithm for Generalized Min-Cost Network Flow
Authors	Andrii Riazanov, Yury Maximov, Michael Chertkov
Abstract	Belief Propagation algorithms are instruments used broadly to solve graphical model optimization and statistical inference problems. In the general case of a loopy Graphical Model, Belief Propagation is a heuristic which is quite successful in practice, even though its empirical success, typically, lacks theoretical guarantees. This paper extends the short list of special cases where correctness and/or convergence of a Belief Propagation algorithm is proven. We generalize formulation of Min-Sum Network Flow problem by relaxing the flow conservation (balance) constraints and then proving that the Belief Propagation algorithm converges to the exact result.
Tasks
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07600v2
PDF	http://arxiv.org/pdf/1710.07600v2.pdf
PWC	https://paperswithcode.com/paper/belief-propagation-min-sum-algorithm-for
Repo
Framework

Person Re-Identification with Vision and Language


Title	Person Re-Identification with Vision and Language
Authors	Fei Yan, Krystian Mikolajczyk, Josef Kittler
Abstract	In this paper we propose a new approach to person re-identification using images and natural language descriptions. We propose a joint vision and language model based on CCA and CNN architectures to match across the two modalities as well as to enrich visual examples for which there are no language descriptions. We also introduce new annotations in the form of natural language descriptions for two standard Re-ID benchmarks, namely CUHK03 and VIPeR. We perform experiments on these two datasets with techniques based on CNN, hand-crafted features as well as LSTM for analysing visual and natural description data. We investigate and demonstrate the advantages of using natural language descriptions compared to attributes as well as CNN compared to LSTM in the context of Re-ID. We show that the joint use of language and vision can significantly improve the state-of-the-art performance on standard Re-ID benchmarks.
Tasks	Language Modelling, Person Re-Identification
Published	2017-10-03
URL	http://arxiv.org/abs/1710.01202v1
PDF	http://arxiv.org/pdf/1710.01202v1.pdf
PWC	https://paperswithcode.com/paper/person-re-identification-with-vision-and
Repo
Framework

Dense 3D Facial Reconstruction from a Single Depth Image in Unconstrained Environment


Title	Dense 3D Facial Reconstruction from a Single Depth Image in Unconstrained Environment
Authors	Shu Zhang, Hui Yu, Ting Wang, Junyu Dong, Honghai Liu
Abstract	With the increasing demands of applications in virtual reality such as 3D films, virtual Human-Machine Interactions and virtual agents, the analysis of 3D human face analysis is considered to be more and more important as a fundamental step for those virtual reality tasks. Due to information provided by an additional dimension, 3D facial reconstruction enables aforementioned tasks to be achieved with higher accuracy than those based on 2D facial analysis. The denser the 3D facial model is, the more information it could provide. However, most existing dense 3D facial reconstruction methods require complicated processing and high system cost. To this end, this paper presents a novel method that simplifies the process of dense 3D facial reconstruction by employing only one frame of depth data obtained with an off-the-shelf RGB-D sensor. The experiments showed competitive results with real world data.
Tasks
Published	2017-04-24
URL	http://arxiv.org/abs/1704.07142v1
PDF	http://arxiv.org/pdf/1704.07142v1.pdf
PWC	https://paperswithcode.com/paper/dense-3d-facial-reconstruction-from-a-single
Repo
Framework