Paper Group ANR 476
Discrete Event, Continuous Time RNNs. Demystifying AlphaGo Zero as AlphaGo GAN. Learning Causal Structures Using Regression Invariance. Complexity-Aware Assignment of Latent Values in Discriminative Models for Accurate Gesture Recognition. SenGen: Sentence Generating Neural Variational Topic Model. ReFACTor: Practical Low-Rank Matrix Estimation Und …
Discrete Event, Continuous Time RNNs
Title | Discrete Event, Continuous Time RNNs |
Authors | Michael C. Mozer, Denis Kazakov, Robert V. Lindsey |
Abstract | We investigate recurrent neural network architectures for event-sequence processing. Event sequences, characterized by discrete observations stamped with continuous-valued times of occurrence, are challenging due to the potentially wide dynamic range of relevant time scales as well as interactions between time scales. We describe four forms of inductive bias that should benefit architectures for event sequences: temporal locality, position and scale homogeneity, and scale interdependence. We extend the popular gated recurrent unit (GRU) architecture to incorporate these biases via intrinsic temporal dynamics, obtaining a continuous-time GRU. The CT-GRU arises by interpreting the gates of a GRU as selecting a time scale of memory, and the CT-GRU generalizes the GRU by incorporating multiple time scales of memory and performing context-dependent selection of time scales for information storage and retrieval. Event time-stamps drive decay dynamics of the CT-GRU, whereas they serve as generic additional inputs to the GRU. Despite the very different manner in which the two models consider time, their performance on eleven data sets we examined is essentially identical. Our surprising results point both to the robustness of GRU and LSTM architectures for handling continuous time, and to the potency of incorporating continuous dynamics into neural architectures. |
Tasks | |
Published | 2017-10-11 |
URL | http://arxiv.org/abs/1710.04110v1 |
http://arxiv.org/pdf/1710.04110v1.pdf | |
PWC | https://paperswithcode.com/paper/discrete-event-continuous-time-rnns |
Repo | |
Framework | |
Demystifying AlphaGo Zero as AlphaGo GAN
Title | Demystifying AlphaGo Zero as AlphaGo GAN |
Authors | Xiao Dong, Jiasong Wu, Ling Zhou |
Abstract | The astonishing success of AlphaGo Zero\cite{Silver_AlphaGo} invokes a worldwide discussion of the future of our human society with a mixed mood of hope, anxiousness, excitement and fear. We try to dymystify AlphaGo Zero by a qualitative analysis to indicate that AlphaGo Zero can be understood as a specially structured GAN system which is expected to possess an inherent good convergence property. Thus we deduct the success of AlphaGo Zero may not be a sign of a new generation of AI. |
Tasks | |
Published | 2017-11-24 |
URL | http://arxiv.org/abs/1711.09091v1 |
http://arxiv.org/pdf/1711.09091v1.pdf | |
PWC | https://paperswithcode.com/paper/demystifying-alphago-zero-as-alphago-gan |
Repo | |
Framework | |
Learning Causal Structures Using Regression Invariance
Title | Learning Causal Structures Using Regression Invariance |
Authors | AmirEmad Ghassami, Saber Salehkaleybar, Negar Kiyavash, Kun Zhang |
Abstract | We study causal inference in a multi-environment setting, in which the functional relations for producing the variables from their direct causes remain the same across environments, while the distribution of exogenous noises may vary. We introduce the idea of using the invariance of the functional relations of the variables to their causes across a set of environments. We define a notion of completeness for a causal inference algorithm in this setting and prove the existence of such algorithm by proposing the baseline algorithm. Additionally, we present an alternate algorithm that has significantly improved computational and sample complexity compared to the baseline algorithm. The experiment results show that the proposed algorithm outperforms the other existing algorithms. |
Tasks | Causal Inference |
Published | 2017-05-26 |
URL | http://arxiv.org/abs/1705.09644v1 |
http://arxiv.org/pdf/1705.09644v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-causal-structures-using-regression |
Repo | |
Framework | |
Complexity-Aware Assignment of Latent Values in Discriminative Models for Accurate Gesture Recognition
Title | Complexity-Aware Assignment of Latent Values in Discriminative Models for Accurate Gesture Recognition |
Authors | Manoel Horta Ribeiro, Bruno Teixeira, Antônio Otávio Fernandes, Wagner Meira Jr., Erickson R. Nascimento |
Abstract | Many of the state-of-the-art algorithms for gesture recognition are based on Conditional Random Fields (CRFs). Successful approaches, such as the Latent-Dynamic CRFs, extend the CRF by incorporating latent variables, whose values are mapped to the values of the labels. In this paper we propose a novel methodology to set the latent values according to the gesture complexity. We use an heuristic that iterates through the samples associated with each label value, stimating their complexity. We then use it to assign the latent values to the label values. We evaluate our method on the task of recognizing human gestures from video streams. The experiments were performed in binary datasets, generated by grouping different labels. Our results demonstrate that our approach outperforms the arbitrary one in many cases, increasing the accuracy by up to 10%. |
Tasks | Gesture Recognition |
Published | 2017-04-01 |
URL | http://arxiv.org/abs/1704.00180v1 |
http://arxiv.org/pdf/1704.00180v1.pdf | |
PWC | https://paperswithcode.com/paper/complexity-aware-assignment-of-latent-values |
Repo | |
Framework | |
SenGen: Sentence Generating Neural Variational Topic Model
Title | SenGen: Sentence Generating Neural Variational Topic Model |
Authors | Ramesh Nallapati, Igor Melnyk, Abhishek Kumar, Bowen Zhou |
Abstract | We present a new topic model that generates documents by sampling a topic for one whole sentence at a time, and generating the words in the sentence using an RNN decoder that is conditioned on the topic of the sentence. We argue that this novel formalism will help us not only visualize and model the topical discourse structure in a document better, but also potentially lead to more interpretable topics since we can now illustrate topics by sampling representative sentences instead of bag of words or phrases. We present a variational auto-encoder approach for learning in which we use a factorized variational encoder that independently models the posterior over topical mixture vectors of documents using a feed-forward network, and the posterior over topic assignments to sentences using an RNN. Our preliminary experiments on two different datasets indicate early promise, but also expose many challenges that remain to be addressed. |
Tasks | |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.00308v1 |
http://arxiv.org/pdf/1708.00308v1.pdf | |
PWC | https://paperswithcode.com/paper/sengen-sentence-generating-neural-variational |
Repo | |
Framework | |
ReFACTor: Practical Low-Rank Matrix Estimation Under Column-Sparsity
Title | ReFACTor: Practical Low-Rank Matrix Estimation Under Column-Sparsity |
Authors | Matan Gavish, Regev Schweiger, Elior Rahmani, Eran Halperin |
Abstract | Various problems in data analysis and statistical genetics call for recovery of a column-sparse, low-rank matrix from noisy observations. We propose ReFACTor, a simple variation of the classical Truncated Singular Value Decomposition (TSVD) algorithm. In contrast to previous sparse principal component analysis (PCA) algorithms, our algorithm can provably reveal a low-rank signal matrix better, and often significantly better, than the widely used TSVD, making it the algorithm of choice whenever column-sparsity is suspected. Empirically, we observe that ReFACTor consistently outperforms TSVD even when the underlying signal is not sparse, suggesting that it is generally safe to use ReFACTor instead of TSVD and PCA. The algorithm is extremely simple to implement and its running time is dominated by the runtime of PCA, making it as practical as standard principal component analysis. |
Tasks | |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.07654v1 |
http://arxiv.org/pdf/1705.07654v1.pdf | |
PWC | https://paperswithcode.com/paper/refactor-practical-low-rank-matrix-estimation |
Repo | |
Framework | |
Attenuation correction for brain PET imaging using deep neural network based on dixon and ZTE MR images
Title | Attenuation correction for brain PET imaging using deep neural network based on dixon and ZTE MR images |
Authors | Kuang Gong, Jaewon Yang, Kyungsang Kim, Georges El Fakhri, Youngho Seo, Quanzheng Li |
Abstract | Positron Emission Tomography (PET) is a functional imaging modality widely used in neuroscience studies. To obtain meaningful quantitative results from PET images, attenuation correction is necessary during image reconstruction. For PET/MR hybrid systems, PET attenuation is challenging as Magnetic Resonance (MR) images do not reflect attenuation coefficients directly. To address this issue, we present deep neural network methods to derive the continuous attenuation coefficients for brain PET imaging from MR images. With only Dixon MR images as the network input, the existing U-net structure was adopted and analysis using forty patient data sets shows it is superior than other Dixon based methods. When both Dixon and zero echo time (ZTE) images are available, we have proposed a modified U-net structure, named GroupU-net, to efficiently make use of both Dixon and ZTE information through group convolution modules when the network goes deeper. Quantitative analysis based on fourteen real patient data sets demonstrates that both network approaches can perform better than the standard methods, and the proposed network structure can further reduce the PET quantification error compared to the U-net structure. |
Tasks | Image Reconstruction |
Published | 2017-12-17 |
URL | http://arxiv.org/abs/1712.06203v2 |
http://arxiv.org/pdf/1712.06203v2.pdf | |
PWC | https://paperswithcode.com/paper/attenuation-correction-for-brain-pet-imaging |
Repo | |
Framework | |
Grounding Visual Explanations (Extended Abstract)
Title | Grounding Visual Explanations (Extended Abstract) |
Authors | Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata |
Abstract | Existing models which generate textual explanations enforce task relevance through a discriminative term loss function, but such mechanisms only weakly constrain mentioned object parts to actually be present in the image. In this paper, a new model is proposed for generating explanations by utilizing localized grounding of constituent phrases in generated explanations to ensure image relevance. Specifically, we introduce a phrase-critic model to refine (re-score/re-rank) generated candidate explanations and employ a relative-attribute inspired ranking loss using “flipped” phrases as negative examples for training. At test time, our phrase-critic model takes an image and a candidate explanation as input and outputs a score indicating how well the candidate explanation is grounded in the image. |
Tasks | |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06465v1 |
http://arxiv.org/pdf/1711.06465v1.pdf | |
PWC | https://paperswithcode.com/paper/grounding-visual-explanations-extended |
Repo | |
Framework | |
Multidimensional classification of hippocampal shape features discriminates Alzheimer’s disease and mild cognitive impairment from normal aging
Title | Multidimensional classification of hippocampal shape features discriminates Alzheimer’s disease and mild cognitive impairment from normal aging |
Authors | Emilie Gerardin, Gaël Chételat, Marie Chupin, Rémi Cuingnet, Béatrice Desgranges, Ho-Sung Kim, Marc Niethammer, Bruno Dubois, Stéphane Lehéricy, Line Garnero, Francis Eustache, Olivier Colliot |
Abstract | We describe a new method to automatically discriminate between patients with Alzheimer’s disease (AD) or mild cognitive impairment (MCI) and elderly controls, based on multidimensional classification of hippocampal shape features. This approach uses spherical harmonics (SPHARM) coefficients to model the shape of the hippocampi, which are segmented from magnetic resonance images (MRI) using a fully automatic method that we previously developed. SPHARM coefficients are used as features in a classification procedure based on support vector machines (SVM). The most relevant features for classification are selected using a bagging strategy. We evaluate the accuracy of our method in a group of 23 patients with AD (10 males, 13 females, age $\pm$ standard-deviation (SD) = 73 $\pm$ 6 years, mini-mental score (MMS) = 24.4 $\pm$ 2.8), 23 patients with amnestic MCI (10 males, 13 females, age $\pm$ SD = 74 $\pm$ 8 years, MMS = 27.3 $\pm$ 1.4) and 25 elderly healthy controls (13 males, 12 females, age $\pm$ SD = 64 $\pm$ 8 years), using leave-one-out cross-validation. For AD vs controls, we obtain a correct classification rate of 94%, a sensitivity of 96%, and a specificity of 92%. For MCI vs controls, we obtain a classification rate of 83%, a sensitivity of 83%, and a specificity of 84%. This accuracy is superior to that of hippocampal volumetry and is comparable to recently published SVM-based whole-brain classification methods, which relied on a different strategy. This new method may become a useful tool to assist in the diagnosis of Alzheimer’s disease. |
Tasks | |
Published | 2017-07-19 |
URL | http://arxiv.org/abs/1707.05961v1 |
http://arxiv.org/pdf/1707.05961v1.pdf | |
PWC | https://paperswithcode.com/paper/multidimensional-classification-of |
Repo | |
Framework | |
Learning to Segment Instances in Videos with Spatial Propagation Network
Title | Learning to Segment Instances in Videos with Spatial Propagation Network |
Authors | Jingchun Cheng, Sifei Liu, Yi-Hsuan Tsai, Wei-Chih Hung, Shalini De Mello, Jinwei Gu, Jan Kautz, Shengjin Wang, Ming-Hsuan Yang |
Abstract | We propose a deep learning-based framework for instance-level object segmentation. Our method mainly consists of three steps. First, We train a generic model based on ResNet-101 for foreground/background segmentations. Second, based on this generic model, we fine-tune it to learn instance-level models and segment individual objects by using augmented object annotations in first frames of test videos. To distinguish different instances in the same video, we compute a pixel-level score map for each object from these instance-level models. Each score map indicates the objectness likelihood and is only computed within the foreground mask obtained in the first step. To further refine this per frame score map, we learn a spatial propagation network. This network aims to learn how to propagate a coarse segmentation mask spatially based on the pairwise similarities in each frame. In addition, we apply a filter on the refined score map that aims to recognize the best connected region using spatial and temporal consistencies in the video. Finally, we decide the instance-level object segmentation in each video by comparing score maps of different instances. |
Tasks | Semantic Segmentation |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04609v1 |
http://arxiv.org/pdf/1709.04609v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-segment-instances-in-videos-with |
Repo | |
Framework | |
The ALAMO approach to machine learning
Title | The ALAMO approach to machine learning |
Authors | Zachary T. Wilson, Nikolaos V. Sahinidis |
Abstract | ALAMO is a computational methodology for leaning algebraic functions from data. Given a data set, the approach begins by building a low-complexity, linear model composed of explicit non-linear transformations of the independent variables. Linear combinations of these non-linear transformations allow a linear model to better approximate complex behavior observed in real processes. The model is refined, as additional data are obtained in an adaptive fashion through error maximization sampling using derivative-free optimization. Models built using ALAMO can enforce constraints on the response variables to incorporate first-principles knowledge. The ability of ALAMO to generate simple and accurate models for a number of reaction problems is demonstrated. The error maximization sampling is compared with Latin hypercube designs to demonstrate its sampling efficiency. ALAMO’s constrained regression methodology is used to further refine concentration models, resulting in models that perform better on validation data and satisfy upper and lower bounds placed on model outputs. |
Tasks | |
Published | 2017-05-31 |
URL | http://arxiv.org/abs/1705.10918v1 |
http://arxiv.org/pdf/1705.10918v1.pdf | |
PWC | https://paperswithcode.com/paper/the-alamo-approach-to-machine-learning |
Repo | |
Framework | |
Discriminative Neural Topic Models
Title | Discriminative Neural Topic Models |
Authors | Gaurav Pandey, Ambedkar Dukkipati |
Abstract | We propose a neural network based approach for learning topics from text and image datasets. The model makes no assumptions about the conditional distribution of the observed features given the latent topics. This allows us to perform topic modelling efficiently using sentences of documents and patches of images as observed features, rather than limiting ourselves to words. Moreover, the proposed approach is online, and hence can be used for streaming data. Furthermore, since the approach utilizes neural networks, it can be implemented on GPU with ease, and hence it is very scalable. |
Tasks | Topic Models |
Published | 2017-01-24 |
URL | http://arxiv.org/abs/1701.06796v2 |
http://arxiv.org/pdf/1701.06796v2.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-neural-topic-models |
Repo | |
Framework | |
Belief Propagation Min-Sum Algorithm for Generalized Min-Cost Network Flow
Title | Belief Propagation Min-Sum Algorithm for Generalized Min-Cost Network Flow |
Authors | Andrii Riazanov, Yury Maximov, Michael Chertkov |
Abstract | Belief Propagation algorithms are instruments used broadly to solve graphical model optimization and statistical inference problems. In the general case of a loopy Graphical Model, Belief Propagation is a heuristic which is quite successful in practice, even though its empirical success, typically, lacks theoretical guarantees. This paper extends the short list of special cases where correctness and/or convergence of a Belief Propagation algorithm is proven. We generalize formulation of Min-Sum Network Flow problem by relaxing the flow conservation (balance) constraints and then proving that the Belief Propagation algorithm converges to the exact result. |
Tasks | |
Published | 2017-10-20 |
URL | http://arxiv.org/abs/1710.07600v2 |
http://arxiv.org/pdf/1710.07600v2.pdf | |
PWC | https://paperswithcode.com/paper/belief-propagation-min-sum-algorithm-for |
Repo | |
Framework | |
Person Re-Identification with Vision and Language
Title | Person Re-Identification with Vision and Language |
Authors | Fei Yan, Krystian Mikolajczyk, Josef Kittler |
Abstract | In this paper we propose a new approach to person re-identification using images and natural language descriptions. We propose a joint vision and language model based on CCA and CNN architectures to match across the two modalities as well as to enrich visual examples for which there are no language descriptions. We also introduce new annotations in the form of natural language descriptions for two standard Re-ID benchmarks, namely CUHK03 and VIPeR. We perform experiments on these two datasets with techniques based on CNN, hand-crafted features as well as LSTM for analysing visual and natural description data. We investigate and demonstrate the advantages of using natural language descriptions compared to attributes as well as CNN compared to LSTM in the context of Re-ID. We show that the joint use of language and vision can significantly improve the state-of-the-art performance on standard Re-ID benchmarks. |
Tasks | Language Modelling, Person Re-Identification |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01202v1 |
http://arxiv.org/pdf/1710.01202v1.pdf | |
PWC | https://paperswithcode.com/paper/person-re-identification-with-vision-and |
Repo | |
Framework | |
Dense 3D Facial Reconstruction from a Single Depth Image in Unconstrained Environment
Title | Dense 3D Facial Reconstruction from a Single Depth Image in Unconstrained Environment |
Authors | Shu Zhang, Hui Yu, Ting Wang, Junyu Dong, Honghai Liu |
Abstract | With the increasing demands of applications in virtual reality such as 3D films, virtual Human-Machine Interactions and virtual agents, the analysis of 3D human face analysis is considered to be more and more important as a fundamental step for those virtual reality tasks. Due to information provided by an additional dimension, 3D facial reconstruction enables aforementioned tasks to be achieved with higher accuracy than those based on 2D facial analysis. The denser the 3D facial model is, the more information it could provide. However, most existing dense 3D facial reconstruction methods require complicated processing and high system cost. To this end, this paper presents a novel method that simplifies the process of dense 3D facial reconstruction by employing only one frame of depth data obtained with an off-the-shelf RGB-D sensor. The experiments showed competitive results with real world data. |
Tasks | |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07142v1 |
http://arxiv.org/pdf/1704.07142v1.pdf | |
PWC | https://paperswithcode.com/paper/dense-3d-facial-reconstruction-from-a-single |
Repo | |
Framework | |