October 19, 2019

3101 words 15 mins read

Paper Group ANR 349

Camera Model Identification Using Convolutional Neural Networks. Forecasting Individualized Disease Trajectories using Interpretable Deep Learning. The Sample Complexity of Up-to-$\varepsilon$ Multi-Dimensional Revenue Maximization. Near-Optimal Sample Complexity Bounds for Maximum Likelihood Estimation of Multivariate Log-concave Densities. A Mode …

Camera Model Identification Using Convolutional Neural Networks


Title	Camera Model Identification Using Convolutional Neural Networks
Authors	Artur Kuzin, Artur Fattakhov, Ilya Kibardin, Vladimir Iglovikov, Ruslan Dautov
Abstract	Source camera identification is the process of determining which camera or model has been used to capture an image. In the recent years, there has been a rapid growth of research interest in the domain of forensics. In the current work, we describe our Deep Learning approach to the camera detection task of 10 cameras as a part of the Camera Model Identification Challenge hosted by Kaggle.com where our team finished 2nd out of 582 teams with the accuracy on the unseen data of 98%. We used aggressive data augmentations that allowed a model to stay robust against transformations. A number of experiments are carried out on datasets collected by organizers and scraped from the web.
Tasks
Published	2018-10-06
URL	http://arxiv.org/abs/1810.02981v2
PDF	http://arxiv.org/pdf/1810.02981v2.pdf
PWC	https://paperswithcode.com/paper/camera-model-identification-using
Repo
Framework

Forecasting Individualized Disease Trajectories using Interpretable Deep Learning


Title	Forecasting Individualized Disease Trajectories using Interpretable Deep Learning
Authors	Ahmed M. Alaa, Mihaela van der Schaar
Abstract	Disease progression models are instrumental in predicting individual-level health trajectories and understanding disease dynamics. Existing models are capable of providing either accurate predictions of patients prognoses or clinically interpretable representations of disease pathophysiology, but not both. In this paper, we develop the phased attentive state space (PASS) model of disease progression, a deep probabilistic model that captures complex representations for disease progression while maintaining clinical interpretability. Unlike Markovian state space models which assume memoryless dynamics, PASS uses an attention mechanism to induce “memoryful” state transitions, whereby repeatedly updated attention weights are used to focus on past state realizations that best predict future states. This gives rise to complex, non-stationary state dynamics that remain interpretable through the generated attention weights, which designate the relationships between the realized state variables for individual patients. PASS uses phased LSTM units (with time gates controlled by parametrized oscillations) to generate the attention weights in continuous time, which enables handling irregularly-sampled and potentially missing medical observations. Experiments on data from a realworld cohort of patients show that PASS successfully balances the tradeoff between accuracy and interpretability: it demonstrates superior predictive accuracy and learns insightful individual-level representations of disease progression.
Tasks	Disease Prediction, Disease Trajectory Forecasting
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10489v1
PDF	http://arxiv.org/pdf/1810.10489v1.pdf
PWC	https://paperswithcode.com/paper/forecasting-individualized-disease
Repo
Framework

The Sample Complexity of Up-to-$\varepsilon$ Multi-Dimensional Revenue Maximization


Title	The Sample Complexity of Up-to-$\varepsilon$ Multi-Dimensional Revenue Maximization
Authors	Yannai A. Gonczarowski, S. Matthew Weinberg
Abstract	We consider the sample complexity of revenue maximization for multiple bidders in unrestricted multi-dimensional settings. Specifically, we study the standard model of $n$ additive bidders whose values for $m$ heterogeneous items are drawn independently. For any such instance and any $\varepsilon>0$, we show that it is possible to learn an $\varepsilon$-Bayesian Incentive Compatible auction whose expected revenue is within $\varepsilon$ of the optimal $\varepsilon$-BIC auction from only polynomially many samples. Our approach is based on ideas that hold quite generally, and completely sidestep the difficulty of characterizing optimal (or near-optimal) auctions for these settings. Therefore, our results easily extend to general multi-dimensional settings, including valuations that aren’t necessarily even subadditive, and arbitrary allocation constraints. For the cases of a single bidder and many goods, or a single parameter (good) and many bidders, our analysis yields exact incentive compatibility (and for the latter also computational efficiency). Although the single-parameter case is already well-understood, our corollary for this case extends slightly the state-of-the-art.
Tasks
Published	2018-08-07
URL	http://arxiv.org/abs/1808.02458v2
PDF	http://arxiv.org/pdf/1808.02458v2.pdf
PWC	https://paperswithcode.com/paper/the-sample-complexity-of-up-to-varepsilon
Repo
Framework

Near-Optimal Sample Complexity Bounds for Maximum Likelihood Estimation of Multivariate Log-concave Densities


Title	Near-Optimal Sample Complexity Bounds for Maximum Likelihood Estimation of Multivariate Log-concave Densities
Authors	Timothy Carpenter, Ilias Diakonikolas, Anastasios Sidiropoulos, Alistair Stewart
Abstract	We study the problem of learning multivariate log-concave densities with respect to a global loss function. We obtain the first upper bound on the sample complexity of the maximum likelihood estimator (MLE) for a log-concave density on $\mathbb{R}^d$, for all $d \geq 4$. Prior to this work, no finite sample upper bound was known for this estimator in more than $3$ dimensions. In more detail, we prove that for any $d \geq 1$ and $\epsilon>0$, given $\tilde{O}_d((1/\epsilon)^{(d+3)/2})$ samples drawn from an unknown log-concave density $f_0$ on $\mathbb{R}^d$, the MLE outputs a hypothesis $h$ that with high probability is $\epsilon$-close to $f_0$, in squared Hellinger loss. A sample complexity lower bound of $\Omega_d((1/\epsilon)^{(d+1)/2})$ was previously known for any learning algorithm that achieves this guarantee. We thus establish that the sample complexity of the log-concave MLE is near-optimal, up to an $\tilde{O}(1/\epsilon)$ factor.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10575v2
PDF	http://arxiv.org/pdf/1802.10575v2.pdf
PWC	https://paperswithcode.com/paper/near-optimal-sample-complexity-bounds-for-1
Repo
Framework

A Model-Based Reinforcement Learning Approach for a Rare Disease Diagnostic Task


Title	A Model-Based Reinforcement Learning Approach for a Rare Disease Diagnostic Task
Authors	Rémi Besson, Erwan Le Pennec, Stéphanie Allassonnière, Julien Stirnemann, Emmanuel Spaggiari, Antoine Neuraz
Abstract	In this work, we present our various contributions to the objective of building a decision support tool for the diagnosis of rare diseases. Our goal is to achieve a state of knowledge where the uncertainty about the patient’s disease is below a predetermined threshold. We aim to reach such states while minimizing the average number of medical tests to perform. In doing so, we take into account the need, in many medical applications, to avoid, as much as possible, any misdiagnosis. To solve this optimization task, we investigate several reinforcement learning algorithm and make them operable in our high-dimensional and sparse-reward setting. We also present a way to combine expert knowledge, expressed as conditional probabilities, with real clinical data. This is crucial because the scarcity of data in the field of rare diseases prevents any approach based solely on clinical data. Finally we show that it is possible to integrate the ontological information about symptoms while remaining in our probabilistic reasoning. It enables our decision support tool to process information given at different level of precision by the user.
Tasks
Published	2018-11-25
URL	http://arxiv.org/abs/1811.10112v1
PDF	http://arxiv.org/pdf/1811.10112v1.pdf
PWC	https://paperswithcode.com/paper/a-model-based-reinforcement-learning-approach
Repo
Framework

Towards WARSHIP: Combining Components of Brain-Inspired Computing of RSH for Image Super Resolution


Title	Towards WARSHIP: Combining Components of Brain-Inspired Computing of RSH for Image Super Resolution
Authors	Wendi Xu, Ming Zhang
Abstract	Evolution of deep learning shows that some algorithmic tricks are more durable , while others are not. To the best of our knowledge, we firstly summarize 5 more durable and complete deep learning components for vision, that is, WARSHIP. Moreover, we give a biological overview of WARSHIP, emphasizing brain-inspired computing of WARSHIP. As a step towards WARSHIP, our case study of image super resolution combines 3 components of RSH to deploy a CNN model of WARSHIP-XZNet, which performs a happy medium between speed and performance.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-10-03
URL	http://arxiv.org/abs/1810.01620v1
PDF	http://arxiv.org/pdf/1810.01620v1.pdf
PWC	https://paperswithcode.com/paper/towards-warship-combining-components-of-brain
Repo
Framework

Système de traduction automatique statistique Anglais-Arabe


Title	Système de traduction automatique statistique Anglais-Arabe
Authors	Marwa Hadj Salah, Didier Schwab, Hervé Blanchon, Mounir Zrigui
Abstract	Machine translation (MT) is the process of translating text written in a source language into text in a target language. In this article, we present our English-Arabic statistical machine translation system. First, we present the general process for setting up a statistical machine translation system, then we describe the tools as well as the different corpora we used to build our MT system. Our system was evaluated in terms of the BLUE score (24.51%)
Tasks	Machine Translation
Published	2018-02-06
URL	http://arxiv.org/abs/1802.02053v1
PDF	http://arxiv.org/pdf/1802.02053v1.pdf
PWC	https://paperswithcode.com/paper/systeme-de-traduction-automatique-statistique
Repo
Framework

Flexible Mixture Modeling on Constrained Spaces


Title	Flexible Mixture Modeling on Constrained Spaces
Authors	Putu Ayu Sudyanti, Vinayak Rao
Abstract	This paper addresses challenges in flexibly modeling multimodal data that lie on constrained spaces. Such data are commonly found in spatial applications, such as climatology and criminology, where measurements are restricted to a geographical area. Other settings include domains where unsuitable recordings are discarded, such as flow-cytometry measurements. A simple approach to modeling such data is through the use of mixture models, especially nonparametric mixture models. Mixture models, while flexible and theoretically well-understood, are unsuitable for settings involving complicated constraints, leading to difficulties in specifying the component distributions and in evaluating normalization constants. Bayesian inference over the parameters of these models results in posterior distributions that are doubly-intractable. We address this problem via an algorithm based on rejection sampling and data augmentation. We view samples from a truncated distribution as outcomes of a rejection sampling scheme, where proposals are made from a simple mixture model and are rejected if they violate the constraints. Our scheme proceeds by imputing the rejected samples given mixture parameters and then resampling parameters given all samples. We study two modeling approaches: mixtures of truncated Gaussians and truncated mixtures of Gaussians, along with their associated Markov chain Monte Carlo sampling algorithms. We also discuss variations of the models, as well as approximations to improve mixing, reduce computational cost, and lower variance. We present results on simulated data and apply our algorithms to two problems; one involving flow-cytometry data, and the other, crime recorded in the city of Chicago.
Tasks	Bayesian Inference, Data Augmentation
Published	2018-09-24
URL	https://arxiv.org/abs/1809.09238v2
PDF	https://arxiv.org/pdf/1809.09238v2.pdf
PWC	https://paperswithcode.com/paper/flexible-mixture-modeling-on-constrained
Repo
Framework

A Sparse Non-negative Matrix Factorization Framework for Identifying Functional Units of Tongue Behavior from MRI


Title	A Sparse Non-negative Matrix Factorization Framework for Identifying Functional Units of Tongue Behavior from MRI
Authors	Jonghye Woo, Jerry L. Prince, Maureen Stone, Fangxu Xing, Arnold Gomez, Jordan R. Green, Christopher J. Hartnick, Thomas J. Brady, Timothy G. Reese, Van J. Wedeen, Georges El Fakhri
Abstract	Muscle coordination patterns of lingual behaviors are synergies generated by deforming local muscle groups in a variety of ways. Functional units are functional muscle groups of local structural elements within the tongue that compress, expand, and move in a cohesive and consistent manner. Identifying the functional units using tagged-Magnetic Resonance Imaging (MRI) sheds light on the mechanisms of normal and pathological muscle coordination patterns, yielding improvement in surgical planning, treatment, or rehabilitation procedures. Here, to mine this information, we propose a matrix factorization and probabilistic graphical model framework to produce building blocks and their associated weighting map using motion quantities extracted from tagged-MRI. Our tagged-MRI imaging and accurate voxel-level tracking provide previously unavailable internal tongue motion patterns, thus revealing the inner workings of the tongue during speech or other lingual behaviors. We then employ spectral clustering on the weighting map to identify the cohesive regions defined by the tongue motion that may involve multiple or undocumented regions. To evaluate our method, we perform a series of experiments. We first use two-dimensional images and synthetic data to demonstrate the accuracy of our method. We then use three-dimensional synthetic and \textit{in vivo} tongue motion data using protrusion and simple speech tasks to identify subject-specific and data-driven functional units of the tongue in localized regions.
Tasks
Published	2018-04-15
URL	http://arxiv.org/abs/1804.05370v3
PDF	http://arxiv.org/pdf/1804.05370v3.pdf
PWC	https://paperswithcode.com/paper/a-sparse-non-negative-matrix-factorization
Repo
Framework

Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis


Title	Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis
Authors	Hai Pham, Thomas Manzini, Paul Pu Liang, Barnabas Poczos
Abstract	Multimodal machine learning is a core research area spanning the language, visual and acoustic modalities. The central challenge in multimodal learning involves learning representations that can process and relate information from multiple modalities. In this paper, we propose two methods for unsupervised learning of joint multimodal representations using sequence to sequence (Seq2Seq) methods: a \textit{Seq2Seq Modality Translation Model} and a \textit{Hierarchical Seq2Seq Modality Translation Model}. We also explore multiple different variations on the multimodal inputs and outputs of these seq2seq models. Our experiments on multimodal sentiment analysis using the CMU-MOSI dataset indicate that our methods learn informative multimodal representations that outperform the baselines and achieve improved performance on multimodal sentiment analysis, specifically in the Bimodal case where our model is able to improve F1 Score by twelve points. We also discuss future directions for multimodal Seq2Seq methods.
Tasks	Multimodal Sentiment Analysis, Sentiment Analysis
Published	2018-07-11
URL	http://arxiv.org/abs/1807.03915v2
PDF	http://arxiv.org/pdf/1807.03915v2.pdf
PWC	https://paperswithcode.com/paper/seq2seq2sentiment-multimodal-sequence-to
Repo
Framework

Effective, Fast, and Memory-Efficient Compressed Multi-function Convolutional Neural Networks for More Accurate Medical Image Classification


Title	Effective, Fast, and Memory-Efficient Compressed Multi-function Convolutional Neural Networks for More Accurate Medical Image Classification
Authors	Luna M. Zhang
Abstract	Convolutional Neural Networks (CNNs) usually use the same activation function, such as RELU, for all convolutional layers. There are performance limitations of just using RELU. In order to achieve better classification performance, reduce training and testing times, and reduce power consumption and memory usage, a new “Compressed Multi-function CNN” is developed. Google’s Inception-V4, for example, is a very deep CNN that consists of 4 Inception-A blocks, 7 Inception-B blocks, and 3 Inception-C blocks. RELU is used for all convolutional layers. A new “Compressed Multi-function Inception-V4” (CMI) that can use different activation functions is created with k Inception-A blocks, m Inception-B blocks, and n Inception-C blocks where k in {1, 2, 3, 4}, m in {1, 2, 3, 4, 5, 6, 7}, n in {1, 2, 3}, and (k+m+n)<14. For performance analysis, a dataset for classifying brain MRI images into one of the four stages of Alzheimer’s disease is used to compare three CMI architectures with Inception-V4 in terms of F1-score, training and testing times (related to power consumption), and memory usage (model size). Overall, simulations show that the new CMI models can outperform both the commonly used Inception-V4 and Inception-V4 using different activation functions. In the future, other “Compressed Multi-function CNNs”, such as “Compressed Multi-function ResNets and DenseNets” that have a reduced number of convolutional blocks using different activation functions, will be developed to further increase classification accuracy, reduce training and testing times, reduce computational power, and reduce memory usage (model size) for building more effective healthcare systems, such as implementing accurate and convenient disease diagnosis systems on mobile devices that have limited battery power and memory.
Tasks	Image Classification
Published	2018-11-29
URL	http://arxiv.org/abs/1811.11996v1
PDF	http://arxiv.org/pdf/1811.11996v1.pdf
PWC	https://paperswithcode.com/paper/effective-fast-and-memory-efficient
Repo
Framework

On2Vec: Embedding-based Relation Prediction for Ontology Population


Title	On2Vec: Embedding-based Relation Prediction for Ontology Population
Authors	Muhao Chen, Yingtao Tian, Xuelu Chen, Zijun Xue, Carlo Zaniolo
Abstract	Populating ontology graphs represents a long-standing problem for the Semantic Web community. Recent advances in translation-based graph embedding methods for populating instance-level knowledge graphs lead to promising new approaching for the ontology population problem. However, unlike instance-level graphs, the majority of relation facts in ontology graphs come with comprehensive semantic relations, which often include the properties of transitivity and symmetry, as well as hierarchical relations. These comprehensive relations are often too complex for existing graph embedding methods, and direct application of such methods is not feasible. Hence, we propose On2Vec, a novel translation-based graph embedding method for ontology population. On2Vec integrates two model components that effectively characterize comprehensive relation facts in ontology graphs. The first is the Component-specific Model that encodes concepts and relations into low-dimensional embedding spaces without a loss of relational properties; the second is the Hierarchy Model that performs focused learning of hierarchical relation facts. Experiments on several well-known ontology graphs demonstrate the promising capabilities of On2Vec in predicting and verifying new relation facts. These promising results also make possible significant improvements in related methods.
Tasks	Graph Embedding, Knowledge Graphs
Published	2018-09-07
URL	http://arxiv.org/abs/1809.02382v1
PDF	http://arxiv.org/pdf/1809.02382v1.pdf
PWC	https://paperswithcode.com/paper/on2vec-embedding-based-relation-prediction
Repo
Framework

Deep Multimodal Clustering for Unsupervised Audiovisual Learning


Title	Deep Multimodal Clustering for Unsupervised Audiovisual Learning
Authors	Di Hu, Feiping Nie, Xuelong Li
Abstract	The seen birds twitter, the running cars accompany with noise, etc. These naturally audiovisual correspondences provide the possibilities to explore and understand the outside world. However, the mixed multiple objects and sounds make it intractable to perform efficient matching in the unconstrained environment. To settle this problem, we propose to adequately excavate audio and visual components and perform elaborate correspondence learning among them. Concretely, a novel unsupervised audiovisual learning model is proposed, named as \Deep Multimodal Clustering (DMC), that synchronously performs sets of clustering with multimodal vectors of convolutional maps in different shared spaces for capturing multiple audiovisual correspondences. And such integrated multimodal clustering network can be effectively trained with max-margin loss in the end-to-end fashion. Amounts of experiments in feature evaluation and audiovisual tasks are performed. The results demonstrate that DMC can learn effective unimodal representation, with which the classifier can even outperform human performance. Further, DMC shows noticeable performance in sound localization, multisource detection, and audiovisual understanding.
Tasks
Published	2018-07-09
URL	http://arxiv.org/abs/1807.03094v3
PDF	http://arxiv.org/pdf/1807.03094v3.pdf
PWC	https://paperswithcode.com/paper/deep-co-clustering-for-unsupervised
Repo
Framework

Multi-View Graph Embedding Using Randomized Shortest Paths


Title	Multi-View Graph Embedding Using Randomized Shortest Paths
Authors	Anuththari Gamage, Brian Rappaport, Shuchin Aeron, Xiaozhe Hu
Abstract	Real-world data sets often provide multiple types of information about the same set of entities. This data is well represented by multi-view graphs, which consist of several distinct sets of edges over the same nodes. These can be used to analyze how entities interact from different viewpoints. Combining multiple views improves the quality of inferences drawn from the underlying data, which has increased interest in developing efficient multi-view graph embedding methods. We propose an algorithm, C-RSP, that generates a common (C) embedding of a multi-view graph using Randomized Shortest Paths (RSP). This algorithm generates a dissimilarity measure between nodes by minimizing the expected cost of a random walk between any two nodes across all views of a multi-view graph, in doing so encoding both the local and global structure of the graph. We test C-RSP on both real and synthetic data and show that it outperforms benchmark algorithms at embedding and clustering tasks while remaining computationally efficient.
Tasks	Graph Embedding
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06560v1
PDF	http://arxiv.org/pdf/1808.06560v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-graph-embedding-using-randomized
Repo
Framework

Unsupervised Disambiguation of Syncretism in Inflected Lexicons


Title	Unsupervised Disambiguation of Syncretism in Inflected Lexicons
Authors	Ryan Cotterell, Christo Kirov, Sabrina J. Mielke, Jason Eisner
Abstract	Lexical ambiguity makes it difficult to compute various useful statistics of a corpus. A given word form might represent any of several morphological feature bundles. One can, however, use unsupervised learning (as in EM) to fit a model that probabilistically disambiguates word forms. We present such an approach, which employs a neural network to smoothly model a prior distribution over feature bundles (even rare ones). Although this basic model does not consider a token’s context, that very property allows it to operate on a simple list of unigram type counts, partitioning each count among different analyses of that unigram. We discuss evaluation metrics for this novel task and report results on 5 languages.
Tasks
Published	2018-06-10
URL	https://arxiv.org/abs/1806.03740v2
PDF	https://arxiv.org/pdf/1806.03740v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-disambiguation-of-syncretism-in
Repo
Framework