Paper Group ANR 449
Model selection by minimum description length: Lower-bound sample sizes for the Fisher information approximation. Convolutional Neural Networks combined with Runge-Kutta Methods. Learning Universal Sentence Representations with Mean-Max Attention Autoencoder. Learning Visual Knowledge Memory Networks for Visual Question Answering. Extreme Learning …
Model selection by minimum description length: Lower-bound sample sizes for the Fisher information approximation
Title | Model selection by minimum description length: Lower-bound sample sizes for the Fisher information approximation |
Authors | Daniel W. Heck, Morten Moshagen, Edgar Erdfelder |
Abstract | The Fisher information approximation (FIA) is an implementation of the minimum description length principle for model selection. Unlike information criteria such as AIC or BIC, it has the advantage of taking the functional form of a model into account. Unfortunately, FIA can be misleading in finite samples, resulting in an inversion of the correct rank order of complexity terms for competing models in the worst case. As a remedy, we propose a lower-bound $N'$ for the sample size that suffices to preclude such errors. We illustrate the approach using three examples from the family of multinomial processing tree models. |
Tasks | Model Selection |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00212v1 |
http://arxiv.org/pdf/1808.00212v1.pdf | |
PWC | https://paperswithcode.com/paper/model-selection-by-minimum-description-length |
Repo | |
Framework | |
Convolutional Neural Networks combined with Runge-Kutta Methods
Title | Convolutional Neural Networks combined with Runge-Kutta Methods |
Authors | Mai Zhu, Bo Chang, Chong Fu |
Abstract | A convolutional neural network for image classification can be constructed mathematically since it can be regarded as a multi-period dynamical system. In this paper, a novel approach is proposed to construct network models from the dynamical systems view. Since a pre-activation residual network can be deemed an approximation of a time-dependent dynamical system using the forward Euler method, higher order Runge-Kutta methods (RK methods) can be utilized to build network models in order to achieve higher accuracy. The model constructed in such a way is referred to as the Runge-Kutta Convolutional Neural Network (RKNet). RK methods also provide an interpretation of Dense Convolutional Networks (DenseNets) and Convolutional Neural Networks with Alternately Updated Clique (CliqueNets) from the dynamical systems view. The proposed methods are evaluated on benchmark datasets: CIFAR-10/100, SVHN and ImageNet. The experimental results are consistent with the theoretical properties of RK methods and support the dynamical systems interpretation. Moreover, the experimental results show that the RKNets are superior to the state-of-the-art network models on CIFAR-10 and on par on CIFAR-100, SVHN and ImageNet. |
Tasks | Image Classification |
Published | 2018-02-24 |
URL | http://arxiv.org/abs/1802.08831v6 |
http://arxiv.org/pdf/1802.08831v6.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-combined-with |
Repo | |
Framework | |
Learning Universal Sentence Representations with Mean-Max Attention Autoencoder
Title | Learning Universal Sentence Representations with Mean-Max Attention Autoencoder |
Authors | Minghua Zhang, Yunfang Wu, Weikang Li, Wei Li |
Abstract | In order to learn universal sentence representations, previous methods focus on complex recurrent neural networks or supervised learning. In this paper, we propose a mean-max attention autoencoder (mean-max AAE) within the encoder-decoder framework. Our autoencoder rely entirely on the MultiHead self-attention mechanism to reconstruct the input sequence. In the encoding we propose a mean-max strategy that applies both mean and max pooling operations over the hidden vectors to capture diverse information of the input. To enable the information to steer the reconstruction process dynamically, the decoder performs attention over the mean-max representation. By training our model on a large collection of unlabelled data, we obtain high-quality representations of sentences. Experimental results on a broad range of 10 transfer tasks demonstrate that our model outperforms the state-of-the-art unsupervised single methods, including the classical skip-thoughts and the advanced skip-thoughts+LN model. Furthermore, compared with the traditional recurrent neural network, our mean-max AAE greatly reduce the training time. |
Tasks | |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06590v1 |
http://arxiv.org/pdf/1809.06590v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-universal-sentence-representations |
Repo | |
Framework | |
Learning Visual Knowledge Memory Networks for Visual Question Answering
Title | Learning Visual Knowledge Memory Networks for Visual Question Answering |
Authors | Zhou Su, Chen Zhu, Yinpeng Dong, Dongqi Cai, Yurong Chen, Jianguo Li |
Abstract | Visual question answering (VQA) requires joint comprehension of images and natural language questions, where many questions can’t be directly or clearly answered from visual content but require reasoning from structured human knowledge with confirmation from visual content. This paper proposes visual knowledge memory network (VKMN) to address this issue, which seamlessly incorporates structured human knowledge and deep visual features into memory networks in an end-to-end learning framework. Comparing to existing methods for leveraging external knowledge for supporting VQA, this paper stresses more on two missing mechanisms. First is the mechanism for integrating visual contents with knowledge facts. VKMN handles this issue by embedding knowledge triples (subject, relation, target) and deep visual features jointly into the visual knowledge features. Second is the mechanism for handling multiple knowledge facts expanding from question and answer pairs. VKMN stores joint embedding using key-value pair structure in the memory networks so that it is easy to handle multiple facts. Experiments show that the proposed method achieves promising results on both VQA v1.0 and v2.0 benchmarks, while outperforms state-of-the-art methods on the knowledge-reasoning related questions. |
Tasks | Question Answering, Visual Question Answering |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.04860v1 |
http://arxiv.org/pdf/1806.04860v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-visual-knowledge-memory-networks-for |
Repo | |
Framework | |
Extreme Learning Machine with Local Connections
Title | Extreme Learning Machine with Local Connections |
Authors | Feng Li, Sibo Yang, Huanhuan Huang, Wei Wu |
Abstract | This paper is concerned with the sparsification of the input-hidden weights of ELM (Extreme Learning Machine). For ordinary feedforward neural networks, the sparsification is usually done by introducing certain regularization technique into the learning process of the network. But this strategy can not be applied for ELM, since the input-hidden weights of ELM are supposed to be randomly chosen rather than to be learned. To this end, we propose a modified ELM, called ELM-LC (ELM with local connections), which is designed for the sparsification of the input-hidden weights as follows: The hidden nodes and the input nodes are divided respectively into several corresponding groups, and an input node group is fully connected with its corresponding hidden node group, but is not connected with any other hidden node group. As in the usual ELM, the hidden-input weights are randomly given, and the hidden-output weights are obtained through a least square learning. In the numerical simulations on some benchmark problems, the new ELM-CL behaves better than the traditional ELM. |
Tasks | |
Published | 2018-01-22 |
URL | http://arxiv.org/abs/1801.06975v1 |
http://arxiv.org/pdf/1801.06975v1.pdf | |
PWC | https://paperswithcode.com/paper/extreme-learning-machine-with-local |
Repo | |
Framework | |
Dynamic Variational Autoencoders for Visual Process Modeling
Title | Dynamic Variational Autoencoders for Visual Process Modeling |
Authors | Alexander Sagel, Hao Shen |
Abstract | This work studies the problem of modeling visual processes by leveraging deep generative architectures for learning linear, Gaussian representations from observed sequences. We propose a joint learning framework, combining a vector autoregressive model and Variational Autoencoders. This results in an architecture that allows Variational Autoencoders to simultaneously learn a non-linear observation as well as a linear state model from sequences of frames. We validate our approach on artificial sequences and dynamic textures. |
Tasks | |
Published | 2018-03-20 |
URL | https://arxiv.org/abs/1803.07488v3 |
https://arxiv.org/pdf/1803.07488v3.pdf | |
PWC | https://paperswithcode.com/paper/linearizing-visual-processes-with |
Repo | |
Framework | |
PAC Ranking from Pairwise and Listwise Queries: Lower Bounds and Upper Bounds
Title | PAC Ranking from Pairwise and Listwise Queries: Lower Bounds and Upper Bounds |
Authors | Wenbo Ren, Jia Liu, Ness B. Shroff |
Abstract | This paper explores the adaptive (active) PAC (probably approximately correct) top-$k$ ranking (i.e., top-$k$ item selection) and total ranking problems from $l$-wise ($l\geq 2$) comparisons under the multinomial logit (MNL) model. By adaptively choosing sets to query and observing the noisy output of the most favored item of each query, we want to design ranking algorithms that recover the top-$k$ or total ranking using as few queries as possible. For the PAC top-$k$ ranking problem, we derive a lower bound on the sample complexity (aka number of queries), and propose an algorithm that is sample-complexity-optimal up to an $O(\log(k+l)/\log{k})$ factor. When $l=2$ (i.e., pairwise comparisons) or $l=O(poly(k))$, this algorithm matches the lower bound. For the PAC total ranking problem, we derive a tight lower bound, and propose an algorithm that matches the lower bound. When $l=2$, the MNL model reduces to the popular Plackett-Luce (PL) model. In this setting, our results still outperform the state-of-the-art both theoretically and numerically. We also compare our algorithms with the state-of-the-art using synthetic data as well as real-world data to verify the efficiency of our algorithms. |
Tasks | |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.02970v2 |
http://arxiv.org/pdf/1806.02970v2.pdf | |
PWC | https://paperswithcode.com/paper/pac-ranking-from-pairwise-and-listwise |
Repo | |
Framework | |
Unsupervised learning with GLRM feature selection reveals novel traumatic brain injury phenotypes
Title | Unsupervised learning with GLRM feature selection reveals novel traumatic brain injury phenotypes |
Authors | Aaron J. Masino, Kaitlin A. Folweiler |
Abstract | Baseline injury categorization is important to traumatic brain injury (TBI) research and treatment. Current categorization is dominated by symptom-based scores that insufficiently capture injury heterogeneity. In this work, we apply unsupervised clustering to identify novel TBI phenotypes. Our approach uses a generalized low-rank model (GLRM) model for feature selection in a procedure analogous to wrapper methods. The resulting clusters reveal four novel TBI phenotypes with distinct feature profiles and that correlate to 90-day functional and cognitive status. |
Tasks | Feature Selection |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1812.00030v1 |
http://arxiv.org/pdf/1812.00030v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-with-glrm-feature |
Repo | |
Framework | |
Collaborative Planning for Mixed-Autonomy Lane Merging
Title | Collaborative Planning for Mixed-Autonomy Lane Merging |
Authors | Shray Bansal, Akansel Cosgun, Alireza Nakhaei, Kikuo Fujimura |
Abstract | Driving is a social activity: drivers often indicate their intent to change lanes via motion cues. We consider mixed-autonomy traffic where a Human-driven Vehicle (HV) and an Autonomous Vehicle (AV) drive together. We propose a planning framework where the degree to which the AV considers the other agent’s reward is controlled by a selfishness factor. We test our approach on a simulated two-lane highway where the AV and HV merge into each other’s lanes. In a user study with 21 subjects and 6 different selfishness factors, we found that our planning approach was sound and that both agents had less merging times when a factor that balances the rewards for the two agents was chosen. Our results on double lane merging suggest it to be a non-zero-sum game and encourage further investigation on collaborative decision making algorithms for mixed-autonomy traffic. |
Tasks | Decision Making |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02550v1 |
http://arxiv.org/pdf/1808.02550v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-planning-for-mixed-autonomy |
Repo | |
Framework | |
A Latent Gaussian Mixture Model for Clustering Longitudinal Data
Title | A Latent Gaussian Mixture Model for Clustering Longitudinal Data |
Authors | Vanessa S. E. Bierling, Paul D. McNicholas |
Abstract | Finite mixture models have become a popular tool for clustering. Amongst other uses, they have been applied for clustering longitudinal data and clustering high-dimensional data. In the latter case, a latent Gaussian mixture model is sometimes used. Although there has been much work on clustering using latent variables and on clustering longitudinal data, respectively, there has been a paucity of work that combines these features. An approach is developed for clustering longitudinal data with many time points based on an extension of the mixture of common factor analyzers model. A variation of the expectation-maximization algorithm is used for parameter estimation and the Bayesian information criterion is used for model selection. The approach is illustrated using real and simulated data. |
Tasks | Model Selection |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.05133v1 |
http://arxiv.org/pdf/1804.05133v1.pdf | |
PWC | https://paperswithcode.com/paper/a-latent-gaussian-mixture-model-for |
Repo | |
Framework | |
Anytime Stochastic Gradient Descent: A Time to Hear from all the Workers
Title | Anytime Stochastic Gradient Descent: A Time to Hear from all the Workers |
Authors | Nuwan Ferdinand, Stark Draper |
Abstract | In this paper, we focus on approaches to parallelizing stochastic gradient descent (SGD) wherein data is farmed out to a set of workers, the results of which, after a number of updates, are then combined at a central master node. Although such synchronized SGD approaches parallelize well in idealized computing environments, they often fail to realize their promised computational acceleration in practical settings. One cause is slow workers, termed stragglers, who can cause the fusion step at the master node to stall, which greatly slowing convergence. In many straggler mitigation approaches work completed by these nodes, while only partial, is discarded completely. In this paper, we propose an approach to parallelizing synchronous SGD that exploits the work completed by all workers. The central idea is to fix the computation time of each worker and then to combine distinct contributions of all workers. We provide a convergence analysis and optimize the combination function. Our numerical results demonstrate an improvement of several factors of magnitude in comparison to existing methods. |
Tasks | |
Published | 2018-10-06 |
URL | http://arxiv.org/abs/1810.02976v1 |
http://arxiv.org/pdf/1810.02976v1.pdf | |
PWC | https://paperswithcode.com/paper/anytime-stochastic-gradient-descent-a-time-to |
Repo | |
Framework | |
Unsupervised Facial Geometry Learning for Sketch to Photo Synthesis
Title | Unsupervised Facial Geometry Learning for Sketch to Photo Synthesis |
Authors | Hadi Kazemi, Fariborz Taherkhani, Nasser M. Nasrabadi |
Abstract | Face sketch-photo synthesis is a critical application in law enforcement and digital entertainment industry where the goal is to learn the mapping between a face sketch image and its corresponding photo-realistic image. However, the limited number of paired sketch-photo training data usually prevents the current frameworks to learn a robust mapping between the geometry of sketches and their matching photo-realistic images. Consequently, in this work, we present an approach for learning to synthesize a photo-realistic image from a face sketch in an unsupervised fashion. In contrast to current unsupervised image-to-image translation techniques, our framework leverages a novel perceptual discriminator to learn the geometry of human face. Learning facial prior information empowers the network to remove the geometrical artifacts in the face sketch. We demonstrate that a simultaneous optimization of the face photo generator network, employing the proposed perceptual discriminator in combination with a texture-wise discriminator, results in a significant improvement in quality and recognition rate of the synthesized photos. We evaluate the proposed network by conducting extensive experiments on multiple baseline sketch-photo datasets. |
Tasks | Image-to-Image Translation, Unsupervised Image-To-Image Translation |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.05361v1 |
http://arxiv.org/pdf/1810.05361v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-facial-geometry-learning-for |
Repo | |
Framework | |
Adversarial point set registration
Title | Adversarial point set registration |
Authors | Sergei Divakov, Ivan Oseledets |
Abstract | We present a novel approach to point set registration which is based on one-shot adversarial learning. The idea of the algorithm is inspired by recent successes of generative adversarial networks. Treating the point clouds as three-dimensional probability distributions, we develop a one-shot adversarial optimization procedure, in which we train a critic neural network to distinguish between source and target point sets, while simultaneously learning the parameters of the transformation to trick the critic into confusing the points. In contrast to most existing algorithms for point set registration, ours does not rely on any correspondences between the point clouds. We demonstrate the performance of the algorithm on several challenging benchmarks and compare it to the existing baselines. |
Tasks | |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08139v1 |
http://arxiv.org/pdf/1811.08139v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-point-set-registration |
Repo | |
Framework | |
BPE and computer-extracted parenchymal enhancement for breast cancer risk, response monitoring, and prognosis
Title | BPE and computer-extracted parenchymal enhancement for breast cancer risk, response monitoring, and prognosis |
Authors | Bas H. M. van der Velden |
Abstract | Functional behavior of breast cancer - representing underlying biology - can be analyzed using MRI. The most widely used breast MR imaging protocol is dynamic contrast-enhanced T1-weighted imaging. The cancer enhances on dynamic contrast-enhanced MR imaging because the contrast agent leaks from the leaky vessels into the interstitial space. The contrast agent subsequently leaks back into the vascular space, creating a washout effect. The normal parenchymal tissue of the breast can also enhance after contrast injection. This enhancement generally increases over time. Typically, a radiologist assesses this background parenchymal enhancement (BPE) using the Breast Imaging Reporting and Data System (BI-RADS). According to the BI-RADS, BPE refers to the volume of enhancement and the intensity of enhancement and is divided in four incremental categories: minimal, mild, moderate, and marked. Researchers have developed semi-automatic and automatic methods to extract properties of BPE from MR images. For clarity, in this syllabus the BI-RADS definition will be referred to as BPE, whereas the computer-extracted properties will not. Both BPE and computer-extracted parenchymal enhancement properties have been linked to screening and diagnosis, hormone status and age, risk of development of breast cancer, response monitoring, and prognosis. |
Tasks | |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05510v1 |
http://arxiv.org/pdf/1809.05510v1.pdf | |
PWC | https://paperswithcode.com/paper/bpe-and-computer-extracted-parenchymal |
Repo | |
Framework | |
Thwarting Adversarial Examples: An $L_0$-RobustSparse Fourier Transform
Title | Thwarting Adversarial Examples: An $L_0$-RobustSparse Fourier Transform |
Authors | Mitali Bafna, Jack Murtagh, Nikhil Vyas |
Abstract | We give a new algorithm for approximating the Discrete Fourier transform of an approximately sparse signal that has been corrupted by worst-case $L_0$ noise, namely a bounded number of coordinates of the signal have been corrupted arbitrarily. Our techniques generalize to a wide range of linear transformations that are used in data analysis such as the Discrete Cosine and Sine transforms, the Hadamard transform, and their high-dimensional analogs. We use our algorithm to successfully defend against well known $L_0$ adversaries in the setting of image classification. We give experimental results on the Jacobian-based Saliency Map Attack (JSMA) and the Carlini Wagner (CW) $L_0$ attack on the MNIST and Fashion-MNIST datasets as well as the Adversarial Patch on the ImageNet dataset. |
Tasks | Image Classification |
Published | 2018-12-12 |
URL | http://arxiv.org/abs/1812.05013v1 |
http://arxiv.org/pdf/1812.05013v1.pdf | |
PWC | https://paperswithcode.com/paper/thwarting-adversarial-examples-an-l_0 |
Repo | |
Framework | |