Paper Group ANR 330
On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo. MoE-SPNet: A Mixture-of-Experts Scene Parsing Network. Convolutional neural network compression for natural language processing. Robotics Rights and Ethics Rules. Benefits of over-parameterization with EM. Information-Weighted Neural Cache Language Models for ASR. Spatially Co …
On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo
Title | On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo |
Authors | Niladri S. Chatterji, Nicolas Flammarion, Yi-An Ma, Peter L. Bartlett, Michael I. Jordan |
Abstract | We provide convergence guarantees in Wasserstein distance for a variety of variance-reduction methods: SAGA Langevin diffusion, SVRG Langevin diffusion and control-variate underdamped Langevin diffusion. We analyze these methods under a uniform set of assumptions on the log-posterior distribution, assuming it to be smooth, strongly convex and Hessian Lipschitz. This is achieved by a new proof technique combining ideas from finite-sum optimization and the analysis of sampling methods. Our sharp theoretical bounds allow us to identify regimes of interest where each method performs better than the others. Our theory is verified with experiments on real-world and synthetic datasets. |
Tasks | |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05431v1 |
http://arxiv.org/pdf/1802.05431v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-theory-of-variance-reduction-for |
Repo | |
Framework | |
MoE-SPNet: A Mixture-of-Experts Scene Parsing Network
Title | MoE-SPNet: A Mixture-of-Experts Scene Parsing Network |
Authors | Huan Fu, Mingming Gong, Chaohui Wang, Dacheng Tao |
Abstract | Scene parsing is an indispensable component in understanding the semantics within a scene. Traditional methods rely on handcrafted local features and probabilistic graphical models to incorporate local and global cues. Recently, methods based on fully convolutional neural networks have achieved new records on scene parsing. An important strategy common to these methods is the aggregation of hierarchical features yielded by a deep convolutional neural network. However, typical algorithms usually aggregate hierarchical convolutional features via concatenation or linear combination, which cannot sufficiently exploit the diversities of contextual information in multi-scale features and the spatial inhomogeneity of a scene. In this paper, we propose a mixture-of-experts scene parsing network (MoE-SPNet) that incorporates a convolutional mixture-of-experts layer to assess the importance of features from different levels and at different spatial locations. In addition, we propose a variant of mixture-of-experts called the adaptive hierarchical feature aggregation (AHFA) mechanism which can be incorporated into existing scene parsing networks that use skip-connections to fuse features layer-wisely. In the proposed networks, different levels of features at each spatial location are adaptively re-weighted according to the local structure and surrounding contextual information before aggregation. We demonstrate the effectiveness of the proposed methods on two scene parsing datasets including PASCAL VOC 2012 and SceneParse150 based on two kinds of baseline models FCN-8s and DeepLab-ASPP. |
Tasks | Scene Parsing |
Published | 2018-06-19 |
URL | http://arxiv.org/abs/1806.07049v1 |
http://arxiv.org/pdf/1806.07049v1.pdf | |
PWC | https://paperswithcode.com/paper/moe-spnet-a-mixture-of-experts-scene-parsing |
Repo | |
Framework | |
Convolutional neural network compression for natural language processing
Title | Convolutional neural network compression for natural language processing |
Authors | Krzysztof Wróbel, Marcin Pietroń, Maciej Wielgosz, Michał Karwatowski, Kazimierz Wiatr |
Abstract | Convolutional neural networks are modern models that are very efficient in many classification tasks. They were originally created for image processing purposes. Then some trials were performed to use them in different domains like natural language processing. The artificial intelligence systems (like humanoid robots) are very often based on embedded systems with constraints on memory, power consumption etc. Therefore convolutional neural network because of its memory capacity should be reduced to be mapped to given hardware. In this paper, results are presented of compressing the efficient convolutional neural networks for sentiment analysis. The main steps are quantization and pruning processes. The method responsible for mapping compressed network to FPGA and results of this implementation are presented. The described simulations showed that 5-bit width is enough to have no drop in accuracy from floating point version of the network. Additionally, significant memory footprint reduction was achieved (from 85% up to 93%). |
Tasks | Neural Network Compression, Quantization, Sentiment Analysis |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.10796v1 |
http://arxiv.org/pdf/1805.10796v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-network-compression-for |
Repo | |
Framework | |
Robotics Rights and Ethics Rules
Title | Robotics Rights and Ethics Rules |
Authors | Tuncay Yigit, Utku Kose, Nilgun Sengoz |
Abstract | It is very important to adhere strictly to ethical and social influences when delivering most of our life to artificial intelligence systems. With industry 4.0, the internet of things, data analysis and automation have begun to be of great importance in our lives. With the Yapanese version of Industry 5.0, it has come to our attention that machine-human interaction and human intelligence are working in harmony with the cognitive computer. In this context, robots working on artificial intelligence algorithms co-ordinated with the development of technology have begun to enter our lives. But the consequences of the recent complaints of the Robots have been that important issues have arisen about how to be followed in terms of intellectual property and ethics. Although there are no laws regulating robots in our country at present, laws on robot ethics and rights abroad have entered into force. This means that it is important that we organize the necessary arrangements in the way that robots and artificial intelligence are so important in the new world order. In this study, it was aimed to examine the existing rules of machine and robot ethics and to set an example for the arrangements to be made in our country, and various discussions were given in this context. |
Tasks | |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08885v1 |
http://arxiv.org/pdf/1809.08885v1.pdf | |
PWC | https://paperswithcode.com/paper/robotics-rights-and-ethics-rules |
Repo | |
Framework | |
Benefits of over-parameterization with EM
Title | Benefits of over-parameterization with EM |
Authors | Ji Xu, Daniel Hsu, Arian Maleki |
Abstract | Expectation Maximization (EM) is among the most popular algorithms for maximum likelihood estimation, but it is generally only guaranteed to find its stationary points of the log-likelihood objective. The goal of this article is to present theoretical and empirical evidence that over-parameterization can help EM avoid spurious local optima in the log-likelihood. We consider the problem of estimating the mean vectors of a Gaussian mixture model in a scenario where the mixing weights are known. Our study shows that the global behavior of EM, when one uses an over-parameterized model in which the mixing weights are treated as unknown, is better than that when one uses the (correct) model with the mixing weights fixed to the known values. For symmetric Gaussians mixtures with two components, we prove that introducing the (statistically redundant) weight parameters enables EM to find the global maximizer of the log-likelihood starting from almost any initial mean parameters, whereas EM without this over-parameterization may very often fail. For other Gaussian mixtures, we provide empirical evidence that shows similar behavior. Our results corroborate the value of over-parameterization in solving non-convex optimization problems, previously observed in other domains. |
Tasks | |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11344v1 |
http://arxiv.org/pdf/1810.11344v1.pdf | |
PWC | https://paperswithcode.com/paper/benefits-of-over-parameterization-with-em |
Repo | |
Framework | |
Information-Weighted Neural Cache Language Models for ASR
Title | Information-Weighted Neural Cache Language Models for ASR |
Authors | Lyan Verwimp, Joris Pelemans, Hugo Van hamme, Patrick Wambacq |
Abstract | Neural cache language models (LMs) extend the idea of regular cache language models by making the cache probability dependent on the similarity between the current context and the context of the words in the cache. We make an extensive comparison of ‘regular’ cache models with neural cache models, both in terms of perplexity and WER after rescoring first-pass ASR results. Furthermore, we propose two extensions to this neural cache model that make use of the content value/information weight of the word: firstly, combining the cache probability and LM probability with an information-weighted interpolation and secondly, selectively adding only content words to the cache. We obtain a 29.9%/32.1% (validation/test set) relative improvement in perplexity with respect to a baseline LSTM LM on the WikiText-2 dataset, outperforming previous work on neural cache LMs. Additionally, we observe significant WER reductions with respect to the baseline model on the WSJ ASR task. |
Tasks | |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08826v1 |
http://arxiv.org/pdf/1809.08826v1.pdf | |
PWC | https://paperswithcode.com/paper/information-weighted-neural-cache-language |
Repo | |
Framework | |
Spatially Constrained Location Prior for Scene Parsing
Title | Spatially Constrained Location Prior for Scene Parsing |
Authors | Ligang Zhang, Brijesh Verma, David Stockwell, Sujan Chowdhury |
Abstract | Semantic context is an important and useful cue for scene parsing in complicated natural images with a substantial amount of variations in objects and the environment. This paper proposes Spatially Constrained Location Prior (SCLP) for effective modelling of global and local semantic context in the scene in terms of inter-class spatial relationships. Unlike existing studies focusing on either relative or absolute location prior of objects, the SCLP effectively incorporates both relative and absolute location priors by calculating object co-occurrence frequencies in spatially constrained image blocks. The SCLP is general and can be used in conjunction with various visual feature-based prediction models, such as Artificial Neural Networks and Support Vector Machine (SVM), to enforce spatial contextual constraints on class labels. Using SVM classifiers and a linear regression model, we demonstrate that the incorporation of SCLP achieves superior performance compared to the state-of-the-art methods on the Stanford background and SIFT Flow datasets. |
Tasks | Scene Parsing |
Published | 2018-02-24 |
URL | http://arxiv.org/abs/1802.08790v1 |
http://arxiv.org/pdf/1802.08790v1.pdf | |
PWC | https://paperswithcode.com/paper/spatially-constrained-location-prior-for |
Repo | |
Framework | |
Moiré Photo Restoration Using Multiresolution Convolutional Neural Networks
Title | Moiré Photo Restoration Using Multiresolution Convolutional Neural Networks |
Authors | Yujing Sun, Yizhou Yu, Wenping Wang |
Abstract | Digital cameras and mobile phones enable us to conveniently record precious moments. While digital image quality is constantly being improved, taking high-quality photos of digital screens still remains challenging because the photos are often contaminated with moir'{e} patterns, a result of the interference between the pixel grids of the camera sensor and the device screen. Moir'{e} patterns can severely damage the visual quality of photos. However, few studies have aimed to solve this problem. In this paper, we introduce a novel multiresolution fully convolutional network for automatically removing moir'{e} patterns from photos. Since a moir'{e} pattern spans over a wide range of frequencies, our proposed network performs a nonlinear multiresolution analysis of the input image before computing how to cancel moir'{e} artefacts within every frequency band. We also create a large-scale benchmark dataset with $100,000^+$ image pairs for investigating and evaluating moir'{e} pattern removal algorithms. Our network achieves state-of-the-art performance on this dataset in comparison to existing learning architectures for image restoration problems. |
Tasks | Image Restoration |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.02996v1 |
http://arxiv.org/pdf/1805.02996v1.pdf | |
PWC | https://paperswithcode.com/paper/moire-photo-restoration-using-multiresolution |
Repo | |
Framework | |
Algorithms for metric learning via contrastive embeddings
Title | Algorithms for metric learning via contrastive embeddings |
Authors | Diego Ihara Centurion, Neshat Mohammadi, Anastasios Sidiropoulos |
Abstract | We study the problem of supervised learning a metric space under discriminative constraints. Given a universe $X$ and sets ${\cal S}, {\cal D}\subset {X \choose 2}$ of similar and dissimilar pairs, we seek to find a mapping $f:X\to Y$, into some target metric space $M=(Y,\rho)$, such that similar objects are mapped to points at distance at most $u$, and dissimilar objects are mapped to points at distance at least $\ell$. More generally, the goal is to find a mapping of maximum accuracy (that is, fraction of correctly classified pairs). We propose approximation algorithms for various versions of this problem, for the cases of Euclidean and tree metric spaces. For both of these target spaces, we obtain fully polynomial-time approximation schemes (FPTAS) for the case of perfect information. In the presence of imperfect information we present approximation algorithms that run in quasipolynomial time (QPTAS). Our algorithms use a combination of tools from metric embeddings and graph partitioning, that could be of independent interest. |
Tasks | graph partitioning, Metric Learning |
Published | 2018-07-13 |
URL | http://arxiv.org/abs/1807.04881v3 |
http://arxiv.org/pdf/1807.04881v3.pdf | |
PWC | https://paperswithcode.com/paper/algorithms-for-metric-learning-via |
Repo | |
Framework | |
Adaptive Diffusions for Scalable Learning over Graphs
Title | Adaptive Diffusions for Scalable Learning over Graphs |
Authors | Dimitris Berberidis, Athanasios N. Nikolakopoulos, Georgios B. Giannakis |
Abstract | Diffusion-based classifiers such as those relying on the Personalized PageRank and the Heat kernel, enjoy remarkable classification accuracy at modest computational requirements. Their performance however is affected by the extent to which the chosen diffusion captures a typically unknown label propagation mechanism, that can be specific to the underlying graph, and potentially different for each class. The present work introduces a disciplined, data-efficient approach to learning class-specific diffusion functions adapted to the underlying network topology. The novel learning approach leverages the notion of “landing probabilities” of class-specific random walks, which can be computed efficiently, thereby ensuring scalability to large graphs. This is supported by rigorous analysis of the properties of the model as well as the proposed algorithms. Furthermore, a robust version of the classifier facilitates learning even in noisy environments. Classification tests on real networks demonstrate that adapting the diffusion function to the given graph and observed labels, significantly improves the performance over fixed diffusions; reaching – and many times surpassing – the classification accuracy of computationally heavier state-of-the-art competing methods, that rely on node embeddings and deep neural networks. |
Tasks | |
Published | 2018-04-05 |
URL | http://arxiv.org/abs/1804.02081v3 |
http://arxiv.org/pdf/1804.02081v3.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-diffusions-for-scalable-learning |
Repo | |
Framework | |
Accurate pedestrian localization in overhead depth images via Height-Augmented HOG
Title | Accurate pedestrian localization in overhead depth images via Height-Augmented HOG |
Authors | Werner Kroneman, Alessandro Corbetta, Federico Toschi |
Abstract | We tackle the challenge of reliably and automatically localizing pedestrians in real-life conditions through overhead depth imaging at unprecedented high-density conditions. Leveraging upon a combination of Histogram of Oriented Gradients-like feature descriptors, neural networks, data augmentation and custom data annotation strategies, this work contributes a robust and scalable machine learning-based localization algorithm, which delivers near-human localization performance in real-time, even with local pedestrian density of about 3 ped/m2, a case in which most state-of-the art algorithms degrade significantly in performance. |
Tasks | Data Augmentation |
Published | 2018-05-31 |
URL | http://arxiv.org/abs/1805.12510v1 |
http://arxiv.org/pdf/1805.12510v1.pdf | |
PWC | https://paperswithcode.com/paper/accurate-pedestrian-localization-in-overhead |
Repo | |
Framework | |
Moment Matching Training for Neural Machine Translation: A Preliminary Study
Title | Moment Matching Training for Neural Machine Translation: A Preliminary Study |
Authors | Cong Duy Vu Hoang, Ioan Calapodescu, Marc Dymetman |
Abstract | In previous works, neural sequence models have been shown to improve significantly if external prior knowledge can be provided, for instance by allowing the model to access the embeddings of explicit features during both training and inference. In this work, we propose a different point of view on how to incorporate prior knowledge in a principled way, using a moment matching framework. In this approach, the standard local cross-entropy training of the sequential model is combined with a moment matching training mode that encourages the equality of the expectations of certain predefined features between the model distribution and the empirical distribution. In particular, we show how to derive unbiased estimates of some stochastic gradients that are central to the training, and compare our framework with a formally related one: policy gradient training in reinforcement learning, pointing out some important differences in terms of the kinds of prior assumptions in both approaches. Our initial results are promising, showing the effectiveness of our proposed framework. |
Tasks | Machine Translation |
Published | 2018-12-24 |
URL | http://arxiv.org/abs/1812.09836v2 |
http://arxiv.org/pdf/1812.09836v2.pdf | |
PWC | https://paperswithcode.com/paper/moment-matching-training-for-neural-machine |
Repo | |
Framework | |
Application of End-to-End Deep Learning in Wireless Communications Systems
Title | Application of End-to-End Deep Learning in Wireless Communications Systems |
Authors | Woongsup Lee, Ohyun Jo, Minhoe Kim |
Abstract | Deep learning is a potential paradigm changer for the design of wireless communications systems (WCS), from conventional handcrafted schemes based on sophisticated mathematical models with assumptions to autonomous schemes based on the end-to-end deep learning using a large number of data. In this article, we present a basic concept of the deep learning and its application to WCS by investigating the resource allocation (RA) scheme based on a deep neural network (DNN) where multiple goals with various constraints can be satisfied through the end-to-end deep learning. Especially, the optimality and feasibility of the DNN based RA are verified through simulation. Then, we discuss the technical challenges regarding the application of deep learning in WCS. |
Tasks | |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02394v1 |
http://arxiv.org/pdf/1808.02394v1.pdf | |
PWC | https://paperswithcode.com/paper/application-of-end-to-end-deep-learning-in |
Repo | |
Framework | |
Modeling Individual Differences in Game Behavior using HMM
Title | Modeling Individual Differences in Game Behavior using HMM |
Authors | Sara Bunian, Alessandro Canossa, Randy Colvin, Magy Seif El-Nasr |
Abstract | Player modeling is an important concept that has gained much attention in game research due to its utility in developing adaptive techniques to target better designs for engagement and retention. Previous work has explored modeling individual differences using machine learning algorithms per- formed on aggregated game actions. However, players’ individual differences may be better manifested through sequential patterns of the in-game player’s actions. While few works have explored sequential analysis of player data, none have explored the use of Hidden Markov Models (HMM) to model individual differences, which is the topic of this paper. In par- ticular, we developed a modeling approach using data col- lected from players playing a Role-Playing Game (RPG). Our proposed approach is two fold: 1. We present a Hidden Markov Model (HMM) of player in-game behaviors to model individual differences, and 2. using the output of the HMM, we generate behavioral features used to classify real world players’ characteristics, including game expertise and the big five personality traits. Our results show predictive power for some of personality traits, such as game expertise and conscientiousness, but the most influential factor was game expertise. |
Tasks | |
Published | 2018-04-01 |
URL | http://arxiv.org/abs/1804.00245v1 |
http://arxiv.org/pdf/1804.00245v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-individual-differences-in-game |
Repo | |
Framework | |
Rich Character-Level Information for Korean Morphological Analysis and Part-of-Speech Tagging
Title | Rich Character-Level Information for Korean Morphological Analysis and Part-of-Speech Tagging |
Authors | Andrew Matteson, Chanhee Lee, Young-Bum Kim, Heuiseok Lim |
Abstract | Due to the fact that Korean is a highly agglutinative, character-rich language, previous work on Korean morphological analysis typically employs the use of sub-character features known as graphemes or otherwise utilizes comprehensive prior linguistic knowledge (i.e., a dictionary of known morphological transformation forms, or actions). These models have been created with the assumption that character-level, dictionary-less morphological analysis was intractable due to the number of actions required. We present, in this study, a multi-stage action-based model that can perform morphological transformation and part-of-speech tagging using arbitrary units of input and apply it to the case of character-level Korean morphological analysis. Among models that do not employ prior linguistic knowledge, we achieve state-of-the-art word and sentence-level tagging accuracy with the Sejong Korean corpus using our proposed data-driven Bi-LSTM model. |
Tasks | Morphological Analysis, Part-Of-Speech Tagging |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.10771v1 |
http://arxiv.org/pdf/1806.10771v1.pdf | |
PWC | https://paperswithcode.com/paper/rich-character-level-information-for-korean |
Repo | |
Framework | |