July 28, 2019

2794 words 14 mins read

Paper Group ANR 401

Multimodal Machine Learning: A Survey and Taxonomy. Chaining Identity Mapping Modules for Image Denoising. Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity. Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation. Object Classification using E …

Multimodal Machine Learning: A Survey and Taxonomy


Title	Multimodal Machine Learning: A Survey and Taxonomy
Authors	Tadas Baltrušaitis, Chaitanya Ahuja, Louis-Philippe Morency
Abstract	Our experience of the world is multimodal - we see objects, hear sounds, feel texture, smell odors, and taste flavors. Modality refers to the way in which something happens or is experienced and a research problem is characterized as multimodal when it includes multiple such modalities. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals together. Multimodal machine learning aims to build models that can process and relate information from multiple modalities. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Instead of focusing on specific multimodal applications, this paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy. We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning. This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research.
Tasks
Published	2017-05-26
URL	http://arxiv.org/abs/1705.09406v2
PDF	http://arxiv.org/pdf/1705.09406v2.pdf
PWC	https://paperswithcode.com/paper/multimodal-machine-learning-a-survey-and
Repo
Framework

Chaining Identity Mapping Modules for Image Denoising


Title	Chaining Identity Mapping Modules for Image Denoising
Authors	Saeed Anwar, Cong Phouc Huynh, Fatih Porikli
Abstract	We propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules (CIMM) for image denoising. The CIMM structure possesses two distinctive features that are important for the noise removal task. Firstly, each residual unit employs identity mappings as the skip connections and receives pre-activated input in order to preserve the gradient magnitude propagated in both the forward and backward directions. Secondly, by utilizing dilated kernels for the convolution layers in the residual branch, in other words within an identity mapping module, each neuron in the last convolution layer can observe the full receptive field of the first layer. After being trained on the BSD400 dataset, the proposed network produces remarkably higher numerical accuracy and better visual image quality than the state-of-the-art when being evaluated on conventional benchmark images and the BSD68 dataset.
Tasks	Denoising, Image Denoising
Published	2017-12-08
URL	https://arxiv.org/abs/1712.02933v2
PDF	https://arxiv.org/pdf/1712.02933v2.pdf
PWC	https://paperswithcode.com/paper/chaining-identity-mapping-modules-for-image
Repo
Framework

Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity


Title	Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity
Authors	Samira Abnar, Rasyan Ahmed, Max Mijnheer, Willem Zuidema
Abstract	We evaluate 8 different word embedding models on their usefulness for predicting the neural activation patterns associated with concrete nouns. The models we consider include an experiential model, based on crowd-sourced association data, several popular neural and distributional models, and a model that reflects the syntactic context of words (based on dependency parses). Our goal is to assess the cognitive plausibility of these various embedding models, and understand how we can further improve our methods for interpreting brain imaging data. We show that neural word embedding models exhibit superior performance on the tasks we consider, beating experiential word representation model. The syntactically informed model gives the overall best performance when predicting brain activation patterns from word embeddings; whereas the GloVe distributional method gives the overall best performance when predicting in the reverse direction (words vectors from brain images). Interestingly, however, the error patterns of these different models are markedly different. This may support the idea that the brain uses different systems for processing different kinds of words. Moreover, we suggest that taking the relative strengths of different embedding models into account will lead to better models of the brain activity associated with words.
Tasks	Word Embeddings
Published	2017-11-25
URL	http://arxiv.org/abs/1711.09285v1
PDF	http://arxiv.org/pdf/1711.09285v1.pdf
PWC	https://paperswithcode.com/paper/experiential-distributional-and-dependency
Repo
Framework

Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation


Title	Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation
Authors	Nasrin Mostafazadeh, Chris Brockett, Bill Dolan, Michel Galley, Jianfeng Gao, Georgios P. Spithourakis, Lucy Vanderwende
Abstract	The popularity of image sharing on social media and the engagement it creates between users reflects the important role that visual context plays in everyday conversations. We present a novel task, Image-Grounded Conversations (IGC), in which natural-sounding conversations are generated about a shared image. To benchmark progress, we introduce a new multiple-reference dataset of crowd-sourced, event-centric conversations on images. IGC falls on the continuum between chit-chat and goal-directed conversation models, where visual grounding constrains the topic of conversation to event-driven utterances. Experiments with models trained on social media data show that the combination of visual and textual context enhances the quality of generated conversational turns. In human evaluation, the gap between human performance and that of both neural and retrieval architectures suggests that multi-modal IGC presents an interesting challenge for dialogue research.
Tasks
Published	2017-01-28
URL	http://arxiv.org/abs/1701.08251v2
PDF	http://arxiv.org/pdf/1701.08251v2.pdf
PWC	https://paperswithcode.com/paper/image-grounded-conversations-multimodal
Repo
Framework

Object Classification using Ensemble of Local and Deep Features


Title	Object Classification using Ensemble of Local and Deep Features
Authors	Siddharth Srivastava, Prerana Mukherjee, Brejesh Lall, Kamlesh Jaiswal
Abstract	In this paper we propose an ensemble of local and deep features for object classification. We also compare and contrast effectiveness of feature representation capability of various layers of convolutional neural network. We demonstrate with extensive experiments for object classification that the representation capability of features from deep networks can be complemented with information captured from local features. We also find out that features from various deep convolutional networks encode distinctive characteristic information. We establish that, as opposed to conventional practice, intermediate layers of deep networks can augment the classification capabilities of features obtained from fully connected layers.
Tasks	Object Classification
Published	2017-12-04
URL	http://arxiv.org/abs/1712.04926v1
PDF	http://arxiv.org/pdf/1712.04926v1.pdf
PWC	https://paperswithcode.com/paper/object-classification-using-ensemble-of-local
Repo
Framework

Inherent Biases of Recurrent Neural Networks for Phonological Assimilation and Dissimilation


Title	Inherent Biases of Recurrent Neural Networks for Phonological Assimilation and Dissimilation
Authors	Amanda Doucette
Abstract	A recurrent neural network model of phonological pattern learning is proposed. The model is a relatively simple neural network with one recurrent layer, and displays biases in learning that mimic observed biases in human learning. Single-feature patterns are learned faster than two-feature patterns, and vowel or consonant-only patterns are learned faster than patterns involving vowels and consonants, mimicking the results of laboratory learning experiments. In non-recurrent models, capturing these biases requires the use of alpha features or some other representation of repeated features, but with a recurrent neural network, these elaborations are not necessary.
Tasks
Published	2017-02-23
URL	http://arxiv.org/abs/1702.07324v1
PDF	http://arxiv.org/pdf/1702.07324v1.pdf
PWC	https://paperswithcode.com/paper/inherent-biases-of-recurrent-neural-networks
Repo
Framework

Multiagent-based Participatory Urban Simulation through Inverse Reinforcement Learning


Title	Multiagent-based Participatory Urban Simulation through Inverse Reinforcement Learning
Authors	Soma Suzuki
Abstract	The multiagent-based participatory simulation features prominently in urban planning as the acquired model is considered as the hybrid system of the domain and the local knowledge. However, the key problem of generating realistic agents for particular social phenomena invariably remains. The existing models have attempted to dictate the factors involving human behavior, which appeared to be intractable. In this paper, Inverse Reinforcement Learning (IRL) is introduced to address this problem. IRL is developed for computational modeling of human behavior and has achieved great successes in robotics, psychology and machine learning. The possibilities presented by this new style of modeling are drawn out as conclusions, and the relative challenges with this modeling are highlighted.
Tasks
Published	2017-12-21
URL	http://arxiv.org/abs/1712.07887v1
PDF	http://arxiv.org/pdf/1712.07887v1.pdf
PWC	https://paperswithcode.com/paper/multiagent-based-participatory-urban
Repo
Framework

Accelerated Distributed Dual Averaging over Evolving Networks of Growing Connectivity


Title	Accelerated Distributed Dual Averaging over Evolving Networks of Growing Connectivity
Authors	Sijia Liu, Pin-Yu Chen, Alfred O. Hero
Abstract	We consider the problem of accelerating distributed optimization in multi-agent networks by sequentially adding edges. Specifically, we extend the distributed dual averaging (DDA) subgradient algorithm to evolving networks of growing connectivity and analyze the corresponding improvement in convergence rate. It is known that the convergence rate of DDA is influenced by the algebraic connectivity of the underlying network, where better connectivity leads to faster convergence. However, the impact of network topology design on the convergence rate of DDA has not been fully understood. In this paper, we begin by designing network topologies via edge selection and scheduling. For edge selection, we determine the best set of candidate edges that achieves the optimal tradeoff between the growth of network connectivity and the usage of network resources. The dynamics of network evolution is then incurred by edge scheduling. Further, we provide a tractable approach to analyze the improvement in the convergence rate of DDA induced by the growth of network connectivity. Our analysis reveals the connection between network topology design and the convergence rate of DDA, and provides quantitative evaluation of DDA acceleration for distributed optimization that is absent in the existing analysis. Lastly, numerical experiments show that DDA can be significantly accelerated using a sequence of well-designed networks, and our theoretical predictions are well matched to its empirical convergence behavior.
Tasks	Distributed Optimization
Published	2017-04-18
URL	http://arxiv.org/abs/1704.05193v2
PDF	http://arxiv.org/pdf/1704.05193v2.pdf
PWC	https://paperswithcode.com/paper/accelerated-distributed-dual-averaging-over
Repo
Framework

DOPE: Distributed Optimization for Pairwise Energies


Title	DOPE: Distributed Optimization for Pairwise Energies
Authors	Jose Dolz, Ismail Ben Ayed, Christian Desrosiers
Abstract	We formulate an Alternating Direction Method of Mul-tipliers (ADMM) that systematically distributes the computations of any technique for optimizing pairwise functions, including non-submodular potentials. Such discrete functions are very useful in segmentation and a breadth of other vision problems. Our method decomposes the problem into a large set of small sub-problems, each involving a sub-region of the image domain, which can be solved in parallel. We achieve consistency between the sub-problems through a novel constraint that can be used for a large class of pair-wise functions. We give an iterative numerical solution that alternates between solving the sub-problems and updating consistency variables, until convergence. We report comprehensive experiments, which demonstrate the benefit of our general distributed solution in the case of the popular serial algorithm of Boykov and Kolmogorov (BK algorithm) and, also, in the context of non-submodular functions.
Tasks	Distributed Optimization
Published	2017-04-11
URL	http://arxiv.org/abs/1704.03116v1
PDF	http://arxiv.org/pdf/1704.03116v1.pdf
PWC	https://paperswithcode.com/paper/dope-distributed-optimization-for-pairwise
Repo
Framework

Worm-level Control through Search-based Reinforcement Learning


Title	Worm-level Control through Search-based Reinforcement Learning
Authors	Mathias Lechner, Radu Grosu, Ramin M. Hasani
Abstract	Through natural evolution, nervous systems of organisms formed near-optimal structures to express behavior. Here, we propose an effective way to create control agents, by \textit{re-purposing} the function of biological neural circuit models, to govern similar real world applications. We model the tap-withdrawal (TW) neural circuit of the nematode, \textit{C. elegans}, a circuit responsible for the worm’s reflexive response to external mechanical touch stimulations, and learn its synaptic and neural parameters as a policy for controlling the inverted pendulum problem. For reconfiguration of the purpose of the TW neural circuit, we manipulate a search-based reinforcement learning. We show that our neural policy performs as good as existing traditional control theory and machine learning approaches. A video demonstration of the performance of our method can be accessed at \url{https://youtu.be/o-Ia5IVyff8}.
Tasks
Published	2017-11-09
URL	http://arxiv.org/abs/1711.03467v1
PDF	http://arxiv.org/pdf/1711.03467v1.pdf
PWC	https://paperswithcode.com/paper/worm-level-control-through-search-based
Repo
Framework

The Conditional Analogy GAN: Swapping Fashion Articles on People Images


Title	The Conditional Analogy GAN: Swapping Fashion Articles on People Images
Authors	Nikolay Jetchev, Urs Bergmann
Abstract	We present a novel method to solve image analogy problems : it allows to learn the relation between paired images present in training data, and then generalize and generate images that correspond to the relation, but were never seen in the training set. Therefore, we call the method Conditional Analogy Generative Adversarial Network (CAGAN), as it is based on adversarial training and employs deep convolutional neural networks. An especially interesting application of that technique is automatic swapping of clothing on fashion model photos. Our work has the following contributions. First, the definition of the end-to-end trainable CAGAN architecture, which implicitly learns segmentation masks without expensive supervised labeling data. Second, experimental results show plausible segmentation masks and often convincing swapped images, given the target article. Finally, we discuss the next steps for that technique: neural network architecture improvements and more advanced applications.
Tasks
Published	2017-09-14
URL	http://arxiv.org/abs/1709.04695v1
PDF	http://arxiv.org/pdf/1709.04695v1.pdf
PWC	https://paperswithcode.com/paper/the-conditional-analogy-gan-swapping-fashion
Repo
Framework

Hungarian Layer: Logics Empowered Neural Architecture


Title	Hungarian Layer: Logics Empowered Neural Architecture
Authors	Han Xiao, Yidong Chen, Xiaodong Shi
Abstract	Neural architecture is a purely numeric framework, which fits the data as a continuous function. However, lacking of logic flow (e.g. \textit{if, for, while}), traditional algorithms (e.g. \textit{Hungarian algorithm, A$^*$ searching, decision tress algorithm}) could not be embedded into this paradigm, which limits the theories and applications. In this paper, we reform the calculus graph as a dynamic process, which is guided by logic flow. Within our novel methodology, traditional algorithms could empower numerical neural network. Specifically, regarding the subject of sentence matching, we reformulate this issue as the form of task-assignment, which is solved by Hungarian algorithm. First, our model applies BiLSTM to parse the sentences. Then Hungarian layer aligns the matching positions. Last, we transform the matching results for soft-max regression by another BiLSTM. Extensive experiments show that our model outperforms other state-of-the-art baselines substantially.
Tasks
Published	2017-12-07
URL	http://arxiv.org/abs/1712.02555v3
PDF	http://arxiv.org/pdf/1712.02555v3.pdf
PWC	https://paperswithcode.com/paper/hungarian-layer-logics-empowered-neural
Repo
Framework

Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017


Title	Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017
Authors	M. Botvinick, D. G. T. Barrett, P. Battaglia, N. de Freitas, D. Kumaran, J. Z Leibo, T. Lillicrap, J. Modayil, S. Mohamed, N. C. Rabinowitz, D. J. Rezende, A. Santoro, T. Schaul, C. Summerfield, G. Wayne, T. Weber, D. Wierstra, S. Legg, D. Hassabis
Abstract	We agree with Lake and colleagues on their list of key ingredients for building humanlike intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand-engineering. We believe an approach centered on autonomous learning has the greatest chance of success as we scale toward real-world complexity, tackling domains for which ready-made formal models are not available. Here we survey several important examples of the progress that has been made toward building autonomous agents with humanlike abilities, and highlight some outstanding challenges.
Tasks
Published	2017-11-22
URL	http://arxiv.org/abs/1711.08378v1
PDF	http://arxiv.org/pdf/1711.08378v1.pdf
PWC	https://paperswithcode.com/paper/building-machines-that-learn-and-think-for
Repo
Framework

Property Testing in High Dimensional Ising models


Title	Property Testing in High Dimensional Ising models
Authors	Matey Neykov, Han Liu
Abstract	This paper explores the information-theoretic limitations of graph property testing in zero-field Ising models. Instead of learning the entire graph structure, sometimes testing a basic graph property such as connectivity, cycle presence or maximum clique size is a more relevant and attainable objective. Since property testing is more fundamental than graph recovery, any necessary conditions for property testing imply corresponding conditions for graph recovery, while custom property tests can be statistically and/or computationally more efficient than graph recovery based algorithms. Understanding the statistical complexity of property testing requires the distinction of ferromagnetic (i.e., positive interactions only) and general Ising models. Using combinatorial constructs such as graph packing and strong monotonicity, we characterize how target properties affect the corresponding minimax upper and lower bounds within the realm of ferromagnets. On the other hand, by studying the detection of an antiferromagnetic (i.e., negative interactions only) Curie-Weiss model buried in Rademacher noise, we show that property testing is strictly more challenging over general Ising models. In terms of methodological development, we propose two types of correlation based tests: computationally efficient screening for ferromagnets, and score type tests for general models, including a fast cycle presence test. Our correlation screening tests match the information-theoretic bounds for property testing in ferromagnets.
Tasks
Published	2017-09-20
URL	http://arxiv.org/abs/1709.06688v2
PDF	http://arxiv.org/pdf/1709.06688v2.pdf
PWC	https://paperswithcode.com/paper/property-testing-in-high-dimensional-ising
Repo
Framework

Training Deep Neural Networks via Optimization Over Graphs


Title	Training Deep Neural Networks via Optimization Over Graphs
Authors	Guoqiang Zhang, W. Bastiaan Kleijn
Abstract	In this work, we propose to train a deep neural network by distributed optimization over a graph. Two nonlinear functions are considered: the rectified linear unit (ReLU) and a linear unit with both lower and upper cutoffs (DCutLU). The problem reformulation over a graph is realized by explicitly representing ReLU or DCutLU using a set of slack variables. We then apply the alternating direction method of multipliers (ADMM) to update the weights of the network layerwise by solving subproblems of the reformulated problem. Empirical results suggest that the ADMM-based method is less sensitive to overfitting than the stochastic gradient descent (SGD) and Adam methods.
Tasks	Distributed Optimization
Published	2017-02-11
URL	http://arxiv.org/abs/1702.03380v2
PDF	http://arxiv.org/pdf/1702.03380v2.pdf
PWC	https://paperswithcode.com/paper/training-deep-neural-networks-via
Repo
Framework