Paper Group ANR 401
Multimodal Machine Learning: A Survey and Taxonomy. Chaining Identity Mapping Modules for Image Denoising. Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity. Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation. Object Classification using E …
Multimodal Machine Learning: A Survey and Taxonomy
Title | Multimodal Machine Learning: A Survey and Taxonomy |
Authors | Tadas Baltrušaitis, Chaitanya Ahuja, Louis-Philippe Morency |
Abstract | Our experience of the world is multimodal - we see objects, hear sounds, feel texture, smell odors, and taste flavors. Modality refers to the way in which something happens or is experienced and a research problem is characterized as multimodal when it includes multiple such modalities. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals together. Multimodal machine learning aims to build models that can process and relate information from multiple modalities. It is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Instead of focusing on specific multimodal applications, this paper surveys the recent advances in multimodal machine learning itself and presents them in a common taxonomy. We go beyond the typical early and late fusion categorization and identify broader challenges that are faced by multimodal machine learning, namely: representation, translation, alignment, fusion, and co-learning. This new taxonomy will enable researchers to better understand the state of the field and identify directions for future research. |
Tasks | |
Published | 2017-05-26 |
URL | http://arxiv.org/abs/1705.09406v2 |
http://arxiv.org/pdf/1705.09406v2.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-machine-learning-a-survey-and |
Repo | |
Framework | |
Chaining Identity Mapping Modules for Image Denoising
Title | Chaining Identity Mapping Modules for Image Denoising |
Authors | Saeed Anwar, Cong Phouc Huynh, Fatih Porikli |
Abstract | We propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules (CIMM) for image denoising. The CIMM structure possesses two distinctive features that are important for the noise removal task. Firstly, each residual unit employs identity mappings as the skip connections and receives pre-activated input in order to preserve the gradient magnitude propagated in both the forward and backward directions. Secondly, by utilizing dilated kernels for the convolution layers in the residual branch, in other words within an identity mapping module, each neuron in the last convolution layer can observe the full receptive field of the first layer. After being trained on the BSD400 dataset, the proposed network produces remarkably higher numerical accuracy and better visual image quality than the state-of-the-art when being evaluated on conventional benchmark images and the BSD68 dataset. |
Tasks | Denoising, Image Denoising |
Published | 2017-12-08 |
URL | https://arxiv.org/abs/1712.02933v2 |
https://arxiv.org/pdf/1712.02933v2.pdf | |
PWC | https://paperswithcode.com/paper/chaining-identity-mapping-modules-for-image |
Repo | |
Framework | |
Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity
Title | Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity |
Authors | Samira Abnar, Rasyan Ahmed, Max Mijnheer, Willem Zuidema |
Abstract | We evaluate 8 different word embedding models on their usefulness for predicting the neural activation patterns associated with concrete nouns. The models we consider include an experiential model, based on crowd-sourced association data, several popular neural and distributional models, and a model that reflects the syntactic context of words (based on dependency parses). Our goal is to assess the cognitive plausibility of these various embedding models, and understand how we can further improve our methods for interpreting brain imaging data. We show that neural word embedding models exhibit superior performance on the tasks we consider, beating experiential word representation model. The syntactically informed model gives the overall best performance when predicting brain activation patterns from word embeddings; whereas the GloVe distributional method gives the overall best performance when predicting in the reverse direction (words vectors from brain images). Interestingly, however, the error patterns of these different models are markedly different. This may support the idea that the brain uses different systems for processing different kinds of words. Moreover, we suggest that taking the relative strengths of different embedding models into account will lead to better models of the brain activity associated with words. |
Tasks | Word Embeddings |
Published | 2017-11-25 |
URL | http://arxiv.org/abs/1711.09285v1 |
http://arxiv.org/pdf/1711.09285v1.pdf | |
PWC | https://paperswithcode.com/paper/experiential-distributional-and-dependency |
Repo | |
Framework | |
Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation
Title | Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation |
Authors | Nasrin Mostafazadeh, Chris Brockett, Bill Dolan, Michel Galley, Jianfeng Gao, Georgios P. Spithourakis, Lucy Vanderwende |
Abstract | The popularity of image sharing on social media and the engagement it creates between users reflects the important role that visual context plays in everyday conversations. We present a novel task, Image-Grounded Conversations (IGC), in which natural-sounding conversations are generated about a shared image. To benchmark progress, we introduce a new multiple-reference dataset of crowd-sourced, event-centric conversations on images. IGC falls on the continuum between chit-chat and goal-directed conversation models, where visual grounding constrains the topic of conversation to event-driven utterances. Experiments with models trained on social media data show that the combination of visual and textual context enhances the quality of generated conversational turns. In human evaluation, the gap between human performance and that of both neural and retrieval architectures suggests that multi-modal IGC presents an interesting challenge for dialogue research. |
Tasks | |
Published | 2017-01-28 |
URL | http://arxiv.org/abs/1701.08251v2 |
http://arxiv.org/pdf/1701.08251v2.pdf | |
PWC | https://paperswithcode.com/paper/image-grounded-conversations-multimodal |
Repo | |
Framework | |
Object Classification using Ensemble of Local and Deep Features
Title | Object Classification using Ensemble of Local and Deep Features |
Authors | Siddharth Srivastava, Prerana Mukherjee, Brejesh Lall, Kamlesh Jaiswal |
Abstract | In this paper we propose an ensemble of local and deep features for object classification. We also compare and contrast effectiveness of feature representation capability of various layers of convolutional neural network. We demonstrate with extensive experiments for object classification that the representation capability of features from deep networks can be complemented with information captured from local features. We also find out that features from various deep convolutional networks encode distinctive characteristic information. We establish that, as opposed to conventional practice, intermediate layers of deep networks can augment the classification capabilities of features obtained from fully connected layers. |
Tasks | Object Classification |
Published | 2017-12-04 |
URL | http://arxiv.org/abs/1712.04926v1 |
http://arxiv.org/pdf/1712.04926v1.pdf | |
PWC | https://paperswithcode.com/paper/object-classification-using-ensemble-of-local |
Repo | |
Framework | |
Inherent Biases of Recurrent Neural Networks for Phonological Assimilation and Dissimilation
Title | Inherent Biases of Recurrent Neural Networks for Phonological Assimilation and Dissimilation |
Authors | Amanda Doucette |
Abstract | A recurrent neural network model of phonological pattern learning is proposed. The model is a relatively simple neural network with one recurrent layer, and displays biases in learning that mimic observed biases in human learning. Single-feature patterns are learned faster than two-feature patterns, and vowel or consonant-only patterns are learned faster than patterns involving vowels and consonants, mimicking the results of laboratory learning experiments. In non-recurrent models, capturing these biases requires the use of alpha features or some other representation of repeated features, but with a recurrent neural network, these elaborations are not necessary. |
Tasks | |
Published | 2017-02-23 |
URL | http://arxiv.org/abs/1702.07324v1 |
http://arxiv.org/pdf/1702.07324v1.pdf | |
PWC | https://paperswithcode.com/paper/inherent-biases-of-recurrent-neural-networks |
Repo | |
Framework | |
Multiagent-based Participatory Urban Simulation through Inverse Reinforcement Learning
Title | Multiagent-based Participatory Urban Simulation through Inverse Reinforcement Learning |
Authors | Soma Suzuki |
Abstract | The multiagent-based participatory simulation features prominently in urban planning as the acquired model is considered as the hybrid system of the domain and the local knowledge. However, the key problem of generating realistic agents for particular social phenomena invariably remains. The existing models have attempted to dictate the factors involving human behavior, which appeared to be intractable. In this paper, Inverse Reinforcement Learning (IRL) is introduced to address this problem. IRL is developed for computational modeling of human behavior and has achieved great successes in robotics, psychology and machine learning. The possibilities presented by this new style of modeling are drawn out as conclusions, and the relative challenges with this modeling are highlighted. |
Tasks | |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.07887v1 |
http://arxiv.org/pdf/1712.07887v1.pdf | |
PWC | https://paperswithcode.com/paper/multiagent-based-participatory-urban |
Repo | |
Framework | |
Accelerated Distributed Dual Averaging over Evolving Networks of Growing Connectivity
Title | Accelerated Distributed Dual Averaging over Evolving Networks of Growing Connectivity |
Authors | Sijia Liu, Pin-Yu Chen, Alfred O. Hero |
Abstract | We consider the problem of accelerating distributed optimization in multi-agent networks by sequentially adding edges. Specifically, we extend the distributed dual averaging (DDA) subgradient algorithm to evolving networks of growing connectivity and analyze the corresponding improvement in convergence rate. It is known that the convergence rate of DDA is influenced by the algebraic connectivity of the underlying network, where better connectivity leads to faster convergence. However, the impact of network topology design on the convergence rate of DDA has not been fully understood. In this paper, we begin by designing network topologies via edge selection and scheduling. For edge selection, we determine the best set of candidate edges that achieves the optimal tradeoff between the growth of network connectivity and the usage of network resources. The dynamics of network evolution is then incurred by edge scheduling. Further, we provide a tractable approach to analyze the improvement in the convergence rate of DDA induced by the growth of network connectivity. Our analysis reveals the connection between network topology design and the convergence rate of DDA, and provides quantitative evaluation of DDA acceleration for distributed optimization that is absent in the existing analysis. Lastly, numerical experiments show that DDA can be significantly accelerated using a sequence of well-designed networks, and our theoretical predictions are well matched to its empirical convergence behavior. |
Tasks | Distributed Optimization |
Published | 2017-04-18 |
URL | http://arxiv.org/abs/1704.05193v2 |
http://arxiv.org/pdf/1704.05193v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-distributed-dual-averaging-over |
Repo | |
Framework | |
DOPE: Distributed Optimization for Pairwise Energies
Title | DOPE: Distributed Optimization for Pairwise Energies |
Authors | Jose Dolz, Ismail Ben Ayed, Christian Desrosiers |
Abstract | We formulate an Alternating Direction Method of Mul-tipliers (ADMM) that systematically distributes the computations of any technique for optimizing pairwise functions, including non-submodular potentials. Such discrete functions are very useful in segmentation and a breadth of other vision problems. Our method decomposes the problem into a large set of small sub-problems, each involving a sub-region of the image domain, which can be solved in parallel. We achieve consistency between the sub-problems through a novel constraint that can be used for a large class of pair-wise functions. We give an iterative numerical solution that alternates between solving the sub-problems and updating consistency variables, until convergence. We report comprehensive experiments, which demonstrate the benefit of our general distributed solution in the case of the popular serial algorithm of Boykov and Kolmogorov (BK algorithm) and, also, in the context of non-submodular functions. |
Tasks | Distributed Optimization |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03116v1 |
http://arxiv.org/pdf/1704.03116v1.pdf | |
PWC | https://paperswithcode.com/paper/dope-distributed-optimization-for-pairwise |
Repo | |
Framework | |
Worm-level Control through Search-based Reinforcement Learning
Title | Worm-level Control through Search-based Reinforcement Learning |
Authors | Mathias Lechner, Radu Grosu, Ramin M. Hasani |
Abstract | Through natural evolution, nervous systems of organisms formed near-optimal structures to express behavior. Here, we propose an effective way to create control agents, by \textit{re-purposing} the function of biological neural circuit models, to govern similar real world applications. We model the tap-withdrawal (TW) neural circuit of the nematode, \textit{C. elegans}, a circuit responsible for the worm’s reflexive response to external mechanical touch stimulations, and learn its synaptic and neural parameters as a policy for controlling the inverted pendulum problem. For reconfiguration of the purpose of the TW neural circuit, we manipulate a search-based reinforcement learning. We show that our neural policy performs as good as existing traditional control theory and machine learning approaches. A video demonstration of the performance of our method can be accessed at \url{https://youtu.be/o-Ia5IVyff8}. |
Tasks | |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03467v1 |
http://arxiv.org/pdf/1711.03467v1.pdf | |
PWC | https://paperswithcode.com/paper/worm-level-control-through-search-based |
Repo | |
Framework | |
The Conditional Analogy GAN: Swapping Fashion Articles on People Images
Title | The Conditional Analogy GAN: Swapping Fashion Articles on People Images |
Authors | Nikolay Jetchev, Urs Bergmann |
Abstract | We present a novel method to solve image analogy problems : it allows to learn the relation between paired images present in training data, and then generalize and generate images that correspond to the relation, but were never seen in the training set. Therefore, we call the method Conditional Analogy Generative Adversarial Network (CAGAN), as it is based on adversarial training and employs deep convolutional neural networks. An especially interesting application of that technique is automatic swapping of clothing on fashion model photos. Our work has the following contributions. First, the definition of the end-to-end trainable CAGAN architecture, which implicitly learns segmentation masks without expensive supervised labeling data. Second, experimental results show plausible segmentation masks and often convincing swapped images, given the target article. Finally, we discuss the next steps for that technique: neural network architecture improvements and more advanced applications. |
Tasks | |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04695v1 |
http://arxiv.org/pdf/1709.04695v1.pdf | |
PWC | https://paperswithcode.com/paper/the-conditional-analogy-gan-swapping-fashion |
Repo | |
Framework | |
Hungarian Layer: Logics Empowered Neural Architecture
Title | Hungarian Layer: Logics Empowered Neural Architecture |
Authors | Han Xiao, Yidong Chen, Xiaodong Shi |
Abstract | Neural architecture is a purely numeric framework, which fits the data as a continuous function. However, lacking of logic flow (e.g. \textit{if, for, while}), traditional algorithms (e.g. \textit{Hungarian algorithm, A$^*$ searching, decision tress algorithm}) could not be embedded into this paradigm, which limits the theories and applications. In this paper, we reform the calculus graph as a dynamic process, which is guided by logic flow. Within our novel methodology, traditional algorithms could empower numerical neural network. Specifically, regarding the subject of sentence matching, we reformulate this issue as the form of task-assignment, which is solved by Hungarian algorithm. First, our model applies BiLSTM to parse the sentences. Then Hungarian layer aligns the matching positions. Last, we transform the matching results for soft-max regression by another BiLSTM. Extensive experiments show that our model outperforms other state-of-the-art baselines substantially. |
Tasks | |
Published | 2017-12-07 |
URL | http://arxiv.org/abs/1712.02555v3 |
http://arxiv.org/pdf/1712.02555v3.pdf | |
PWC | https://paperswithcode.com/paper/hungarian-layer-logics-empowered-neural |
Repo | |
Framework | |
Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017
Title | Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017 |
Authors | M. Botvinick, D. G. T. Barrett, P. Battaglia, N. de Freitas, D. Kumaran, J. Z Leibo, T. Lillicrap, J. Modayil, S. Mohamed, N. C. Rabinowitz, D. J. Rezende, A. Santoro, T. Schaul, C. Summerfield, G. Wayne, T. Weber, D. Wierstra, S. Legg, D. Hassabis |
Abstract | We agree with Lake and colleagues on their list of key ingredients for building humanlike intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand-engineering. We believe an approach centered on autonomous learning has the greatest chance of success as we scale toward real-world complexity, tackling domains for which ready-made formal models are not available. Here we survey several important examples of the progress that has been made toward building autonomous agents with humanlike abilities, and highlight some outstanding challenges. |
Tasks | |
Published | 2017-11-22 |
URL | http://arxiv.org/abs/1711.08378v1 |
http://arxiv.org/pdf/1711.08378v1.pdf | |
PWC | https://paperswithcode.com/paper/building-machines-that-learn-and-think-for |
Repo | |
Framework | |
Property Testing in High Dimensional Ising models
Title | Property Testing in High Dimensional Ising models |
Authors | Matey Neykov, Han Liu |
Abstract | This paper explores the information-theoretic limitations of graph property testing in zero-field Ising models. Instead of learning the entire graph structure, sometimes testing a basic graph property such as connectivity, cycle presence or maximum clique size is a more relevant and attainable objective. Since property testing is more fundamental than graph recovery, any necessary conditions for property testing imply corresponding conditions for graph recovery, while custom property tests can be statistically and/or computationally more efficient than graph recovery based algorithms. Understanding the statistical complexity of property testing requires the distinction of ferromagnetic (i.e., positive interactions only) and general Ising models. Using combinatorial constructs such as graph packing and strong monotonicity, we characterize how target properties affect the corresponding minimax upper and lower bounds within the realm of ferromagnets. On the other hand, by studying the detection of an antiferromagnetic (i.e., negative interactions only) Curie-Weiss model buried in Rademacher noise, we show that property testing is strictly more challenging over general Ising models. In terms of methodological development, we propose two types of correlation based tests: computationally efficient screening for ferromagnets, and score type tests for general models, including a fast cycle presence test. Our correlation screening tests match the information-theoretic bounds for property testing in ferromagnets. |
Tasks | |
Published | 2017-09-20 |
URL | http://arxiv.org/abs/1709.06688v2 |
http://arxiv.org/pdf/1709.06688v2.pdf | |
PWC | https://paperswithcode.com/paper/property-testing-in-high-dimensional-ising |
Repo | |
Framework | |
Training Deep Neural Networks via Optimization Over Graphs
Title | Training Deep Neural Networks via Optimization Over Graphs |
Authors | Guoqiang Zhang, W. Bastiaan Kleijn |
Abstract | In this work, we propose to train a deep neural network by distributed optimization over a graph. Two nonlinear functions are considered: the rectified linear unit (ReLU) and a linear unit with both lower and upper cutoffs (DCutLU). The problem reformulation over a graph is realized by explicitly representing ReLU or DCutLU using a set of slack variables. We then apply the alternating direction method of multipliers (ADMM) to update the weights of the network layerwise by solving subproblems of the reformulated problem. Empirical results suggest that the ADMM-based method is less sensitive to overfitting than the stochastic gradient descent (SGD) and Adam methods. |
Tasks | Distributed Optimization |
Published | 2017-02-11 |
URL | http://arxiv.org/abs/1702.03380v2 |
http://arxiv.org/pdf/1702.03380v2.pdf | |
PWC | https://paperswithcode.com/paper/training-deep-neural-networks-via |
Repo | |
Framework | |