January 29, 2020

2933 words 14 mins read

Paper Group ANR 654

Paper Group ANR 654

Neural Text Summarization: A Critical Evaluation. Symplectic Recurrent Neural Networks. Learning-based Model Predictive Control for Smart Building Thermal Management. Understanding Troll Writing as a Linguistic Phenomenon. Sistema Sensor para el Monitoreo Ambiental Basado en Redes Neuronales. Not All Attention Is Needed: Gated Attention Network for …

Neural Text Summarization: A Critical Evaluation

Title Neural Text Summarization: A Critical Evaluation
Authors Wojciech Kryściński, Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher
Abstract Text summarization aims at compressing long documents into a shorter form that conveys the most important parts of the original document. Despite increased interest in the community and notable research effort, progress on benchmark datasets has stagnated. We critically evaluate key ingredients of the current research setup: datasets, evaluation metrics, and models, and highlight three primary shortcomings: 1) automatically collected datasets leave the task underconstrained and may contain noise detrimental to training and evaluation, 2) current evaluation protocol is weakly correlated with human judgment and does not account for important characteristics such as factual correctness, 3) models overfit to layout biases of current datasets and offer limited diversity in their outputs.
Tasks Text Summarization
Published 2019-08-23
URL https://arxiv.org/abs/1908.08960v1
PDF https://arxiv.org/pdf/1908.08960v1.pdf
PWC https://paperswithcode.com/paper/neural-text-summarization-a-critical
Repo
Framework

Symplectic Recurrent Neural Networks

Title Symplectic Recurrent Neural Networks
Authors Zhengdao Chen, Jianyu Zhang, Martin Arjovsky, Léon Bottou
Abstract We propose Symplectic Recurrent Neural Networks (SRNNs) as learning algorithms that capture the dynamics of physical systems from observed trajectories. An SRNN models the Hamiltonian function of the system by a neural network and furthermore leverages symplectic integration, multiple-step training and initial state optimization to address the challenging numerical issues associated with Hamiltonian systems. We show SRNNs succeed reliably on complex and noisy Hamiltonian systems. We also show how to augment the SRNN integration scheme in order to handle stiff dynamical systems such as bouncing billiards.
Tasks
Published 2019-09-29
URL https://arxiv.org/abs/1909.13334v1
PDF https://arxiv.org/pdf/1909.13334v1.pdf
PWC https://paperswithcode.com/paper/symplectic-recurrent-neural-networks
Repo
Framework

Learning-based Model Predictive Control for Smart Building Thermal Management

Title Learning-based Model Predictive Control for Smart Building Thermal Management
Authors Roja Eini, Sherif Abdelwahed
Abstract This paper proposes a learning-based model predictive control (MPC) approach for the thermal control of a four-zone smart building. The objectives are to minimize energy consumption and maintain the residents’ comfort. The proposed control scheme incorporates learning with the model-based control. The occupancy profile in the building zones are estimated in a long-term horizon through the artificial neural network (ANN), and this data is fed into the model-based predictor to get the indoor temperature predictions. The Energy Plus software is utilized as the actual dataset provider (weather data, indoor temperature, energy consumption). The optimization problem, including the actual and predicted data, is solved in each step of the simulation and the input setpoint temperature for the heating/cooling system, is generated. Comparing the results of the proposed approach with the conventional MPC results proved the significantly better performance of the proposed method in energy savings (40.56% less cooling power consumption and 16.73% less heating power consumption), and residents’ comfort.
Tasks
Published 2019-09-11
URL https://arxiv.org/abs/1909.05331v1
PDF https://arxiv.org/pdf/1909.05331v1.pdf
PWC https://paperswithcode.com/paper/learning-based-model-predictive-control-for-2
Repo
Framework

Understanding Troll Writing as a Linguistic Phenomenon

Title Understanding Troll Writing as a Linguistic Phenomenon
Authors Sergei Monakhov
Abstract The current study yielded a number of important findings. We managed to build a neural network that achieved an accuracy score of 91 per cent in classifying troll and genuine tweets. By means of regression analysis, we identified a number of features that make a tweet more susceptible to correct labelling and found that they are inherently present in troll tweets as a special type of discourse. We hypothesised that those features are grounded in the sociolinguistic limitations of troll writing, which can be best described as a combination of two factors: speaking with a purpose and trying to mask the purpose of speaking. Next, we contended that the orthogonal nature of these factors must necessarily result in the skewed distribution of many different language parameters of troll messages. Having chosen as an example distribution of the topics and vocabulary associated with those topics, we showed some very pronounced distributional anomalies, thus confirming our prediction.
Tasks
Published 2019-11-14
URL https://arxiv.org/abs/1911.08946v1
PDF https://arxiv.org/pdf/1911.08946v1.pdf
PWC https://paperswithcode.com/paper/understanding-troll-writing-as-a-linguistic
Repo
Framework

Sistema Sensor para el Monitoreo Ambiental Basado en Redes Neuronales

Title Sistema Sensor para el Monitoreo Ambiental Basado en Redes Neuronales
Authors Jose de Jesus Rubio, Jose Alberto Hernandez-Aguilar, Francisco Jacob Avila-Camacho, Juan Manuel Stein-Carrillo, Adolfo Melendez-Ramirez
Abstract In the tasks of environmental monitoring is of great importance to have compact and portable systems able to identify environmental contaminants that facilitate tasks related to waste management and environmental restoration. In this paper, a prototype sensor is described to identify contaminants in the environment. This prototype is made with an array of tin oxide SnO2 gas sensors used to identify chemical vapors, a step of data acquisition implemented with ARM (Advanced RISC Machine) low-cost platform (Arduino) and a neural network able to identify environmental contaminants automatically. The neural network is used to identify the composition of contaminant census. In the computer system, the heavy computational load is presented only in the training process, once the neural network has been trained, the operation is to spread the data across the network with a much lighter computational load, which consists mainly of a vector-matrix multiplication and a search table that holds the activation function to quickly identify unknown samples.
Tasks
Published 2019-04-28
URL http://arxiv.org/abs/1904.12234v1
PDF http://arxiv.org/pdf/1904.12234v1.pdf
PWC https://paperswithcode.com/paper/sistema-sensor-para-el-monitoreo-ambiental
Repo
Framework

Not All Attention Is Needed: Gated Attention Network for Sequence Data

Title Not All Attention Is Needed: Gated Attention Network for Sequence Data
Authors Lanqing Xue, Xiaopeng Li, Nevin L. Zhang
Abstract Although deep neural networks generally have fixed network structures, the concept of dynamic mechanism has drawn more and more attention in recent years. Attention mechanisms compute input-dependent dynamic attention weights for aggregating a sequence of hidden states. Dynamic network configuration in convolutional neural networks (CNNs) selectively activates only part of the network at a time for different inputs. In this paper, we combine the two dynamic mechanisms for text classification tasks. Traditional attention mechanisms attend to the whole sequence of hidden states for an input sentence, while in most cases not all attention is needed especially for long sequences. We propose a novel method called Gated Attention Network (GA-Net) to dynamically select a subset of elements to attend to using an auxiliary network, and compute attention weights to aggregate the selected elements. It avoids a significant amount of unnecessary computation on unattended elements, and allows the model to pay attention to important parts of the sequence. Experiments in various datasets show that the proposed method achieves better performance compared with all baseline models with global or local attention while requiring less computation and achieving better interpretability. It is also promising to extend the idea to more complex attention-based models, such as transformers and seq-to-seq models.
Tasks Text Classification
Published 2019-12-01
URL https://arxiv.org/abs/1912.00349v1
PDF https://arxiv.org/pdf/1912.00349v1.pdf
PWC https://paperswithcode.com/paper/not-all-attention-is-needed-gated-attention
Repo
Framework

URNet : User-Resizable Residual Networks with Conditional Gating Module

Title URNet : User-Resizable Residual Networks with Conditional Gating Module
Authors Sang-ho Lee, Simyung Chang, Nojun Kwak
Abstract Convolutional Neural Networks are widely used to process spatial scenes, but their computational cost is fixed and depends on the structure of the network used. There are methods to reduce the cost by compressing networks or varying its computational path dynamically according to the input image. However, since a user can not control the size of the learned model, it is difficult to respond dynamically if the amount of service requests suddenly increases. We propose User-Resizable Residual Networks (URNet), which allows users to adjust the scale of the network as needed during evaluation. URNet includes Conditional Gating Module (CGM) that determines the use of each residual block according to the input image and the desired scale. CGM is trained in a supervised manner using the newly proposed scale loss and its corresponding training methods. URNet can control the amount of computation according to user’s demand without degrading the accuracy significantly. It can also be used as a general compression method by fixing the scale size during training. In the experiments on ImageNet, URNet based on ResNet-101 maintains the accuracy of the baseline even when resizing it to approximately 80% of the original network, and demonstrates only about 1% accuracy degradation when using about 65% of the computation.
Tasks
Published 2019-01-15
URL http://arxiv.org/abs/1901.04687v2
PDF http://arxiv.org/pdf/1901.04687v2.pdf
PWC https://paperswithcode.com/paper/urnet-user-resizable-residual-networks-with
Repo
Framework

Designing Deep Reinforcement Learning for Human Parameter Exploration

Title Designing Deep Reinforcement Learning for Human Parameter Exploration
Authors Hugo Scurto, Bavo Van Kerrebroeck, Baptiste Caramiaux, Frédéric Bevilacqua
Abstract Software tools for generating digital sound often present users with high-dimensional, parametric interfaces, that may not facilitate exploration of diverse sound designs. In this paper, we propose to investigate artificial agents using deep reinforcement learning to explore parameter spaces in partnership with users for sound design. We describe a series of user-centred studies to probe the creative benefits of these agents and adapting their design to exploration. Preliminary studies observing users’ exploration strategies with parametric interfaces and testing different agent exploration behaviours led to the design of a fully-functioning prototype, called Co-Explorer, that we evaluated in a workshop with professional sound designers. We found that the Co-Explorer enables a novel creative workflow centred on human-machine partnership, which has been positively received by practitioners. We also highlight varied user exploration behaviors throughout partnering with our system. Finally, we frame design guidelines for enabling such co-exploration workflow in creative digital applications.
Tasks
Published 2019-07-01
URL https://arxiv.org/abs/1907.00824v1
PDF https://arxiv.org/pdf/1907.00824v1.pdf
PWC https://paperswithcode.com/paper/designing-deep-reinforcement-learning-for
Repo
Framework

ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics

Title ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics
Authors Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung
Abstract Deep learning with 3D data has progressed significantly since the introduction of convolutional neural networks that can handle point order ambiguity in point cloud data. While being able to achieve good accuracies in various scene understanding tasks, previous methods often have low training speed and complex network architecture. In this paper, we address these problems by proposing an efficient end-to-end permutation invariant convolution for point cloud deep learning. Our simple yet effective convolution operator named ShellConv uses statistics from concentric spherical shells to define representative features and resolve the point order ambiguity, allowing traditional convolution to perform on such features. Based on ShellConv we further build an efficient neural network named ShellNet to directly consume the point clouds with larger receptive fields while maintaining less layers. We demonstrate the efficacy of ShellNet by producing state-of-the-art results on object classification, object part segmentation, and semantic scene segmentation while keeping the network very fast to train.
Tasks Semantic Segmentation
Published 2019-08-17
URL https://arxiv.org/abs/1908.06295v1
PDF https://arxiv.org/pdf/1908.06295v1.pdf
PWC https://paperswithcode.com/paper/shellnet-efficient-point-cloud-convolutional
Repo
Framework

Learning to Reason with Relational Video Representation for Question Answering

Title Learning to Reason with Relational Video Representation for Question Answering
Authors Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran
Abstract How does machine learn to reason about the content of a video in answering a question? A Video QA system must simultaneously understand language, represent visual content over space-time, and iteratively transform these representations in response to lingual content in the query, and finally arriving at a sensible answer. While recent advances in textual and visual question answering have come up with sophisticated visual representation and neural reasoning mechanisms, major challenges in Video QA remain on dynamic grounding of concepts, relations and actions to support the reasoning process. We present a new end-to-end layered architecture for Video QA, which is composed of a question-guided video representation layer and a generic reasoning layer to produce answer. The video is represented using a hierarchical model that encodes visual information about objects, actions and relations in space-time given the textual cues from the question. The encoded representation is then passed to a reasoning module, which in this paper, is implemented as a MAC net. The system is evaluated on the SVQA (synthetic) and TGIF-QA datasets (real), demonstrating state-of-the-art results, with a large margin in the case of multi-step reasoning.
Tasks Question Answering, Visual Question Answering
Published 2019-07-10
URL https://arxiv.org/abs/1907.04553v1
PDF https://arxiv.org/pdf/1907.04553v1.pdf
PWC https://paperswithcode.com/paper/learning-to-reason-with-relational-video
Repo
Framework

Convex Programming for Estimation in Nonlinear Recurrent Models

Title Convex Programming for Estimation in Nonlinear Recurrent Models
Authors Sohail Bahmani, Justin Romberg
Abstract We propose a formulation for nonlinear recurrent models that includes simple parametric models of recurrent neural networks as a special case. The proposed formulation leads to a natural estimator in the form of a convex program. We provide a sample complexity for this estimator in the case of stable dynamics, where the nonlinear recursion has a certain contraction property, and under certain regularity conditions on the input distribution. We evaluate the performance of the estimator by simulation on synthetic data. These numerical experiments also suggest the extent at which the imposed theoretical assumptions may be relaxed.
Tasks
Published 2019-08-26
URL https://arxiv.org/abs/1908.09915v1
PDF https://arxiv.org/pdf/1908.09915v1.pdf
PWC https://paperswithcode.com/paper/convex-programming-for-estimation-in
Repo
Framework

Query-bag Matching with Mutual Coverage for Information-seeking Conversations in E-commerce

Title Query-bag Matching with Mutual Coverage for Information-seeking Conversations in E-commerce
Authors Zhenxin Fu, Feng Ji, Wenpeng Hu, Wei Zhou, Dongyan Zhao, Haiqing Chen, Rui Yan
Abstract Information-seeking conversation system aims at satisfying the information needs of users through conversations. Text matching between a user query and a pre-collected question is an important part of the information-seeking conversation in E-commerce. In the practical scenario, a sort of questions always correspond to a same answer. Naturally, these questions can form a bag. Learning the matching between user query and bag directly may improve the conversation performance, denoted as query-bag matching. Inspired by such opinion, we propose a query-bag matching model which mainly utilizes the mutual coverage between query and bag and measures the degree of the content in the query mentioned by the bag, and vice verse. In addition, the learned bag representation in word level helps find the main points of a bag in a fine grade and promotes the query-bag matching performance. Experiments on two datasets show the effectiveness of our model.
Tasks Text Matching
Published 2019-11-07
URL https://arxiv.org/abs/1911.02747v1
PDF https://arxiv.org/pdf/1911.02747v1.pdf
PWC https://paperswithcode.com/paper/query-bag-matching-with-mutual-coverage-for
Repo
Framework

Integrated Triaging for Fast Reading Comprehension

Title Integrated Triaging for Fast Reading Comprehension
Authors Felix Wu, Boyi Li, Lequn Wang, Ni Lao, John Blitzer, Kilian Q. Weinberger
Abstract Although according to several benchmarks automatic machine reading comprehension (MRC) systems have recently reached super-human performance, less attention has been paid to their computational efficiency. However, efficiency is of crucial importance for training and deployment in real world applications. This paper introduces Integrated Triaging, a framework that prunes almost all context in early layers of a network, leaving the remaining (deep) layers to scan only a tiny fraction of the full corpus. This pruning drastically increases the efficiency of MRC models and further prevents the later layers from overfitting to prevalent short paragraphs in the training set. Our framework is extremely flexible and naturally applicable to a wide variety of models. Our experiment on doc-SQuAD and TriviaQA tasks demonstrates its effectiveness in consistently improving both speed and quality of several diverse MRC models.
Tasks Machine Reading Comprehension, Reading Comprehension
Published 2019-09-28
URL https://arxiv.org/abs/1909.13128v1
PDF https://arxiv.org/pdf/1909.13128v1.pdf
PWC https://paperswithcode.com/paper/integrated-triaging-for-fast-reading
Repo
Framework

Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation

Title Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation
Authors Chao Wen, Yinda Zhang, Zhuwen Li, Yanwei Fu
Abstract We study the problem of shape generation in 3D mesh representation from a few color images with known camera poses. While many previous works learn to hallucinate the shape directly from priors, we resort to further improving the shape quality by leveraging cross-view information with a graph convolutional network. Instead of building a direct mapping function from images to 3D shape, our model learns to predict series of deformations to improve a coarse shape iteratively. Inspired by traditional multiple view geometry methods, our network samples nearby area around the initial mesh’s vertex locations and reasons an optimal deformation using perceptual feature statistics built from multiple input images. Extensive experiments show that our model produces accurate 3D shape that are not only visually plausible from the input perspectives, but also well aligned to arbitrary viewpoints. With the help of physically driven architecture, our model also exhibits generalization capability across different semantic categories, number of input images, and quality of mesh initialization.
Tasks
Published 2019-08-05
URL https://arxiv.org/abs/1908.01491v2
PDF https://arxiv.org/pdf/1908.01491v2.pdf
PWC https://paperswithcode.com/paper/pixel2mesh-multi-view-3d-mesh-generation-via
Repo
Framework

Learning Alignment for Multimodal Emotion Recognition from Speech

Title Learning Alignment for Multimodal Emotion Recognition from Speech
Authors Haiyang Xu, Hui Zhang, Kun Han, Yun Wang, Yiping Peng, Xiangang Li
Abstract Speech emotion recognition is a challenging problem because human convey emotions in subtle and complex ways. For emotion recognition on human speech, one can either extract emotion related features from audio signals or employ speech recognition techniques to generate text from speech and then apply natural language processing to analyze the sentiment. Further, emotion recognition will be beneficial from using audio-textual multimodal information, it is not trivial to build a system to learn from multimodality. One can build models for two input sources separately and combine them in a decision level, but this method ignores the interaction between speech and text in the temporal domain. In this paper, we propose to use an attention mechanism to learn the alignment between speech frames and text words, aiming to produce more accurate multimodal feature representations. The aligned multimodal features are fed into a sequential model for emotion recognition. We evaluate the approach on the IEMOCAP dataset and the experimental results show the proposed approach achieves the state-of-the-art performance on the dataset.
Tasks Emotion Recognition, Multimodal Emotion Recognition, Speech Emotion Recognition, Speech Recognition
Published 2019-09-06
URL https://arxiv.org/abs/1909.05645v1
PDF https://arxiv.org/pdf/1909.05645v1.pdf
PWC https://paperswithcode.com/paper/learning-alignment-for-multimodal-emotion
Repo
Framework
comments powered by Disqus