Paper Group ANR 654
Neural Text Summarization: A Critical Evaluation. Symplectic Recurrent Neural Networks. Learning-based Model Predictive Control for Smart Building Thermal Management. Understanding Troll Writing as a Linguistic Phenomenon. Sistema Sensor para el Monitoreo Ambiental Basado en Redes Neuronales. Not All Attention Is Needed: Gated Attention Network for …
Neural Text Summarization: A Critical Evaluation
Title | Neural Text Summarization: A Critical Evaluation |
Authors | Wojciech Kryściński, Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher |
Abstract | Text summarization aims at compressing long documents into a shorter form that conveys the most important parts of the original document. Despite increased interest in the community and notable research effort, progress on benchmark datasets has stagnated. We critically evaluate key ingredients of the current research setup: datasets, evaluation metrics, and models, and highlight three primary shortcomings: 1) automatically collected datasets leave the task underconstrained and may contain noise detrimental to training and evaluation, 2) current evaluation protocol is weakly correlated with human judgment and does not account for important characteristics such as factual correctness, 3) models overfit to layout biases of current datasets and offer limited diversity in their outputs. |
Tasks | Text Summarization |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08960v1 |
https://arxiv.org/pdf/1908.08960v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-text-summarization-a-critical |
Repo | |
Framework | |
Symplectic Recurrent Neural Networks
Title | Symplectic Recurrent Neural Networks |
Authors | Zhengdao Chen, Jianyu Zhang, Martin Arjovsky, Léon Bottou |
Abstract | We propose Symplectic Recurrent Neural Networks (SRNNs) as learning algorithms that capture the dynamics of physical systems from observed trajectories. An SRNN models the Hamiltonian function of the system by a neural network and furthermore leverages symplectic integration, multiple-step training and initial state optimization to address the challenging numerical issues associated with Hamiltonian systems. We show SRNNs succeed reliably on complex and noisy Hamiltonian systems. We also show how to augment the SRNN integration scheme in order to handle stiff dynamical systems such as bouncing billiards. |
Tasks | |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13334v1 |
https://arxiv.org/pdf/1909.13334v1.pdf | |
PWC | https://paperswithcode.com/paper/symplectic-recurrent-neural-networks |
Repo | |
Framework | |
Learning-based Model Predictive Control for Smart Building Thermal Management
Title | Learning-based Model Predictive Control for Smart Building Thermal Management |
Authors | Roja Eini, Sherif Abdelwahed |
Abstract | This paper proposes a learning-based model predictive control (MPC) approach for the thermal control of a four-zone smart building. The objectives are to minimize energy consumption and maintain the residents’ comfort. The proposed control scheme incorporates learning with the model-based control. The occupancy profile in the building zones are estimated in a long-term horizon through the artificial neural network (ANN), and this data is fed into the model-based predictor to get the indoor temperature predictions. The Energy Plus software is utilized as the actual dataset provider (weather data, indoor temperature, energy consumption). The optimization problem, including the actual and predicted data, is solved in each step of the simulation and the input setpoint temperature for the heating/cooling system, is generated. Comparing the results of the proposed approach with the conventional MPC results proved the significantly better performance of the proposed method in energy savings (40.56% less cooling power consumption and 16.73% less heating power consumption), and residents’ comfort. |
Tasks | |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05331v1 |
https://arxiv.org/pdf/1909.05331v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-based-model-predictive-control-for-2 |
Repo | |
Framework | |
Understanding Troll Writing as a Linguistic Phenomenon
Title | Understanding Troll Writing as a Linguistic Phenomenon |
Authors | Sergei Monakhov |
Abstract | The current study yielded a number of important findings. We managed to build a neural network that achieved an accuracy score of 91 per cent in classifying troll and genuine tweets. By means of regression analysis, we identified a number of features that make a tweet more susceptible to correct labelling and found that they are inherently present in troll tweets as a special type of discourse. We hypothesised that those features are grounded in the sociolinguistic limitations of troll writing, which can be best described as a combination of two factors: speaking with a purpose and trying to mask the purpose of speaking. Next, we contended that the orthogonal nature of these factors must necessarily result in the skewed distribution of many different language parameters of troll messages. Having chosen as an example distribution of the topics and vocabulary associated with those topics, we showed some very pronounced distributional anomalies, thus confirming our prediction. |
Tasks | |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.08946v1 |
https://arxiv.org/pdf/1911.08946v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-troll-writing-as-a-linguistic |
Repo | |
Framework | |
Sistema Sensor para el Monitoreo Ambiental Basado en Redes Neuronales
Title | Sistema Sensor para el Monitoreo Ambiental Basado en Redes Neuronales |
Authors | Jose de Jesus Rubio, Jose Alberto Hernandez-Aguilar, Francisco Jacob Avila-Camacho, Juan Manuel Stein-Carrillo, Adolfo Melendez-Ramirez |
Abstract | In the tasks of environmental monitoring is of great importance to have compact and portable systems able to identify environmental contaminants that facilitate tasks related to waste management and environmental restoration. In this paper, a prototype sensor is described to identify contaminants in the environment. This prototype is made with an array of tin oxide SnO2 gas sensors used to identify chemical vapors, a step of data acquisition implemented with ARM (Advanced RISC Machine) low-cost platform (Arduino) and a neural network able to identify environmental contaminants automatically. The neural network is used to identify the composition of contaminant census. In the computer system, the heavy computational load is presented only in the training process, once the neural network has been trained, the operation is to spread the data across the network with a much lighter computational load, which consists mainly of a vector-matrix multiplication and a search table that holds the activation function to quickly identify unknown samples. |
Tasks | |
Published | 2019-04-28 |
URL | http://arxiv.org/abs/1904.12234v1 |
http://arxiv.org/pdf/1904.12234v1.pdf | |
PWC | https://paperswithcode.com/paper/sistema-sensor-para-el-monitoreo-ambiental |
Repo | |
Framework | |
Not All Attention Is Needed: Gated Attention Network for Sequence Data
Title | Not All Attention Is Needed: Gated Attention Network for Sequence Data |
Authors | Lanqing Xue, Xiaopeng Li, Nevin L. Zhang |
Abstract | Although deep neural networks generally have fixed network structures, the concept of dynamic mechanism has drawn more and more attention in recent years. Attention mechanisms compute input-dependent dynamic attention weights for aggregating a sequence of hidden states. Dynamic network configuration in convolutional neural networks (CNNs) selectively activates only part of the network at a time for different inputs. In this paper, we combine the two dynamic mechanisms for text classification tasks. Traditional attention mechanisms attend to the whole sequence of hidden states for an input sentence, while in most cases not all attention is needed especially for long sequences. We propose a novel method called Gated Attention Network (GA-Net) to dynamically select a subset of elements to attend to using an auxiliary network, and compute attention weights to aggregate the selected elements. It avoids a significant amount of unnecessary computation on unattended elements, and allows the model to pay attention to important parts of the sequence. Experiments in various datasets show that the proposed method achieves better performance compared with all baseline models with global or local attention while requiring less computation and achieving better interpretability. It is also promising to extend the idea to more complex attention-based models, such as transformers and seq-to-seq models. |
Tasks | Text Classification |
Published | 2019-12-01 |
URL | https://arxiv.org/abs/1912.00349v1 |
https://arxiv.org/pdf/1912.00349v1.pdf | |
PWC | https://paperswithcode.com/paper/not-all-attention-is-needed-gated-attention |
Repo | |
Framework | |
URNet : User-Resizable Residual Networks with Conditional Gating Module
Title | URNet : User-Resizable Residual Networks with Conditional Gating Module |
Authors | Sang-ho Lee, Simyung Chang, Nojun Kwak |
Abstract | Convolutional Neural Networks are widely used to process spatial scenes, but their computational cost is fixed and depends on the structure of the network used. There are methods to reduce the cost by compressing networks or varying its computational path dynamically according to the input image. However, since a user can not control the size of the learned model, it is difficult to respond dynamically if the amount of service requests suddenly increases. We propose User-Resizable Residual Networks (URNet), which allows users to adjust the scale of the network as needed during evaluation. URNet includes Conditional Gating Module (CGM) that determines the use of each residual block according to the input image and the desired scale. CGM is trained in a supervised manner using the newly proposed scale loss and its corresponding training methods. URNet can control the amount of computation according to user’s demand without degrading the accuracy significantly. It can also be used as a general compression method by fixing the scale size during training. In the experiments on ImageNet, URNet based on ResNet-101 maintains the accuracy of the baseline even when resizing it to approximately 80% of the original network, and demonstrates only about 1% accuracy degradation when using about 65% of the computation. |
Tasks | |
Published | 2019-01-15 |
URL | http://arxiv.org/abs/1901.04687v2 |
http://arxiv.org/pdf/1901.04687v2.pdf | |
PWC | https://paperswithcode.com/paper/urnet-user-resizable-residual-networks-with |
Repo | |
Framework | |
Designing Deep Reinforcement Learning for Human Parameter Exploration
Title | Designing Deep Reinforcement Learning for Human Parameter Exploration |
Authors | Hugo Scurto, Bavo Van Kerrebroeck, Baptiste Caramiaux, Frédéric Bevilacqua |
Abstract | Software tools for generating digital sound often present users with high-dimensional, parametric interfaces, that may not facilitate exploration of diverse sound designs. In this paper, we propose to investigate artificial agents using deep reinforcement learning to explore parameter spaces in partnership with users for sound design. We describe a series of user-centred studies to probe the creative benefits of these agents and adapting their design to exploration. Preliminary studies observing users’ exploration strategies with parametric interfaces and testing different agent exploration behaviours led to the design of a fully-functioning prototype, called Co-Explorer, that we evaluated in a workshop with professional sound designers. We found that the Co-Explorer enables a novel creative workflow centred on human-machine partnership, which has been positively received by practitioners. We also highlight varied user exploration behaviors throughout partnering with our system. Finally, we frame design guidelines for enabling such co-exploration workflow in creative digital applications. |
Tasks | |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00824v1 |
https://arxiv.org/pdf/1907.00824v1.pdf | |
PWC | https://paperswithcode.com/paper/designing-deep-reinforcement-learning-for |
Repo | |
Framework | |
ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics
Title | ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics |
Authors | Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung |
Abstract | Deep learning with 3D data has progressed significantly since the introduction of convolutional neural networks that can handle point order ambiguity in point cloud data. While being able to achieve good accuracies in various scene understanding tasks, previous methods often have low training speed and complex network architecture. In this paper, we address these problems by proposing an efficient end-to-end permutation invariant convolution for point cloud deep learning. Our simple yet effective convolution operator named ShellConv uses statistics from concentric spherical shells to define representative features and resolve the point order ambiguity, allowing traditional convolution to perform on such features. Based on ShellConv we further build an efficient neural network named ShellNet to directly consume the point clouds with larger receptive fields while maintaining less layers. We demonstrate the efficacy of ShellNet by producing state-of-the-art results on object classification, object part segmentation, and semantic scene segmentation while keeping the network very fast to train. |
Tasks | Semantic Segmentation |
Published | 2019-08-17 |
URL | https://arxiv.org/abs/1908.06295v1 |
https://arxiv.org/pdf/1908.06295v1.pdf | |
PWC | https://paperswithcode.com/paper/shellnet-efficient-point-cloud-convolutional |
Repo | |
Framework | |
Learning to Reason with Relational Video Representation for Question Answering
Title | Learning to Reason with Relational Video Representation for Question Answering |
Authors | Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran |
Abstract | How does machine learn to reason about the content of a video in answering a question? A Video QA system must simultaneously understand language, represent visual content over space-time, and iteratively transform these representations in response to lingual content in the query, and finally arriving at a sensible answer. While recent advances in textual and visual question answering have come up with sophisticated visual representation and neural reasoning mechanisms, major challenges in Video QA remain on dynamic grounding of concepts, relations and actions to support the reasoning process. We present a new end-to-end layered architecture for Video QA, which is composed of a question-guided video representation layer and a generic reasoning layer to produce answer. The video is represented using a hierarchical model that encodes visual information about objects, actions and relations in space-time given the textual cues from the question. The encoded representation is then passed to a reasoning module, which in this paper, is implemented as a MAC net. The system is evaluated on the SVQA (synthetic) and TGIF-QA datasets (real), demonstrating state-of-the-art results, with a large margin in the case of multi-step reasoning. |
Tasks | Question Answering, Visual Question Answering |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04553v1 |
https://arxiv.org/pdf/1907.04553v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-reason-with-relational-video |
Repo | |
Framework | |
Convex Programming for Estimation in Nonlinear Recurrent Models
Title | Convex Programming for Estimation in Nonlinear Recurrent Models |
Authors | Sohail Bahmani, Justin Romberg |
Abstract | We propose a formulation for nonlinear recurrent models that includes simple parametric models of recurrent neural networks as a special case. The proposed formulation leads to a natural estimator in the form of a convex program. We provide a sample complexity for this estimator in the case of stable dynamics, where the nonlinear recursion has a certain contraction property, and under certain regularity conditions on the input distribution. We evaluate the performance of the estimator by simulation on synthetic data. These numerical experiments also suggest the extent at which the imposed theoretical assumptions may be relaxed. |
Tasks | |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09915v1 |
https://arxiv.org/pdf/1908.09915v1.pdf | |
PWC | https://paperswithcode.com/paper/convex-programming-for-estimation-in |
Repo | |
Framework | |
Query-bag Matching with Mutual Coverage for Information-seeking Conversations in E-commerce
Title | Query-bag Matching with Mutual Coverage for Information-seeking Conversations in E-commerce |
Authors | Zhenxin Fu, Feng Ji, Wenpeng Hu, Wei Zhou, Dongyan Zhao, Haiqing Chen, Rui Yan |
Abstract | Information-seeking conversation system aims at satisfying the information needs of users through conversations. Text matching between a user query and a pre-collected question is an important part of the information-seeking conversation in E-commerce. In the practical scenario, a sort of questions always correspond to a same answer. Naturally, these questions can form a bag. Learning the matching between user query and bag directly may improve the conversation performance, denoted as query-bag matching. Inspired by such opinion, we propose a query-bag matching model which mainly utilizes the mutual coverage between query and bag and measures the degree of the content in the query mentioned by the bag, and vice verse. In addition, the learned bag representation in word level helps find the main points of a bag in a fine grade and promotes the query-bag matching performance. Experiments on two datasets show the effectiveness of our model. |
Tasks | Text Matching |
Published | 2019-11-07 |
URL | https://arxiv.org/abs/1911.02747v1 |
https://arxiv.org/pdf/1911.02747v1.pdf | |
PWC | https://paperswithcode.com/paper/query-bag-matching-with-mutual-coverage-for |
Repo | |
Framework | |
Integrated Triaging for Fast Reading Comprehension
Title | Integrated Triaging for Fast Reading Comprehension |
Authors | Felix Wu, Boyi Li, Lequn Wang, Ni Lao, John Blitzer, Kilian Q. Weinberger |
Abstract | Although according to several benchmarks automatic machine reading comprehension (MRC) systems have recently reached super-human performance, less attention has been paid to their computational efficiency. However, efficiency is of crucial importance for training and deployment in real world applications. This paper introduces Integrated Triaging, a framework that prunes almost all context in early layers of a network, leaving the remaining (deep) layers to scan only a tiny fraction of the full corpus. This pruning drastically increases the efficiency of MRC models and further prevents the later layers from overfitting to prevalent short paragraphs in the training set. Our framework is extremely flexible and naturally applicable to a wide variety of models. Our experiment on doc-SQuAD and TriviaQA tasks demonstrates its effectiveness in consistently improving both speed and quality of several diverse MRC models. |
Tasks | Machine Reading Comprehension, Reading Comprehension |
Published | 2019-09-28 |
URL | https://arxiv.org/abs/1909.13128v1 |
https://arxiv.org/pdf/1909.13128v1.pdf | |
PWC | https://paperswithcode.com/paper/integrated-triaging-for-fast-reading |
Repo | |
Framework | |
Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation
Title | Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation |
Authors | Chao Wen, Yinda Zhang, Zhuwen Li, Yanwei Fu |
Abstract | We study the problem of shape generation in 3D mesh representation from a few color images with known camera poses. While many previous works learn to hallucinate the shape directly from priors, we resort to further improving the shape quality by leveraging cross-view information with a graph convolutional network. Instead of building a direct mapping function from images to 3D shape, our model learns to predict series of deformations to improve a coarse shape iteratively. Inspired by traditional multiple view geometry methods, our network samples nearby area around the initial mesh’s vertex locations and reasons an optimal deformation using perceptual feature statistics built from multiple input images. Extensive experiments show that our model produces accurate 3D shape that are not only visually plausible from the input perspectives, but also well aligned to arbitrary viewpoints. With the help of physically driven architecture, our model also exhibits generalization capability across different semantic categories, number of input images, and quality of mesh initialization. |
Tasks | |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01491v2 |
https://arxiv.org/pdf/1908.01491v2.pdf | |
PWC | https://paperswithcode.com/paper/pixel2mesh-multi-view-3d-mesh-generation-via |
Repo | |
Framework | |
Learning Alignment for Multimodal Emotion Recognition from Speech
Title | Learning Alignment for Multimodal Emotion Recognition from Speech |
Authors | Haiyang Xu, Hui Zhang, Kun Han, Yun Wang, Yiping Peng, Xiangang Li |
Abstract | Speech emotion recognition is a challenging problem because human convey emotions in subtle and complex ways. For emotion recognition on human speech, one can either extract emotion related features from audio signals or employ speech recognition techniques to generate text from speech and then apply natural language processing to analyze the sentiment. Further, emotion recognition will be beneficial from using audio-textual multimodal information, it is not trivial to build a system to learn from multimodality. One can build models for two input sources separately and combine them in a decision level, but this method ignores the interaction between speech and text in the temporal domain. In this paper, we propose to use an attention mechanism to learn the alignment between speech frames and text words, aiming to produce more accurate multimodal feature representations. The aligned multimodal features are fed into a sequential model for emotion recognition. We evaluate the approach on the IEMOCAP dataset and the experimental results show the proposed approach achieves the state-of-the-art performance on the dataset. |
Tasks | Emotion Recognition, Multimodal Emotion Recognition, Speech Emotion Recognition, Speech Recognition |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.05645v1 |
https://arxiv.org/pdf/1909.05645v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-alignment-for-multimodal-emotion |
Repo | |
Framework | |