Paper Group ANR 1539
Improving Robustness of ReRAM-based Spiking Neural Network Accelerator with Stochastic Spike-timing-dependent-plasticity. From Multi-modal Property Dataset to Robot-centric Conceptual Knowledge About Household Objects. Generating Multi-Sentence Abstractive Summaries of Interleaved Texts. Advances in Machine Learning for the Behavioral Sciences. Dir …
Improving Robustness of ReRAM-based Spiking Neural Network Accelerator with Stochastic Spike-timing-dependent-plasticity
Title | Improving Robustness of ReRAM-based Spiking Neural Network Accelerator with Stochastic Spike-timing-dependent-plasticity |
Authors | Xueyuan She, Yun Long, Saibal Mukhopadhyay |
Abstract | Spike-timing-dependent-plasticity (STDP) is an unsupervised learning algorithm for spiking neural network (SNN), which promises to achieve deeper understanding of human brain and more powerful artificial intelligence. While conventional computing system fails to simulate SNN efficiently, process-in-memory (PIM) based on devices such as ReRAM can be used in designing fast and efficient STDP based SNN accelerators, as it operates in high resemblance with biological neural network. However, the real-life implementation of such design still suffers from impact of input noise and device variation. In this work, we present a novel stochastic STDP algorithm that uses spiking frequency information to dynamically adjust synaptic behavior. The algorithm is tested in pattern recognition task with noisy input and shows accuracy improvement over deterministic STDP. In addition, we show that the new algorithm can be used for designing a robust ReRAM based SNN accelerator that has strong resilience to device variation. |
Tasks | |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05401v1 |
https://arxiv.org/pdf/1909.05401v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-robustness-of-reram-based-spiking |
Repo | |
Framework | |
From Multi-modal Property Dataset to Robot-centric Conceptual Knowledge About Household Objects
Title | From Multi-modal Property Dataset to Robot-centric Conceptual Knowledge About Household Objects |
Authors | Madhura Thosar, Christian A. Mueller, Georg Jaeger, Johannes Schleiss, Narender Pulugu, Ravi Mallikarjun Chennaboina, Sai Vivek Jeevangekar, Andreas Birk, Max Pfingsthorn, Sebastian Zug |
Abstract | Tool-use applications in robotics require conceptual knowledge about objects for informed decision making and object interactions. State-of-the-art methods employ hand-crafted symbolic knowledge which is defined from a human perspective and grounded into sensory data afterwards. However, due to different sensing and acting capabilities of robots, their conceptual understanding of objects must be generated from a robot’s perspective entirely, which asks for robot-centric conceptual knowledge about objects. With this goal in mind, this article motivates that such knowledge should be based on physical and functional properties of objects. Consequently, a selection of ten properties is defined and corresponding extraction methods are proposed. This multi-modal property extraction forms the basis on which our second contribution, a robot-centric knowledge generation is build on. It employs unsupervised clustering methods to transform numerical property data into symbols, and Bivariate Joint Frequency Distributions and Sample Proportion to generate conceptual knowledge about objects using the robot-centric symbols. A preliminary implementation of the proposed framework is employed to acquire a dataset comprising physical and functional property data of 110 houshold objects. This Robot-Centric dataSet (RoCS) is used to evaluate the framework regarding the property extraction methods, the semantics of the considered properties within the dataset and its usefulness in real-world applications such as tool substitution. |
Tasks | Decision Making |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11114v1 |
https://arxiv.org/pdf/1906.11114v1.pdf | |
PWC | https://paperswithcode.com/paper/from-multi-modal-property-dataset-to-robot |
Repo | |
Framework | |
Generating Multi-Sentence Abstractive Summaries of Interleaved Texts
Title | Generating Multi-Sentence Abstractive Summaries of Interleaved Texts |
Authors | Sanjeev Kumar Karn, Francine Chen, Yan-Ying Chen, Ulli Waltinger, Hinrich Schütze |
Abstract | In multi-participant postings, as in online chat conversations, several conversations or topic threads may take place concurrently. This leads to difficulties for readers reviewing the postings in not only following discussions but also in quickly identifying their essence. A two-step process, disentanglement of interleaved posts followed by summarization of each thread, addresses the issue, but disentanglement errors are propagated to the summarization step, degrading the overall performance. To address this, we propose an end-to-end trainable encoder-decoder network for summarizing interleaved posts. The interleaved posts are encoded hierarchically, i.e., word-to-word (words in a post) followed by post-to-post (posts in a channel). The decoder also generates summaries hierarchically, thread-to-thread (generate thread representations) followed by word-to-word (i.e., generate summary words). Additionally, we propose a hierarchical attention mechanism for interleaved text. Overall, our end-to-end trainable hierarchical framework enhances performance over a sequence to sequence framework by 8% on a synthetic interleaved texts dataset. |
Tasks | |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01973v1 |
https://arxiv.org/pdf/1906.01973v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-multi-sentence-abstractive |
Repo | |
Framework | |
Advances in Machine Learning for the Behavioral Sciences
Title | Advances in Machine Learning for the Behavioral Sciences |
Authors | Tomáš Kliegr, Štěpán Bahník, Johannes Fürnkranz |
Abstract | The areas of machine learning and knowledge discovery in databases have considerably matured in recent years. In this article, we briefly review recent developments as well as classical algorithms that stood the test of time. Our goal is to provide a general introduction into different tasks such as learning from tabular data, behavioral data, or textual data, with a particular focus on actual and potential applications in behavioral sciences. The supplemental appendix to the article also provides practical guidance for using the methods by pointing the reader to proven software implementations. The focus is on R, but we also cover some libraries in other programming languages as well as systems with easy-to-use graphical interfaces. |
Tasks | |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03249v1 |
https://arxiv.org/pdf/1911.03249v1.pdf | |
PWC | https://paperswithcode.com/paper/advances-in-machine-learning-for-the |
Repo | |
Framework | |
Direct Estimation of Differential Functional Graphical Models
Title | Direct Estimation of Differential Functional Graphical Models |
Authors | Boxin Zhao, Y. Samuel Wang, Mladen Kolar |
Abstract | We consider the problem of estimating the difference between two functional undirected graphical models with shared structures. In many applications, data are naturally regarded as high-dimensional random function vectors rather than multivariate scalars. For example, electroencephalography (EEG) data are more appropriately treated as functions of time. In these problems, not only can the number of functions measured per sample be large, but each function is itself an infinite dimensional object, making estimation of model parameters challenging. We develop a method that directly estimates the difference of graphs, avoiding separate estimation of each graph, and show it is consistent in certain high-dimensional settings. We illustrate finite sample properties of our method through simulation studies. Finally, we apply our method to EEG data to uncover differences in functional brain connectivity between alcoholics and control subjects. |
Tasks | EEG |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.09701v2 |
https://arxiv.org/pdf/1910.09701v2.pdf | |
PWC | https://paperswithcode.com/paper/direct-estimation-of-differential-functional |
Repo | |
Framework | |
Exploring Hierarchical Interaction Between Review and Summary for Better Sentiment Analysis
Title | Exploring Hierarchical Interaction Between Review and Summary for Better Sentiment Analysis |
Authors | Sen yang, Leyang Cui, Yue Zhang |
Abstract | Sentiment analysis provides a useful overview of customer review contents. Many review websites allow a user to enter a summary in addition to a full review. It has been shown that jointly predicting the review summary and the sentiment rating benefits both tasks. However, these methods consider the integration of review and summary information in an implicit manner, which limits their performance to some extent. In this paper, we propose a hierarchically-refined attention network for better exploiting multi-interaction between a review and its summary for sentiment analysis. In particular, the representation of a review is layer-wise refined by attention over the summary representation. Empirical results show that our model can better make use of user-written summaries for review sentiment analysis, and is also more effective compared to existing methods when the user summary is replaced with summary generated by an automatic summarization system. |
Tasks | Sentiment Analysis |
Published | 2019-11-07 |
URL | https://arxiv.org/abs/1911.02711v1 |
https://arxiv.org/pdf/1911.02711v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-hierarchical-interaction-between |
Repo | |
Framework | |
Embedding Symbolic Knowledge into Deep Networks
Title | Embedding Symbolic Knowledge into Deep Networks |
Authors | Yaqi Xie, Ziwei Xu, Mohan S. Kankanhalli, Kuldeep S. Meel, Harold Soh |
Abstract | In this work, we aim to leverage prior symbolic knowledge to improve the performance of deep models. We propose a graph embedding network that projects propositional formulae (and assignments) onto a manifold via an augmented Graph Convolutional Network (GCN). To generate semantically-faithful embeddings, we develop techniques to recognize node heterogeneity, and semantic regularization that incorporate structural constraints into the embedding. Experiments show that our approach improves the performance of models trained to perform entailment checking and visual relation prediction. Interestingly, we observe a connection between the tractability of the propositional theory representation and the ease of embedding. Future exploration of this connection may elucidate the relationship between knowledge compilation and vector representation learning. |
Tasks | Graph Embedding, Representation Learning |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01161v4 |
https://arxiv.org/pdf/1909.01161v4.pdf | |
PWC | https://paperswithcode.com/paper/semantically-regularized-logic-graph |
Repo | |
Framework | |
Universal Invariant and Equivariant Graph Neural Networks
Title | Universal Invariant and Equivariant Graph Neural Networks |
Authors | Nicolas Keriven, Gabriel Peyré |
Abstract | Graph Neural Networks (GNN) come in many flavors, but should always be either invariant (permutation of the nodes of the input graph does not affect the output) or equivariant (permutation of the input permutes the output). In this paper, we consider a specific class of invariant and equivariant networks, for which we prove new universality theorems. More precisely, we consider networks with a single hidden layer, obtained by summing channels formed by applying an equivariant linear operator, a pointwise non-linearity and either an invariant or equivariant linear operator. Recently, Maron et al. (2019) showed that by allowing higher-order tensorization inside the network, universal invariant GNNs can be obtained. As a first contribution, we propose an alternative proof of this result, which relies on the Stone-Weierstrass theorem for algebra of real-valued functions. Our main contribution is then an extension of this result to the equivariant case, which appears in many practical applications but has been less studied from a theoretical point of view. The proof relies on a new generalized Stone-Weierstrass theorem for algebra of equivariant functions, which is of independent interest. Finally, unlike many previous settings that consider a fixed number of nodes, our results show that a GNN defined by a single set of parameters can approximate uniformly well a function defined on graphs of varying size. |
Tasks | |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.04943v2 |
https://arxiv.org/pdf/1905.04943v2.pdf | |
PWC | https://paperswithcode.com/paper/universal-invariant-and-equivariant-graph |
Repo | |
Framework | |
Creating A Neural Pedagogical Agent by Jointly Learning to Review and Assess
Title | Creating A Neural Pedagogical Agent by Jointly Learning to Review and Assess |
Authors | Youngnam Lee, Youngduck Choi, Junghyun Cho, Alexander R. Fabbri, Hyunbin Loh, Chanyou Hwang, Yongku Lee, Sang-Wook Kim, Dragomir Radev |
Abstract | Machine learning plays an increasing role in intelligent tutoring systems as both the amount of data available and specialization among students grow. Nowadays, these systems are frequently deployed on mobile applications. Users on such mobile education platforms are dynamic, frequently being added, accessing the application with varying levels of focus, and changing while using the service. The education material itself, on the other hand, is often static and is an exhaustible resource whose use in tasks such as problem recommendation must be optimized. The ability to update user models with respect to educational material in real-time is thus essential; however, existing approaches require time-consuming re-training of user features whenever new data is added. In this paper, we introduce a neural pedagogical agent for real-time user modeling in the task of predicting user response correctness, a central task for mobile education applications. Our model, inspired by work in natural language processing on sequence modeling and machine translation, updates user features in real-time via bidirectional recurrent neural networks with an attention mechanism over embedded question-response pairs. We experiment on the mobile education application SantaTOEIC, which has 559k users, 66M response data points as well as a set of 10k study problems each expert-annotated with topic tags and gathered since 2016. Our model outperforms existing approaches over several metrics in predicting user response correctness, notably out-performing other methods on new users without large question-response histories. Additionally, our attention mechanism and annotated tag set allow us to create an interpretable education platform, with a smart review system that addresses the aforementioned issue of varied user attention and problem exhaustion. |
Tasks | Machine Translation |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.10910v2 |
https://arxiv.org/pdf/1906.10910v2.pdf | |
PWC | https://paperswithcode.com/paper/creating-a-neural-pedagogical-agent-by |
Repo | |
Framework | |
Deep Learning for System Trace Restoration
Title | Deep Learning for System Trace Restoration |
Authors | Ilia Sucholutsky, Apurva Narayan, Matthias Schonlau, Sebastian Fischmeister |
Abstract | Most real-world datasets, and particularly those collected from physical systems, are full of noise, packet loss, and other imperfections. However, most specification mining, anomaly detection and other such algorithms assume, or even require, perfect data quality to function properly. Such algorithms may work in lab conditions when given clean, controlled data, but will fail in the field when given imperfect data. We propose a method for accurately reconstructing discrete temporal or sequential system traces affected by data loss, using Long Short-Term Memory Networks (LSTMs). The model works by learning to predict the next event in a sequence of events, and uses its own output as an input to continue predicting future events. As a result, this method can be used for data restoration even with streamed data. Such a method can reconstruct even long sequence of missing events, and can also help validate and improve data quality for noisy data. The output of the model will be a close reconstruction of the true data, and can be fed to algorithms that rely on clean data. We demonstrate our method by reconstructing automotive CAN traces consisting of long sequences of discrete events. We show that given even small parts of a CAN trace, our LSTM model can predict future events with an accuracy of almost 90%, and can successfully reconstruct large portions of the original trace, greatly outperforming a Markov Model benchmark. We separately feed the original, lossy, and reconstructed traces into a specification mining framework to perform downstream analysis of the effect of our method on state-of-the-art models that use these traces for understanding the behavior of complex systems. |
Tasks | Anomaly Detection |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05411v1 |
http://arxiv.org/pdf/1904.05411v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-system-trace-restoration |
Repo | |
Framework | |
Time Series Analysis of Big Data for Electricity Price and Demand to Find Cyber-Attacks part 2: Decomposition Analysis
Title | Time Series Analysis of Big Data for Electricity Price and Demand to Find Cyber-Attacks part 2: Decomposition Analysis |
Authors | Mohsen Rakhshandehroo, Mohammad Rajabdorri |
Abstract | In this paper, in following of the first part (which ADF tests using ACI evaluation) has conducted, Time Series (TSs) are analyzed using decomposition analysis. In fact, TSs are composed of four components including trend (long term behavior or progression of series), cyclic component (non-periodic fluctuation behavior which are usually long term), seasonal component (periodic fluctuations due to seasonal variations like temperature, weather condition and etc.) and error term. For our case of cyber-attack detection, in this paper, two common ways of TS decomposition are investigated. The first method is additive decomposition and the second is multiplicative method to decompose a TS into its components. After decomposition, the error term is tested using Durbin-Watson and Breusch-Godfrey test to see whether the error follows any predictable pattern, it can be concluded that there is a chance of cyber-attack to the system. |
Tasks | Cyber Attack Detection, Time Series, Time Series Analysis |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.13016v1 |
https://arxiv.org/pdf/1907.13016v1.pdf | |
PWC | https://paperswithcode.com/paper/time-series-analysis-of-big-data-for |
Repo | |
Framework | |
Predicting next shopping stage using Google Analytics data for E-commerce applications
Title | Predicting next shopping stage using Google Analytics data for E-commerce applications |
Authors | Mihai Cristian Pîrvu, Alexandra Anghel |
Abstract | E-commerce web applications are almost ubiquitous in our day to day life, however as useful as they are, most of them have little to no adaptation to user needs, which in turn can cause both lower conversion rates as well as unsatisfied customers. We propose a machine learning system which learns the user behaviour from multiple previous sessions and predicts useful metrics for the current session. In turn, these metrics can be used by the applications to customize and better target the customer, which can mean anything from offering better offers of specific products, targeted notifications or placing smart ads. The data used for the learning algorithm is extracted from Google Analytics Enhanced E-commerce, which is enabled by most e-commerce websites and thus the system can be used by any such merchant. In order to learn the user patterns, only its behaviour features were used, which don’t include names, gender or any other personal information that could identify the user. The learning model that was used is a double recurrent neural network which learns both intra-session and inter-session features. The model predicts for each session a probability score for each of the defined target classes. |
Tasks | |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12595v1 |
https://arxiv.org/pdf/1905.12595v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-next-shopping-stage-using-google |
Repo | |
Framework | |
Predicting How to Distribute Work Between Algorithms and Humans to Segment an Image Batch
Title | Predicting How to Distribute Work Between Algorithms and Humans to Segment an Image Batch |
Authors | Danna Gurari, Yinan Zhao, Suyog Dutt Jain, Margrit Betke, Kristen Grauman |
Abstract | Foreground object segmentation is a critical step for many image analysis tasks. While automated methods can produce high-quality results, their failures disappoint users in need of practical solutions. We propose a resource allocation framework for predicting how best to allocate a fixed budget of human annotation effort in order to collect higher quality segmentations for a given batch of images and automated methods. The framework is based on a prediction module that estimates the quality of given algorithm-drawn segmentations. We demonstrate the value of the framework for two novel tasks related to predicting how to distribute annotation efforts between algorithms and humans. Specifically, we develop two systems that automatically decide, for a batch of images, when to recruit humans versus computers to create 1) coarse segmentations required to initialize segmentation tools and 2) final, fine-grained segmentations. Experiments demonstrate the advantage of relying on a mix of human and computer efforts over relying on either resource alone for segmenting objects in images coming from three diverse modalities (visible, phase contrast microscopy, and fluorescence microscopy). |
Tasks | Semantic Segmentation |
Published | 2019-04-30 |
URL | http://arxiv.org/abs/1905.00060v1 |
http://arxiv.org/pdf/1905.00060v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-how-to-distribute-work-between |
Repo | |
Framework | |
Enforcing Statistical Constraints in Generative Adversarial Networks for Modeling Chaotic Dynamical Systems
Title | Enforcing Statistical Constraints in Generative Adversarial Networks for Modeling Chaotic Dynamical Systems |
Authors | Jin-Long Wu, Karthik Kashinath, Adrian Albert, Dragos Chirila, Prabhat, Heng Xiao |
Abstract | Simulating complex physical systems often involves solving partial differential equations (PDEs) with some closures due to the presence of multi-scale physics that cannot be fully resolved. Therefore, reliable and accurate closure models for unresolved physics remains an important requirement for many computational physics problems, e.g., turbulence simulation. Recently, several researchers have adopted generative adversarial networks (GANs), a novel paradigm of training machine learning models, to generate solutions of PDEs-governed complex systems without having to numerically solve these PDEs. However, GANs are known to be difficult in training and likely to converge to local minima, where the generated samples do not capture the true statistics of the training data. In this work, we present a statistical constrained generative adversarial network by enforcing constraints of covariance from the training data, which results in an improved machine-learning-based emulator to capture the statistics of the training data generated by solving fully resolved PDEs. We show that such a statistical regularization leads to better performance compared to standard GANs, measured by (1) the constrained model’s ability to more faithfully emulate certain physical properties of the system and (2) the significantly reduced (by up to 80%) training time to reach the solution. We exemplify this approach on the Rayleigh-Benard convection, a turbulent flow system that is an idealized model of the Earth’s atmosphere. With the growth of high-fidelity simulation databases of physical systems, this work suggests great potential for being an alternative to the explicit modeling of closures or parameterizations for unresolved physics, which are known to be a major source of uncertainty in simulating multi-scale physical systems, e.g., turbulence or Earth’s climate. |
Tasks | |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.06841v1 |
https://arxiv.org/pdf/1905.06841v1.pdf | |
PWC | https://paperswithcode.com/paper/enforcing-statistical-constraints-in |
Repo | |
Framework | |
Meta-Learning without Memorization
Title | Meta-Learning without Memorization |
Authors | Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn |
Abstract | The ability to learn new concepts with small amounts of data is a critical aspect of intelligence that has proven challenging for deep learning methods. Meta-learning has emerged as a promising technique for leveraging data from previous tasks to enable efficient learning of new tasks. However, most meta-learning algorithms implicitly require that the meta-training tasks be mutually-exclusive, such that no single model can solve all of the tasks at once. For example, when creating tasks for few-shot image classification, prior work uses a per-task random assignment of image classes to N-way classification labels. If this is not done, the meta-learner can ignore the task training data and learn a single model that performs all of the meta-training tasks zero-shot, but does not adapt effectively to new image classes. This requirement means that the user must take great care in designing the tasks, for example by shuffling labels or removing task identifying information from the inputs. In some domains, this makes meta-learning entirely inapplicable. In this paper, we address this challenge by designing a meta-regularization objective using information theory that places precedence on data-driven adaptation. This causes the meta-learner to decide what must be learned from the task training data and what should be inferred from the task testing input. By doing so, our algorithm can successfully use data from non-mutually-exclusive tasks to efficiently adapt to novel tasks. We demonstrate its applicability to both contextual and gradient-based meta-learning algorithms, and apply it in practical settings where applying standard meta-learning has been difficult. Our approach substantially outperforms standard meta-learning algorithms in these settings. |
Tasks | Few-Shot Image Classification, Image Classification, Meta-Learning |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.03820v2 |
https://arxiv.org/pdf/1912.03820v2.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-without-memorization-1 |
Repo | |
Framework | |