Paper Group ANR 388
Deep Lagrangian Networks for end-to-end learning of energy-based control for under-actuated systems. Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark. Active Learning of Spin Network Models. Unsupervised Moving Object Detection via Contextual Information Separation. Relevance-Promoting Language Model for Short-Te …
Deep Lagrangian Networks for end-to-end learning of energy-based control for under-actuated systems
Title | Deep Lagrangian Networks for end-to-end learning of energy-based control for under-actuated systems |
Authors | Michael Lutter, Kim Listmann, Jan Peters |
Abstract | Applying Deep Learning to control has a lot of potential for enabling the intelligent design of robot control laws. Unfortunately common deep learning approaches to control, such as deep reinforcement learning, require an unrealistic amount of interaction with the real system, do not yield any performance guarantees, and do not make good use of extensive insights from model-based control. In particular, common black-box approaches – that abandon all insight from control – are not suitable for complex robot systems. We propose a deep control approach as a bridge between the solid theoretical foundations of energy-based control and the flexibility of deep learning. To accomplish this goal, we extend Deep Lagrangian Networks (DeLaN) to not only adhere to Lagrangian Mechanics but also ensure conservation of energy and passivity of the learned representation. This novel extension is embedded within generic model-based control laws to enable energy control of under-actuated systems. The resulting DeLaN for energy control (DeLaN 4EC) is the first model learning approach using generic function approximation that is capable of learning energy control. DeLaN 4EC exhibits excellent real-time control on the physical Furuta Pendulum and learns to swing-up the pendulum while the control law using system identification does not. |
Tasks | |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04489v2 |
https://arxiv.org/pdf/1907.04489v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-lagrangian-networks-for-end-to-end |
Repo | |
Framework | |
Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark
Title | Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark |
Authors | Janak Dahal, Elias Ioup, Shaikh Arifuzzaman, Mahdi Abdelguerfi |
Abstract | Real-world data from diverse domains require real-time scalable analysis. Large-scale data processing frameworks or engines such as Hadoop fall short when results are needed on-the-fly. Apache Spark’s streaming library is increasingly becoming a popular choice as it can stream and analyze a significant amount of data. In this paper, we analyze large-scale geo-temporal data collected from the USGODAE (United States Global Ocean Data Assimilation Experiment) data catalog, and showcase and assess the ability of Spark stream processing. We measure the latency of streaming and monitor scalability by adding and removing nodes in the middle of a streaming job. We also verify the fault tolerance by stopping nodes in the middle of a job and making sure that the job is rescheduled and completed on other nodes. We design a full-stack application that automates data collection, data processing and visualizing the results. We also use Google Maps API to visualize results by color coding the world map with values from various analytics. |
Tasks | |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1907.13264v2 |
https://arxiv.org/pdf/1907.13264v2.pdf | |
PWC | https://paperswithcode.com/paper/distributed-streaming-analytics-on-large |
Repo | |
Framework | |
Active Learning of Spin Network Models
Title | Active Learning of Spin Network Models |
Authors | Jialong Jiang, David A. Sivak, Matt Thomson |
Abstract | The inverse statistical problem of finding direct interactions in complex networks is difficult. In the natural sciences, well-controlled perturbation experiments are widely used to probe the structure of complex networks. However, our understanding of how and why perturbations aid inference remains heuristic, and we lack automated procedures that determine network structure by combining inference and perturbation. Therefore, we propose a general mathematical framework to study inference with iteratively applied perturbations. Using the formulation of information geometry, our framework quantifies the difficulty of inference and the information gain from perturbations through the curvature of the underlying parameter manifold, measured by Fisher information. We apply the framework to the inference of spin network models and find that designed perturbations can reduce the sampling complexity by $10^6$-fold across a variety of network architectures. Physically, our framework reveals that perturbations boost inference by causing a network to explore previously inaccessible states. Optimal perturbations break spin-spin correlations within a network, increasing the information available for inference and thus reducing sampling complexity by orders of magnitude. Our active learning framework could be powerful in the analysis of complex networks as well as in the rational design of experiments. |
Tasks | Active Learning |
Published | 2019-03-25 |
URL | https://arxiv.org/abs/1903.10474v3 |
https://arxiv.org/pdf/1903.10474v3.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-of-spin-network-models |
Repo | |
Framework | |
Unsupervised Moving Object Detection via Contextual Information Separation
Title | Unsupervised Moving Object Detection via Contextual Information Separation |
Authors | Yanchao Yang, Antonio Loquercio, Davide Scaramuzza, Stefano Soatto |
Abstract | We propose an adversarial contextual model for detecting moving objects in images. A deep neural network is trained to predict the optical flow in a region using information from everywhere else but that region (context), while another network attempts to make such context as uninformative as possible. The result is a model where hypotheses naturally compete with no need for explicit regularization or hyper-parameter tuning. Although our method requires no supervision whatsoever, it outperforms several methods that are pre-trained on large annotated datasets. Our model can be thought of as a generalization of classical variational generative region-based segmentation, but in a way that avoids explicit regularization or solution of partial differential equations at run-time. |
Tasks | Object Detection, Optical Flow Estimation |
Published | 2019-01-10 |
URL | http://arxiv.org/abs/1901.03360v2 |
http://arxiv.org/pdf/1901.03360v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-moving-object-detection-via |
Repo | |
Framework | |
Relevance-Promoting Language Model for Short-Text Conversation
Title | Relevance-Promoting Language Model for Short-Text Conversation |
Authors | Xin Li, Piji Li, Wei Bi, Xiaojiang Liu, Wai Lam |
Abstract | Despite the effectiveness of sequence-to-sequence framework on the task of Short-Text Conversation (STC), the issue of under-exploitation of training data (i.e., the supervision signals from query text is \textit{ignored}) still remains unresolved. Also, the adopted \textit{maximization}-based decoding strategies, inclined to generating the generic responses or responses with repetition, are unsuited to the STC task. In this paper, we propose to formulate the STC task as a language modeling problem and tailor-make a training strategy to adapt a language model for response generation. To enhance generation performance, we design a relevance-promoting transformer language model, which performs additional supervised source attention after the self-attention to increase the importance of informative query tokens in calculating the token-level representation. The model further refines the query representation with relevance clues inferred from its multiple references during training. In testing, we adopt a \textit{randomization-over-maximization} strategy to reduce the generation of generic responses. Experimental results on a large Chinese STC dataset demonstrate the superiority of the proposed model on relevance metrics and diversity metrics.\footnote{Code available at https://ai.tencent.com/ailab/nlp/dialogue/. |
Tasks | Language Modelling, Short-Text Conversation |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11489v1 |
https://arxiv.org/pdf/1911.11489v1.pdf | |
PWC | https://paperswithcode.com/paper/relevance-promoting-language-model-for-short |
Repo | |
Framework | |
Unsupervised and Supervised Principal Component Analysis: Tutorial
Title | Unsupervised and Supervised Principal Component Analysis: Tutorial |
Authors | Benyamin Ghojogh, Mark Crowley |
Abstract | This is a detailed tutorial paper which explains the Principal Component Analysis (PCA), Supervised PCA (SPCA), kernel PCA, and kernel SPCA. We start with projection, PCA with eigen-decomposition, PCA with one and multiple projection directions, properties of the projection matrix, reconstruction error minimization, and we connect to auto-encoder. Then, PCA with singular value decomposition, dual PCA, and kernel PCA are covered. SPCA using both scoring and Hilbert-Schmidt independence criterion are explained. Kernel SPCA using both direct and dual approaches are then introduced. We cover all cases of projection and reconstruction of training and out-of-sample data. Finally, some simulations are provided on Frey and AT&T face datasets for verifying the theory in practice. |
Tasks | |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.03148v1 |
https://arxiv.org/pdf/1906.03148v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-and-supervised-principal |
Repo | |
Framework | |
Using Social Media for Word-of-Mouth Marketing
Title | Using Social Media for Word-of-Mouth Marketing |
Authors | Nagendra Kumar, Yash Chandarana, K. Anand, Manish Singh |
Abstract | Nowadays online social networks are used extensively for personal and commercial purposes. This widespread popularity makes them an ideal platform for advertisements. Social media can be used for both direct and word-of-mouth (WoM) marketing. Although WoM marketing is considered more effective and it requires less advertisement cost, it is currently being under-utilized. To do WoM marketing, we need to identify a set of people who can use their authoritative position in social network to promote a given product. In this paper, we show how to do WoM marketing in Facebook group, which is a question answer type of social network. We also present concept of reinforced WoM marketing, where multiple authorities can together promote a product to increase the effectiveness of marketing. We perform our experiments on Facebook group dataset consisting of 0.3 million messages and 10 million user reactions. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08298v1 |
https://arxiv.org/pdf/1908.08298v1.pdf | |
PWC | https://paperswithcode.com/paper/using-social-media-for-word-of-mouth |
Repo | |
Framework | |
Deep Unsupervised Cardinality Estimation
Title | Deep Unsupervised Cardinality Estimation |
Authors | Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M. Hellerstein, Sanjay Krishnan, Ion Stoica |
Abstract | Cardinality estimation has long been grounded in statistical tools for density estimation. To capture the rich multivariate distributions of relational tables, we propose the use of a new type of high-capacity statistical model: deep autoregressive models. However, direct application of these models leads to a limited estimator that is prohibitively expensive to evaluate for range or wildcard predicates. To produce a truly usable estimator, we develop a Monte Carlo integration scheme on top of autoregressive models that can efficiently handle range queries with dozens of dimensions or more. Like classical synopses, our estimator summarizes the data without supervision. Unlike previous solutions, we approximate the joint data distribution without any independence assumptions. Evaluated on real-world datasets and compared against real systems and dominant families of techniques, our estimator achieves single-digit multiplicative error at tail, an up to 90$\times$ accuracy improvement over the second best method, and is space- and runtime-efficient. |
Tasks | Density Estimation |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04278v2 |
https://arxiv.org/pdf/1905.04278v2.pdf | |
PWC | https://paperswithcode.com/paper/selectivity-estimation-with-deep-likelihood |
Repo | |
Framework | |
Commit2Vec: Learning Distributed Representations of Code Changes
Title | Commit2Vec: Learning Distributed Representations of Code Changes |
Authors | Rocìo Cabrera Lozoya, Arnaud Baumann, Antonino Sabetta, Michele Bezzi |
Abstract | Deep learning methods, which have found successful applications in fields like image classification and natural language processing, have recently been applied to source code analysis too, due to the enormous amount of freely available source code (e.g., from open-source software repositories). In this work, we elaborate upon a state-of-the-art approach to the representation of source code that uses information about its syntactic structure, and we adapt it to represent source changes (i.e., commits). We use this representation to classify security-relevant commits. Because our method uses transfer learning (that is, we train a network on a “pretext task” for which abundant labeled data is available, and then we use such network for the target task of commit classification, for which fewer labeled instances are available), we studied the impact of pre-training the network using two different pretext tasks versus a randomly initialized model. Our results indicate that representations that leverage the structural information obtained through code syntax outperform token-based representations. Furthermore, the performance metrics obtained when pre-training on a loosely related pretext task with a very large dataset ($>10^6$ samples) were surpassed when pretraining on a smaller dataset ($>10^4$ samples) but for a pretext task that is more closely related to the target task. |
Tasks | Image Classification, Transfer Learning |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07605v3 |
https://arxiv.org/pdf/1911.07605v3.pdf | |
PWC | https://paperswithcode.com/paper/patch2vec-distributed-representation-of-code |
Repo | |
Framework | |
Short Text Conversation Based on Deep Neural Network and Analysis on Evaluation Measures
Title | Short Text Conversation Based on Deep Neural Network and Analysis on Evaluation Measures |
Authors | Hsiang-En Cherng, Chia-Hui Chang |
Abstract | With the development of Natural Language Processing, Automatic question-answering system such as Waston, Siri, Alexa, has become one of the most important NLP applications. Nowadays, enterprises try to build automatic custom service chatbots to save human resources and provide a 24-hour customer service. Evaluation of chatbots currently relied greatly on human annotation which cost a plenty of time. Thus, has initiated a new Short Text Conversation subtask called Dialogue Quality (DQ) and Nugget Detection (ND) which aim to automatically evaluate dialogues generated by chatbots. In this paper, we solve the DQ and ND subtasks by deep neural network. We proposed two models for both DQ and ND subtasks which is constructed by hierarchical structure: embedding layer, utterance layer, context layer and memory layer, to hierarchical learn dialogue representation from word level, sentence level, context level to long range context level. Furthermore, we apply gating and attention mechanism at utterance layer and context layer to improve the performance. We also tried BERT to replace embedding layer and utterance layer as sentence representation. The result shows that BERT produced a better utterance representation than multi-stack CNN for both DQ and ND subtasks and outperform other models proposed by other researches. The evaluation measures are proposed by , that is, NMD, RSNOD for DQ and JSD, RNSS for ND, which is not traditional evaluation measures such as accuracy, precision, recall and f1-score. Thus, we have done a series of experiments by using traditional evaluation measures and analyze the performance and error. |
Tasks | Question Answering, Short-Text Conversation |
Published | 2019-07-06 |
URL | https://arxiv.org/abs/1907.03070v1 |
https://arxiv.org/pdf/1907.03070v1.pdf | |
PWC | https://paperswithcode.com/paper/short-text-conversation-based-on-deep-neural |
Repo | |
Framework | |
Hierarchical Hidden Markov Jump Processes for Cancer Screening Modeling
Title | Hierarchical Hidden Markov Jump Processes for Cancer Screening Modeling |
Authors | Rui Meng, Soper Braden, Jan Nygard, Mari Nygrad, Herbert Lee |
Abstract | Hidden Markov jump processes are an attractive approach for modeling clinical disease progression data because they are explainable and capable of handling both irregularly sampled and noisy data. Most applications in this context consider time-homogeneous models due to their relative computational simplicity. However, the time homogeneous assumption is too strong to accurately model the natural history of many diseases. Moreover, the population at risk is not homogeneous either, since disease exposure and susceptibility can vary considerably. In this paper, we propose a piece-wise stationary transition matrix to explain the heterogeneity in time. We propose a hierarchical structure for the heterogeneity in population, where prior information is considered to deal with unbalanced data. Moreover, an efficient, scalable EM algorithm is proposed for inference. We demonstrate the feasibility and superiority of our model on a cervical cancer screening dataset from the Cancer Registry of Norway. Experiments show that our model outperforms state-of-the-art recurrent neural network models in terms of prediction accuracy and significantly outperforms a standard hidden Markov jump process in generating Kaplan-Meier estimators. |
Tasks | |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05847v1 |
https://arxiv.org/pdf/1910.05847v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-hidden-markov-jump-processes-for |
Repo | |
Framework | |
Spacecraft design optimisation for demise and survivability
Title | Spacecraft design optimisation for demise and survivability |
Authors | Mirko Trisolini, Hugh G. Lewis, Camilla Colombo |
Abstract | Among the mitigation measures introduced to cope with the space debris issue there is the de-orbiting of decommissioned satellites. Guidelines for re-entering objects call for a ground casualty risk no higher than 0.0001. To comply with this requirement, satellites can be designed through a design-for-demise philosophy. Still, a spacecraft designed to demise has to survive the debris-populated space environment for many years. The demisability and the survivability of a satellite can both be influenced by a set of common design choices such as the material selection, the geometry definition, and the position of the components. Within this context, two models have been developed to analyse the demise and the survivability of satellites. Given the competing nature of the demisability and the survivability, a multi-objective optimisation framework was developed, with the aim to identify trade-off solutions for the preliminary design of satellites. As the problem is nonlinear and involves the combination of continuous and discrete variables, classical derivative based approaches are unsuited and a genetic algorithm was selected instead. The genetic algorithm uses the developed demisability and survivability criteria as the fitness functions of the multi-objective algorithm. The paper presents a test case, which considers the preliminary optimisation of tanks in terms of material, geometry, location, and number of tanks for a representative Earth observation mission. The configuration of the external structure of the spacecraft is fixed. Tanks were selected because they are sensitive to both design requirements: they represent critical components in the demise process and impact damage can cause the loss of the mission because of leaking and ruptures. The results present the possible trade off solutions, constituting the Pareto front obtained from the multi-objective optimisation. |
Tasks | |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05091v1 |
https://arxiv.org/pdf/1910.05091v1.pdf | |
PWC | https://paperswithcode.com/paper/spacecraft-design-optimisation-for-demise-and |
Repo | |
Framework | |
Computational Approaches to Access Probabilistic Population Codes for Higher Cognition an Decision-Making
Title | Computational Approaches to Access Probabilistic Population Codes for Higher Cognition an Decision-Making |
Authors | Kevin Jasberg, Sergej Sizov |
Abstract | In recent years, research unveiled more and more evidence for the so-called Bayesian Brain Paradigm, i.e. the human brain is interpreted as a probabilistic inference machine and Bayesian modelling approaches are hence used successfully. One of the many theories is that of Probabilistic Population Codes (PPC). Although this model has so far only been considered as meaningful and useful for sensory perception as well as motor control, it has always been suggested that this mechanism also underlies higher cognition and decision-making. However, the adequacy of PPC for this regard cannot be confirmed by means of neurological standard measurement procedures. In this article we combine the parallel research branches of recommender systems and predictive data mining with theoretical neuroscience. The nexus of both fields is given by behavioural variability and resulting internal distributions. We adopt latest experimental settings and measurement approaches from predictive data mining to obtain these internal distributions, to inform the theoretical PPC approach and to deduce medical correlates which can indeed be measured in vivo. This is a strong hint for the applicability of the PPC approach and the Bayesian Brain Paradigm for higher cognition and human decision-making. |
Tasks | Decision Making, Recommendation Systems |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.12651v1 |
http://arxiv.org/pdf/1904.12651v1.pdf | |
PWC | https://paperswithcode.com/paper/computational-approaches-to-access |
Repo | |
Framework | |
Verbal Programming of Robot Behavior
Title | Verbal Programming of Robot Behavior |
Authors | Jonathan Connell |
Abstract | Home robots may come with many sophisticated built-in abilities, however there will always be a degree of customization needed for each user and environment. Ideally this should be accomplished through one-shot learning, as collecting the large number of examples needed for statistical inference is tedious. A particularly appealing approach is to simply explain to the robot, via speech, what it should be doing. In this paper we describe the ALIA cognitive architecture that is able to effectively incorporate user-supplied advice and prohibitions in this manner. The functioning of the implemented system on a small robot is illustrated by an associated video. |
Tasks | One-Shot Learning |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09782v1 |
https://arxiv.org/pdf/1911.09782v1.pdf | |
PWC | https://paperswithcode.com/paper/verbal-programming-of-robot-behavior |
Repo | |
Framework | |
Modeling Intelligent Decision Making Command And Control Agents: An Application to Air Defense
Title | Modeling Intelligent Decision Making Command And Control Agents: An Application to Air Defense |
Authors | Sumanta Kumar Das |
Abstract | The paper is a half-way between the agent technology and the mathematical reasoning to model tactical decision making tasks. These models are applied to air defense (AD) domain for command and control (C2). It also addresses the issues related to evaluation of agents. The agents are designed and implemented using the agent-programming paradigm. The agents are deployed in an air combat simulated environment for performing the tasks of C2 like electronic counter counter measures, threat assessment, and weapon allocation. The simulated AD system runs without any human intervention, and represents state-of-the-art model for C2 autonomy. The use of agents as autonomous decision making entities is particularly useful in view of futuristic network centric warfare. |
Tasks | Decision Making |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1903.08412v1 |
http://arxiv.org/pdf/1903.08412v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-intelligent-decision-making-command |
Repo | |
Framework | |