January 26, 2020

3041 words 15 mins read

Paper Group ANR 1447

Paper Group ANR 1447

SWift – A SignWriting editor to bridge between deaf world and e-learning. Improving Question Generation With to the Point Context. Active online learning in the binary perceptron problem. MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders. Redditors in Recovery: Text Mining Reddit to Investigate Transitions into Drug Addi …

SWift – A SignWriting editor to bridge between deaf world and e-learning

Title SWift – A SignWriting editor to bridge between deaf world and e-learning
Authors Claudia S. Bianchini, Fabrizio Borgia, Maria de Marsico
Abstract SWift (SignWriting improved fast transcriber) is an advanced editor for SignWriting (SW). At present, SW is a promising alternative to provide documents in an easy-to-grasp written form of (any) Sign Language, the gestural way of communication which is widely adopted by the deaf community. SWift was developed SW users, either deaf or not, to support collaboration and exchange of ideas. The application allows composing and saving desired signs using elementary components, called glyphs. The procedure that was devised guides and simplifies the editing process. SWift aims at breaking the “electronic” barriers that keep the deaf community away from ICT in general, and from e-learning in particular. The editor can be contained in a pluggable module; therefore, it can be integrated everywhere the use of SW is an advisable alternative to written “verbal” language, which often hinders information grasping by deaf users.
Tasks
Published 2019-11-22
URL https://arxiv.org/abs/1911.09923v1
PDF https://arxiv.org/pdf/1911.09923v1.pdf
PWC https://paperswithcode.com/paper/swift-a-signwriting-editor-to-bridge-between
Repo
Framework

Improving Question Generation With to the Point Context

Title Improving Question Generation With to the Point Context
Authors Jingjing Li, Yifan Gao, Lidong Bing, Irwin King, Michael R. Lyu
Abstract Question generation (QG) is the task of generating a question from a reference sentence and a specified answer within the sentence. A major challenge in QG is to identify answer-relevant context words to finish the declarative-to-interrogative sentence transformation. Existing sequence-to-sequence neural models achieve this goal by proximity-based answer position encoding under the intuition that neighboring words of answers are of high possibility to be answer-relevant. However, such intuition may not apply to all cases especially for sentences with complex answer-relevant relations. Consequently, the performance of these models drops sharply when the relative distance between the answer fragment and other non-stop sentence words that also appear in the ground truth question increases. To address this issue, we propose a method to jointly model the unstructured sentence and the structured answer-relevant relation (extracted from the sentence in advance) for question generation. Specifically, the structured answer-relevant relation acts as the to the point context and it thus naturally helps keep the generated question to the point, while the unstructured sentence provides the full information. Extensive experiments show that to the point context helps our question generation model achieve significant improvements on several automatic evaluation metrics. Furthermore, our model is capable of generating diverse questions for a sentence which conveys multiple relations of its answer fragment.
Tasks Question Generation
Published 2019-10-14
URL https://arxiv.org/abs/1910.06036v2
PDF https://arxiv.org/pdf/1910.06036v2.pdf
PWC https://paperswithcode.com/paper/improving-question-generation-with-to-the
Repo
Framework

Active online learning in the binary perceptron problem

Title Active online learning in the binary perceptron problem
Authors Hai-Jun Zhou
Abstract The binary perceptron is the simplest artificial neural network formed by $N$ input units and one output unit, with the neural states and the synaptic weights all restricted to $\pm 1$ values. The task in the teacher–student scenario is to infer the hidden weight vector by training on a set of labeled patterns. Previous efforts on the passive learning mode have shown that learning from independent random patterns is quite inefficient. Here we consider the active online learning mode in which the student designs every new Ising training pattern. We demonstrate that it is mathematically possible to achieve perfect (error-free) inference using only $N$ designed training patterns, but this is computationally unfeasible for large systems. We then investigate two Bayesian statistical designing protocols, which require $2.3 N$ and $1.9 N$ training patterns, respectively, to achieve error-free inference. If the training patterns are instead designed through deductive reasoning, perfect inference is achieved using $N!+!\log_{2}!N$ samples. The performance gap between Bayesian and deductive designing strategies may be shortened in future work by taking into account the possibility of ergodicity breaking in the version space of the binary perceptron.
Tasks
Published 2019-02-21
URL http://arxiv.org/abs/1902.08043v1
PDF http://arxiv.org/pdf/1902.08043v1.pdf
PWC https://paperswithcode.com/paper/active-online-learning-in-the-binary
Repo
Framework

MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders

Title MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders
Authors Xuezhe Ma, Chunting Zhou, Eduard Hovy
Abstract Variational Autoencoder (VAE), a simple and effective deep generative model, has led to a number of impressive empirical successes and spawned many advanced variants and theoretical investigations. However, recent studies demonstrate that, when equipped with expressive generative distributions (aka. decoders), VAE suffers from learning uninformative latent representations with the observation called KL Varnishing, in which case VAE collapses into an unconditional generative model. In this work, we introduce mutual posterior-divergence regularization, a novel regularization that is able to control the geometry of the latent space to accomplish meaningful representation learning, while achieving comparable or superior capability of density estimation. Experiments on three image benchmark datasets demonstrate that, when equipped with powerful decoders, our model performs well both on density estimation and representation learning.
Tasks Density Estimation, Representation Learning
Published 2019-01-06
URL http://arxiv.org/abs/1901.01498v1
PDF http://arxiv.org/pdf/1901.01498v1.pdf
PWC https://paperswithcode.com/paper/mae-mutual-posterior-divergence
Repo
Framework

Redditors in Recovery: Text Mining Reddit to Investigate Transitions into Drug Addiction

Title Redditors in Recovery: Text Mining Reddit to Investigate Transitions into Drug Addiction
Authors John Lu, Sumati Sridhar, Ritika Pandey, Mohammad Al Hasan, George Mohler
Abstract Increasing rates of opioid drug abuse and heightened prevalence of online support communities underscore the necessity of employing data mining techniques to better understand drug addiction using these rapidly developing online resources. In this work, we obtain data from Reddit, an online collection of forums, to gather insight into drug use/misuse using text data from users themselves. Specifically, using user posts, we trained 1) a binary classifier which predicts transitions from casual drug discussion forums to drug recovery forums and 2) a Cox regression model that outputs likelihoods of such transitions. In doing so, we found that utterances of select drugs and certain linguistic features contained in one’s posts can help predict these transitions. Using unfiltered drug-related posts, our research delineates drugs that are associated with higher rates of transitions from recreational drug discussion to support/recovery discussion, offers insight into modern drug culture, and provides tools with potential applications in combating the opioid crisis.
Tasks
Published 2019-03-11
URL http://arxiv.org/abs/1903.04081v1
PDF http://arxiv.org/pdf/1903.04081v1.pdf
PWC https://paperswithcode.com/paper/redditors-in-recovery-text-mining-reddit-to
Repo
Framework

Deep Supervised Hashing leveraging Quadratic Spherical Mutual Information for Content-based Image Retrieval

Title Deep Supervised Hashing leveraging Quadratic Spherical Mutual Information for Content-based Image Retrieval
Authors Nikolaos Passalis, Anastasios Tefas
Abstract Several deep supervised hashing techniques have been proposed to allow for efficiently querying large image databases. However, deep supervised image hashing techniques are developed, to a great extent, heuristically often leading to suboptimal results. Contrary to this, we propose an efficient deep supervised hashing algorithm that optimizes the learned codes using an information-theoretic measure, the Quadratic Mutual Information (QMI). The proposed method is adapted to the needs of large-scale hashing and information retrieval leading to a novel information-theoretic measure, the Quadratic Spherical Mutual Information (QSMI). Apart from demonstrating the effectiveness of the proposed method under different scenarios and outperforming existing state-of-the-art image hashing techniques, this paper provides a structured way to model the process of information retrieval and develop novel methods adapted to the needs of each application.
Tasks Content-Based Image Retrieval, Image Retrieval, Information Retrieval
Published 2019-01-16
URL http://arxiv.org/abs/1901.05135v1
PDF http://arxiv.org/pdf/1901.05135v1.pdf
PWC https://paperswithcode.com/paper/deep-supervised-hashing-leveraging-quadratic
Repo
Framework

Learning to Learn Relation for Important People Detection in Still Images

Title Learning to Learn Relation for Important People Detection in Still Images
Authors Wei-Hong Li, Fa-Ting Hong, Wei-Shi Zheng
Abstract Humans can easily recognize the importance of people in social event images, and they always focus on the most important individuals. However, learning to learn the relation between people in an image, and inferring the most important person based on this relation, remains undeveloped. In this work, we propose a deep imPOrtance relatIon NeTwork (POINT) that combines both relation modeling and feature learning. In particular, we infer two types of interaction modules: the person-person interaction module that learns the interaction between people and the event-person interaction module that learns to describe how a person is involved in the event occurring in an image. We then estimate the importance relations among people from both interactions and encode the relation feature from the importance relations. In this way, POINT automatically learns several types of relation features in parallel, and we aggregate these relation features and the person’s feature to form the importance feature for important people classification. Extensive experimental results show that our method is effective for important people detection and verify the efficacy of learning to learn relations for important people detection.
Tasks
Published 2019-04-07
URL http://arxiv.org/abs/1904.03632v1
PDF http://arxiv.org/pdf/1904.03632v1.pdf
PWC https://paperswithcode.com/paper/learning-to-learn-relation-for-important
Repo
Framework

Towards Controllable and Personalized Review Generation

Title Towards Controllable and Personalized Review Generation
Authors Pan Li, Alexander Tuzhilin
Abstract In this paper, we propose a novel model RevGAN that automatically generates controllable and personalized user reviews based on the arbitrarily given sentimental and stylistic information. RevGAN utilizes the combination of three novel components, including self-attentive recursive autoencoders, conditional discriminators, and personalized decoders. We test its performance on the several real-world datasets, where our model significantly outperforms state-of-the-art generation models in terms of sentence quality, coherence, personalization and human evaluations. We also empirically show that the generated reviews could not be easily distinguished from the organically produced reviews and that they follow the same statistical linguistics laws.
Tasks
Published 2019-09-30
URL https://arxiv.org/abs/1910.03506v2
PDF https://arxiv.org/pdf/1910.03506v2.pdf
PWC https://paperswithcode.com/paper/towards-controllable-and-personalized-review
Repo
Framework

RouteNet: Leveraging Graph Neural Networks for network modeling and optimization in SDN

Title RouteNet: Leveraging Graph Neural Networks for network modeling and optimization in SDN
Authors Krzysztof Rusek, José Suárez-Varela, Paul Almasan, Pere Barlet-Ros, Albert Cabellos-Aparicio
Abstract Network modeling is a key enabler to achieve efficient network operation in future self-driving Software-Defined Networks. However, we still lack functional network models able to produce accurate predictions of Key Performance Indicators (KPI) such as delay, jitter or loss at limited cost. In this paper we propose RouteNet, a novel network model based on Graph Neural Network (GNN) that is able to understand the complex relationship between topology, routing and input traffic to produce accurate estimates of the per-source/destination per-packet delay distribution and loss. RouteNet leverages the ability of GNNs to learn and model graph-structured information and as a result, our model is able to generalize over arbitrary topologies, routing schemes and traffic intensity. In our evaluation, we show that RouteNet is able to predict accurately the delay distribution (mean delay and jitter) and loss even in topologies, routing and traffic unseen in the training (worst case $R^{2}$ = 0.878). Also, we present several use-cases where we leverage the KPI predictions of our GNN model to achieve efficient routing optimization and network planning.
Tasks
Published 2019-10-03
URL https://arxiv.org/abs/1910.01508v1
PDF https://arxiv.org/pdf/1910.01508v1.pdf
PWC https://paperswithcode.com/paper/routenet-leveraging-graph-neural-networks-for
Repo
Framework

Review: Ordinary Differential Equations For Deep Learning

Title Review: Ordinary Differential Equations For Deep Learning
Authors Xinshi Chen
Abstract To better understand and improve the behavior of neural networks, a recent line of works bridged the connection between ordinary differential equations (ODEs) and deep neural networks (DNNs). The connections are made in two folds: (1) View DNN as ODE discretization; (2) View the training of DNN as solving an optimal control problem. The former connection motivates people either to design neural architectures based on ODE discretization schemes or to replace DNN by a continuous model characterized by ODEs. Several works demonstrated distinct advantages of using a continuous model instead of traditional DNN in some specific applications. The latter connection is inspiring. Based on Pontryagin’s maximum principle, which is popular in the optimal control literature, some developed new optimization methods for training neural networks and some developed algorithms to train the infinite-deep continuous model with low memory-cost. This paper is organized as follows: In Section 2, the relation between neural architecture and ODE discretization is introduced. Some architectures are not motivated by ODE, but they are later found to be associated with some specific discretization schemes. Some architectures are designed based on ODE discretization and expected to achieve some special properties. Section 3 formulates the optimization problem where a traditional neural network is replaced by a continuous model (ODE). The formulated optimization problem is an optimal control problem. Therefore, two different types of controls will also be discussed in this section. In Section 4, we will discuss how we can utilize the optimization methods that are popular in optimal control literature to help the training of machine learning problems. Finally, two applications of using a continuous model will be shown in Section 5 and 6 to demonstrate some of its advantages over traditional neural networks.
Tasks
Published 2019-11-01
URL https://arxiv.org/abs/1911.00502v1
PDF https://arxiv.org/pdf/1911.00502v1.pdf
PWC https://paperswithcode.com/paper/review-ordinary-differential-equations-for
Repo
Framework

An Alternative Probabilistic Interpretation of the Huber Loss

Title An Alternative Probabilistic Interpretation of the Huber Loss
Authors Gregory P. Meyer
Abstract The Huber loss is a robust loss function used for a wide range of regression tasks. To utilize the Huber loss, a parameter that controls the transitions from a quadratic function to an absolute value function needs to be selected. We believe the standard probabilistic interpretation that relates the Huber loss to the so-called Huber density fails to provide adequate intuition for identifying the transition point. As a result, hyper-parameter search is often necessary to determine an appropriate value. In this work, we propose an alternative probabilistic interpretation of the Huber loss, which relates minimizing the Huber loss to minimizing an upper-bound on the Kullback-Leibler divergence between Laplace distributions. Furthermore, we show that the parameters of the Laplace distributions are directly related to the transition point of the Huber loss. We demonstrate through a case study and experimentation on the Faster R-CNN object detector that our interpretation provides an intuitive way to select well-suited hyper-parameters.
Tasks
Published 2019-11-05
URL https://arxiv.org/abs/1911.02088v1
PDF https://arxiv.org/pdf/1911.02088v1.pdf
PWC https://paperswithcode.com/paper/an-alternative-probabilistic-interpretation
Repo
Framework

Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator

Title Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator
Authors Yuwei Luo, Zhuoran Yang, Zhaoran Wang, Mladen Kolar
Abstract Multi-agent reinforcement learning has been successfully applied to a number of challenging problems. Despite these empirical successes, theoretical understanding of different algorithms is lacking, primarily due to the curse of dimensionality caused by the exponential growth of the state-action space with the number of agents. We study a fundamental problem of multi-agent linear quadratic regulator in a setting where the agents are partially exchangeable. In this setting, we develop a hierarchical actor-critic algorithm, whose computational complexity is independent of the total number of agents, and prove its global linear convergence to the optimal policy. As linear quadratic regulators are often used to approximate general dynamic systems, this paper provided an important step towards better understanding of general hierarchical mean-field multi-agent reinforcement learning.
Tasks Multi-agent Reinforcement Learning
Published 2019-12-14
URL https://arxiv.org/abs/1912.06875v1
PDF https://arxiv.org/pdf/1912.06875v1.pdf
PWC https://paperswithcode.com/paper/natural-actor-critic-converges-globally-for
Repo
Framework

Annotating Student Talk in Text-based Classroom Discussions

Title Annotating Student Talk in Text-based Classroom Discussions
Authors Luca Lugini, Diane Litman, Amanda Godley, Christopher Olshefski
Abstract Classroom discussions in English Language Arts have a positive effect on students’ reading, writing and reasoning skills. Although prior work has largely focused on teacher talk and student-teacher interactions, we focus on three theoretically-motivated aspects of high-quality student talk: argumentation, specificity, and knowledge domain. We introduce an annotation scheme, then show that the scheme can be used to produce reliable annotations and that the annotations are predictive of discussion quality. We also highlight opportunities provided by our scheme for education and natural language processing research.
Tasks
Published 2019-09-06
URL https://arxiv.org/abs/1909.03023v1
PDF https://arxiv.org/pdf/1909.03023v1.pdf
PWC https://paperswithcode.com/paper/annotating-student-talk-in-text-based-1
Repo
Framework

SleepNet: Automated Sleep Analysis via Dense Convolutional Neural Network Using Physiological Time Series

Title SleepNet: Automated Sleep Analysis via Dense Convolutional Neural Network Using Physiological Time Series
Authors Bahareh Pourbabaee, Matthew Howe-Patterson, Matthew Patterson, Frederic Benard
Abstract In this work, a dense recurrent convolutional neural network (DRCNN) was constructed to detect sleep disorders including arousal, apnea and hypopnea using Polysomnography (PSG) measurement channels provided in the 2018 Physionet challenge database. Our model structure is composed of multiple dense convolutional units (DCU) followed by a bidirectional long-short term memory (LSTM) layer followed by a softmax output layer. The sleep events including sleep stages, arousal regions and multiple types of apnea and hypopnea are manually annotated by experts which enables us to train our proposed network using a multi-task learning mechanism. Three binary cross-entropy loss functions corresponding to sleep/wake, target arousal and apnea-hypopnea/normal detection tasks are summed up to generate our overall network loss function that is optimized using the Adam method. Our model performance was evaluated using two metrics: the area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUROC). To measure our model generalization, 4-fold cross-validation was also performed. For training, our model was applied to full night recording data. Finally, the average AUPRC and AUROC values associated with the arousal detection task were 0.505 and 0.922, respectively on our testing dataset. An ensemble of four models trained on different data folds improved the AUPRC and AUROC to 0.543 and 0.931, respectively. Our proposed algorithm achieved the first place in the official stage of the 2018 Physionet challenge for detecting sleep arousals with AUPRC of 0.54 on the blind testing dataset.
Tasks Multi-Task Learning, Time Series
Published 2019-03-11
URL https://arxiv.org/abs/1903.04377v2
PDF https://arxiv.org/pdf/1903.04377v2.pdf
PWC https://paperswithcode.com/paper/sleepnet-automated-sleep-disorder-detection
Repo
Framework

Online Distributed Estimation of Principal Eigenspaces

Title Online Distributed Estimation of Principal Eigenspaces
Authors Davoud Ataee Tarzanagh, Mohamad Kazem Shirani Faradonbeh, George Michailidis
Abstract Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further establish its rate of convergence and show how it relates to the number of nodes employed in the distributed computation, the effective rank of the data matrix under consideration, and the gap in the spectrum of the underlying population covariance matrix. The proposed algorithm is illustrated on low-rank approximation and $\boldsymbol{k}$-means clustering tasks. The numerical results show a substantial computational speed-up vis-a-vis standard distributed PCA algorithms, without compromising learning accuracy.
Tasks Dimensionality Reduction
Published 2019-05-17
URL https://arxiv.org/abs/1905.07389v1
PDF https://arxiv.org/pdf/1905.07389v1.pdf
PWC https://paperswithcode.com/paper/online-distributed-estimation-of-principal
Repo
Framework
comments powered by Disqus