Paper Group ANR 147
A Hybrid Approach to Dependency Parsing: Combining Rules and Morphology with Deep Learning. Utilizing Human Memory Processes to Model Genre Preferences for Personalized Music Recommendations. Evolutionary algorithms for constructing an ensemble of decision trees. Combining SchNet and SHARC: The SchNarc machine learning approach for excited-state dy …
A Hybrid Approach to Dependency Parsing: Combining Rules and Morphology with Deep Learning
Title | A Hybrid Approach to Dependency Parsing: Combining Rules and Morphology with Deep Learning |
Authors | Şaziye Betül Özateş, Arzucan Özgür, Tunga Güngör, Balkız Öztürk |
Abstract | Fully data-driven, deep learning-based models are usually designed as language-independent and have been shown to be successful for many natural language processing tasks. However, when the studied language is low-resourced and the amount of training data is insufficient, these models can benefit from the integration of natural language grammar-based information. We propose two approaches to dependency parsing especially for languages with restricted amount of training data. Our first approach combines a state-of-the-art deep learning-based parser with a rule-based approach and the second one incorporates morphological information into the parser. In the rule-based approach, the parsing decisions made by the rules are encoded and concatenated with the vector representations of the input words as additional information to the deep network. The morphology-based approach proposes different methods to include the morphological structure of words into the parser network. Experiments are conducted on the IMST-UD Treebank and the results suggest that integration of explicit knowledge about the target language to a neural parser through a rule-based parsing system and morphological analysis leads to more accurate annotations and hence, increases the parsing performance in terms of attachment scores. The proposed methods are developed for Turkish, but can be adapted to other languages as well. |
Tasks | Dependency Parsing, Morphological Analysis |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10116v1 |
https://arxiv.org/pdf/2002.10116v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-approach-to-dependency-parsing |
Repo | |
Framework | |
Utilizing Human Memory Processes to Model Genre Preferences for Personalized Music Recommendations
Title | Utilizing Human Memory Processes to Model Genre Preferences for Personalized Music Recommendations |
Authors | Dominik Kowald, Elisabeth Lex, Markus Schedl |
Abstract | In this paper, we introduce a psychology-inspired approach to model and predict the music genre preferences of different groups of users by utilizing human memory processes. These processes describe how humans access information units in their memory by considering the factors of (i) past usage frequency, (ii) past usage recency, and (iii) the current context. Using a publicly available dataset of more than a billion music listening records shared on the music streaming platform Last.fm, we find that our approach provides significantly better prediction accuracy results than various baseline algorithms for all evaluated user groups, i.e., (i) low-mainstream music listeners, (ii) medium-mainstream music listeners, and (iii) high-mainstream music listeners. Furthermore, our approach is based on a simple psychological model, which contributes to the transparency and explainability of the calculated predictions. |
Tasks | |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.10699v1 |
https://arxiv.org/pdf/2003.10699v1.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-human-memory-processes-to-model |
Repo | |
Framework | |
Evolutionary algorithms for constructing an ensemble of decision trees
Title | Evolutionary algorithms for constructing an ensemble of decision trees |
Authors | Evgeny Dolotov, Nikolai Zolotykh |
Abstract | Most decision tree induction algorithms are based on a greedy top-down recursive partitioning strategy for tree growth. In this paper, we propose several methods for induction of decision trees and their ensembles based on evolutionary algorithms. The main difference of our approach is using real-valued vector representation of decision tree that allows to use a large number of different optimization algorithms, as well as optimize the whole tree or ensemble for avoiding local optima. Differential evolution and evolution strategies were chosen as optimization algorithms, as they have good results in reinforcement learning problems. We test the predictive performance of this methods using several public UCI data sets, and the proposed methods show better quality than classical methods. |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00721v1 |
https://arxiv.org/pdf/2002.00721v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-algorithms-for-constructing-an |
Repo | |
Framework | |
Combining SchNet and SHARC: The SchNarc machine learning approach for excited-state dynamics
Title | Combining SchNet and SHARC: The SchNarc machine learning approach for excited-state dynamics |
Authors | Julia Westermayr, Michael Gastegger, Philipp Marquetand |
Abstract | In recent years, deep learning has become a part of our everyday life and is revolutionizing quantum chemistry as well. In this work, we show how deep learning can be used to advance the research field of photochemistry by learning all important properties for photodynamics simulations. The properties are multiple energies, forces, nonadiabatic couplings and spin-orbit couplings. The nonadiabatic couplings are learned in a phase-free manner as derivatives of a virtually constructed property by the deep learning model, which guarantees rotational covariance. Additionally, an approximation for nonadiabatic couplings is introduced, based on the potentials, their gradients and Hessians. As deep-learning method, we employ SchNet extended for multiple electronic states. In combination with the molecular dynamics program SHARC, our approach termed SchNarc is tested on a model system and two realistic polyatomic molecules and paves the way towards efficient photodynamics simulations of complex systems. |
Tasks | |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.07264v1 |
https://arxiv.org/pdf/2002.07264v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-schnet-and-sharc-the-schnarc |
Repo | |
Framework | |
Controllable Descendant Face Synthesis
Title | Controllable Descendant Face Synthesis |
Authors | Yong Zhang, Le Li, Zhilei Liu, Baoyuan Wu, Yanbo Fan, Zhifeng Li |
Abstract | Kinship face synthesis is an interesting topic raised to answer questions like “what will your future children look like?". Published approaches to this topic are limited. Most of the existing methods train models for one-versus-one kin relation, which only consider one parent face and one child face by directly using an auto-encoder without any explicit control over the resemblance of the synthesized face to the parent face. In this paper, we propose a novel method for controllable descendant face synthesis, which models two-versus-one kin relation between two parent faces and one child face. Our model consists of an inheritance module and an attribute enhancement module, where the former is designed for accurate control over the resemblance between the synthesized face and parent faces, and the latter is designed for control over age and gender. As there is no large scale database with father-mother-child kinship annotation, we propose an effective strategy to train the model without using the ground truth descendant faces. No carefully designed image pairs are required for learning except only age and gender labels of training faces. We conduct comprehensive experimental evaluations on three public benchmark databases, which demonstrates encouraging results. |
Tasks | Face Generation |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11376v1 |
https://arxiv.org/pdf/2002.11376v1.pdf | |
PWC | https://paperswithcode.com/paper/controllable-descendant-face-synthesis |
Repo | |
Framework | |
Quantum statistical query learning
Title | Quantum statistical query learning |
Authors | Srinivasan Arunachalam, Alex B. Grilo, Henry Yuen |
Abstract | We propose a learning model called the quantum statistical learning QSQ model, which extends the SQ learning model introduced by Kearns to the quantum setting. Our model can be also seen as a restriction of the quantum PAC learning model: here, the learner does not have direct access to quantum examples, but can only obtain estimates of measurement statistics on them. Theoretically, this model provides a simple yet expressive setting to explore the power of quantum examples in machine learning. From a practical perspective, since simpler operations are required, learning algorithms in the QSQ model are more feasible for implementation on near-term quantum devices. We prove a number of results about the QSQ learning model. We first show that parity functions, (log n)-juntas and polynomial-sized DNF formulas are efficiently learnable in the QSQ model, in contrast to the classical setting where these problems are provably hard. This implies that many of the advantages of quantum PAC learning can be realized even in the more restricted quantum SQ learning model. It is well-known that weak statistical query dimension, denoted by WSQDIM(C), characterizes the complexity of learning a concept class C in the classical SQ model. We show that log(WSQDIM(C)) is a lower bound on the complexity of QSQ learning, and furthermore it is tight for certain concept classes C. Additionally, we show that this quantity provides strong lower bounds for the small-bias quantum communication model under product distributions. Finally, we introduce the notion of private quantum PAC learning, in which a quantum PAC learner is required to be differentially private. We show that learnability in the QSQ model implies learnability in the quantum private PAC model. Additionally, we show that in the private PAC learning setting, the classical and quantum sample complexities are equal, up to constant factors. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08240v1 |
https://arxiv.org/pdf/2002.08240v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-statistical-query-learning |
Repo | |
Framework | |
Modelling High-Order Social Relations for Item Recommendation
Title | Modelling High-Order Social Relations for Item Recommendation |
Authors | Yang Liu, Liang Chen, Xiangnan He, Jiaying Peng, Zibin Zheng, Jie Tang |
Abstract | The prevalence of online social network makes it compulsory to study how social relations affect user choice. However, most existing methods leverage only first-order social relations, that is, the direct neighbors that are connected to the target user. The high-order social relations, e.g., the friends of friends, which very informative to reveal user preference, have been largely ignored. In this work, we focus on modeling the indirect influence from the high-order neighbors in social networks to improve the performance of item recommendation. Distinct from mainstream social recommenders that regularize the model learning with social relations, we instead propose to directly factor social relations in the predictive model, aiming at learning better user embeddings to improve recommendation. To address the challenge that high-order neighbors increase dramatically with the order size, we propose to recursively “propagate” embeddings along the social network, effectively injecting the influence of high-order neighbors into user representation. We conduct experiments on two real datasets of Yelp and Douban to verify our High-Order Social Recommender (HOSR) model. Empirical results show that our HOSR significantly outperforms recent graph regularization-based recommenders NSCR and IF-BPR+, and graph convolutional network-based social influence prediction model DeepInf, achieving new state-of-the-arts of the task. |
Tasks | |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10149v1 |
https://arxiv.org/pdf/2003.10149v1.pdf | |
PWC | https://paperswithcode.com/paper/modelling-high-order-social-relations-for |
Repo | |
Framework | |
Multi-Task Learning with Auxiliary Speaker Identification for Conversational Emotion Recognition
Title | Multi-Task Learning with Auxiliary Speaker Identification for Conversational Emotion Recognition |
Authors | Jingye Li, Meishan Zhang, Donghong Ji, Yijiang Liu |
Abstract | Conversational emotion recognition (CER) has attracted increasing interests in the natural language processing (NLP) community. Different from the vanilla emotion recognition, effective speaker-sensitive utterance representation is one major challenge for CER. In this paper, we exploit speaker identification (SI) as an auxiliary task to enhance the utterance representation in conversations. By this method, we can learn better speaker-aware contextual representations from the additional SI corpus. Experiments on two benchmark datasets demonstrate that the proposed architecture is highly effective for CER, obtaining new state-of-the-art results on two datasets. |
Tasks | Emotion Recognition, Emotion Recognition in Conversation, Multi-Task Learning, Speaker Identification |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01478v2 |
https://arxiv.org/pdf/2003.01478v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-network-for-emotion |
Repo | |
Framework | |
Explaining Knowledge Distillation by Quantifying the Knowledge
Title | Explaining Knowledge Distillation by Quantifying the Knowledge |
Authors | Xu Cheng, Zhefan Rao, Yilan Chen, Quanshi Zhang |
Abstract | This paper presents a method to interpret the success of knowledge distillation by quantifying and analyzing task-relevant and task-irrelevant visual concepts that are encoded in intermediate layers of a deep neural network (DNN). More specifically, three hypotheses are proposed as follows. 1. Knowledge distillation makes the DNN learn more visual concepts than learning from raw data. 2. Knowledge distillation ensures that the DNN is prone to learning various visual concepts simultaneously. Whereas, in the scenario of learning from raw data, the DNN learns visual concepts sequentially. 3. Knowledge distillation yields more stable optimization directions than learning from raw data. Accordingly, we design three types of mathematical metrics to evaluate feature representations of the DNN. In experiments, we diagnosed various DNNs, and above hypotheses were verified. |
Tasks | |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03622v1 |
https://arxiv.org/pdf/2003.03622v1.pdf | |
PWC | https://paperswithcode.com/paper/explaining-knowledge-distillation-by |
Repo | |
Framework | |
Team O2AS at the World Robot Summit 2018: An Approach to Robotic Kitting and Assembly Tasks using General Purpose Grippers and Tools
Title | Team O2AS at the World Robot Summit 2018: An Approach to Robotic Kitting and Assembly Tasks using General Purpose Grippers and Tools |
Authors | Felix von Drigalski, Chisato Nakashima, Yoshiya Shibata, Yoshinori Konishi, Joshua C. Triyonoputro, Kaidi Nie, Damien Petit, Toshio Ueshiba, Ryuichi Takase, Yukiyasu Domae, Taku Yoshioka, Yoshihisa Ijiri, Ixchel G. Ramirez-Alpizar, Weiwei Wan, Kensuke Harada |
Abstract | We propose a versatile robotic system for kitting and assembly tasks which uses no jigs or commercial tool changers. Instead of specialized end effectors, it uses its two-finger grippers to grasp and hold tools to perform subtasks such as screwing and suctioning. A third gripper is used as a precision picking and centering tool, and uses in-built passive compliance to compensate for small position errors and uncertainty. A novel grasp point detection for bin picking is described for the kitting task, using a single depth map. Using the proposed system we competed in the Assembly Challenge of the Industrial Robotics Category of the World Robot Challenge at the World Robot Summit 2018, obtaining 4th place and the SICE award for lean design and versatile tool use. We show the effectiveness of our approach through experiments performed during the competition. |
Tasks | |
Published | 2020-03-05 |
URL | https://arxiv.org/abs/2003.02427v1 |
https://arxiv.org/pdf/2003.02427v1.pdf | |
PWC | https://paperswithcode.com/paper/team-o2as-at-the-world-robot-summit-2018-an |
Repo | |
Framework | |
Improving data-driven global weather prediction using deep convolutional neural networks on a cubed sphere
Title | Improving data-driven global weather prediction using deep convolutional neural networks on a cubed sphere |
Authors | Jonathan A. Weyn, Dale R. Durran, Rich Caruana |
Abstract | We present a significantly-improved data-driven global weather forecasting framework using a deep convolutional neural network (CNN) to forecast several basic atmospheric variables on a global grid. New developments in this framework include an offline volume-conservative mapping to a cubed-sphere grid, improvements to the CNN architecture, and the minimization of the loss function over multiple steps in a prediction sequence. The cubed-sphere remapping minimizes the distortion on the cube faces on which convolution operations are performed and provides natural boundary conditions for padding in the CNN. Our improved model produces weather forecasts that are indefinitely stable and produce realistic weather patterns at lead times of several weeks and longer. For short- to medium-range forecasting, our model significantly outperforms persistence, climatology, and a coarse-resolution dynamical numerical weather prediction (NWP) model. Unsurprisingly, our forecasts are worse than those from a high-resolution state-of-the-art operational NWP system. Our data-driven model is able to learn to forecast complex surface temperature patterns from few input atmospheric state variables. On annual time scales, our model produces a realistic seasonal cycle driven solely by the prescribed variation in top-of-atmosphere solar forcing. Although it is currently less accurate than operational weather forecasting models, our data-driven CNN executes much faster than those models, suggesting that machine learning could prove to be a valuable tool for large-ensemble forecasting. |
Tasks | Weather Forecasting |
Published | 2020-03-15 |
URL | https://arxiv.org/abs/2003.11927v1 |
https://arxiv.org/pdf/2003.11927v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-data-driven-global-weather |
Repo | |
Framework | |
Model Extraction Attacks against Recurrent Neural Networks
Title | Model Extraction Attacks against Recurrent Neural Networks |
Authors | Tatsuya Takemura, Naoto Yanai, Toru Fujiwara |
Abstract | Model extraction attacks are a kind of attacks in which an adversary obtains a new model, whose performance is equivalent to that of a target model, via query access to the target model efficiently, i.e., fewer datasets and computational resources than those of the target model. Existing works have dealt with only simple deep neural networks (DNNs), e.g., only three layers, as targets of model extraction attacks, and hence are not aware of the effectiveness of recurrent neural networks (RNNs) in dealing with time-series data. In this work, we shed light on the threats of model extraction attacks against RNNs. We discuss whether a model with a higher accuracy can be extracted with a simple RNN from a long short-term memory (LSTM), which is a more complicated and powerful RNN. Specifically, we tackle the following problems. First, in a case of a classification problem, such as image recognition, extraction of an RNN model without final outputs from an LSTM model is presented by utilizing outputs halfway through the sequence. Next, in a case of a regression problem. such as in weather forecasting, a new attack by newly configuring a loss function is presented. We conduct experiments on our model extraction attacks against an RNN and an LSTM trained with publicly available academic datasets. We then show that a model with a higher accuracy can be extracted efficiently, especially through configuring a loss function and a more complex architecture different from the target model. |
Tasks | Time Series, Weather Forecasting |
Published | 2020-02-01 |
URL | https://arxiv.org/abs/2002.00123v1 |
https://arxiv.org/pdf/2002.00123v1.pdf | |
PWC | https://paperswithcode.com/paper/model-extraction-attacks-against-recurrent |
Repo | |
Framework | |
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
Title | Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue |
Authors | Byeongchang Kim, Jaewoo Ahn, Gunhee Kim |
Abstract | Knowledge-grounded dialogue is a task of generating an informative response based on both discourse context and external knowledge. As we focus on better modeling the knowledge selection in the multi-turn knowledge-grounded dialogue, we propose a sequential latent variable model as the first approach to this matter. The model named sequential knowledge transformer (SKT) can keep track of the prior and posterior distribution over knowledge; as a result, it can not only reduce the ambiguity caused from the diversity in knowledge selection of conversation but also better leverage the response information for proper choice of knowledge. Our experimental results show that the proposed model improves the knowledge selection accuracy and subsequently the performance of utterance generation. We achieve the new state-of-the-art performance on Wizard of Wikipedia (Dinan et al., 2019) as one of the most large-scale and challenging benchmarks. We further validate the effectiveness of our model over existing conversation methods in another knowledge-based dialogue Holl-E dataset (Moghe et al., 2018). |
Tasks | |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07510v1 |
https://arxiv.org/pdf/2002.07510v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-latent-knowledge-selection-for-1 |
Repo | |
Framework | |
Knowledge Graphs and Knowledge Networks: The Story in Brief
Title | Knowledge Graphs and Knowledge Networks: The Story in Brief |
Authors | Amit Sheth, Swati Padhee, Amelie Gyrard |
Abstract | Knowledge Graphs (KGs) represent real-world noisy raw information in a structured form, capturing relationships between entities. However, for dynamic real-world applications such as social networks, recommender systems, computational biology, relational knowledge representation has emerged as a challenging research problem where there is a need to represent the changing nodes, attributes, and edges over time. The evolution of search engine responses to user queries in the last few years is partly because of the role of KGs such as Google KG. KGs are significantly contributing to various AI applications from link prediction, entity relations prediction, node classification to recommendation and question answering systems. This article is an attempt to summarize the journey of KG for AI. |
Tasks | Knowledge Graphs, Link Prediction, Node Classification, Question Answering, Recommendation Systems |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03623v1 |
https://arxiv.org/pdf/2003.03623v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-graphs-and-knowledge-networks-the |
Repo | |
Framework | |
Differentiable Bandit Exploration
Title | Differentiable Bandit Exploration |
Authors | Craig Boutilier, Chih-Wei Hsu, Branislav Kveton, Martin Mladenov, Csaba Szepesvari, Manzil Zaheer |
Abstract | We learn bandit policies that maximize the average reward over bandit instances drawn from an unknown distribution $\mathcal{P}$, from a sample from $\mathcal{P}$. Our approach is an instance of meta-learning and its appeal is that the properties of $\mathcal{P}$ can be exploited without restricting it. We parameterize our policies in a differentiable way and optimize them by policy gradients - an approach that is easy to implement and pleasantly general. Then the challenge is to design effective gradient estimators and good policy classes. To make policy gradients practical, we introduce novel variance reduction techniques. We experiment with various bandit policy classes, including neural networks and a novel soft-elimination policy. The latter has regret guarantees and is a natural starting point for our optimization. Our experiments highlight the versatility of our approach. We also observe that neural network policies can learn implicit biases, which are only expressed through sampled bandit instances during training. |
Tasks | Meta-Learning |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06772v1 |
https://arxiv.org/pdf/2002.06772v1.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-bandit-exploration |
Repo | |
Framework | |