Paper Group ANR 881
Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data. Detecting Offensive Content in Open-domain Conversations using Two Stage Semi-supervision. ReviewQA: a relational aspect-based opinion reading dataset. OIL: Observational Imitation Learning. Teaching machines to understand data science code by semantic enrichment of dataflow …
Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data
Title | Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data |
Authors | Ethem F. Can, Aysu Ezen-Can, Fazli Can |
Abstract | Sentiment analysis is a widely studied NLP task where the goal is to determine opinions, emotions, and evaluations of users towards a product, an entity or a service that they are reviewing. One of the biggest challenges for sentiment analysis is that it is highly language dependent. Word embeddings, sentiment lexicons, and even annotated data are language specific. Further, optimizing models for each language is very time consuming and labor intensive especially for recurrent neural network models. From a resource perspective, it is very challenging to collect data for different languages. In this paper, we look for an answer to the following research question: can a sentiment analysis model trained on a language be reused for sentiment analysis in other languages, Russian, Spanish, Turkish, and Dutch, where the data is more limited? Our goal is to build a single model in the language with the largest dataset available for the task, and reuse it for languages that have limited resources. For this purpose, we train a sentiment analysis model using recurrent neural networks with reviews in English. We then translate reviews in other languages and reuse this model to evaluate the sentiments. Experimental results show that our robust approach of single model trained on English reviews statistically significantly outperforms the baselines in several different languages. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.04511v1 |
http://arxiv.org/pdf/1806.04511v1.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-sentiment-analysis-an-rnn-based |
Repo | |
Framework | |
Detecting Offensive Content in Open-domain Conversations using Two Stage Semi-supervision
Title | Detecting Offensive Content in Open-domain Conversations using Two Stage Semi-supervision |
Authors | Chandra Khatri, Behnam Hedayatnia, Rahul Goel, Anushree Venkatesh, Raefer Gabriel, Arindam Mandal |
Abstract | As open-ended human-chatbot interaction becomes commonplace, sensitive content detection gains importance. In this work, we propose a two stage semi-supervised approach to bootstrap large-scale data for automatic sensitive language detection from publicly available web resources. We explore various data selection methods including 1) using a blacklist to rank online discussion forums by the level of their sensitiveness followed by randomly sampling utterances and 2) training a weakly supervised model in conjunction with the blacklist for scoring sentences from online discussion forums to curate a dataset. Our data collection strategy is flexible and allows the models to detect implicit sensitive content for which manual annotations may be difficult. We train models using publicly available annotated datasets as well as using the proposed large-scale semi-supervised datasets. We evaluate the performance of all the models on Twitter and Toxic Wikipedia comments testsets as well as on a manually annotated spoken language dataset collected during a large scale chatbot competition. Results show that a model trained on this collected data outperforms the baseline models by a large margin on both in-domain and out-of-domain testsets, achieving an F1 score of 95.5% on an out-of-domain testset compared to a score of 75% for models trained on public datasets. We also showcase that large scale two stage semi-supervision generalizes well across multiple classes of sensitivities such as hate speech, racism, sexual and pornographic content, etc. without even providing explicit labels for these classes, leading to an average recall of 95.5% versus the models trained using annotated public datasets which achieve an average recall of 73.2% across seven sensitive classes on out-of-domain testsets. |
Tasks | Chatbot |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12900v1 |
http://arxiv.org/pdf/1811.12900v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-offensive-content-in-open-domain |
Repo | |
Framework | |
ReviewQA: a relational aspect-based opinion reading dataset
Title | ReviewQA: a relational aspect-based opinion reading dataset |
Authors | Quentin Grail, Julien Perez |
Abstract | Deep reading models for question-answering have demonstrated promising performance over the last couple of years. However current systems tend to learn how to cleverly extract a span of the source document, based on its similarity with the question, instead of seeking for the appropriate answer. Indeed, a reading machine should be able to detect relevant passages in a document regarding a question, but more importantly, it should be able to reason over the important pieces of the document in order to produce an answer when it is required. To motivate this purpose, we present ReviewQA, a question-answering dataset based on hotel reviews. The questions of this dataset are linked to a set of relational understanding competencies that we expect a model to master. Indeed, each question comes with an associated type that characterizes the required competency. With this framework, it is possible to benchmark the main families of models and to get an overview of what are the strengths and the weaknesses of a given model on the set of tasks evaluated in this dataset. Our corpus contains more than 500.000 questions in natural language over 100.000 hotel reviews. Our setup is projective, the answer of a question does not need to be extracted from a document, like in most of the recent datasets, but selected among a set of candidates that contains all the possible answers to the questions of the dataset. Finally, we present several baselines over this dataset. |
Tasks | Question Answering |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12196v1 |
http://arxiv.org/pdf/1810.12196v1.pdf | |
PWC | https://paperswithcode.com/paper/reviewqa-a-relational-aspect-based-opinion |
Repo | |
Framework | |
OIL: Observational Imitation Learning
Title | OIL: Observational Imitation Learning |
Authors | Guohao Li, Matthias Müller, Vincent Casser, Neil Smith, Dominik L. Michels, Bernard Ghanem |
Abstract | Recent work has explored the problem of autonomous navigation by imitating a teacher and learning an end-to-end policy, which directly predicts controls from raw images. However, these approaches tend to be sensitive to mistakes by the teacher and do not scale well to other environments or vehicles. To this end, we propose Observational Imitation Learning (OIL), a novel imitation learning variant that supports online training and automatic selection of optimal behavior by observing multiple imperfect teachers. We apply our proposed methodology to the challenging problems of autonomous driving and UAV racing. For both tasks, we utilize the Sim4CV simulator that enables the generation of large amounts of synthetic training data and also allows for online learning and evaluation. We train a perception network to predict waypoints from raw image data and use OIL to train another network to predict controls from these waypoints. Extensive experiments demonstrate that our trained network outperforms its teachers, conventional imitation learning (IL) and reinforcement learning (RL) baselines and even humans in simulation. The project website is available at https://sites.google.com/kaust.edu.sa/oil/ and a video at https://youtu.be/_rhq8a0qgeg |
Tasks | Autonomous Driving, Autonomous Navigation, Imitation Learning |
Published | 2018-03-03 |
URL | https://arxiv.org/abs/1803.01129v3 |
https://arxiv.org/pdf/1803.01129v3.pdf | |
PWC | https://paperswithcode.com/paper/oil-observational-imitation-learning |
Repo | |
Framework | |
Teaching machines to understand data science code by semantic enrichment of dataflow graphs
Title | Teaching machines to understand data science code by semantic enrichment of dataflow graphs |
Authors | Evan Patterson, Ioana Baldini, Aleksandra Mojsilovic, Kush R. Varshney |
Abstract | Your computer is continuously executing programs, but does it really understand them? Not in any meaningful sense. That burden falls upon human knowledge workers, who are increasingly asked to write and understand code. They deserve to have intelligent tools that reveal the connections between code and its subject matter. Towards this prospect, we develop an AI system that forms semantic representations of computer programs, using techniques from knowledge representation and program analysis. To create the representations, we introduce an algorithm for enriching dataflow graphs with semantic information. The semantic enrichment algorithm is undergirded by a new ontology language for modeling computer programs and a new ontology about data science, written in this language. Throughout the paper, we focus on code written by data scientists and we locate our work within a larger movement towards collaborative, open, and reproducible science. |
Tasks | |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05691v2 |
http://arxiv.org/pdf/1807.05691v2.pdf | |
PWC | https://paperswithcode.com/paper/teaching-machines-to-understand-data-science |
Repo | |
Framework | |
Multi-atomic Annealing Heuristic for Static Dial-a-ride Problem
Title | Multi-atomic Annealing Heuristic for Static Dial-a-ride Problem |
Authors | Song Guang Ho, Ramesh Ramasamy Pandi, Sarat Chandra Nagavarapu, Justin Dauwels |
Abstract | Dial-a-ride problem (DARP) deals with the transportation of users between pickup and drop-off locations associated with specified time windows. This paper proposes a novel algorithm called multi-atomic annealing (MATA) to solve static dial-a-ride problem. Two new local search operators (burn and reform), a new construction heuristic and two request sequencing mechanisms (Sorted List and Random List) are developed. Computational experiments conducted on various standard DARP test instances prove that MATA is an expeditious meta-heuristic in contrast to other existing methods. In all experiments, MATA demonstrates the capability to obtain high quality solutions, faster convergence, and quicker attainment of a first feasible solution. It is observed that MATA attains a first feasible solution 29.8 to 65.1% faster, and obtains a final solution that is 3.9 to 5.2% better, when compared to other algorithms within 60 sec. |
Tasks | |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1807.02406v1 |
http://arxiv.org/pdf/1807.02406v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-atomic-annealing-heuristic-for-static |
Repo | |
Framework | |
Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks
Title | Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks |
Authors | Leon René Sütfeld, Flemming Brieger, Holger Finger, Sonja Füllhase, Gordon Pipa |
Abstract | The most widely used activation functions in current deep feed-forward neural networks are rectified linear units (ReLU), and many alternatives have been successfully applied, as well. However, none of the alternatives have managed to consistently outperform the rest and there is no unified theory connecting properties of the task and network with properties of activation functions for most efficient training. A possible solution is to have the network learn its preferred activation functions. In this work, we introduce Adaptive Blending Units (ABUs), a trainable linear combination of a set of activation functions. Since ABUs learn the shape, as well as the overall scaling of the activation function, we also analyze the effects of adaptive scaling in common activation functions. We experimentally demonstrate advantages of both adaptive scaling and ABUs over common activation functions across a set of systematically varied network specifications. We further show that adaptive scaling works by mitigating covariate shifts during training, and that the observed advantages in performance of ABUs likewise rely largely on the activation function’s ability to adapt over the course of training. |
Tasks | |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.10064v1 |
http://arxiv.org/pdf/1806.10064v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-blending-units-trainable-activation |
Repo | |
Framework | |
Ontology-based Fuzzy Markup Language Agent for Student and Robot Co-Learning
Title | Ontology-based Fuzzy Markup Language Agent for Student and Robot Co-Learning |
Authors | Chang-Shing Lee, Mei-Hui Wang, Tzong-Xiang Huang, Li-Chung Chen, Yung-Ching Huang, Sheng-Chi Yang, Chien-Hsun Tseng, Pi-Hsia Hung, Naoyuki Kubota |
Abstract | An intelligent robot agent based on domain ontology, machine learning mechanism, and Fuzzy Markup Language (FML) for students and robot co-learning is presented in this paper. The machine-human co-learning model is established to help various students learn the mathematical concepts based on their learning ability and performance. Meanwhile, the robot acts as a teacher’s assistant to co-learn with children in the class. The FML-based knowledge base and rule base are embedded in the robot so that the teachers can get feedback from the robot on whether students make progress or not. Next, we inferred students’ learning performance based on learning content’s difficulty and students’ ability, concentration level, as well as teamwork sprit in the class. Experimental results show that learning with the robot is helpful for disadvantaged and below-basic children. Moreover, the accuracy of the intelligent FML-based agent for student learning is increased after machine learning mechanism. |
Tasks | |
Published | 2018-01-26 |
URL | http://arxiv.org/abs/1801.08650v1 |
http://arxiv.org/pdf/1801.08650v1.pdf | |
PWC | https://paperswithcode.com/paper/ontology-based-fuzzy-markup-language-agent |
Repo | |
Framework | |
Inferring Remote Channel State Information: Cramér-Rao Lower Bound and Deep Learning Implementation
Title | Inferring Remote Channel State Information: Cramér-Rao Lower Bound and Deep Learning Implementation |
Authors | Zhiyuan Jiang, Ziyan He, Sheng Chen, Andreas F. Molisch, Sheng Zhou, Zhisheng Niu |
Abstract | Channel state information (CSI) is of vital importance in wireless communication systems. Existing CSI acquisition methods usually rely on pilot transmissions, and geographically separated base stations (BSs) with non-correlated CSI need to be assigned with orthogonal pilots which occupy excessive system resources. Our previous work adopts a data-driven deep learning based approach which leverages the CSI at a local BS to infer the CSI remotely, however the relevance of CSI between separated BSs is not specified explicitly. In this paper, we exploit a model-based methodology to derive the Cram'er-Rao lower bound (CRLB) of remote CSI inference given the local CSI. Although the model is simplified, the derived CRLB explicitly illustrates the relationship between the inference performance and several key system parameters, e.g., terminal distance and antenna array size. In particular, it shows that by leveraging multiple local BSs, the inference error exhibits a larger power-law decay rate (w.r.t. number of antennas), compared with a single local BS; this explains and validates our findings in evaluating the deep-neural-network-based (DNN-based) CSI inference. We further improve on the DNN-based method by employing dropout and deeper networks, and show an inference performance of approximately $90%$ accuracy in a realistic scenario with CSI generated by a ray-tracing simulator. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01223v1 |
http://arxiv.org/pdf/1812.01223v1.pdf | |
PWC | https://paperswithcode.com/paper/inferring-remote-channel-state-information |
Repo | |
Framework | |
Multi-Modal Coreference Resolution with the Correlation between Space Structures
Title | Multi-Modal Coreference Resolution with the Correlation between Space Structures |
Authors | Qibin Zheng, Xingchun Diao, Jianjun Cao, Xiaolei Zhou, Yi Liu, Hongmei Li |
Abstract | Multi-modal data is becoming more common in big data background. Finding the semantically similar objects from different modality is one of the heart problems of multi-modal learning. Most of the current methods try to learn the inter-modal correlation with extrinsic supervised information, while intrinsic structural information of each modality is neglected. The performance of these methods heavily depends on the richness of training samples. However, obtaining the multi-modal training samples is still a labor and cost intensive work. In this paper, we bring a extrinsic correlation between the space structures of each modalities in coreference resolution. With this correlation, a semi-supervised learning model for multi-modal coreference resolution is proposed. We firstly extract high-level features of images and text, then compute the distances of each object from some reference points to build the space structure of each modality. With a shared reference point set, the space structures of each modality are correlated. We employ the correlation to build a commonly shared space that the semantic distance between multi-modal objects can be computed directly. The experiments on two multi-modal datasets show that our model performs better than the existing methods with insufficient training data. |
Tasks | Coreference Resolution |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.08010v2 |
http://arxiv.org/pdf/1804.08010v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-modal-coreference-resolution-with-the |
Repo | |
Framework | |
A theoretical guideline for designing an effective adaptive particle swarm
Title | A theoretical guideline for designing an effective adaptive particle swarm |
Authors | Mohammad Reza Bonyadi |
Abstract | In this paper we theoretically investigate underlying assumptions that have been used for designing adaptive particle swarm optimization algorithms in the past years. We relate these assumptions to the movement patterns of particles controlled by coefficient values (inertia weight and acceleration coefficient) and introduce three factors, namely the autocorrelation of the particle positions, the average movement distance of the particle in each iteration, and the focus of the search, that describe these movement patterns. We show how these factors represent movement patterns of a particle within a swarm and how they are affected by particle coefficients (i.e., inertia weight and acceleration coefficients). We derive equations that provide exact coefficient values to guarantee achieving a desired movement pattern defined by these three factors within a swarm. We then relate these movements to the searching capability of particles and provide guideline for designing potentially successful adaptive methods to control coefficients in particle swarm. Finally, we propose a new simple time adaptive particle swarm and compare its results with previous adaptive particle swarm approaches. Our experiments show that the theoretical findings indeed provide a beneficial guideline for successful adaptation of the coefficients in the particle swarm optimization algorithm. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04855v1 |
http://arxiv.org/pdf/1802.04855v1.pdf | |
PWC | https://paperswithcode.com/paper/a-theoretical-guideline-for-designing-an |
Repo | |
Framework | |
Security Theater: On the Vulnerability of Classifiers to Exploratory Attacks
Title | Security Theater: On the Vulnerability of Classifiers to Exploratory Attacks |
Authors | Tegjyot Singh Sethi, Mehmed Kantardzic, Joung Woo Ryu |
Abstract | The increasing scale and sophistication of cyberattacks has led to the adoption of machine learning based classification techniques, at the core of cybersecurity systems. These techniques promise scale and accuracy, which traditional rule or signature based methods cannot. However, classifiers operating in adversarial domains are vulnerable to evasion attacks by an adversary, who is capable of learning the behavior of the system by employing intelligently crafted probes. Classification accuracy in such domains provides a false sense of security, as detection can easily be evaded by carefully perturbing the input samples. In this paper, a generic data driven framework is presented, to analyze the vulnerability of classification systems to black box probing based attacks. The framework uses an exploration exploitation based strategy, to understand an adversary’s point of view of the attack defense cycle. The adversary assumes a black box model of the defender’s classifier and can launch indiscriminate attacks on it, without information of the defender’s model type, training data or the domain of application. Experimental evaluation on 10 real world datasets demonstrates that even models having high perceived accuracy (>90%), by a defender, can be effectively circumvented with a high evasion rate (>95%, on average). The detailed attack algorithms, adversarial model and empirical evaluation, serve. |
Tasks | |
Published | 2018-03-24 |
URL | http://arxiv.org/abs/1803.09163v1 |
http://arxiv.org/pdf/1803.09163v1.pdf | |
PWC | https://paperswithcode.com/paper/security-theater-on-the-vulnerability-of |
Repo | |
Framework | |
Rethinking Monocular Depth Estimation with Adversarial Training
Title | Rethinking Monocular Depth Estimation with Adversarial Training |
Authors | Richard Chen, Faisal Mahmood, Alan Yuille, Nicholas J. Durr |
Abstract | Monocular depth estimation is an extensively studied computer vision problem with a vast variety of applications. Deep learning-based methods have demonstrated promise for both supervised and unsupervised depth estimation from monocular images. Most existing approaches treat depth estimation as a regression problem with a local pixel-wise loss function. In this work, we innovate beyond existing approaches by using adversarial training to learn a context-aware, non-local loss function. Such an approach penalizes the joint configuration of predicted depth values at the patch-level instead of the pixel-level, which allows networks to incorporate more global information. In this framework, the generator learns a mapping between RGB images and its corresponding depth map, while the discriminator learns to distinguish depth map and RGB pairs from ground truth. This conditional GAN depth estimation framework is stabilized using spectral normalization to prevent mode collapse when learning from diverse datasets. We test this approach using a diverse set of generators that include U-Net and joint CNN-CRF. We benchmark this approach on the NYUv2, Make3D and KITTI datasets, and observe that adversarial training reduces relative error by several fold, achieving state-of-the-art performance. |
Tasks | Depth Estimation, Monocular Depth Estimation |
Published | 2018-08-22 |
URL | https://arxiv.org/abs/1808.07528v3 |
https://arxiv.org/pdf/1808.07528v3.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-monocular-depth-estimation-with |
Repo | |
Framework | |
TzK Flow - Conditional Generative Model
Title | TzK Flow - Conditional Generative Model |
Authors | Micha Livne, David J. Fleet |
Abstract | We introduce TzK (pronounced “task”), a conditional probability flow-based model that exploits attributes (e.g., style, class membership, or other side information) in order to learn tight conditional prior around manifolds of the target observations. The model is trained via approximated ML, and offers efficient approximation of arbitrary data sample distributions (similar to GAN and flow-based ML), and stable training (similar to VAE and ML), while avoiding variational approximations. TzK exploits meta-data to facilitate a bottleneck, similar to autoencoders, thereby producing a low-dimensional representation. Unlike autoencoders, the bottleneck does not limit model expressiveness, similar to flow-based ML. Supervised, unsupervised, and semi-supervised learning are supported by replacing missing observations with samples from learned priors. We demonstrate TzK by training jointly on MNIST and Omniglot datasets with minimal preprocessing, and weak supervision, with results comparable to state-of-the-art. |
Tasks | Omniglot |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01837v4 |
http://arxiv.org/pdf/1811.01837v4.pdf | |
PWC | https://paperswithcode.com/paper/tzk-flow-conditional-generative-model |
Repo | |
Framework | |
Deep Bayesian Trust : A Dominant and Fair Incentive Mechanism for Crowd
Title | Deep Bayesian Trust : A Dominant and Fair Incentive Mechanism for Crowd |
Authors | Naman Goel, Boi Faltings |
Abstract | An important class of game-theoretic incentive mechanisms for eliciting effort from a crowd are the peer based mechanisms, in which workers are paid by matching their answers with one another. The other classic mechanism is to have the workers solve some gold standard tasks and pay them according to their accuracy on gold tasks. This mechanism ensures stronger incentive compatibility than the peer based mechanisms but assigning gold tasks to all workers becomes inefficient at large scale. We propose a novel mechanism that assigns gold tasks to only a few workers and exploits transitivity to derive accuracy of the rest of the workers from their peers’ accuracy. We show that the resulting mechanism ensures a dominant notion of incentive compatibility and fairness. |
Tasks | |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05560v2 |
http://arxiv.org/pdf/1804.05560v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-bayesian-trust-a-dominant-and-fair |
Repo | |
Framework | |