October 18, 2019

3096 words 15 mins read

Paper Group ANR 652

An Automatic System for Unconstrained Video-Based Face Recognition. Machine Common Sense Concept Paper. A Knowledge Hunting Framework for Common Sense Reasoning. Real-time Context-aware Learning System for IoT Applications. Multi-class Classification Model Inspired by Quantum Detection Theory. Quantifying Generalization in Reinforcement Learning. N …

An Automatic System for Unconstrained Video-Based Face Recognition


Title	An Automatic System for Unconstrained Video-Based Face Recognition
Authors	Jingxiao Zheng, Rajeev Ranjan, Ching-Hui Chen, Jun-Cheng Chen, Carlos D. Castillo, Rama Chellappa
Abstract	Although deep learning approaches have achieved performance surpassing humans for still image-based face recognition, unconstrained video-based face recognition is still a challenging task due to large volume of data to be processed and intra/inter-video variations on pose, illumination, occlusion, scene, blur, video quality, etc. In this work, we consider challenging scenarios for unconstrained video-based face recognition from multiple-shot videos and surveillance videos with low-quality frames. To handle these problems, we propose a robust and efficient system for unconstrained video-based face recognition, which is composed of modules for face/fiducial detection, face association, and face recognition. First, we use multi-scale single-shot face detectors to efficiently localize faces in videos. The detected faces are then grouped respectively through carefully designed face association methods, especially for multi-shot videos. Finally, the faces are recognized by the proposed face matcher based on an unsupervised subspace learning approach and a subspace-to-subspace similarity metric. Extensive experiments on challenging video datasets, such as Multiple Biometric Grand Challenge (MBGC), Face and Ocular Challenge Series (FOCS), IARPA Janus Surveillance Video Benchmark (IJB-S) for low-quality surveillance videos and IARPA JANUS Benchmark B (IJB-B) for multiple-shot videos, demonstrate that the proposed system can accurately detect and associate faces from unconstrained videos and effectively learn robust and discriminative features for recognition.
Tasks	Face Recognition
Published	2018-12-10
URL	https://arxiv.org/abs/1812.04058v3
PDF	https://arxiv.org/pdf/1812.04058v3.pdf
PWC	https://paperswithcode.com/paper/an-automatic-system-for-unconstrained-video
Repo
Framework

Machine Common Sense Concept Paper


Title	Machine Common Sense Concept Paper
Authors	David Gunning
Abstract	This paper summarizes some of the technical background, research ideas, and possible development strategies for achieving machine common sense. Machine common sense has long been a critical-but-missing component of Artificial Intelligence (AI). Recent advances in machine learning have resulted in new AI capabilities, but in all of these applications, machine reasoning is narrow and highly specialized. Developers must carefully train or program systems for every situation. General commonsense reasoning remains elusive. The absence of common sense prevents intelligent systems from understanding their world, behaving reasonably in unforeseen situations, communicating naturally with people, and learning from new experiences. Its absence is perhaps the most significant barrier between the narrowly focused AI applications we have today and the more general, human-like AI systems we would like to build in the future. Machine common sense remains a broad, potentially unbounded problem in AI. There are a wide range of strategies that could be employed to make progress on this difficult challenge. This paper discusses two diverse strategies for focusing development on two different machine commonsense services: (1) a service that learns from experience, like a child, to construct computational models that mimic the core domains of child cognition for objects (intuitive physics), agents (intentional actors), and places (spatial navigation); and (2) service that learns from reading the Web, like a research librarian, to construct a commonsense knowledge repository capable of answering natural language and image-based questions about commonsense phenomena.
Tasks	Common Sense Reasoning
Published	2018-10-17
URL	http://arxiv.org/abs/1810.07528v1
PDF	http://arxiv.org/pdf/1810.07528v1.pdf
PWC	https://paperswithcode.com/paper/machine-common-sense-concept-paper
Repo
Framework

A Knowledge Hunting Framework for Common Sense Reasoning


Title	A Knowledge Hunting Framework for Common Sense Reasoning
Authors	Ali Emami, Noelia De La Cruz, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung
Abstract	We introduce an automatic system that achieves state-of-the-art results on the Winograd Schema Challenge (WSC), a common sense reasoning task that requires diverse, complex forms of inference and knowledge. Our method uses a knowledge hunting module to gather text from the web, which serves as evidence for candidate problem resolutions. Given an input problem, our system generates relevant queries to send to a search engine, then extracts and classifies knowledge from the returned results and weighs them to make a resolution. Our approach improves F1 performance on the full WSC by 0.21 over the previous best and represents the first system to exceed 0.5 F1. We further demonstrate that the approach is competitive on the Choice of Plausible Alternatives (COPA) task, which suggests that it is generally applicable.
Tasks	Common Sense Reasoning
Published	2018-10-02
URL	http://arxiv.org/abs/1810.01375v1
PDF	http://arxiv.org/pdf/1810.01375v1.pdf
PWC	https://paperswithcode.com/paper/a-knowledge-hunting-framework-for-common
Repo
Framework

Real-time Context-aware Learning System for IoT Applications


Title	Real-time Context-aware Learning System for IoT Applications
Authors	Bhaskar Das, Jalal Almhana
Abstract	We propose a real-time context-aware learning system along with the architecture that runs on the mobile devices, provide services to the user and manage the IoT devices. In this system, an application running on mobile devices collected data from the sensors, learned about the user-defined context, made predictions in real-time and manage IoT devices accordingly. However, the computational power of the mobile devices makes it challenging to run machine learning algorithms with acceptable accuracy. To solve this issue, some authors have run machine learning algorithms on the server and transmitted the results to the mobile devices. Although the context-aware predictions made by the server are more accurate than their mobile counterpart, it heavily depends on the network connection for the delivery of the results to the devices, which negatively affects real-time context-learning. Therefore, in this work, we describe a context-learning algorithm for mobile devices which is less demanding on the computational resources and maintains the accuracy of the prediction by updating itself from the learning parameters obtained from the server periodically. Experimental results show that the proposed light-weight context-learning algorithm can achieve mean accuracy up to 97.51% while mean execution time requires only 11ms.
Tasks
Published	2018-10-26
URL	http://arxiv.org/abs/1810.11295v1
PDF	http://arxiv.org/pdf/1810.11295v1.pdf
PWC	https://paperswithcode.com/paper/real-time-context-aware-learning-system-for
Repo
Framework

Multi-class Classification Model Inspired by Quantum Detection Theory


Title	Multi-class Classification Model Inspired by Quantum Detection Theory
Authors	Prayag Tiwari, Massimo Melucci
Abstract	Machine Learning has become very famous currently which assist in identifying the patterns from the raw data. Technological advancement has led to substantial improvement in Machine Learning which, thus helping to improve prediction. Current Machine Learning models are based on Classical Theory, which can be replaced by Quantum Theory to improve the effectiveness of the model. In the previous work, we developed binary classifier inspired by Quantum Detection Theory. In this extended abstract, our main goal is to develop multi-class classifier. We generally use the terminology multinomial classification or multi-class classification when we have a classification problem for classifying observations or instances into one of three or more classes.
Tasks
Published	2018-10-10
URL	http://arxiv.org/abs/1810.04491v1
PDF	http://arxiv.org/pdf/1810.04491v1.pdf
PWC	https://paperswithcode.com/paper/multi-class-classification-model-inspired-by
Repo
Framework

Quantifying Generalization in Reinforcement Learning


Title	Quantifying Generalization in Reinforcement Learning
Authors	Karl Cobbe, Oleg Klimov, Chris Hesse, Taehoon Kim, John Schulman
Abstract	In this paper, we investigate the problem of overfitting in deep reinforcement learning. Among the most common benchmarks in RL, it is customary to use the same environments for both training and testing. This practice offers relatively little insight into an agent’s ability to generalize. We address this issue by using procedurally generated environments to construct distinct training and test sets. Most notably, we introduce a new environment called CoinRun, designed as a benchmark for generalization in RL. Using CoinRun, we find that agents overfit to surprisingly large training sets. We then show that deeper convolutional architectures improve generalization, as do methods traditionally found in supervised learning, including L2 regularization, dropout, data augmentation and batch normalization.
Tasks	Data Augmentation, L2 Regularization
Published	2018-12-06
URL	https://arxiv.org/abs/1812.02341v3
PDF	https://arxiv.org/pdf/1812.02341v3.pdf
PWC	https://paperswithcode.com/paper/quantifying-generalization-in-reinforcement
Repo
Framework

Normalization of Neural Networks using Analytic Variance Propagation


Title	Normalization of Neural Networks using Analytic Variance Propagation
Authors	Alexander Shekhovtsov, Boris Flach
Abstract	We address the problem of estimating statistics of hidden units in a neural network using a method of analytic moment propagation. These statistics are useful for approximate whitening of the inputs in front of saturating non-linearities such as a sigmoid function. This is important for initialization of training and for reducing the accumulated scale and bias dependencies (compensating covariate shift), which presumably eases the learning. In batch normalization, which is currently a very widely applied technique, sample estimates of statistics of hidden units over a batch are used. The proposed estimation uses an analytic propagation of mean and variance of the training set through the network. The result depends on the network structure and its current weights but not on the specific batch input. The estimates are suitable for initialization and normalization, efficient to compute and independent of the batch size. The experimental verification well supports these claims. However, the method does not share the generalization properties of BN, to which our experiments give some additional insight.
Tasks
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10560v1
PDF	http://arxiv.org/pdf/1803.10560v1.pdf
PWC	https://paperswithcode.com/paper/normalization-of-neural-networks-using
Repo
Framework

Affordance Extraction and Inference based on Semantic Role Labeling


Title	Affordance Extraction and Inference based on Semantic Role Labeling
Authors	Daniel Loureiro, Alípio Mário Jorge
Abstract	Common-sense reasoning is becoming increasingly important for the advancement of Natural Language Processing. While word embeddings have been very successful, they cannot explain which aspects of ‘coffee’ and ‘tea’ make them similar, or how they could be related to ‘shop’. In this paper, we propose an explicit word representation that builds upon the Distributional Hypothesis to represent meaning from semantic roles, and allow inference of relations from their meshing, as supported by the affordance-based Indexical Hypothesis. We find that our model improves the state-of-the-art on unsupervised word similarity tasks while allowing for direct inference of new relations from the same vector space.
Tasks	Common Sense Reasoning, Semantic Role Labeling, Word Embeddings
Published	2018-09-03
URL	http://arxiv.org/abs/1809.00589v1
PDF	http://arxiv.org/pdf/1809.00589v1.pdf
PWC	https://paperswithcode.com/paper/affordance-extraction-and-inference-based-on
Repo
Framework

Efficient and Robust Question Answering from Minimal Context over Documents


Title	Efficient and Robust Question Answering from Minimal Context over Documents
Authors	Sewon Min, Victor Zhong, Richard Socher, Caiming Xiong
Abstract	Neural models for question answering (QA) over documents have achieved significant performance improvements. Although effective, these models do not scale to large corpora due to their complex modeling of interactions between the document and the question. Moreover, recent work has shown that such models are sensitive to adversarial inputs. In this paper, we study the minimal context required to answer the question, and find that most questions in existing datasets can be answered with a small set of sentences. Inspired by this observation, we propose a simple sentence selector to select the minimal set of sentences to feed into the QA model. Our overall system achieves significant reductions in training (up to 15 times) and inference times (up to 13 times), with accuracy comparable to or better than the state-of-the-art on SQuAD, NewsQA, TriviaQA and SQuAD-Open. Furthermore, our experimental results and analyses show that our approach is more robust to adversarial inputs.
Tasks	Question Answering, Reading Comprehension
Published	2018-05-21
URL	http://arxiv.org/abs/1805.08092v1
PDF	http://arxiv.org/pdf/1805.08092v1.pdf
PWC	https://paperswithcode.com/paper/efficient-and-robust-question-answering-from
Repo
Framework

Construction of the Literature Graph in Semantic Scholar


Title	Construction of the Literature Graph in Semantic Scholar
Authors	Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, Rodney Kinney, Sebastian Kohlmeier, Kyle Lo, Tyler Murray, Hsu-Han Ooi, Matthew Peters, Joanna Power, Sam Skjonsberg, Lucy Lu Wang, Chris Wilhelm, Zheng Yuan, Madeleine van Zuylen, Oren Etzioni
Abstract	We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery. The resulting literature graph consists of more than 280M nodes, representing papers, authors, entities and various interactions between them (e.g., authorships, citations, entity mentions). We reduce literature graph construction into familiar NLP tasks (e.g., entity extraction and linking), point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task. The methods described in this paper are used to enable semantic features in www.semanticscholar.org
Tasks	Entity Extraction, graph construction
Published	2018-05-06
URL	http://arxiv.org/abs/1805.02262v1
PDF	http://arxiv.org/pdf/1805.02262v1.pdf
PWC	https://paperswithcode.com/paper/construction-of-the-literature-graph-in
Repo
Framework

An Empirical Analysis of the Role of Amplifiers, Downtoners, and Negations in Emotion Classification in Microblogs


Title	An Empirical Analysis of the Role of Amplifiers, Downtoners, and Negations in Emotion Classification in Microblogs
Authors	Florian Strohm, Roman Klinger
Abstract	The effect of amplifiers, downtoners, and negations has been studied in general and particularly in the context of sentiment analysis. However, there is only limited work which aims at transferring the results and methods to discrete classes of emotions, e. g., joy, anger, fear, sadness, surprise, and disgust. For instance, it is not straight-forward to interpret which emotion the phrase “not happy” expresses. With this paper, we aim at obtaining a better understanding of such modifiers in the context of emotion-bearing words and their impact on document-level emotion classification, namely, microposts on Twitter. We select an appropriate scope detection method for modifiers of emotion words, incorporate it in a document-level emotion classification model as additional bag of words and show that this approach improves the performance of emotion classification. In addition, we build a term weighting approach based on the different modifiers into a lexical model for the analysis of the semantics of modifiers and their impact on emotion meaning. We show that amplifiers separate emotions expressed with an emotion- bearing word more clearly from other secondary connotations. Downtoners have the opposite effect. In addition, we discuss the meaning of negations of emotion-bearing words. For instance we show empirically that “not happy” is closer to sadness than to anger and that fear-expressing words in the scope of downtoners often express surprise.
Tasks	Emotion Classification, Sentiment Analysis
Published	2018-08-31
URL	http://arxiv.org/abs/1808.10653v2
PDF	http://arxiv.org/pdf/1808.10653v2.pdf
PWC	https://paperswithcode.com/paper/an-empirical-analysis-of-the-role-of
Repo
Framework

A Multi-task Ensemble Framework for Emotion, Sentiment and Intensity Prediction


Title	A Multi-task Ensemble Framework for Emotion, Sentiment and Intensity Prediction
Authors	Md Shad Akhtar, Deepanway Ghosal, Asif Ekbal, Pushpak Bhattacharyya, Sadao Kurohashi
Abstract	In this paper, through multi-task ensemble framework we address three problems of emotion and sentiment analysis i.e. “emotion classification & intensity”, “valence, arousal & dominance for emotion” and “valence & arousal} for sentiment”. The underlying problems cover two granularities (i.e. coarse-grained and fine-grained) and a diverse range of domains (i.e. tweets, Facebook posts, news headlines, blogs, letters etc.). The ensemble model aims to leverage the learned representations of three deep learning models (i.e. CNN, LSTM and GRU) and a hand-crafted feature representation for the predictions. Experimental results on the benchmark datasets show the efficacy of our proposed multi-task ensemble frameworks. We obtain the performance improvement of 2-3 points on an average over single-task systems for most of the problems and domains.
Tasks	Emotion Classification, Sentiment Analysis
Published	2018-08-03
URL	http://arxiv.org/abs/1808.01216v2
PDF	http://arxiv.org/pdf/1808.01216v2.pdf
PWC	https://paperswithcode.com/paper/a-multi-task-ensemble-framework-for-emotion
Repo
Framework

It’s all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data


Title	It’s all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data
Authors	Matteo Ruggero Ronchi, Oisin Mac Aodha, Robert Eng, Pietro Perona
Abstract	We address the problem of 3D human pose estimation from 2D input images using only weakly supervised training data. Despite showing considerable success for 2D pose estimation, the application of supervised machine learning to 3D pose estimation in real world images is currently hampered by the lack of varied training images with corresponding 3D poses. Most existing 3D pose estimation algorithms train on data that has either been collected in carefully controlled studio settings or has been generated synthetically. Instead, we take a different approach, and propose a 3D human pose estimation algorithm that only requires relative estimates of depth at training time. Such training signal, although noisy, can be easily collected from crowd annotators, and is of sufficient quality for enabling successful training and evaluation of 3D pose algorithms. Our results are competitive with fully supervised regression based approaches on the Human3.6M dataset, despite using significantly weaker training data. Our proposed algorithm opens the door to using existing widespread 2D datasets for 3D pose estimation by allowing fine-tuning with noisy relative constraints, resulting in more accurate 3D poses.
Tasks	3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06880v2
PDF	http://arxiv.org/pdf/1805.06880v2.pdf
PWC	https://paperswithcode.com/paper/its-all-relative-monocular-3d-human-pose
Repo
Framework

Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation


Title	Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation
Authors	Jonathan Tremblay, Thang To, Stan Birchfield
Abstract	We present a new dataset, called Falling Things (FAT), for advancing the state-of-the-art in object detection and 3D pose estimation in the context of robotics. By synthetically combining object models and backgrounds of complex composition and high graphical quality, we are able to generate photorealistic images with accurate 3D pose annotations for all objects in all images. Our dataset contains 60k annotated photos of 21 household objects taken from the YCB dataset. For each image, we provide the 3D poses, per-pixel class segmentation, and 2D/3D bounding box coordinates for all objects. To facilitate testing different input modalities, we provide mono and stereo RGB images, along with registered dense depth images. We describe in detail the generation process and statistical analysis of the data.
Tasks	3D Object Detection, 3D Pose Estimation, Object Detection, Pose Estimation
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06534v2
PDF	http://arxiv.org/pdf/1804.06534v2.pdf
PWC	https://paperswithcode.com/paper/falling-things-a-synthetic-dataset-for-3d
Repo
Framework

Unsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations


Title	Unsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations
Authors	Yasunori Kudo, Keisuke Ogaki, Yusuke Matsui, Yuri Odagiri
Abstract	The task of three-dimensional (3D) human pose estimation from a single image can be divided into two parts: (1) Two-dimensional (2D) human joint detection from the image and (2) estimating a 3D pose from the 2D joints. Herein, we focus on the second part, i.e., a 3D pose estimation from 2D joint locations. The problem with existing methods is that they require either (1) a 3D pose dataset or (2) 2D joint locations in consecutive frames taken from a video sequence. We aim to solve these problems. For the first time, we propose a method that learns a 3D human pose without any 3D datasets. Our method can predict a 3D pose from 2D joint locations in a single image. Our system is based on the generative adversarial networks, and the networks are trained in an unsupervised manner. Our primary idea is that, if the network can predict a 3D human pose correctly, the 3D pose that is projected onto a 2D plane should not collapse even if it is rotated perpendicularly. We evaluated the performance of our method using Human3.6M and the MPII dataset and showed that our network can predict a 3D pose well even if the 3D dataset is not available during training.
Tasks	3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation
Published	2018-03-22
URL	http://arxiv.org/abs/1803.08244v1
PDF	http://arxiv.org/pdf/1803.08244v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-adversarial-learning-of-3d-human
Repo
Framework

3D Pose Estimation 3D Human Pose Estimation artificial intelligence L2 Regularization deep learning Entity Extraction reinforcement learning Word Embeddings Question Answering Semantic Role Labeling Object Detection graph construction Data Augmentation adversarial network Face Recognition machine learning dataset Common Sense Reasoning 3D Object Detection Pose Estimation Emotion Classification Sentiment Analysis Reading Comprehension nlp