Paper Group ANR 652
An Automatic System for Unconstrained Video-Based Face Recognition. Machine Common Sense Concept Paper. A Knowledge Hunting Framework for Common Sense Reasoning. Real-time Context-aware Learning System for IoT Applications. Multi-class Classification Model Inspired by Quantum Detection Theory. Quantifying Generalization in Reinforcement Learning. N …
An Automatic System for Unconstrained Video-Based Face Recognition
Title | An Automatic System for Unconstrained Video-Based Face Recognition |
Authors | Jingxiao Zheng, Rajeev Ranjan, Ching-Hui Chen, Jun-Cheng Chen, Carlos D. Castillo, Rama Chellappa |
Abstract | Although deep learning approaches have achieved performance surpassing humans for still image-based face recognition, unconstrained video-based face recognition is still a challenging task due to large volume of data to be processed and intra/inter-video variations on pose, illumination, occlusion, scene, blur, video quality, etc. In this work, we consider challenging scenarios for unconstrained video-based face recognition from multiple-shot videos and surveillance videos with low-quality frames. To handle these problems, we propose a robust and efficient system for unconstrained video-based face recognition, which is composed of modules for face/fiducial detection, face association, and face recognition. First, we use multi-scale single-shot face detectors to efficiently localize faces in videos. The detected faces are then grouped respectively through carefully designed face association methods, especially for multi-shot videos. Finally, the faces are recognized by the proposed face matcher based on an unsupervised subspace learning approach and a subspace-to-subspace similarity metric. Extensive experiments on challenging video datasets, such as Multiple Biometric Grand Challenge (MBGC), Face and Ocular Challenge Series (FOCS), IARPA Janus Surveillance Video Benchmark (IJB-S) for low-quality surveillance videos and IARPA JANUS Benchmark B (IJB-B) for multiple-shot videos, demonstrate that the proposed system can accurately detect and associate faces from unconstrained videos and effectively learn robust and discriminative features for recognition. |
Tasks | Face Recognition |
Published | 2018-12-10 |
URL | https://arxiv.org/abs/1812.04058v3 |
https://arxiv.org/pdf/1812.04058v3.pdf | |
PWC | https://paperswithcode.com/paper/an-automatic-system-for-unconstrained-video |
Repo | |
Framework | |
Machine Common Sense Concept Paper
Title | Machine Common Sense Concept Paper |
Authors | David Gunning |
Abstract | This paper summarizes some of the technical background, research ideas, and possible development strategies for achieving machine common sense. Machine common sense has long been a critical-but-missing component of Artificial Intelligence (AI). Recent advances in machine learning have resulted in new AI capabilities, but in all of these applications, machine reasoning is narrow and highly specialized. Developers must carefully train or program systems for every situation. General commonsense reasoning remains elusive. The absence of common sense prevents intelligent systems from understanding their world, behaving reasonably in unforeseen situations, communicating naturally with people, and learning from new experiences. Its absence is perhaps the most significant barrier between the narrowly focused AI applications we have today and the more general, human-like AI systems we would like to build in the future. Machine common sense remains a broad, potentially unbounded problem in AI. There are a wide range of strategies that could be employed to make progress on this difficult challenge. This paper discusses two diverse strategies for focusing development on two different machine commonsense services: (1) a service that learns from experience, like a child, to construct computational models that mimic the core domains of child cognition for objects (intuitive physics), agents (intentional actors), and places (spatial navigation); and (2) service that learns from reading the Web, like a research librarian, to construct a commonsense knowledge repository capable of answering natural language and image-based questions about commonsense phenomena. |
Tasks | Common Sense Reasoning |
Published | 2018-10-17 |
URL | http://arxiv.org/abs/1810.07528v1 |
http://arxiv.org/pdf/1810.07528v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-common-sense-concept-paper |
Repo | |
Framework | |
A Knowledge Hunting Framework for Common Sense Reasoning
Title | A Knowledge Hunting Framework for Common Sense Reasoning |
Authors | Ali Emami, Noelia De La Cruz, Adam Trischler, Kaheer Suleman, Jackie Chi Kit Cheung |
Abstract | We introduce an automatic system that achieves state-of-the-art results on the Winograd Schema Challenge (WSC), a common sense reasoning task that requires diverse, complex forms of inference and knowledge. Our method uses a knowledge hunting module to gather text from the web, which serves as evidence for candidate problem resolutions. Given an input problem, our system generates relevant queries to send to a search engine, then extracts and classifies knowledge from the returned results and weighs them to make a resolution. Our approach improves F1 performance on the full WSC by 0.21 over the previous best and represents the first system to exceed 0.5 F1. We further demonstrate that the approach is competitive on the Choice of Plausible Alternatives (COPA) task, which suggests that it is generally applicable. |
Tasks | Common Sense Reasoning |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01375v1 |
http://arxiv.org/pdf/1810.01375v1.pdf | |
PWC | https://paperswithcode.com/paper/a-knowledge-hunting-framework-for-common |
Repo | |
Framework | |
Real-time Context-aware Learning System for IoT Applications
Title | Real-time Context-aware Learning System for IoT Applications |
Authors | Bhaskar Das, Jalal Almhana |
Abstract | We propose a real-time context-aware learning system along with the architecture that runs on the mobile devices, provide services to the user and manage the IoT devices. In this system, an application running on mobile devices collected data from the sensors, learned about the user-defined context, made predictions in real-time and manage IoT devices accordingly. However, the computational power of the mobile devices makes it challenging to run machine learning algorithms with acceptable accuracy. To solve this issue, some authors have run machine learning algorithms on the server and transmitted the results to the mobile devices. Although the context-aware predictions made by the server are more accurate than their mobile counterpart, it heavily depends on the network connection for the delivery of the results to the devices, which negatively affects real-time context-learning. Therefore, in this work, we describe a context-learning algorithm for mobile devices which is less demanding on the computational resources and maintains the accuracy of the prediction by updating itself from the learning parameters obtained from the server periodically. Experimental results show that the proposed light-weight context-learning algorithm can achieve mean accuracy up to 97.51% while mean execution time requires only 11ms. |
Tasks | |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11295v1 |
http://arxiv.org/pdf/1810.11295v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-context-aware-learning-system-for |
Repo | |
Framework | |
Multi-class Classification Model Inspired by Quantum Detection Theory
Title | Multi-class Classification Model Inspired by Quantum Detection Theory |
Authors | Prayag Tiwari, Massimo Melucci |
Abstract | Machine Learning has become very famous currently which assist in identifying the patterns from the raw data. Technological advancement has led to substantial improvement in Machine Learning which, thus helping to improve prediction. Current Machine Learning models are based on Classical Theory, which can be replaced by Quantum Theory to improve the effectiveness of the model. In the previous work, we developed binary classifier inspired by Quantum Detection Theory. In this extended abstract, our main goal is to develop multi-class classifier. We generally use the terminology multinomial classification or multi-class classification when we have a classification problem for classifying observations or instances into one of three or more classes. |
Tasks | |
Published | 2018-10-10 |
URL | http://arxiv.org/abs/1810.04491v1 |
http://arxiv.org/pdf/1810.04491v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-class-classification-model-inspired-by |
Repo | |
Framework | |
Quantifying Generalization in Reinforcement Learning
Title | Quantifying Generalization in Reinforcement Learning |
Authors | Karl Cobbe, Oleg Klimov, Chris Hesse, Taehoon Kim, John Schulman |
Abstract | In this paper, we investigate the problem of overfitting in deep reinforcement learning. Among the most common benchmarks in RL, it is customary to use the same environments for both training and testing. This practice offers relatively little insight into an agent’s ability to generalize. We address this issue by using procedurally generated environments to construct distinct training and test sets. Most notably, we introduce a new environment called CoinRun, designed as a benchmark for generalization in RL. Using CoinRun, we find that agents overfit to surprisingly large training sets. We then show that deeper convolutional architectures improve generalization, as do methods traditionally found in supervised learning, including L2 regularization, dropout, data augmentation and batch normalization. |
Tasks | Data Augmentation, L2 Regularization |
Published | 2018-12-06 |
URL | https://arxiv.org/abs/1812.02341v3 |
https://arxiv.org/pdf/1812.02341v3.pdf | |
PWC | https://paperswithcode.com/paper/quantifying-generalization-in-reinforcement |
Repo | |
Framework | |
Normalization of Neural Networks using Analytic Variance Propagation
Title | Normalization of Neural Networks using Analytic Variance Propagation |
Authors | Alexander Shekhovtsov, Boris Flach |
Abstract | We address the problem of estimating statistics of hidden units in a neural network using a method of analytic moment propagation. These statistics are useful for approximate whitening of the inputs in front of saturating non-linearities such as a sigmoid function. This is important for initialization of training and for reducing the accumulated scale and bias dependencies (compensating covariate shift), which presumably eases the learning. In batch normalization, which is currently a very widely applied technique, sample estimates of statistics of hidden units over a batch are used. The proposed estimation uses an analytic propagation of mean and variance of the training set through the network. The result depends on the network structure and its current weights but not on the specific batch input. The estimates are suitable for initialization and normalization, efficient to compute and independent of the batch size. The experimental verification well supports these claims. However, the method does not share the generalization properties of BN, to which our experiments give some additional insight. |
Tasks | |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.10560v1 |
http://arxiv.org/pdf/1803.10560v1.pdf | |
PWC | https://paperswithcode.com/paper/normalization-of-neural-networks-using |
Repo | |
Framework | |
Affordance Extraction and Inference based on Semantic Role Labeling
Title | Affordance Extraction and Inference based on Semantic Role Labeling |
Authors | Daniel Loureiro, Alípio Mário Jorge |
Abstract | Common-sense reasoning is becoming increasingly important for the advancement of Natural Language Processing. While word embeddings have been very successful, they cannot explain which aspects of ‘coffee’ and ‘tea’ make them similar, or how they could be related to ‘shop’. In this paper, we propose an explicit word representation that builds upon the Distributional Hypothesis to represent meaning from semantic roles, and allow inference of relations from their meshing, as supported by the affordance-based Indexical Hypothesis. We find that our model improves the state-of-the-art on unsupervised word similarity tasks while allowing for direct inference of new relations from the same vector space. |
Tasks | Common Sense Reasoning, Semantic Role Labeling, Word Embeddings |
Published | 2018-09-03 |
URL | http://arxiv.org/abs/1809.00589v1 |
http://arxiv.org/pdf/1809.00589v1.pdf | |
PWC | https://paperswithcode.com/paper/affordance-extraction-and-inference-based-on |
Repo | |
Framework | |
Efficient and Robust Question Answering from Minimal Context over Documents
Title | Efficient and Robust Question Answering from Minimal Context over Documents |
Authors | Sewon Min, Victor Zhong, Richard Socher, Caiming Xiong |
Abstract | Neural models for question answering (QA) over documents have achieved significant performance improvements. Although effective, these models do not scale to large corpora due to their complex modeling of interactions between the document and the question. Moreover, recent work has shown that such models are sensitive to adversarial inputs. In this paper, we study the minimal context required to answer the question, and find that most questions in existing datasets can be answered with a small set of sentences. Inspired by this observation, we propose a simple sentence selector to select the minimal set of sentences to feed into the QA model. Our overall system achieves significant reductions in training (up to 15 times) and inference times (up to 13 times), with accuracy comparable to or better than the state-of-the-art on SQuAD, NewsQA, TriviaQA and SQuAD-Open. Furthermore, our experimental results and analyses show that our approach is more robust to adversarial inputs. |
Tasks | Question Answering, Reading Comprehension |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.08092v1 |
http://arxiv.org/pdf/1805.08092v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-and-robust-question-answering-from |
Repo | |
Framework | |
Construction of the Literature Graph in Semantic Scholar
Title | Construction of the Literature Graph in Semantic Scholar |
Authors | Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, Rodney Kinney, Sebastian Kohlmeier, Kyle Lo, Tyler Murray, Hsu-Han Ooi, Matthew Peters, Joanna Power, Sam Skjonsberg, Lucy Lu Wang, Chris Wilhelm, Zheng Yuan, Madeleine van Zuylen, Oren Etzioni |
Abstract | We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery. The resulting literature graph consists of more than 280M nodes, representing papers, authors, entities and various interactions between them (e.g., authorships, citations, entity mentions). We reduce literature graph construction into familiar NLP tasks (e.g., entity extraction and linking), point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task. The methods described in this paper are used to enable semantic features in www.semanticscholar.org |
Tasks | Entity Extraction, graph construction |
Published | 2018-05-06 |
URL | http://arxiv.org/abs/1805.02262v1 |
http://arxiv.org/pdf/1805.02262v1.pdf | |
PWC | https://paperswithcode.com/paper/construction-of-the-literature-graph-in |
Repo | |
Framework | |
An Empirical Analysis of the Role of Amplifiers, Downtoners, and Negations in Emotion Classification in Microblogs
Title | An Empirical Analysis of the Role of Amplifiers, Downtoners, and Negations in Emotion Classification in Microblogs |
Authors | Florian Strohm, Roman Klinger |
Abstract | The effect of amplifiers, downtoners, and negations has been studied in general and particularly in the context of sentiment analysis. However, there is only limited work which aims at transferring the results and methods to discrete classes of emotions, e. g., joy, anger, fear, sadness, surprise, and disgust. For instance, it is not straight-forward to interpret which emotion the phrase “not happy” expresses. With this paper, we aim at obtaining a better understanding of such modifiers in the context of emotion-bearing words and their impact on document-level emotion classification, namely, microposts on Twitter. We select an appropriate scope detection method for modifiers of emotion words, incorporate it in a document-level emotion classification model as additional bag of words and show that this approach improves the performance of emotion classification. In addition, we build a term weighting approach based on the different modifiers into a lexical model for the analysis of the semantics of modifiers and their impact on emotion meaning. We show that amplifiers separate emotions expressed with an emotion- bearing word more clearly from other secondary connotations. Downtoners have the opposite effect. In addition, we discuss the meaning of negations of emotion-bearing words. For instance we show empirically that “not happy” is closer to sadness than to anger and that fear-expressing words in the scope of downtoners often express surprise. |
Tasks | Emotion Classification, Sentiment Analysis |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1808.10653v2 |
http://arxiv.org/pdf/1808.10653v2.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-analysis-of-the-role-of |
Repo | |
Framework | |
A Multi-task Ensemble Framework for Emotion, Sentiment and Intensity Prediction
Title | A Multi-task Ensemble Framework for Emotion, Sentiment and Intensity Prediction |
Authors | Md Shad Akhtar, Deepanway Ghosal, Asif Ekbal, Pushpak Bhattacharyya, Sadao Kurohashi |
Abstract | In this paper, through multi-task ensemble framework we address three problems of emotion and sentiment analysis i.e. “emotion classification & intensity”, “valence, arousal & dominance for emotion” and “valence & arousal} for sentiment”. The underlying problems cover two granularities (i.e. coarse-grained and fine-grained) and a diverse range of domains (i.e. tweets, Facebook posts, news headlines, blogs, letters etc.). The ensemble model aims to leverage the learned representations of three deep learning models (i.e. CNN, LSTM and GRU) and a hand-crafted feature representation for the predictions. Experimental results on the benchmark datasets show the efficacy of our proposed multi-task ensemble frameworks. We obtain the performance improvement of 2-3 points on an average over single-task systems for most of the problems and domains. |
Tasks | Emotion Classification, Sentiment Analysis |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01216v2 |
http://arxiv.org/pdf/1808.01216v2.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-task-ensemble-framework-for-emotion |
Repo | |
Framework | |
It’s all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data
Title | It’s all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data |
Authors | Matteo Ruggero Ronchi, Oisin Mac Aodha, Robert Eng, Pietro Perona |
Abstract | We address the problem of 3D human pose estimation from 2D input images using only weakly supervised training data. Despite showing considerable success for 2D pose estimation, the application of supervised machine learning to 3D pose estimation in real world images is currently hampered by the lack of varied training images with corresponding 3D poses. Most existing 3D pose estimation algorithms train on data that has either been collected in carefully controlled studio settings or has been generated synthetically. Instead, we take a different approach, and propose a 3D human pose estimation algorithm that only requires relative estimates of depth at training time. Such training signal, although noisy, can be easily collected from crowd annotators, and is of sufficient quality for enabling successful training and evaluation of 3D pose algorithms. Our results are competitive with fully supervised regression based approaches on the Human3.6M dataset, despite using significantly weaker training data. Our proposed algorithm opens the door to using existing widespread 2D datasets for 3D pose estimation by allowing fine-tuning with noisy relative constraints, resulting in more accurate 3D poses. |
Tasks | 3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06880v2 |
http://arxiv.org/pdf/1805.06880v2.pdf | |
PWC | https://paperswithcode.com/paper/its-all-relative-monocular-3d-human-pose |
Repo | |
Framework | |
Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation
Title | Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation |
Authors | Jonathan Tremblay, Thang To, Stan Birchfield |
Abstract | We present a new dataset, called Falling Things (FAT), for advancing the state-of-the-art in object detection and 3D pose estimation in the context of robotics. By synthetically combining object models and backgrounds of complex composition and high graphical quality, we are able to generate photorealistic images with accurate 3D pose annotations for all objects in all images. Our dataset contains 60k annotated photos of 21 household objects taken from the YCB dataset. For each image, we provide the 3D poses, per-pixel class segmentation, and 2D/3D bounding box coordinates for all objects. To facilitate testing different input modalities, we provide mono and stereo RGB images, along with registered dense depth images. We describe in detail the generation process and statistical analysis of the data. |
Tasks | 3D Object Detection, 3D Pose Estimation, Object Detection, Pose Estimation |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06534v2 |
http://arxiv.org/pdf/1804.06534v2.pdf | |
PWC | https://paperswithcode.com/paper/falling-things-a-synthetic-dataset-for-3d |
Repo | |
Framework | |
Unsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations
Title | Unsupervised Adversarial Learning of 3D Human Pose from 2D Joint Locations |
Authors | Yasunori Kudo, Keisuke Ogaki, Yusuke Matsui, Yuri Odagiri |
Abstract | The task of three-dimensional (3D) human pose estimation from a single image can be divided into two parts: (1) Two-dimensional (2D) human joint detection from the image and (2) estimating a 3D pose from the 2D joints. Herein, we focus on the second part, i.e., a 3D pose estimation from 2D joint locations. The problem with existing methods is that they require either (1) a 3D pose dataset or (2) 2D joint locations in consecutive frames taken from a video sequence. We aim to solve these problems. For the first time, we propose a method that learns a 3D human pose without any 3D datasets. Our method can predict a 3D pose from 2D joint locations in a single image. Our system is based on the generative adversarial networks, and the networks are trained in an unsupervised manner. Our primary idea is that, if the network can predict a 3D human pose correctly, the 3D pose that is projected onto a 2D plane should not collapse even if it is rotated perpendicularly. We evaluated the performance of our method using Human3.6M and the MPII dataset and showed that our network can predict a 3D pose well even if the 3D dataset is not available during training. |
Tasks | 3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation |
Published | 2018-03-22 |
URL | http://arxiv.org/abs/1803.08244v1 |
http://arxiv.org/pdf/1803.08244v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-adversarial-learning-of-3d-human |
Repo | |
Framework | |