May 7, 2019

2887 words 14 mins read

Paper Group AWR 12

Paper Group AWR 12

Multi-Person Pose Estimation with Local Joint-to-Person Associations. Whitening-Free Least-Squares Non-Gaussian Component Analysis. Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution. Equality of Opportunity in Supervised Learning. vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neur …

Multi-Person Pose Estimation with Local Joint-to-Person Associations

Title Multi-Person Pose Estimation with Local Joint-to-Person Associations
Authors Umar Iqbal, Juergen Gall
Abstract Despite of the recent success of neural networks for human pose estimation, current approaches are limited to pose estimation of a single person and cannot handle humans in groups or crowds. In this work, we propose a method that estimates the poses of multiple persons in an image in which a person can be occluded by another person or might be truncated. To this end, we consider multi-person pose estimation as a joint-to-person association problem. We construct a fully connected graph from a set of detected joint candidates in an image and resolve the joint-to-person association and outlier detection using integer linear programming. Since solving joint-to-person association jointly for all persons in an image is an NP-hard problem and even approximations are expensive, we solve the problem locally for each person. On the challenging MPII Human Pose Dataset for multiple persons, our approach achieves the accuracy of a state-of-the-art method, but it is 6,000 to 19,000 times faster.
Tasks Multi-Person Pose Estimation, Outlier Detection, Pose Estimation
Published 2016-08-30
URL http://arxiv.org/abs/1608.08526v2
PDF http://arxiv.org/pdf/1608.08526v2.pdf
PWC https://paperswithcode.com/paper/multi-person-pose-estimation-with-local-joint
Repo https://github.com/MVIG-SJTU/RMPE
Framework torch

Whitening-Free Least-Squares Non-Gaussian Component Analysis

Title Whitening-Free Least-Squares Non-Gaussian Component Analysis
Authors Hiroaki Shiino, Hiroaki Sasaki, Gang Niu, Masashi Sugiyama
Abstract Non-Gaussian component analysis (NGCA) is an unsupervised linear dimension reduction method that extracts low-dimensional non-Gaussian “signals” from high-dimensional data contaminated with Gaussian noise. NGCA can be regarded as a generalization of projection pursuit (PP) and independent component analysis (ICA) to multi-dimensional and dependent non-Gaussian components. Indeed, seminal approaches to NGCA are based on PP and ICA. Recently, a novel NGCA approach called least-squares NGCA (LSNGCA) has been developed, which gives a solution analytically through least-squares estimation of log-density gradients and eigendecomposition. However, since pre-whitening of data is involved in LSNGCA, it performs unreliably when the data covariance matrix is ill-conditioned, which is often the case in high-dimensional data analysis. In this paper, we propose a whitening-free LSNGCA method and experimentally demonstrate its superiority.
Tasks Dimensionality Reduction
Published 2016-03-03
URL http://arxiv.org/abs/1603.01029v2
PDF http://arxiv.org/pdf/1603.01029v2.pdf
PWC https://paperswithcode.com/paper/whitening-free-least-squares-non-gaussian
Repo https://github.com/hgeno/WFLSNGCA
Framework none

Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution

Title Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution
Authors Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, Zhengyou Zhang
Abstract Crowd sourcing has become a widely adopted scheme to collect ground truth labels. However, it is a well-known problem that these labels can be very noisy. In this paper, we demonstrate how to learn a deep convolutional neural network (DCNN) from noisy labels, using facial expression recognition as an example. More specifically, we have 10 taggers to label each input image, and compare four different approaches to utilizing the multiple labels: majority voting, multi-label learning, probabilistic label drawing, and cross-entropy loss. We show that the traditional majority voting scheme does not perform as well as the last two approaches that fully leverage the label distribution. An enhanced FER+ data set with multiple labels for each face image will also be shared with the research community.
Tasks Facial Expression Recognition, Multi-Label Learning
Published 2016-08-03
URL http://arxiv.org/abs/1608.01041v2
PDF http://arxiv.org/pdf/1608.01041v2.pdf
PWC https://paperswithcode.com/paper/training-deep-networks-for-facial-expression
Repo https://github.com/Microsoft/FERPlus
Framework none

Equality of Opportunity in Supervised Learning

Title Equality of Opportunity in Supervised Learning
Authors Moritz Hardt, Eric Price, Nathan Srebro
Abstract We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to the decision maker, who can respond by improving the classification accuracy. In line with other studies, our notion is oblivious: it depends only on the joint statistics of the predictor, the target and the protected attribute, but not on interpretation of individualfeatures. We study the inherent limits of defining and identifying biases based on such oblivious measures, outlining what can and cannot be inferred from different oblivious tests. We illustrate our notion using a case study of FICO credit scores.
Tasks
Published 2016-10-07
URL http://arxiv.org/abs/1610.02413v1
PDF http://arxiv.org/pdf/1610.02413v1.pdf
PWC https://paperswithcode.com/paper/equality-of-opportunity-in-supervised
Repo https://github.com/stes/drk.ki-macht-schule
Framework tf

vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design

Title vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design
Authors Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, Stephen W. Keckler
Abstract The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU. This restriction hampers a researcher’s flexibility to study different machine learning algorithms, forcing them to either use a less desirable network architecture or parallelize the processing across multiple GPUs. We propose a runtime memory manager that virtualizes the memory usage of DNNs such that both GPU and CPU memory can simultaneously be utilized for training larger DNNs. Our virtualized DNN (vDNN) reduces the average GPU memory usage of AlexNet by up to 89%, OverFeat by 91%, and GoogLeNet by 95%, a significant reduction in memory requirements of DNNs. Similar experiments on VGG-16, one of the deepest and memory hungry DNNs to date, demonstrate the memory-efficiency of our proposal. vDNN enables VGG-16 with batch size 256 (requiring 28 GB of memory) to be trained on a single NVIDIA Titan X GPU card containing 12 GB of memory, with 18% performance loss compared to a hypothetical, oracular GPU with enough memory to hold the entire DNN.
Tasks
Published 2016-02-25
URL http://arxiv.org/abs/1602.08124v3
PDF http://arxiv.org/pdf/1602.08124v3.pdf
PWC https://paperswithcode.com/paper/vdnn-virtualized-deep-neural-networks-for
Repo https://github.com/adderbyte/MultiClassLabelSegTF
Framework tf

Optimization for Large-Scale Machine Learning with Distributed Features and Observations

Title Optimization for Large-Scale Machine Learning with Distributed Features and Observations
Authors Alexandros Nathan, Diego Klabjan
Abstract As the size of modern data sets exceeds the disk and memory capacities of a single computer, machine learning practitioners have resorted to parallel and distributed computing. Given that optimization is one of the pillars of machine learning and predictive modeling, distributed optimization methods have recently garnered ample attention in the literature. Although previous research has mostly focused on settings where either the observations, or features of the problem at hand are stored in distributed fashion, the situation where both are partitioned across the nodes of a computer cluster (doubly distributed) has barely been studied. In this work we propose two doubly distributed optimization algorithms. The first one falls under the umbrella of distributed dual coordinate ascent methods, while the second one belongs to the class of stochastic gradient/coordinate descent hybrid methods. We conduct numerical experiments in Spark using real-world and simulated data sets and study the scaling properties of our methods. Our empirical evaluation of the proposed algorithms demonstrates the out-performance of a block distributed ADMM method, which, to the best of our knowledge is the only other existing doubly distributed optimization algorithm.
Tasks Distributed Optimization
Published 2016-10-31
URL http://arxiv.org/abs/1610.10060v2
PDF http://arxiv.org/pdf/1610.10060v2.pdf
PWC https://paperswithcode.com/paper/optimization-for-large-scale-machine-learning
Repo https://github.com/anathan90/RADiSA
Framework none

Accelerating Exact and Approximate Inference for (Distributed) Discrete Optimization with GPUs

Title Accelerating Exact and Approximate Inference for (Distributed) Discrete Optimization with GPUs
Authors Ferdinando Fioretto, Enrico Pontelli, William Yeoh, Rina Dechter
Abstract Discrete optimization is a central problem in artificial intelligence. The optimization of the aggregated cost of a network of cost functions arises in a variety of problems including (W)CSP, DCOP, as well as optimization in stochastic variants such as the tasks of finding the most probable explanation (MPE) in belief networks. Inference-based algorithms are powerful techniques for solving discrete optimization problems, which can be used independently or in combination with other techniques. However, their applicability is often limited by their compute intensive nature and their space requirements. This paper proposes the design and implementation of a novel inference-based technique, which exploits modern massively parallel architectures, such as those found in Graphical Processing Units (GPUs), to speed up the resolution of exact and approximated inference-based algorithms for discrete optimization. The paper studies the proposed algorithm in both centralized and distributed optimization contexts. The paper demonstrates that the use of GPUs provides significant advantages in terms of runtime and scalability, achieving up to two orders of magnitude in speedups and showing a considerable reduction in execution time (up to 345 times faster) with respect to a sequential version.
Tasks Distributed Optimization
Published 2016-08-18
URL http://arxiv.org/abs/1608.05288v2
PDF http://arxiv.org/pdf/1608.05288v2.pdf
PWC https://paperswithcode.com/paper/accelerating-exact-and-approximate-inference
Repo https://github.com/nandofioretto/GpuBE
Framework none

Using the Output Embedding to Improve Language Models

Title Using the Output Embedding to Improve Language Models
Authors Ofir Press, Lior Wolf
Abstract We study the topmost weight matrix of neural network language models. We show that this matrix constitutes a valid word embedding. When training language models, we recommend tying the input embedding and this output embedding. We analyze the resulting update rules and show that the tied embedding evolves in a more similar way to the output embedding than to the input embedding in the untied model. We also offer a new method of regularizing the output embedding. Our methods lead to a significant reduction in perplexity, as we are able to show on a variety of neural network language models. Finally, we show that weight tying can reduce the size of neural translation models to less than half of their original size without harming their performance.
Tasks
Published 2016-08-20
URL http://arxiv.org/abs/1608.05859v3
PDF http://arxiv.org/pdf/1608.05859v3.pdf
PWC https://paperswithcode.com/paper/using-the-output-embedding-to-improve
Repo https://github.com/floydhub/word-language-model
Framework pytorch

Incorporating Copying Mechanism in Sequence-to-Sequence Learning

Title Incorporating Copying Mechanism in Sequence-to-Sequence Learning
Authors Jiatao Gu, Zhengdong Lu, Hang Li, Victor O. K. Li
Abstract We address an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. A similar phenomenon is observable in human language communication. For example, humans tend to repeat entity names or even long phrases in conversation. The challenge with regard to copying in Seq2Seq is that new machinery is needed to decide when to perform the operation. In this paper, we incorporate copying into neural network-based Seq2Seq learning and propose a new model called CopyNet with encoder-decoder structure. CopyNet can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence. Our empirical study on both synthetic data sets and real world data sets demonstrates the efficacy of CopyNet. For example, CopyNet can outperform regular RNN-based model with remarkable margins on text summarization tasks.
Tasks Text Summarization
Published 2016-03-21
URL http://arxiv.org/abs/1603.06393v3
PDF http://arxiv.org/pdf/1603.06393v3.pdf
PWC https://paperswithcode.com/paper/incorporating-copying-mechanism-in-sequence
Repo https://github.com/adamklec/copynet
Framework pytorch

We used Neural Networks to Detect Clickbaits: You won’t believe what happened Next!

Title We used Neural Networks to Detect Clickbaits: You won’t believe what happened Next!
Authors Ankesh Anand, Tanmoy Chakraborty, Noseong Park
Abstract Online content publishers often use catchy headlines for their articles in order to attract users to their websites. These headlines, popularly known as clickbaits, exploit a user’s curiosity gap and lure them to click on links that often disappoint them. Existing methods for automatically detecting clickbaits rely on heavy feature engineering and domain knowledge. Here, we introduce a neural network architecture based on Recurrent Neural Networks for detecting clickbaits. Our model relies on distributed word representations learned from a large unannotated corpora, and character embeddings learned via Convolutional Neural Networks. Experimental results on a dataset of news headlines show that our model outperforms existing techniques for clickbait detection with an accuracy of 0.98 with F1-score of 0.98 and ROC-AUC of 0.99.
Tasks Clickbait Detection, Feature Engineering
Published 2016-12-05
URL https://arxiv.org/abs/1612.01340v2
PDF https://arxiv.org/pdf/1612.01340v2.pdf
PWC https://paperswithcode.com/paper/we-used-neural-networks-to-detect-clickbaits
Repo https://github.com/ankeshanand/deep-clickbait-detection
Framework tf

Human pose estimation via Convolutional Part Heatmap Regression

Title Human pose estimation via Convolutional Part Heatmap Regression
Authors Adrian Bulat, Georgios Tzimiropoulos
Abstract This paper is on human pose estimation using Convolutional Neural Networks. Our main contribution is a CNN cascaded architecture specifically designed for learning part relationships and spatial context, and robustly inferring pose even for the case of severe part occlusions. To this end, we propose a detection-followed-by-regression CNN cascade. The first part of our cascade outputs part detection heatmaps and the second part performs regression on these heatmaps. The benefits of the proposed architecture are multi-fold: It guides the network where to focus in the image and effectively encodes part constraints and context. More importantly, it can effectively cope with occlusions because part detection heatmaps for occluded parts provide low confidence scores which subsequently guide the regression part of our network to rely on contextual information in order to predict the location of these parts. Additionally, we show that the proposed cascade is flexible enough to readily allow the integration of various CNN architectures for both detection and regression, including recent ones based on residual learning. Finally, we illustrate that our cascade achieves top performance on the MPII and LSP data sets. Code can be downloaded from http://www.cs.nott.ac.uk/~psxab5/
Tasks Pose Estimation
Published 2016-09-06
URL http://arxiv.org/abs/1609.01743v1
PDF http://arxiv.org/pdf/1609.01743v1.pdf
PWC https://paperswithcode.com/paper/human-pose-estimation-via-convolutional-part
Repo https://github.com/1adrianb/human-pose-estimation
Framework torch

3D Fully Convolutional Network for Vehicle Detection in Point Cloud

Title 3D Fully Convolutional Network for Vehicle Detection in Point Cloud
Authors Bo Li
Abstract 2D fully convolutional network has been recently successfully applied to object detection from images. In this paper, we extend the fully convolutional network based detection techniques to 3D and apply it to point cloud data. The proposed approach is verified on the task of vehicle detection from lidar point cloud for autonomous driving. Experiments on the KITTI dataset shows a significant performance improvement over the previous point cloud based detection approaches.
Tasks Autonomous Driving, Object Detection
Published 2016-11-24
URL http://arxiv.org/abs/1611.08069v2
PDF http://arxiv.org/pdf/1611.08069v2.pdf
PWC https://paperswithcode.com/paper/3d-fully-convolutional-network-for-vehicle
Repo https://github.com/s10803926/3D-Object-detection-from-Pointcloud
Framework tf

DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks

Title DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks
Authors Jie Fu, Hongyin Luo, Jiashi Feng, Kian Hsiang Low, Tat-Seng Chua
Abstract The performance of deep neural networks is well-known to be sensitive to the setting of their hyperparameters. Recent advances in reverse-mode automatic differentiation allow for optimizing hyperparameters with gradients. The standard way of computing these gradients involves a forward and backward pass of computations. However, the backward pass usually needs to consume unaffordable memory to store all the intermediate variables to exactly reverse the forward training procedure. In this work we propose a simple but effective method, DrMAD, to distill the knowledge of the forward pass into a shortcut path, through which we approximately reverse the training trajectory. Experiments on several image benchmark datasets show that DrMAD is at least 45 times faster and consumes 100 times less memory compared to state-of-the-art methods for optimizing hyperparameters with minimal compromise to its effectiveness. To the best of our knowledge, DrMAD is the first research attempt to make it practical to automatically tune thousands of hyperparameters of deep neural networks. The code can be downloaded from https://github.com/bigaidream-projects/drmad
Tasks
Published 2016-01-05
URL http://arxiv.org/abs/1601.00917v5
PDF http://arxiv.org/pdf/1601.00917v5.pdf
PWC https://paperswithcode.com/paper/drmad-distilling-reverse-mode-automatic
Repo https://github.com/bigaidream-projects/drmad
Framework none

Real-Time Anomaly Detection for Streaming Analytics

Title Real-Time Anomaly Detection for Streaming Analytics
Authors Subutai Ahmad, Scott Purdy
Abstract Much of the worlds data is streaming, time-series data, where anomalies give significant information in critical situations. Yet detecting anomalies in streaming data is a difficult task, requiring detectors to process data in real-time, and learn while simultaneously making predictions. We present a novel anomaly detection technique based on an on-line sequence memory algorithm called Hierarchical Temporal Memory (HTM). We show results from a live application that detects anomalies in financial metrics in real-time. We also test the algorithm on NAB, a published benchmark for real-time anomaly detection, where our algorithm achieves best-in-class results.
Tasks Anomaly Detection, Time Series
Published 2016-07-08
URL http://arxiv.org/abs/1607.02480v1
PDF http://arxiv.org/pdf/1607.02480v1.pdf
PWC https://paperswithcode.com/paper/real-time-anomaly-detection-for-streaming
Repo https://github.com/SudeepSarkar/matlabHTM
Framework none

MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos

Title MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos
Authors Amir Zadeh, Rowan Zellers, Eli Pincus, Louis-Philippe Morency
Abstract People are sharing their opinions, stories and reviews through online video sharing websites every day. Studying sentiment and subjectivity in these opinion videos is experiencing a growing attention from academia and industry. While sentiment analysis has been successful for text, it is an understudied research question for videos and multimedia content. The biggest setbacks for studies in this direction are lack of a proper dataset, methodology, baselines and statistical analysis of how information from different modality sources relate to each other. This paper introduces to the scientific community the first opinion-level annotated corpus of sentiment and subjectivity analysis in online videos called Multimodal Opinion-level Sentiment Intensity dataset (MOSI). The dataset is rigorously annotated with labels for subjectivity, sentiment intensity, per-frame and per-opinion annotated visual features, and per-milliseconds annotated audio features. Furthermore, we present baselines for future studies in this direction as well as a new multimodal fusion approach that jointly models spoken words and visual gestures.
Tasks Sentiment Analysis, Subjectivity Analysis
Published 2016-06-20
URL http://arxiv.org/abs/1606.06259v2
PDF http://arxiv.org/pdf/1606.06259v2.pdf
PWC https://paperswithcode.com/paper/mosi-multimodal-corpus-of-sentiment-intensity
Repo https://github.com/soujanyaporia/multimodal-sentiment-analysis
Framework tf
comments powered by Disqus