May 7, 2019

2887 words 14 mins read

Paper Group AWR 12

Multi-Person Pose Estimation with Local Joint-to-Person Associations. Whitening-Free Least-Squares Non-Gaussian Component Analysis. Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution. Equality of Opportunity in Supervised Learning. vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neur …

Multi-Person Pose Estimation with Local Joint-to-Person Associations


Title	Multi-Person Pose Estimation with Local Joint-to-Person Associations
Authors	Umar Iqbal, Juergen Gall
Abstract	Despite of the recent success of neural networks for human pose estimation, current approaches are limited to pose estimation of a single person and cannot handle humans in groups or crowds. In this work, we propose a method that estimates the poses of multiple persons in an image in which a person can be occluded by another person or might be truncated. To this end, we consider multi-person pose estimation as a joint-to-person association problem. We construct a fully connected graph from a set of detected joint candidates in an image and resolve the joint-to-person association and outlier detection using integer linear programming. Since solving joint-to-person association jointly for all persons in an image is an NP-hard problem and even approximations are expensive, we solve the problem locally for each person. On the challenging MPII Human Pose Dataset for multiple persons, our approach achieves the accuracy of a state-of-the-art method, but it is 6,000 to 19,000 times faster.
Tasks	Multi-Person Pose Estimation, Outlier Detection, Pose Estimation
Published	2016-08-30
URL	http://arxiv.org/abs/1608.08526v2
PDF	http://arxiv.org/pdf/1608.08526v2.pdf
PWC	https://paperswithcode.com/paper/multi-person-pose-estimation-with-local-joint
Repo	https://github.com/MVIG-SJTU/RMPE
Framework	torch

Whitening-Free Least-Squares Non-Gaussian Component Analysis


Title	Whitening-Free Least-Squares Non-Gaussian Component Analysis
Authors	Hiroaki Shiino, Hiroaki Sasaki, Gang Niu, Masashi Sugiyama
Abstract	Non-Gaussian component analysis (NGCA) is an unsupervised linear dimension reduction method that extracts low-dimensional non-Gaussian “signals” from high-dimensional data contaminated with Gaussian noise. NGCA can be regarded as a generalization of projection pursuit (PP) and independent component analysis (ICA) to multi-dimensional and dependent non-Gaussian components. Indeed, seminal approaches to NGCA are based on PP and ICA. Recently, a novel NGCA approach called least-squares NGCA (LSNGCA) has been developed, which gives a solution analytically through least-squares estimation of log-density gradients and eigendecomposition. However, since pre-whitening of data is involved in LSNGCA, it performs unreliably when the data covariance matrix is ill-conditioned, which is often the case in high-dimensional data analysis. In this paper, we propose a whitening-free LSNGCA method and experimentally demonstrate its superiority.
Tasks	Dimensionality Reduction
Published	2016-03-03
URL	http://arxiv.org/abs/1603.01029v2
PDF	http://arxiv.org/pdf/1603.01029v2.pdf
PWC	https://paperswithcode.com/paper/whitening-free-least-squares-non-gaussian
Repo	https://github.com/hgeno/WFLSNGCA
Framework	none

Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution


Title	Training Deep Networks for Facial Expression Recognition with Crowd-Sourced Label Distribution
Authors	Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, Zhengyou Zhang
Abstract	Crowd sourcing has become a widely adopted scheme to collect ground truth labels. However, it is a well-known problem that these labels can be very noisy. In this paper, we demonstrate how to learn a deep convolutional neural network (DCNN) from noisy labels, using facial expression recognition as an example. More specifically, we have 10 taggers to label each input image, and compare four different approaches to utilizing the multiple labels: majority voting, multi-label learning, probabilistic label drawing, and cross-entropy loss. We show that the traditional majority voting scheme does not perform as well as the last two approaches that fully leverage the label distribution. An enhanced FER+ data set with multiple labels for each face image will also be shared with the research community.
Tasks	Facial Expression Recognition, Multi-Label Learning
Published	2016-08-03
URL	http://arxiv.org/abs/1608.01041v2
PDF	http://arxiv.org/pdf/1608.01041v2.pdf
PWC	https://paperswithcode.com/paper/training-deep-networks-for-facial-expression
Repo	https://github.com/Microsoft/FERPlus
Framework	none

Equality of Opportunity in Supervised Learning


Title	Equality of Opportunity in Supervised Learning
Authors	Moritz Hardt, Eric Price, Nathan Srebro
Abstract	We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to the decision maker, who can respond by improving the classification accuracy. In line with other studies, our notion is oblivious: it depends only on the joint statistics of the predictor, the target and the protected attribute, but not on interpretation of individualfeatures. We study the inherent limits of defining and identifying biases based on such oblivious measures, outlining what can and cannot be inferred from different oblivious tests. We illustrate our notion using a case study of FICO credit scores.
Tasks
Published	2016-10-07
URL	http://arxiv.org/abs/1610.02413v1
PDF	http://arxiv.org/pdf/1610.02413v1.pdf
PWC	https://paperswithcode.com/paper/equality-of-opportunity-in-supervised
Repo	https://github.com/stes/drk.ki-macht-schule
Framework	tf

vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design


Title	vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design
Authors	Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, Stephen W. Keckler
Abstract	The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU. This restriction hampers a researcher’s flexibility to study different machine learning algorithms, forcing them to either use a less desirable network architecture or parallelize the processing across multiple GPUs. We propose a runtime memory manager that virtualizes the memory usage of DNNs such that both GPU and CPU memory can simultaneously be utilized for training larger DNNs. Our virtualized DNN (vDNN) reduces the average GPU memory usage of AlexNet by up to 89%, OverFeat by 91%, and GoogLeNet by 95%, a significant reduction in memory requirements of DNNs. Similar experiments on VGG-16, one of the deepest and memory hungry DNNs to date, demonstrate the memory-efficiency of our proposal. vDNN enables VGG-16 with batch size 256 (requiring 28 GB of memory) to be trained on a single NVIDIA Titan X GPU card containing 12 GB of memory, with 18% performance loss compared to a hypothetical, oracular GPU with enough memory to hold the entire DNN.
Tasks
Published	2016-02-25
URL	http://arxiv.org/abs/1602.08124v3
PDF	http://arxiv.org/pdf/1602.08124v3.pdf
PWC	https://paperswithcode.com/paper/vdnn-virtualized-deep-neural-networks-for
Repo	https://github.com/adderbyte/MultiClassLabelSegTF
Framework	tf

Optimization for Large-Scale Machine Learning with Distributed Features and Observations


Title	Optimization for Large-Scale Machine Learning with Distributed Features and Observations
Authors	Alexandros Nathan, Diego Klabjan
Abstract	As the size of modern data sets exceeds the disk and memory capacities of a single computer, machine learning practitioners have resorted to parallel and distributed computing. Given that optimization is one of the pillars of machine learning and predictive modeling, distributed optimization methods have recently garnered ample attention in the literature. Although previous research has mostly focused on settings where either the observations, or features of the problem at hand are stored in distributed fashion, the situation where both are partitioned across the nodes of a computer cluster (doubly distributed) has barely been studied. In this work we propose two doubly distributed optimization algorithms. The first one falls under the umbrella of distributed dual coordinate ascent methods, while the second one belongs to the class of stochastic gradient/coordinate descent hybrid methods. We conduct numerical experiments in Spark using real-world and simulated data sets and study the scaling properties of our methods. Our empirical evaluation of the proposed algorithms demonstrates the out-performance of a block distributed ADMM method, which, to the best of our knowledge is the only other existing doubly distributed optimization algorithm.
Tasks	Distributed Optimization
Published	2016-10-31
URL	http://arxiv.org/abs/1610.10060v2
PDF	http://arxiv.org/pdf/1610.10060v2.pdf
PWC	https://paperswithcode.com/paper/optimization-for-large-scale-machine-learning
Repo	https://github.com/anathan90/RADiSA
Framework	none

Accelerating Exact and Approximate Inference for (Distributed) Discrete Optimization with GPUs


Title	Accelerating Exact and Approximate Inference for (Distributed) Discrete Optimization with GPUs
Authors	Ferdinando Fioretto, Enrico Pontelli, William Yeoh, Rina Dechter
Abstract	Discrete optimization is a central problem in artificial intelligence. The optimization of the aggregated cost of a network of cost functions arises in a variety of problems including (W)CSP, DCOP, as well as optimization in stochastic variants such as the tasks of finding the most probable explanation (MPE) in belief networks. Inference-based algorithms are powerful techniques for solving discrete optimization problems, which can be used independently or in combination with other techniques. However, their applicability is often limited by their compute intensive nature and their space requirements. This paper proposes the design and implementation of a novel inference-based technique, which exploits modern massively parallel architectures, such as those found in Graphical Processing Units (GPUs), to speed up the resolution of exact and approximated inference-based algorithms for discrete optimization. The paper studies the proposed algorithm in both centralized and distributed optimization contexts. The paper demonstrates that the use of GPUs provides significant advantages in terms of runtime and scalability, achieving up to two orders of magnitude in speedups and showing a considerable reduction in execution time (up to 345 times faster) with respect to a sequential version.
Tasks	Distributed Optimization
Published	2016-08-18
URL	http://arxiv.org/abs/1608.05288v2
PDF	http://arxiv.org/pdf/1608.05288v2.pdf
PWC	https://paperswithcode.com/paper/accelerating-exact-and-approximate-inference
Repo	https://github.com/nandofioretto/GpuBE
Framework	none

Using the Output Embedding to Improve Language Models


Title	Using the Output Embedding to Improve Language Models
Authors	Ofir Press, Lior Wolf
Abstract	We study the topmost weight matrix of neural network language models. We show that this matrix constitutes a valid word embedding. When training language models, we recommend tying the input embedding and this output embedding. We analyze the resulting update rules and show that the tied embedding evolves in a more similar way to the output embedding than to the input embedding in the untied model. We also offer a new method of regularizing the output embedding. Our methods lead to a significant reduction in perplexity, as we are able to show on a variety of neural network language models. Finally, we show that weight tying can reduce the size of neural translation models to less than half of their original size without harming their performance.
Tasks
Published	2016-08-20
URL	http://arxiv.org/abs/1608.05859v3
PDF	http://arxiv.org/pdf/1608.05859v3.pdf
PWC	https://paperswithcode.com/paper/using-the-output-embedding-to-improve
Repo	https://github.com/floydhub/word-language-model
Framework	pytorch

Incorporating Copying Mechanism in Sequence-to-Sequence Learning


Title	Incorporating Copying Mechanism in Sequence-to-Sequence Learning
Authors	Jiatao Gu, Zhengdong Lu, Hang Li, Victor O. K. Li
Abstract	We address an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. A similar phenomenon is observable in human language communication. For example, humans tend to repeat entity names or even long phrases in conversation. The challenge with regard to copying in Seq2Seq is that new machinery is needed to decide when to perform the operation. In this paper, we incorporate copying into neural network-based Seq2Seq learning and propose a new model called CopyNet with encoder-decoder structure. CopyNet can nicely integrate the regular way of word generation in the decoder with the new copying mechanism which can choose sub-sequences in the input sequence and put them at proper places in the output sequence. Our empirical study on both synthetic data sets and real world data sets demonstrates the efficacy of CopyNet. For example, CopyNet can outperform regular RNN-based model with remarkable margins on text summarization tasks.
Tasks	Text Summarization
Published	2016-03-21
URL	http://arxiv.org/abs/1603.06393v3
PDF	http://arxiv.org/pdf/1603.06393v3.pdf
PWC	https://paperswithcode.com/paper/incorporating-copying-mechanism-in-sequence
Repo	https://github.com/adamklec/copynet
Framework	pytorch

We used Neural Networks to Detect Clickbaits: You won’t believe what happened Next!


Title	We used Neural Networks to Detect Clickbaits: You won’t believe what happened Next!
Authors	Ankesh Anand, Tanmoy Chakraborty, Noseong Park
Abstract	Online content publishers often use catchy headlines for their articles in order to attract users to their websites. These headlines, popularly known as clickbaits, exploit a user’s curiosity gap and lure them to click on links that often disappoint them. Existing methods for automatically detecting clickbaits rely on heavy feature engineering and domain knowledge. Here, we introduce a neural network architecture based on Recurrent Neural Networks for detecting clickbaits. Our model relies on distributed word representations learned from a large unannotated corpora, and character embeddings learned via Convolutional Neural Networks. Experimental results on a dataset of news headlines show that our model outperforms existing techniques for clickbait detection with an accuracy of 0.98 with F1-score of 0.98 and ROC-AUC of 0.99.
Tasks	Clickbait Detection, Feature Engineering
Published	2016-12-05
URL	https://arxiv.org/abs/1612.01340v2
PDF	https://arxiv.org/pdf/1612.01340v2.pdf
PWC	https://paperswithcode.com/paper/we-used-neural-networks-to-detect-clickbaits
Repo	https://github.com/ankeshanand/deep-clickbait-detection
Framework	tf

Human pose estimation via Convolutional Part Heatmap Regression


Title	Human pose estimation via Convolutional Part Heatmap Regression
Authors	Adrian Bulat, Georgios Tzimiropoulos
Abstract	This paper is on human pose estimation using Convolutional Neural Networks. Our main contribution is a CNN cascaded architecture specifically designed for learning part relationships and spatial context, and robustly inferring pose even for the case of severe part occlusions. To this end, we propose a detection-followed-by-regression CNN cascade. The first part of our cascade outputs part detection heatmaps and the second part performs regression on these heatmaps. The benefits of the proposed architecture are multi-fold: It guides the network where to focus in the image and effectively encodes part constraints and context. More importantly, it can effectively cope with occlusions because part detection heatmaps for occluded parts provide low confidence scores which subsequently guide the regression part of our network to rely on contextual information in order to predict the location of these parts. Additionally, we show that the proposed cascade is flexible enough to readily allow the integration of various CNN architectures for both detection and regression, including recent ones based on residual learning. Finally, we illustrate that our cascade achieves top performance on the MPII and LSP data sets. Code can be downloaded from http://www.cs.nott.ac.uk/~psxab5/
Tasks	Pose Estimation
Published	2016-09-06
URL	http://arxiv.org/abs/1609.01743v1
PDF	http://arxiv.org/pdf/1609.01743v1.pdf
PWC	https://paperswithcode.com/paper/human-pose-estimation-via-convolutional-part
Repo	https://github.com/1adrianb/human-pose-estimation
Framework	torch

3D Fully Convolutional Network for Vehicle Detection in Point Cloud


Title	3D Fully Convolutional Network for Vehicle Detection in Point Cloud
Authors	Bo Li
Abstract	2D fully convolutional network has been recently successfully applied to object detection from images. In this paper, we extend the fully convolutional network based detection techniques to 3D and apply it to point cloud data. The proposed approach is verified on the task of vehicle detection from lidar point cloud for autonomous driving. Experiments on the KITTI dataset shows a significant performance improvement over the previous point cloud based detection approaches.
Tasks	Autonomous Driving, Object Detection
Published	2016-11-24
URL	http://arxiv.org/abs/1611.08069v2
PDF	http://arxiv.org/pdf/1611.08069v2.pdf
PWC	https://paperswithcode.com/paper/3d-fully-convolutional-network-for-vehicle
Repo	https://github.com/s10803926/3D-Object-detection-from-Pointcloud
Framework	tf

DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks


Title	DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks
Authors	Jie Fu, Hongyin Luo, Jiashi Feng, Kian Hsiang Low, Tat-Seng Chua
Abstract	The performance of deep neural networks is well-known to be sensitive to the setting of their hyperparameters. Recent advances in reverse-mode automatic differentiation allow for optimizing hyperparameters with gradients. The standard way of computing these gradients involves a forward and backward pass of computations. However, the backward pass usually needs to consume unaffordable memory to store all the intermediate variables to exactly reverse the forward training procedure. In this work we propose a simple but effective method, DrMAD, to distill the knowledge of the forward pass into a shortcut path, through which we approximately reverse the training trajectory. Experiments on several image benchmark datasets show that DrMAD is at least 45 times faster and consumes 100 times less memory compared to state-of-the-art methods for optimizing hyperparameters with minimal compromise to its effectiveness. To the best of our knowledge, DrMAD is the first research attempt to make it practical to automatically tune thousands of hyperparameters of deep neural networks. The code can be downloaded from https://github.com/bigaidream-projects/drmad
Tasks
Published	2016-01-05
URL	http://arxiv.org/abs/1601.00917v5
PDF	http://arxiv.org/pdf/1601.00917v5.pdf
PWC	https://paperswithcode.com/paper/drmad-distilling-reverse-mode-automatic
Repo	https://github.com/bigaidream-projects/drmad
Framework	none

Real-Time Anomaly Detection for Streaming Analytics


Title	Real-Time Anomaly Detection for Streaming Analytics
Authors	Subutai Ahmad, Scott Purdy
Abstract	Much of the worlds data is streaming, time-series data, where anomalies give significant information in critical situations. Yet detecting anomalies in streaming data is a difficult task, requiring detectors to process data in real-time, and learn while simultaneously making predictions. We present a novel anomaly detection technique based on an on-line sequence memory algorithm called Hierarchical Temporal Memory (HTM). We show results from a live application that detects anomalies in financial metrics in real-time. We also test the algorithm on NAB, a published benchmark for real-time anomaly detection, where our algorithm achieves best-in-class results.
Tasks	Anomaly Detection, Time Series
Published	2016-07-08
URL	http://arxiv.org/abs/1607.02480v1
PDF	http://arxiv.org/pdf/1607.02480v1.pdf
PWC	https://paperswithcode.com/paper/real-time-anomaly-detection-for-streaming
Repo	https://github.com/SudeepSarkar/matlabHTM
Framework	none

MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos


Title	MOSI: Multimodal Corpus of Sentiment Intensity and Subjectivity Analysis in Online Opinion Videos
Authors	Amir Zadeh, Rowan Zellers, Eli Pincus, Louis-Philippe Morency
Abstract	People are sharing their opinions, stories and reviews through online video sharing websites every day. Studying sentiment and subjectivity in these opinion videos is experiencing a growing attention from academia and industry. While sentiment analysis has been successful for text, it is an understudied research question for videos and multimedia content. The biggest setbacks for studies in this direction are lack of a proper dataset, methodology, baselines and statistical analysis of how information from different modality sources relate to each other. This paper introduces to the scientific community the first opinion-level annotated corpus of sentiment and subjectivity analysis in online videos called Multimodal Opinion-level Sentiment Intensity dataset (MOSI). The dataset is rigorously annotated with labels for subjectivity, sentiment intensity, per-frame and per-opinion annotated visual features, and per-milliseconds annotated audio features. Furthermore, we present baselines for future studies in this direction as well as a new multimodal fusion approach that jointly models spoken words and visual gestures.
Tasks	Sentiment Analysis, Subjectivity Analysis
Published	2016-06-20
URL	http://arxiv.org/abs/1606.06259v2
PDF	http://arxiv.org/pdf/1606.06259v2.pdf
PWC	https://paperswithcode.com/paper/mosi-multimodal-corpus-of-sentiment-intensity
Repo	https://github.com/soujanyaporia/multimodal-sentiment-analysis
Framework	tf