May 6, 2019

2951 words 14 mins read

Paper Group ANR 434

A Parallel Memory-efficient Epistemic Logic Program Solver: Harder, Better, Faster. Fast-AT: Fast Automatic Thumbnail Generation using Deep Neural Networks. Multi-way Particle Swarm Fusion. Predicting Counterfactuals from Large Historical Data and Small Randomized Trials. The Future of Data Analysis in the Neurosciences. Inverse Reinforcement Learn …

A Parallel Memory-efficient Epistemic Logic Program Solver: Harder, Better, Faster


Title	A Parallel Memory-efficient Epistemic Logic Program Solver: Harder, Better, Faster
Authors	Patrick Thor Kahl, Anthony P. Leclerc, Tran Cao Son
Abstract	As the practical use of answer set programming (ASP) has grown with the development of efficient solvers, we expect a growing interest in extensions of ASP as their semantics stabilize and solvers supporting them mature. Epistemic Specifications, which adds modal operators K and M to the language of ASP, is one such extension. We call a program in this language an epistemic logic program (ELP). Solvers have thus far been practical for only the simplest ELPs due to exponential growth of the search space. We describe a solver that is able to solve harder problems better (e.g., without exponentially-growing memory needs w.r.t. K and M occurrences) and faster than any other known ELP solver.
Tasks
Published	2016-08-24
URL	http://arxiv.org/abs/1608.06910v2
PDF	http://arxiv.org/pdf/1608.06910v2.pdf
PWC	https://paperswithcode.com/paper/a-parallel-memory-efficient-epistemic-logic
Repo
Framework

Fast-AT: Fast Automatic Thumbnail Generation using Deep Neural Networks


Title	Fast-AT: Fast Automatic Thumbnail Generation using Deep Neural Networks
Authors	Seyed A. Esmaeili, Bharat Singh, Larry S. Davis
Abstract	Fast-AT is an automatic thumbnail generation system based on deep neural networks. It is a fully-convolutional deep neural network, which learns specific filters for thumbnails of different sizes and aspect ratios. During inference, the appropriate filter is selected depending on the dimensions of the target thumbnail. Unlike most previous work, Fast-AT does not utilize saliency but addresses the problem directly. In addition, it eliminates the need to conduct region search on the saliency map. The model generalizes to thumbnails of different sizes including those with extreme aspect ratios and can generate thumbnails in real time. A data set of more than 70,000 thumbnail annotations was collected to train Fast-AT. We show competitive results in comparison to existing techniques.
Tasks
Published	2016-12-14
URL	http://arxiv.org/abs/1612.04811v2
PDF	http://arxiv.org/pdf/1612.04811v2.pdf
PWC	https://paperswithcode.com/paper/fast-at-fast-automatic-thumbnail-generation
Repo
Framework

Multi-way Particle Swarm Fusion


Title	Multi-way Particle Swarm Fusion
Authors	Chen Liu, Hang Yan, Pushmeet Kohli, Yasutaka Furukawa
Abstract	This paper proposes a novel MAP inference framework for Markov Random Field (MRF) in parallel computing environments. The inference framework, dubbed Swarm Fusion, is a natural generalization of the Fusion Move method. Every thread (in a case of multi-threading environments) maintains and updates a solution. At each iteration, a thread can generate arbitrary number of solution proposals and take arbitrary number of concurrent solutions from the other threads to perform multi-way fusion in updating its solution. The framework is general, making popular existing inference techniques such as alpha-expansion, fusion move, parallel alpha-expansion, and hierarchical fusion, its special cases. We have evaluated the effectiveness of our approach against competing methods on three problems of varying difficulties, in particular, the stereo, the optical flow, and the layered depthmap estimation problems.
Tasks	Optical Flow Estimation
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01234v1
PDF	http://arxiv.org/pdf/1612.01234v1.pdf
PWC	https://paperswithcode.com/paper/multi-way-particle-swarm-fusion
Repo
Framework

Predicting Counterfactuals from Large Historical Data and Small Randomized Trials


Title	Predicting Counterfactuals from Large Historical Data and Small Randomized Trials
Authors	Nir Rosenfeld, Yishay Mansour, Elad Yom-Tov
Abstract	When a new treatment is considered for use, whether a pharmaceutical drug or a search engine ranking algorithm, a typical question that arises is, will its performance exceed that of the current treatment? The conventional way to answer this counterfactual question is to estimate the effect of the new treatment in comparison to that of the conventional treatment by running a controlled, randomized experiment. While this approach theoretically ensures an unbiased estimator, it suffers from several drawbacks, including the difficulty in finding representative experimental populations as well as the cost of running such trials. Moreover, such trials neglect the huge quantities of available control-condition data which are often completely ignored. In this paper we propose a discriminative framework for estimating the performance of a new treatment given a large dataset of the control condition and data from a small (and possibly unrepresentative) randomized trial comparing new and old treatments. Our objective, which requires minimal assumptions on the treatments, models the relation between the outcomes of the different conditions. This allows us to not only estimate mean effects but also to generate individual predictions for examples outside the randomized sample. We demonstrate the utility of our approach through experiments in three areas: Search engine operation, treatments to diabetes patients, and market value estimation for houses. Our results demonstrate that our approach can reduce the number and size of the currently performed randomized controlled experiments, thus saving significant time, money and effort on the part of practitioners.
Tasks
Published	2016-10-24
URL	http://arxiv.org/abs/1610.07667v2
PDF	http://arxiv.org/pdf/1610.07667v2.pdf
PWC	https://paperswithcode.com/paper/predicting-counterfactuals-from-large
Repo
Framework

The Future of Data Analysis in the Neurosciences


Title	The Future of Data Analysis in the Neurosciences
Authors	Danilo Bzdok, B. T. Thomas Yeo
Abstract	Neuroscience is undergoing faster changes than ever before. Over 100 years our field qualitatively described and invasively manipulated single or few organisms to gain anatomical, physiological, and pharmacological insights. In the last 10 years neuroscience spawned quantitative big-sample datasets on microanatomy, synaptic connections, optogenetic brain-behavior assays, and high-level cognition. While growing data availability and information granularity have been amply discussed, we direct attention to a routinely neglected question: How will the unprecedented data richness shape data analysis practices? Statistical reasoning is becoming more central to distill neurobiological knowledge from healthy and pathological brain recordings. We believe that large-scale data analysis will use more models that are non-parametric, generative, mixing frequentist and Bayesian aspects, and grounded in different statistical inferences.
Tasks
Published	2016-08-05
URL	http://arxiv.org/abs/1608.03465v1
PDF	http://arxiv.org/pdf/1608.03465v1.pdf
PWC	https://paperswithcode.com/paper/the-future-of-data-analysis-in-the
Repo
Framework

Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics


Title	Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics
Authors	Michael Herman, Tobias Gindele, Jörg Wagner, Felix Schmitt, Wolfram Burgard
Abstract	Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent’s behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL approaches assume that if the transition model is unknown, additional samples from the system’s dynamics are accessible, or the observed behavior provides enough samples of the system’s dynamics to solve the inverse problem accurately. These assumptions are often not satisfied. To overcome this, we present a gradient-based IRL approach that simultaneously estimates the system’s dynamics. By solving the combined optimization problem, our approach takes into account the bias of the demonstrations, which stems from the generating policy. The evaluation on a synthetic MDP and a transfer learning task shows improvements regarding the sample efficiency as well as the accuracy of the estimated reward functions and transition models.
Tasks	Transfer Learning
Published	2016-04-13
URL	http://arxiv.org/abs/1604.03912v1
PDF	http://arxiv.org/pdf/1604.03912v1.pdf
PWC	https://paperswithcode.com/paper/inverse-reinforcement-learning-with-1
Repo
Framework

An Enhanced Deep Feature Representation for Person Re-identification


Title	An Enhanced Deep Feature Representation for Person Re-identification
Authors	Shangxuan Wu, Ying-Cong Chen, Xiang Li, An-Cong Wu, Jin-Jie You, Wei-Shi Zheng
Abstract	Feature representation and metric learning are two critical components in person re-identification models. In this paper, we focus on the feature representation and claim that hand-crafted histogram features can be complementary to Convolutional Neural Network (CNN) features. We propose a novel feature extraction model called Feature Fusion Net (FFN) for pedestrian image representation. In FFN, back propagation makes CNN features constrained by the handcrafted features. Utilizing color histogram features (RGB, HSV, YCbCr, Lab and YIQ) and texture features (multi-scale and multi-orientation Gabor features), we get a new deep feature representation that is more discriminative and compact. Experiments on three challenging datasets (VIPeR, CUHK01, PRID450s) validates the effectiveness of our proposal.
Tasks	Metric Learning, Person Re-Identification
Published	2016-04-26
URL	http://arxiv.org/abs/1604.07807v2
PDF	http://arxiv.org/pdf/1604.07807v2.pdf
PWC	https://paperswithcode.com/paper/an-enhanced-deep-feature-representation-for
Repo
Framework

Efficient Branching Cascaded Regression for Face Alignment under Significant Head Rotation


Title	Efficient Branching Cascaded Regression for Face Alignment under Significant Head Rotation
Authors	Brandon M. Smith, Charles R. Dyer
Abstract	Despite much interest in face alignment in recent years, the large majority of work has focused on near-frontal faces. Algorithms typically break down on profile faces, or are too slow for real-time applications. In this work we propose an efficient approach to face alignment that can handle 180 degrees of head rotation in a unified way (e.g., without resorting to view-based models) using 2D training data. The foundation of our approach is cascaded shape regression (CSR), which has emerged recently as the leading strategy. We propose a generalization of conventional CSRs that we call branching cascaded regression (BCR). Conventional CSRs are single-track; that is, they progress from one cascade level to the next in a straight line, with each regressor attempting to fit the entire dataset. We instead split the regression problem into two or more simpler ones after each cascade level. Intuitively, each regressor can then operate on a simpler objective function (i.e., with fewer conflicting gradient directions). Within the BCR framework, we model and infer pose-related landmark visibility and face shape simultaneously using Structured Point Distribution Models (SPDMs). We propose to learn task-specific feature mapping functions that are adaptive to landmark visibility, and that use SPDM parameters as regression targets instead of 2D landmark coordinates. Additionally, we introduce a new in-the-wild dataset of profile faces to validate our approach.
Tasks	Face Alignment
Published	2016-11-05
URL	http://arxiv.org/abs/1611.01584v2
PDF	http://arxiv.org/pdf/1611.01584v2.pdf
PWC	https://paperswithcode.com/paper/efficient-branching-cascaded-regression-for
Repo
Framework

A moment-matching Ferguson and Klass algorithm


Title	A moment-matching Ferguson and Klass algorithm
Authors	Julyan Arbel, Igor Prünster
Abstract	Completely random measures (CRM) represent the key building block of a wide variety of popular stochastic models and play a pivotal role in modern Bayesian Nonparametrics. A popular representation of CRMs as a random series with decreasing jumps is due to Ferguson and Klass (1972). This can immediately be turned into an algorithm for sampling realizations of CRMs or more elaborate models involving transformed CRMs. However, concrete implementation requires to truncate the random series at some threshold resulting in an approximation error. The goal of this paper is to quantify the quality of the approximation by a moment-matching criterion, which consists in evaluating a measure of discrepancy between actual moments and moments based on the simulation output. Seen as a function of the truncation level, the methodology can be used to determine the truncation level needed to reach a certain level of precision. The resulting moment-matching \FK algorithm is then implemented and illustrated on several popular Bayesian nonparametric models.
Tasks
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02566v1
PDF	http://arxiv.org/pdf/1606.02566v1.pdf
PWC	https://paperswithcode.com/paper/a-moment-matching-ferguson-and-klass
Repo
Framework

Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification


Title	Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification
Authors	Maksim Lapin, Matthias Hein, Bernt Schiele
Abstract	Top-k error is currently a popular performance measure on large scale image classification benchmarks such as ImageNet and Places. Despite its wide acceptance, our understanding of this metric is limited as most of the previous research is focused on its special case, the top-1 error. In this work, we explore two directions that shed more light on the top-k error. First, we provide an in-depth analysis of established and recently proposed single-label multiclass methods along with a detailed account of efficient optimization algorithms for them. Our results indicate that the softmax loss and the smooth multiclass SVM are surprisingly competitive in top-k error uniformly across all k, which can be explained by our analysis of multiclass top-k calibration. Further improvements for a specific k are possible with a number of proposed top-k loss functions. Second, we use the top-k methods to explore the transition from multiclass to multilabel learning. In particular, we find that it is possible to obtain effective multilabel classifiers on Pascal VOC using a single label per image for training, while the gap between multiclass and multilabel methods on MS COCO is more significant. Finally, our contribution of efficient algorithms for training with the considered top-k and multilabel loss functions is of independent interest.
Tasks	Calibration, Image Classification
Published	2016-12-12
URL	http://arxiv.org/abs/1612.03663v1
PDF	http://arxiv.org/pdf/1612.03663v1.pdf
PWC	https://paperswithcode.com/paper/analysis-and-optimization-of-loss-functions
Repo
Framework

Recurrent Human Pose Estimation


Title	Recurrent Human Pose Estimation
Authors	Vasileios Belagiannis, Andrew Zisserman
Abstract	We propose a novel ConvNet model for predicting 2D human body poses in an image. The model regresses a heatmap representation for each body keypoint, and is able to learn and represent both the part appearances and the context of the part configuration. We make the following three contributions: (i) an architecture combining a feed forward module with a recurrent module, where the recurrent module can be run iteratively to improve the performance, (ii) the model can be trained end-to-end and from scratch, with auxiliary losses incorporated to improve performance, (iii) we investigate whether keypoint visibility can also be predicted. The model is evaluated on two benchmark datasets. The result is a simple architecture that achieves performance on par with the state of the art, but without the complexity of a graphical model stage (or layers).
Tasks	Pose Estimation
Published	2016-05-10
URL	http://arxiv.org/abs/1605.02914v3
PDF	http://arxiv.org/pdf/1605.02914v3.pdf
PWC	https://paperswithcode.com/paper/recurrent-human-pose-estimation
Repo
Framework

Multi-task Recurrent Model for Speech and Speaker Recognition


Title	Multi-task Recurrent Model for Speech and Speaker Recognition
Authors	Zhiyuan Tang, Lantian Li, Dong Wang
Abstract	Although highly correlated, speech and speaker recognition have been regarded as two independent tasks and studied by two communities. This is certainly not the way that people behave: we decipher both speech content and speaker traits at the same time. This paper presents a unified model to perform speech and speaker recognition simultaneously and altogether. The model is based on a unified neural network where the output of one task is fed to the input of the other, leading to a multi-task recurrent network. Experiments show that the joint model outperforms the task-specific models on both the two tasks.
Tasks	Speaker Recognition
Published	2016-03-31
URL	http://arxiv.org/abs/1603.09643v4
PDF	http://arxiv.org/pdf/1603.09643v4.pdf
PWC	https://paperswithcode.com/paper/multi-task-recurrent-model-for-speech-and
Repo
Framework

Criteria of efficiency for conformal prediction


Title	Criteria of efficiency for conformal prediction
Authors	Vladimir Vovk, Ilia Nouretdinov, Valentina Fedorova, Ivan Petej, Alex Gammerman
Abstract	We study optimal conformity measures for various criteria of efficiency of classification in an idealised setting. This leads to an important class of criteria of efficiency that we call probabilistic; it turns out that the most standard criteria of efficiency used in literature on conformal prediction are not probabilistic unless the problem of classification is binary. We consider both unconditional and label-conditional conformal prediction.
Tasks
Published	2016-03-14
URL	http://arxiv.org/abs/1603.04416v2
PDF	http://arxiv.org/pdf/1603.04416v2.pdf
PWC	https://paperswithcode.com/paper/criteria-of-efficiency-for-conformal
Repo
Framework


Title	Automatic Interpretation of Unordered Point Cloud Data for UAV Navigation in Construction
Authors	M. D. Phung, C. H. Quach, D. T. Chu, N. Q. Nguyen, T. H. Dinh, Q. P. Ha
Abstract	The objective of this work is to develop a data processing system that can automatically generate waypoints for navigation of an unmanned aerial vehicle (UAV) to inspect surfaces of structures like buildings and bridges. The input includes data recorded by two 2D laser scanners, orthogonally mounted on the UAV, and an inertial measurement unit (IMU). To achieve the goal, algorithms are developed to process the data collected. They are separated into three major groups: (i) the data registration and filtering to generate a 3D model of the structure and control the density of point clouds for data completeness enhancement; (ii) the surface and obstacle detection to assist the UAV in monitoring tasks; and (iii) the waypoint generation to set the flight path. Experiments on different data sets show that the developed system is able to reconstruct a 3D point cloud of the structure, extract its surfaces and objects, and generate waypoints for the UAV to accomplish inspection tasks.
Tasks
Published	2016-12-23
URL	http://arxiv.org/abs/1612.07850v2
PDF	http://arxiv.org/pdf/1612.07850v2.pdf
PWC	https://paperswithcode.com/paper/automatic-interpretation-of-unordered-point
Repo
Framework

ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA


Title	ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA
Authors	Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, Huazhong Yang, William J. Dally
Abstract	Long Short-Term Memory (LSTM) is widely used in speech recognition. In order to achieve higher prediction accuracy, machine learning scientists have built larger and larger models. Such large model is both computation intensive and memory intensive. Deploying such bulky model results in high power consumption and leads to high total cost of ownership (TCO) of a data center. In order to speedup the prediction and make it energy efficient, we first propose a load-balance-aware pruning method that can compress the LSTM model size by 20x (10x from pruning and 2x from quantization) with negligible loss of the prediction accuracy. The pruned model is friendly for parallel processing. Next, we propose scheduler that encodes and partitions the compressed model to each PE for parallelism, and schedule the complicated LSTM data flow. Finally, we design the hardware architecture, named Efficient Speech Recognition Engine (ESE) that works directly on the compressed model. Implemented on Xilinx XCKU060 FPGA running at 200MHz, ESE has a performance of 282 GOPS working directly on the compressed LSTM network, corresponding to 2.52 TOPS on the uncompressed one, and processes a full LSTM for speech recognition with a power dissipation of 41 Watts. Evaluated on the LSTM for speech recognition benchmark, ESE is 43x and 3x faster than Core i7 5930k CPU and Pascal Titan X GPU implementations. It achieves 40x and 11.5x higher energy efficiency compared with the CPU and GPU respectively.
Tasks	Quantization, Speech Recognition
Published	2016-12-01
URL	http://arxiv.org/abs/1612.00694v2
PDF	http://arxiv.org/pdf/1612.00694v2.pdf
PWC	https://paperswithcode.com/paper/ese-efficient-speech-recognition-engine-with
Repo
Framework