May 6, 2019

3006 words 15 mins read

Paper Group ANR 175

Efficient Multi-view Performance Capture of Fine-Scale Surface Detail. Hierarchy through Composition with Linearly Solvable Markov Decision Processes. Efficient Pairwise Learning Using Kernel Ridge Regression: an Exact Two-Step Method. Hybrid Learning of Optical Flow and Next Frame Prediction to Boost Optical Flow in the Wild. Distance Metric Learn …

Efficient Multi-view Performance Capture of Fine-Scale Surface Detail


Title	Efficient Multi-view Performance Capture of Fine-Scale Surface Detail
Authors	Nadia Robertini, Edilson De Aguiar, Thomas Helten, Christian Theobalt
Abstract	We present a new effective way for performance capture of deforming meshes with fine-scale time-varying surface detail from multi-view video. Our method builds up on coarse 4D surface reconstructions, as obtained with commonly used template-based methods. As they only capture models of coarse-to-medium scale detail, fine scale deformation detail is often done in a second pass by using stereo constraints, features, or shading-based refinement. In this paper, we propose a new effective and stable solution to this second step. Our framework creates an implicit representation of the deformable mesh using a dense collection of 3D Gaussian functions on the surface, and a set of 2D Gaussians for the images. The fine scale deformation of all mesh vertices that maximizes photo-consistency can be efficiently found by densely optimizing a new model-to-image consistency energy on all vertex positions. A principal advantage is that our problem formulation yields a smooth closed form energy with implicit occlusion handling and analytic derivatives. Error-prone correspondence finding, or discrete sampling of surface displacement values are also not needed. We show several reconstructions of human subjects wearing loose clothing, and we qualitatively and quantitatively show that we robustly capture more detail than related methods.
Tasks
Published	2016-02-05
URL	http://arxiv.org/abs/1602.02023v1
PDF	http://arxiv.org/pdf/1602.02023v1.pdf
PWC	https://paperswithcode.com/paper/efficient-multi-view-performance-capture-of
Repo
Framework

Hierarchy through Composition with Linearly Solvable Markov Decision Processes


Title	Hierarchy through Composition with Linearly Solvable Markov Decision Processes
Authors	Andrew M. Saxe, Adam Earle, Benjamin Rosman
Abstract	Hierarchical architectures are critical to the scalability of reinforcement learning methods. Current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme uses the concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.
Tasks
Published	2016-12-08
URL	http://arxiv.org/abs/1612.02757v1
PDF	http://arxiv.org/pdf/1612.02757v1.pdf
PWC	https://paperswithcode.com/paper/hierarchy-through-composition-with-linearly
Repo
Framework

Efficient Pairwise Learning Using Kernel Ridge Regression: an Exact Two-Step Method


Title	Efficient Pairwise Learning Using Kernel Ridge Regression: an Exact Two-Step Method
Authors	Michiel Stock, Tapio Pahikkala, Antti Airola, Bernard De Baets, Willem Waegeman
Abstract	Pairwise learning or dyadic prediction concerns the prediction of properties for pairs of objects. It can be seen as an umbrella covering various machine learning problems such as matrix completion, collaborative filtering, multi-task learning, transfer learning, network prediction and zero-shot learning. In this work we analyze kernel-based methods for pairwise learning, with a particular focus on a recently-suggested two-step method. We show that this method offers an appealing alternative for commonly-applied Kronecker-based methods that model dyads by means of pairwise feature representations and pairwise kernels. In a series of theoretical results, we establish correspondences between the two types of methods in terms of linear algebra and spectral filtering, and we analyze their statistical consistency. In addition, the two-step method allows us to establish novel algorithmic shortcuts for efficient training and validation on very large datasets. Putting those properties together, we believe that this simple, yet powerful method can become a standard tool for many problems. Extensive experimental results for a range of practical settings are reported.
Tasks	Matrix Completion, Multi-Task Learning, Transfer Learning, Zero-Shot Learning
Published	2016-06-14
URL	http://arxiv.org/abs/1606.04275v1
PDF	http://arxiv.org/pdf/1606.04275v1.pdf
PWC	https://paperswithcode.com/paper/efficient-pairwise-learning-using-kernel
Repo
Framework

Hybrid Learning of Optical Flow and Next Frame Prediction to Boost Optical Flow in the Wild


Title	Hybrid Learning of Optical Flow and Next Frame Prediction to Boost Optical Flow in the Wild
Authors	Nima Sedaghat, Mohammadreza Zolfaghari, Thomas Brox
Abstract	CNN-based optical flow estimation has attracted attention recently, mainly due to its impressively high frame rates. These networks perform well on synthetic datasets, but they are still far behind the classical methods in real-world videos. This is because there is no ground truth optical flow for training these networks on real data. In this paper, we boost CNN-based optical flow estimation in real scenes with the help of the freely available self-supervised task of next-frame prediction. To this end, we train the network in a hybrid way, providing it with a mixture of synthetic and real videos. With the help of a sample-variant multi-tasking architecture, the network is trained on different tasks depending on the availability of ground-truth. We also experiment with the prediction of “next-flow” instead of estimation of the current flow, which is intuitively closer to the task of next-frame prediction and yields favorable results. We demonstrate the improvement in optical flow estimation on the real-world KITTI benchmark. Additionally, we test the optical flow indirectly in an action classification scenario. As a side product of this work, we report significant improvements over state-of-the-art in the task of next-frame prediction.
Tasks	Action Classification, Optical Flow Estimation
Published	2016-12-12
URL	http://arxiv.org/abs/1612.03777v2
PDF	http://arxiv.org/pdf/1612.03777v2.pdf
PWC	https://paperswithcode.com/paper/hybrid-learning-of-optical-flow-and-next
Repo
Framework

Distance Metric Learning for Aspect Phrase Grouping


Title	Distance Metric Learning for Aspect Phrase Grouping
Authors	Shufeng Xiong, Yue Zhang, Donghong Ji, Yinxia Lou
Abstract	Aspect phrase grouping is an important task in aspect-level sentiment analysis. It is a challenging problem due to polysemy and context dependency. We propose an Attention-based Deep Distance Metric Learning (ADDML) method, by considering aspect phrase representation as well as context representation. First, leveraging the characteristics of the review text, we automatically generate aspect phrase sample pairs for distant supervision. Second, we feed word embeddings of aspect phrases and their contexts into an attention-based neural network to learn feature representation of contexts. Both aspect phrase embedding and context embedding are used to learn a deep feature subspace for measure the distances between aspect phrases for K-means clustering. Experiments on four review datasets show that the proposed method outperforms state-of-the-art strong baseline methods.
Tasks	Metric Learning, Sentiment Analysis, Word Embeddings
Published	2016-04-29
URL	http://arxiv.org/abs/1604.08672v2
PDF	http://arxiv.org/pdf/1604.08672v2.pdf
PWC	https://paperswithcode.com/paper/distance-metric-learning-for-aspect-phrase
Repo
Framework

Sub-Optimal Multi-Phase Path Planning: A Method for Solving Rubik’s Revenge


Title	Sub-Optimal Multi-Phase Path Planning: A Method for Solving Rubik’s Revenge
Authors	Jared Weed
Abstract	Rubik’s Revenge, a 4x4x4 variant of the Rubik’s puzzles, remains to date as an unsolved puzzle. That is to say, we do not have a method or successful categorization to optimally solve every one of its approximately $7.401 \times 10^{45}$ possible configurations. Rubik’s Cube, Rubik’s Revenge’s predecessor (3x3x3), with its approximately $4.33 \times 10^{19}$ possible configurations, has only recently been completely solved by Rokicki et. al, further finding that any configuration requires no more than 20 moves. With the sheer dimension of Rubik’s Revenge and its total configuration space, a brute-force method of finding all optimal solutions would be in vain. Similar to the methods used by Rokicki et. al on Rubik’s Cube, in this paper we develop a method for solving arbitrary configurations of Rubik’s Revenge in phases, using a combination of a powerful algorithm known as IDA* and a useful definition of distance in the cube space. While time-series results were not successfully gathered, it will be shown that this method far outweighs current human-solving methods and can be used to determine loose upper bounds for the cube space. Discussion will suggest that this method can also be applied to other puzzles with the proper transformations.
Tasks	Time Series
Published	2016-01-20
URL	http://arxiv.org/abs/1601.05744v1
PDF	http://arxiv.org/pdf/1601.05744v1.pdf
PWC	https://paperswithcode.com/paper/sub-optimal-multi-phase-path-planning-a
Repo
Framework

BioSpaun: A large-scale behaving brain model with complex neurons


Title	BioSpaun: A large-scale behaving brain model with complex neurons
Authors	Chris Eliasmith, Jan Gosmann, Xuan Choo
Abstract	We describe a large-scale functional brain model that includes detailed, conductance-based, compartmental models of individual neurons. We call the model BioSpaun, to indicate the increased biological plausibility of these neurons, and because it is a direct extension of the Spaun model \cite{Eliasmith2012b}. We demonstrate that including these detailed compartmental models does not adversely affect performance across a variety of tasks, including digit recognition, serial working memory, and counting. We then explore the effects of applying TTX, a sodium channel blocking drug, to the model. We characterize the behavioral changes that result from this molecular level intervention. We believe this is the first demonstration of a large-scale brain model that clearly links low-level molecular interventions and high-level behavior.
Tasks
Published	2016-02-16
URL	http://arxiv.org/abs/1602.05220v1
PDF	http://arxiv.org/pdf/1602.05220v1.pdf
PWC	https://paperswithcode.com/paper/biospaun-a-large-scale-behaving-brain-model
Repo
Framework

Error Asymmetry in Causal and Anticausal Regression


Title	Error Asymmetry in Causal and Anticausal Regression
Authors	Patrick Blöbaum, Takashi Washio, Shohei Shimizu
Abstract	It is generally difficult to make any statements about the expected prediction error in an univariate setting without further knowledge about how the data were generated. Recent work showed that knowledge about the real underlying causal structure of a data generation process has implications for various machine learning settings. Assuming an additive noise and an independence between data generating mechanism and its input, we draw a novel connection between the intrinsic causal relationship of two variables and the expected prediction error. We formulate the theorem that the expected error of the true data generating function as prediction model is generally smaller when the effect is predicted from its cause and, on the contrary, greater when the cause is predicted from its effect. The theorem implies an asymmetry in the error depending on the prediction direction. This is further corroborated with empirical evaluations in artificial and real-world data sets.
Tasks
Published	2016-10-11
URL	http://arxiv.org/abs/1610.03263v2
PDF	http://arxiv.org/pdf/1610.03263v2.pdf
PWC	https://paperswithcode.com/paper/error-asymmetry-in-causal-and-anticausal
Repo
Framework

Fully Convolutional Attention Networks for Fine-Grained Recognition


Title	Fully Convolutional Attention Networks for Fine-Grained Recognition
Authors	Xiao Liu, Tian Xia, Jiang Wang, Yi Yang, Feng Zhou, Yuanqing Lin
Abstract	Fine-grained recognition is challenging due to its subtle local inter-class differences versus large intra-class variations such as poses. A key to address this problem is to localize discriminative parts to extract pose-invariant features. However, ground-truth part annotations can be expensive to acquire. Moreover, it is hard to define parts for many fine-grained classes. This work introduces Fully Convolutional Attention Networks (FCANs), a reinforcement learning framework to optimally glimpse local discriminative regions adaptive to different fine-grained domains. Compared to previous methods, our approach enjoys three advantages: 1) the weakly-supervised reinforcement learning procedure requires no expensive part annotations; 2) the fully-convolutional architecture speeds up both training and testing; 3) the greedy reward strategy accelerates the convergence of the learning. We demonstrate the effectiveness of our method with extensive experiments on four challenging fine-grained benchmark datasets, including CUB-200-2011, Stanford Dogs, Stanford Cars and Food-101.
Tasks
Published	2016-03-22
URL	http://arxiv.org/abs/1603.06765v4
PDF	http://arxiv.org/pdf/1603.06765v4.pdf
PWC	https://paperswithcode.com/paper/fully-convolutional-attention-networks-for
Repo
Framework

Learning and Free Energies for Vector Approximate Message Passing


Title	Learning and Free Energies for Vector Approximate Message Passing
Authors	Alyson K. Fletcher, Philip Schniter
Abstract	Vector approximate message passing (VAMP) is a computationally simple approach to the recovery of a signal $\mathbf{x}$ from noisy linear measurements $\mathbf{y}=\mathbf{Ax}+\mathbf{w}$. Like the AMP proposed by Donoho, Maleki, and Montanari in 2009, VAMP is characterized by a rigorous state evolution (SE) that holds under certain large random matrices and that matches the replica prediction of optimality. But while AMP’s SE holds only for large i.i.d. sub-Gaussian $\mathbf{A}$, VAMP’s SE holds under the much larger class: right-rotationally invariant $\mathbf{A}$. To run VAMP, however, one must specify the statistical parameters of the signal and noise. This work combines VAMP with Expectation-Maximization to yield an algorithm, EM-VAMP, that can jointly recover $\mathbf{x}$ while learning those statistical parameters. The fixed points of the proposed EM-VAMP algorithm are shown to be stationary points of a certain constrained free-energy, providing a variational interpretation of the algorithm. Numerical simulations show that EM-VAMP is robust to highly ill-conditioned $\mathbf{A}$ with performance nearly matching oracle-parameter VAMP.
Tasks
Published	2016-02-26
URL	http://arxiv.org/abs/1602.08207v4
PDF	http://arxiv.org/pdf/1602.08207v4.pdf
PWC	https://paperswithcode.com/paper/learning-and-free-energies-for-vector
Repo
Framework

Procedural Generation of Videos to Train Deep Action Recognition Networks


Title	Procedural Generation of Videos to Train Deep Action Recognition Networks
Authors	César Roberto de Souza, Adrien Gaidon, Yohann Cabon, Antonio Manuel López Peña
Abstract	Deep learning for human action recognition in videos is making significant progress, but is slowed down by its dependency on expensive manual labeling of large video collections. In this work, we investigate the generation of synthetic training data for action recognition, as it has recently shown promising results for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation and other computer graphics techniques of modern game engines. We generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for “Procedural Human Action Videos”. It contains a total of 39,982 videos, with more than 1,000 examples for each action of 35 categories. Our approach is not limited to existing motion capture sequences, and we procedurally define 14 synthetic actions. We introduce a deep multi-task representation learning architecture to mix synthetic and real videos, even if the action categories differ. Our experiments on the UCF101 and HMDB51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance, significantly outperforming fine-tuning state-of-the-art unsupervised generative models of videos.
Tasks	Action Recognition In Videos, Motion Capture, Representation Learning, Temporal Action Localization
Published	2016-12-02
URL	http://arxiv.org/abs/1612.00881v2
PDF	http://arxiv.org/pdf/1612.00881v2.pdf
PWC	https://paperswithcode.com/paper/procedural-generation-of-videos-to-train-deep
Repo
Framework

Combinatorial Aspects of the Distribution of Rough Objects


Title	Combinatorial Aspects of the Distribution of Rough Objects
Authors	A. Mani
Abstract	The inverse problem of general rough sets, considered by the present author in some of her earlier papers, in one of its manifestations is essentially the question of when an agent’s view about crisp and non crisp objects over a set of objects has a rough evolution. In this research the nature of the problem is examined from number-theoretic and combinatorial perspectives under very few assumptions about the nature of data and some necessary conditions are proved.
Tasks
Published	2016-05-05
URL	http://arxiv.org/abs/1605.01778v2
PDF	http://arxiv.org/pdf/1605.01778v2.pdf
PWC	https://paperswithcode.com/paper/combinatorial-aspects-of-the-distribution-of
Repo
Framework

A study on tuning parameter selection for the high-dimensional lasso


Title	A study on tuning parameter selection for the high-dimensional lasso
Authors	Darren Homrighausen, Daniel J. McDonald
Abstract	High-dimensional predictive models, those with more measurements than observations, require regularization to be well defined, perform well empirically, and possess theoretical guarantees. The amount of regularization, often determined by tuning parameters, is integral to achieving good performance. One can choose the tuning parameter in a variety of ways, such as through resampling methods or generalized information criteria. However, the theory supporting many regularized procedures relies on an estimate for the variance parameter, which is complicated in high dimensions. We develop a suite of information criteria for choosing the tuning parameter in lasso regression by leveraging the literature on high-dimensional variance estimation. We derive intuition showing that existing information-theoretic approaches work poorly in this setting. We compare our risk estimators to existing methods with an extensive simulation and derive some theoretical justification. We find that our new estimators perform well across a wide range of simulation conditions and evaluation criteria.
Tasks
Published	2016-02-04
URL	https://arxiv.org/abs/1602.01522v2
PDF	https://arxiv.org/pdf/1602.01522v2.pdf
PWC	https://paperswithcode.com/paper/risk-estimation-for-high-dimensional-lasso
Repo
Framework

Morphological Constraints for Phrase Pivot Statistical Machine Translation


Title	Morphological Constraints for Phrase Pivot Statistical Machine Translation
Authors	Ahmed El Kholy, Nizar Habash
Abstract	The lack of parallel data for many language pairs is an important challenge to statistical machine translation (SMT). One common solution is to pivot through a third language for which there exist parallel corpora with the source and target languages. Although pivoting is a robust technique, it introduces some low quality translations especially when a poor morphology language is used as the pivot between rich morphology languages. In this paper, we examine the use of synchronous morphology constraint features to improve the quality of phrase pivot SMT. We compare hand-crafted constraints to those learned from limited parallel data between source and target languages. The learned morphology constraints are based on projected align- ments between the source and target phrases in the pivot phrase table. We show positive results on Hebrew-Arabic SMT (pivoting on English). We get 1.5 BLEU points over a phrase pivot baseline and 0.8 BLEU points over a system combination baseline with a direct model built from parallel data.
Tasks	Machine Translation
Published	2016-09-12
URL	http://arxiv.org/abs/1609.03376v1
PDF	http://arxiv.org/pdf/1609.03376v1.pdf
PWC	https://paperswithcode.com/paper/morphological-constraints-for-phrase-pivot
Repo
Framework

Improved Spoken Document Summarization with Coverage Modeling Techniques


Title	Improved Spoken Document Summarization with Coverage Modeling Techniques
Authors	Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang
Abstract	Extractive summarization aims at selecting a set of indicative sentences from a source document as a summary that can express the major theme of the document. A general consensus on extractive summarization is that both relevance and coverage are critical issues to address. The existing methods designed to model coverage can be characterized by either reducing redundancy or increasing diversity in the summary. Maximal margin relevance (MMR) is a widely-cited method since it takes both relevance and redundancy into account when generating a summary for a given document. In addition to MMR, there is only a dearth of research concentrating on reducing redundancy or increasing diversity for the spoken document summarization task, as far as we are aware. Motivated by these observations, two major contributions are presented in this paper. First, in contrast to MMR, which considers coverage by reducing redundancy, we propose two novel coverage-based methods, which directly increase diversity. With the proposed methods, a set of representative sentences, which not only are relevant to the given document but also cover most of the important sub-themes of the document, can be selected automatically. Second, we make a step forward to plug in several document/sentence representation methods into the proposed framework to further enhance the summarization performance. A series of empirical evaluations demonstrate the effectiveness of our proposed methods.
Tasks	Document Summarization
Published	2016-01-20
URL	http://arxiv.org/abs/1601.05194v1
PDF	http://arxiv.org/pdf/1601.05194v1.pdf
PWC	https://paperswithcode.com/paper/improved-spoken-document-summarization-with
Repo
Framework