October 20, 2019

3117 words 15 mins read

Paper Group ANR 10

Incentivizing the Dynamic Workforce: Learning Contracts in the Gig-Economy. Learning to Attend Relevant Regions in Videos from Eye Fixations. Multimodal Emoji Prediction. The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing. Overcoming catastrophic forgetting problem by weight consolidation and long-term memory. Knowledge Int …

Incentivizing the Dynamic Workforce: Learning Contracts in the Gig-Economy


Title	Incentivizing the Dynamic Workforce: Learning Contracts in the Gig-Economy
Authors	Alon Cohen, Moran Koren, Argyrios Deligkas
Abstract	In principal-agent models, a principal offers a contract to an agent to perform a certain task. The agent exerts a level of effort that maximizes her utility. The principal is oblivious to the agent’s chosen level of effort, and conditions her wage only on possible outcomes. In this work, we consider a model in which the principal is unaware of the agent’s utility and action space. She sequentially offers contracts to identical agents, and observes the resulting outcomes. We present an algorithm for learning the optimal contract under mild assumptions. We bound the number of samples needed for the principal obtain a contract that is within $\epsilon$ of her optimal net profit for every $\epsilon>0$.
Tasks
Published	2018-11-16
URL	http://arxiv.org/abs/1811.06736v1
PDF	http://arxiv.org/pdf/1811.06736v1.pdf
PWC	https://paperswithcode.com/paper/incentivizing-the-dynamic-workforce-learning
Repo
Framework

Learning to Attend Relevant Regions in Videos from Eye Fixations


Title	Learning to Attend Relevant Regions in Videos from Eye Fixations
Authors	Thanh T. Nguyen, Dung Nguyen
Abstract	Attentively important regions in video frames account for a majority part of the semantics in each frame. This information is helpful in many applications not only for entertainment (such as auto generating commentary and tourist guide) but also for robotic control which holds a larascope supported for laparoscopic surgery. However, it is not always straightforward to define and locate such semantic regions in videos. In this work, we attempt to address the problem of attending relevant regions in videos by leveraging the eye fixations labels with a RNN-based visual attention model. Our experimental results suggest that this approach holds a good potential to learn to attend semantic regions in videos while its performance also heavily relies on the quality of eye fixations labels.
Tasks
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08594v2
PDF	http://arxiv.org/pdf/1811.08594v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-attend-relevant-regions-in-videos
Repo
Framework

Multimodal Emoji Prediction


Title	Multimodal Emoji Prediction
Authors	Francesco Barbieri, Miguel Ballesteros, Francesco Ronzano, Horacio Saggion
Abstract	Emojis are small images that are commonly included in social media text messages. The combination of visual and textual content in the same message builds up a modern way of communication, that automatic systems are not used to deal with. In this paper we extend recent advances in emoji prediction by putting forward a multimodal approach that is able to predict emojis in Instagram posts. Instagram posts are composed of pictures together with texts which sometimes include emojis. We show that these emojis can be predicted by using the text, but also using the picture. Our main finding is that incorporating the two synergistic modalities, in a combined model, improves accuracy in an emoji prediction task. This result demonstrates that these two modalities (text and images) encode different information on the use of emojis and therefore can complement each other.
Tasks
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02392v2
PDF	http://arxiv.org/pdf/1803.02392v2.pdf
PWC	https://paperswithcode.com/paper/multimodal-emoji-prediction
Repo
Framework

The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing


Title	The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing
Authors	Cencheng Shen, Joshua T. Vogelstein
Abstract	Distance-based tests, also called “energy statistics”, are leading methods for two-sample and independence tests from the statistics community. Kernel-based tests, developed from “kernel mean embeddings”, are leading methods for two-sample and independence tests from the machine learning community. A fixed-point transformation was previously proposed to connect the distance methods and kernel methods for the population statistics. In this paper, we propose a new bijective transformation between metrics and kernels. It simplifies the fixed-point transformation, inherits similar theoretical properties, allows distance methods to be exactly the same as kernel methods for sample statistics and p-value, and better preserves the data structure upon transformation. Our results further advance the understanding in distance and kernel-based tests, streamline the code base for implementing these tests, and enable a rich literature of distance-based and kernel-based methodologies to directly communicate with each other.
Tasks
Published	2018-06-14
URL	https://arxiv.org/abs/1806.05514v4
PDF	https://arxiv.org/pdf/1806.05514v4.pdf
PWC	https://paperswithcode.com/paper/the-exact-equivalence-of-distance-and-kernel
Repo
Framework

Overcoming catastrophic forgetting problem by weight consolidation and long-term memory


Title	Overcoming catastrophic forgetting problem by weight consolidation and long-term memory
Authors	Shixian Wen, Laurent Itti
Abstract	Sequential learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby previously learned knowledge is erased during learning of new, disjoint knowledge. Here, we propose a new approach to sequential learning which leverages the recent discovery of adversarial examples. We use adversarial subspaces from previous tasks to enable learning of new tasks with less interference. We apply our method to sequentially learning to classify digits 0, 1, 2 (task 1), 4, 5, 6, (task 2), and 7, 8, 9 (task 3) in MNIST (disjoint MNIST task). We compare and combine our Adversarial Direction (AD) method with the recently proposed Elastic Weight Consolidation (EWC) method for sequential learning. We train each task for 20 epochs, which yields good initial performance (99.24% correct task 1 performance). After training task 2, and then task 3, both plain gradient descent (PGD) and EWC largely forget task 1 (task 1 accuracy 32.95% for PGD and 41.02% for EWC), while our combined approach (AD+EWC) still achieves 94.53% correct on task 1. We obtain similar results with a much more difficult disjoint CIFAR10 task, which to our knowledge had not been attempted before (70.10% initial task 1 performance, 67.73% after learning tasks 2 and 3 for AD+EWC, while PGD and EWC both fall to chance level). Our results suggest that AD+EWC can provide better sequential learning performance than either PGD or EWC.
Tasks
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07441v1
PDF	http://arxiv.org/pdf/1805.07441v1.pdf
PWC	https://paperswithcode.com/paper/overcoming-catastrophic-forgetting-problem-by
Repo
Framework

Knowledge Integrated Classifier Design Based on Utility Optimization


Title	Knowledge Integrated Classifier Design Based on Utility Optimization
Authors	Shaohan Chen, Chuanhou Gao
Abstract	This paper proposes a systematic framework to design a classification model that yields a classifier which optimizes a utility function based on prior knowledge. Specifically, as the data size grows, we prove that the produced classifier asymptotically converges to the optimal classifier, an extended version of the Bayes rule, which maximizes the utility function. Therefore, we provide a meaningful theoretical interpretation for modeling with the knowledge incorporated. Our knowledge incorporation method allows domain experts to guide the classifier towards correctly classifying data that they think to be more significant.
Tasks
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01571v1
PDF	http://arxiv.org/pdf/1809.01571v1.pdf
PWC	https://paperswithcode.com/paper/knowledge-integrated-classifier-design-based
Repo
Framework

Mixed-Order Spectral Clustering for Networks


Title	Mixed-Order Spectral Clustering for Networks
Authors	Yan Ge, Haiping Lu, Pan Peng
Abstract	Clustering is fundamental for gaining insights from complex networks, and spectral clustering (SC) is a popular approach. Conventional SC focuses on second-order structures (e.g., edges connecting two nodes) without direct consideration of higher-order structures (e.g., triangles and cliques). This has motivated SC extensions that directly consider higher-order structures. However, both approaches are limited to considering a single order. This paper proposes a new Mixed-Order Spectral Clustering (MOSC) approach to model both second-order and third-order structures simultaneously, with two MOSC methods developed based on Graph Laplacian (GL) and Random Walks (RW). MOSC-GL combines edge and triangle adjacency matrices, with theoretical performance guarantee. MOSC-RW combines first-order and second-order random walks for a probabilistic interpretation. We automatically determine the mixing parameter based on cut criteria or triangle density, and construct new structure-aware error metrics for performance evaluation. Experiments on real-world networks show 1) the superior performance of two MOSC methods over existing SC methods, 2) the effectiveness of the mixing parameter determination strategy, and 3) insights offered by the structure-aware error metrics.
Tasks
Published	2018-12-25
URL	http://arxiv.org/abs/1812.10140v1
PDF	http://arxiv.org/pdf/1812.10140v1.pdf
PWC	https://paperswithcode.com/paper/mixed-order-spectral-clustering-for-networks
Repo
Framework

Learning-to-Ask: Knowledge Acquisition via 20 Questions


Title	Learning-to-Ask: Knowledge Acquisition via 20 Questions
Authors	Yihong Chen, Bei Chen, Xuguang Duan, Jian-Guang Lou, Yue Wang, Wenwu Zhu, Yong Cao
Abstract	Almost all the knowledge empowered applications rely upon accurate knowledge, which has to be either collected manually with high cost, or extracted automatically with unignorable errors. In this paper, we study 20 Questions, an online interactive game where each question-response pair corresponds to a fact of the target entity, to acquire highly accurate knowledge effectively with nearly zero labor cost. Knowledge acquisition via 20 Questions predominantly presents two challenges to the intelligent agent playing games with human players. The first one is to seek enough information and identify the target entity with as few questions as possible, while the second one is to leverage the remaining questioning opportunities to acquire valuable knowledge effectively, both of which count on good questioning strategies. To address these challenges, we propose the Learning-to-Ask (LA) framework, within which the agent learns smart questioning strategies for information seeking and knowledge acquisition by means of deep reinforcement learning and generalized matrix factorization respectively. In addition, a Bayesian approach to represent knowledge is adopted to ensure robustness to noisy user responses. Simulating experiments on real data show that LA is able to equip the agent with effective questioning strategies, which result in high winning rates and rapid knowledge acquisition. Moreover, the questioning strategies for information seeking and knowledge acquisition boost the performance of each other, allowing the agent to start with a relatively small knowledge set and quickly improve its knowledge base in the absence of constant human supervision.
Tasks
Published	2018-06-22
URL	http://arxiv.org/abs/1806.08554v1
PDF	http://arxiv.org/pdf/1806.08554v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-ask-knowledge-acquisition-via-20
Repo
Framework

(Self-Attentive) Autoencoder-based Universal Language Representation for Machine Translation


Title	(Self-Attentive) Autoencoder-based Universal Language Representation for Machine Translation
Authors	Carlos Escolano, Marta R. Costa-jussà, José A. R. Fonollosa
Abstract	Universal language representation is the holy grail in machine translation (MT). Thanks to the new neural MT approach, it seems that there are good perspectives towards this goal. In this paper, we propose a new architecture based on combining variational autoencoders with encoder-decoders and introducing an interlingual loss as an additional training objective. By adding and forcing this interlingual loss, we are able to train multiple encoders and decoders for each language, sharing a common universal representation. Since the final objective of this universal representation is producing close results for similar input sentences (in any language), we propose to evaluate it by encoding the same sentence in two different languages, decoding both latent representations into the same language and comparing both outputs. Preliminary results on the WMT 2017 Turkish/English task shows that the proposed architecture is capable of learning a universal language representation and simultaneously training both translation directions with state-of-the-art results.
Tasks	Machine Translation
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06351v1
PDF	http://arxiv.org/pdf/1810.06351v1.pdf
PWC	https://paperswithcode.com/paper/self-attentive-autoencoder-based-universal
Repo
Framework

Randomized Optimal Transport on a Graph: framework and new distance measures


Title	Randomized Optimal Transport on a Graph: framework and new distance measures
Authors	Guillaume Guex, Ilkka Kivimäki, Marco Saerens
Abstract	The recently developed bag-of-paths (BoP) framework consists in setting a Gibbs-Boltzmann distribution on all feasible paths of a graph. This probability distribution favors short paths over long ones, with a free parameter (the temperature $T$) controlling the entropic level of the distribution. This formalism enables the computation of new distances or dissimilarities, interpolating between the shortest-path and the resistance distance, which have been shown to perform well in clustering and classification tasks. In this work, the bag-of-paths formalism is extended by adding two independent equality constraints fixing starting and ending nodes distributions of paths (margins). When the temperature is low, this formalism is shown to be equivalent to a relaxation of the optimal transport problem on a network where paths carry a flow between two discrete distributions on nodes. The randomization is achieved by considering free energy minimization instead of traditional cost minimization. Algorithms computing the optimal free energy solution are developed for two types of paths: hitting (or absorbing) paths and non-hitting, regular, paths, and require the inversion of an $n \times n$ matrix with $n$ being the number of nodes. Interestingly, for regular paths on an undirected graph, the resulting optimal policy interpolates between the deterministic optimal transport policy ($T \rightarrow 0^{+}$) and the solution to the corresponding electrical circuit ($T \rightarrow \infty$). Two distance measures between nodes and a dissimilarity between groups of nodes, both integrating weights on nodes, are derived from this framework.
Tasks
Published	2018-06-07
URL	http://arxiv.org/abs/1806.03232v2
PDF	http://arxiv.org/pdf/1806.03232v2.pdf
PWC	https://paperswithcode.com/paper/randomized-optimal-transport-on-a-graph
Repo
Framework

Deep Textured 3D Reconstruction of Human Bodies


Title	Deep Textured 3D Reconstruction of Human Bodies
Authors	Abbhinav Venkat, Sai Sagar Jinka, Avinash Sharma
Abstract	Recovering textured 3D models of non-rigid human body shapes is challenging due to self-occlusions caused by complex body poses and shapes, clothing obstructions, lack of surface texture, background clutter, sparse set of cameras with non-overlapping fields of view, etc. Further, a calibration-free environment adds additional complexity to both - reconstruction and texture recovery. In this paper, we propose a deep learning based solution for textured 3D reconstruction of human body shapes from a single view RGB image. This is achieved by first recovering the volumetric grid of the non-rigid human body given a single view RGB image followed by orthographic texture view synthesis using the respective depth projection of the reconstructed (volumetric) shape and input RGB image. We propose to co-learn the depth information readily available with affordable RGBD sensors (e.g., Kinect) while showing multiple views of the same object during the training phase. We show superior reconstruction performance in terms of quantitative and qualitative results, on both, publicly available datasets (by simulating the depth channel with virtual Kinect) as well as real RGBD data collected with our calibrated multi Kinect setup.
Tasks	3D Reconstruction, Calibration
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06547v1
PDF	http://arxiv.org/pdf/1809.06547v1.pdf
PWC	https://paperswithcode.com/paper/deep-textured-3d-reconstruction-of-human
Repo
Framework

Deep Learned Full-3D Object Completion from Single View


Title	Deep Learned Full-3D Object Completion from Single View
Authors	Dario Rethage, Federico Tombari, Felix Achilles, Nassir Navab
Abstract	3D geometry is a very informative cue when interacting with and navigating an environment. This writing proposes a new approach to 3D reconstruction and scene understanding, which implicitly learns 3D geometry from depth maps pairing a deep convolutional neural network architecture with an auto-encoder. A data set of synthetic depth views and voxelized 3D representations is built based on ModelNet, a large-scale collection of CAD models, to train networks. The proposed method offers a significant advantage over current, explicit reconstruction methods in that it learns key geometric features offline and makes use of those to predict the most probable reconstruction of an unseen object. The relatively small network, consisting of roughly 4 million weights, achieves a 92.9% reconstruction accuracy at a 30x30x30 resolution through the use of a pre-trained decompression layer. This is roughly 1/4 the weights of the current leading network. The fast execution time of the model makes it suitable for real-time applications.
Tasks	3D Reconstruction, Scene Understanding
Published	2018-08-21
URL	http://arxiv.org/abs/1808.06843v1
PDF	http://arxiv.org/pdf/1808.06843v1.pdf
PWC	https://paperswithcode.com/paper/deep-learned-full-3d-object-completion-from
Repo
Framework

Hessian barrier algorithms for linearly constrained optimization problems


Title	Hessian barrier algorithms for linearly constrained optimization problems
Authors	Immanuel M. Bomze, Panayotis Mertikopoulos, Werner Schachinger, Mathias Staudigl
Abstract	In this paper, we propose an interior-point method for linearly constrained optimization problems (possibly nonconvex). The method - which we call the Hessian barrier algorithm (HBA) - combines a forward Euler discretization of Hessian Riemannian gradient flows with an Armijo backtracking step-size policy. In this way, HBA can be seen as an alternative to mirror descent (MD), and contains as special cases the affine scaling algorithm, regularized Newton processes, and several other iterative solution methods. Our main result is that, modulo a non-degeneracy condition, the algorithm converges to the problem’s set of critical points; hence, in the convex case, the algorithm converges globally to the problem’s minimum set. In the case of linearly constrained quadratic programs (not necessarily convex), we also show that the method’s convergence rate is $\mathcal{O}(1/k^\rho)$ for some $\rho\in(0,1]$ that depends only on the choice of kernel function (i.e., not on the problem’s primitives). These theoretical results are validated by numerical experiments in standard non-convex test functions and large-scale traffic assignment problems.
Tasks
Published	2018-09-25
URL	https://arxiv.org/abs/1809.09449v2
PDF	https://arxiv.org/pdf/1809.09449v2.pdf
PWC	https://paperswithcode.com/paper/hessian-barrier-algorithms-for-linearly
Repo
Framework

A Generic Multi-Projection-Center Model and Calibration Method for Light Field Cameras


Title	A Generic Multi-Projection-Center Model and Calibration Method for Light Field Cameras
Authors	Qi Zhang, Chunping Zhang, Jinbo Ling, Qing Wang, Jingyi Yu
Abstract	Light field cameras can capture both spatial and angular information of light rays, enabling 3D reconstruction by a single exposure. The geometry of 3D reconstruction is affected by intrinsic parameters of a light field camera significantly. In the paper, we propose a multi-projection-center (MPC) model with 6 intrinsic parameters to characterize light field cameras based on traditional two-parallel-plane (TPP) representation. The MPC model can generally parameterize light field in different imaging formations, including conventional and focused light field cameras. By the constraints of 4D ray and 3D geometry, a 3D projective transformation is deduced to describe the relationship between geometric structure and the MPC coordinates. Based on the MPC model and projective transformation, we propose a calibration algorithm to verify our light field camera model. Our calibration method includes a close-form solution and a non-linear optimization by minimizing re-projection errors. Experimental results on both simulated and real scene data have verified the performance of our algorithm.
Tasks	3D Reconstruction, Calibration
Published	2018-08-07
URL	http://arxiv.org/abs/1808.02244v1
PDF	http://arxiv.org/pdf/1808.02244v1.pdf
PWC	https://paperswithcode.com/paper/a-generic-multi-projection-center-model-and
Repo
Framework

mvn2vec: Preservation and Collaboration in Multi-View Network Embedding


Title	mvn2vec: Preservation and Collaboration in Multi-View Network Embedding
Authors	Yu Shi, Fangqiu Han, Xinwei He, Xinran He, Carl Yang, Jie Luo, Jiawei Han
Abstract	Multi-view networks are broadly present in real-world applications. In the meantime, network embedding has emerged as an effective representation learning approach for networked data. Therefore, we are motivated to study the problem of multi-view network embedding with a focus on the optimization objectives that are specific and important in embedding this type of network. In our practice of embedding real-world multi-view networks, we explicitly identify two such objectives, which we refer to as preservation and collaboration. The in-depth analysis of these two objectives is discussed throughout this paper. In addition, the mvn2vec algorithms are proposed to (i) study how varied extent of preservation and collaboration can impact embedding learning and (ii) explore the feasibility of achieving better embedding quality by modeling them simultaneously. With experiments on a series of synthetic datasets, a large-scale internal Snapchat dataset, and two public datasets, we confirm the validity and importance of preservation and collaboration as two objectives for multi-view network embedding. These experiments further demonstrate that better embedding can be obtained by simultaneously modeling the two objectives, while not over-complicating the model or requiring additional supervision. The code and the processed datasets are available at http://yushi2.web.engr.illinois.edu/.
Tasks	Network Embedding, Representation Learning
Published	2018-01-19
URL	https://arxiv.org/abs/1801.06597v3
PDF	https://arxiv.org/pdf/1801.06597v3.pdf
PWC	https://paperswithcode.com/paper/mvn2vec-preservation-and-collaboration-in
Repo
Framework