October 16, 2019

2851 words 14 mins read

Paper Group ANR 1088

Understanding and Improving Kernel Local Descriptors. Recurrent Neural Networks with Pre-trained Language Model Embedding for Slot Filling Task. Contextual Stochastic Block Models. An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems. PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn. Imitati …

Understanding and Improving Kernel Local Descriptors


Title	Understanding and Improving Kernel Local Descriptors
Authors	Arun Mukundan, Giorgos Tolias, Andrei Bursuc, Hervé Jégou, Ondřej Chum
Abstract	We propose a multiple-kernel local-patch descriptor based on efficient match kernels from pixel gradients. It combines two parametrizations of gradient position and direction, each parametrization provides robustness to a different type of patch mis-registration: polar parametrization for noise in the patch dominant orientation detection, Cartesian for imprecise location of the feature point. Combined with whitening of the descriptor space, that is learned with or without supervision, the performance is significantly improved. We analyze the effect of the whitening on patch similarity and demonstrate its semantic meaning. Our unsupervised variant is the best performing descriptor constructed without the need of labeled data. Despite the simplicity of the proposed descriptor, it competes well with deep learning approaches on a number of different tasks.
Tasks
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11147v1
PDF	http://arxiv.org/pdf/1811.11147v1.pdf
PWC	https://paperswithcode.com/paper/understanding-and-improving-kernel-local
Repo
Framework

Recurrent Neural Networks with Pre-trained Language Model Embedding for Slot Filling Task


Title	Recurrent Neural Networks with Pre-trained Language Model Embedding for Slot Filling Task
Authors	Liang Qiu, Yuanyi Ding, Lei He
Abstract	In recent years, Recurrent Neural Networks (RNNs) based models have been applied to the Slot Filling problem of Spoken Language Understanding and achieved the state-of-the-art performances. In this paper, we investigate the effect of incorporating pre-trained language models into RNN based Slot Filling models. Our evaluation on the Airline Travel Information System (ATIS) data corpus shows that we can significantly reduce the size of labeled training data and achieve the same level of Slot Filling performance by incorporating extra word embedding and language model embedding layers pre-trained on unlabeled corpora.
Tasks	Language Modelling, Slot Filling, Spoken Language Understanding
Published	2018-12-12
URL	http://arxiv.org/abs/1812.05199v2
PDF	http://arxiv.org/pdf/1812.05199v2.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-networks-with-pre-trained
Repo
Framework

Contextual Stochastic Block Models


Title	Contextual Stochastic Block Models
Authors	Yash Deshpande, Andrea Montanari, Elchanan Mossel, Subhabrata Sen
Abstract	We provide the first information theoretic tight analysis for inference of latent community structure given a sparse graph along with high dimensional node covariates, correlated with the same latent communities. Our work bridges recent theoretical breakthroughs in the detection of latent community structure without nodes covariates and a large body of empirical work using diverse heuristics for combining node covariates with graphs for inference. The tightness of our analysis implies in particular, the information theoretical necessity of combining the different sources of information. Our analysis holds for networks of large degrees as well as for a Gaussian version of the model.
Tasks
Published	2018-07-23
URL	http://arxiv.org/abs/1807.09596v1
PDF	http://arxiv.org/pdf/1807.09596v1.pdf
PWC	https://paperswithcode.com/paper/contextual-stochastic-block-models
Repo
Framework

An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems


Title	An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems
Authors	Hyeong Soo Chang
Abstract	For the stochastic multi-armed bandit (MAB) problem from a constrained model that generalizes the classical one, we show that an asymptotic optimality is achievable by a simple strategy extended from the $\epsilon_t$-greedy strategy. We provide a finite-time lower bound on the probability of correct selection of an optimal near-feasible arm that holds for all time steps. Under some conditions, the bound approaches one as time $t$ goes to infinity. A particular example sequence of ${\epsilon_t}$ having the asymptotic convergence rate in the order of $(1-\frac{1}{t})^4$ that holds from a sufficiently large $t$ is also discussed.
Tasks
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01237v1
PDF	http://arxiv.org/pdf/1805.01237v1.pdf
PWC	https://paperswithcode.com/paper/an-asymptotically-optimal-strategy-for
Repo
Framework

PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn


Title	PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn
Authors	Krishnaram Kenthapadi, Thanh T. L. Tran
Abstract	Preserving privacy of users is a key requirement of web-scale analytics and reporting applications, and has witnessed a renewed focus in light of recent data breaches and new regulations such as GDPR. We focus on the problem of computing robust, reliable analytics in a privacy-preserving manner, while satisfying product requirements. We present PriPeARL, a framework for privacy-preserving analytics and reporting, inspired by differential privacy. We describe the overall design and architecture, and the key modeling components, focusing on the unique challenges associated with privacy, coverage, utility, and consistency. We perform an experimental study in the context of ads analytics and reporting at LinkedIn, thereby demonstrating the tradeoffs between privacy and utility needs, and the applicability of privacy-preserving mechanisms to real-world data. We also highlight the lessons learned from the production deployment of our system at LinkedIn.
Tasks
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07754v1
PDF	http://arxiv.org/pdf/1809.07754v1.pdf
PWC	https://paperswithcode.com/paper/pripearl-a-framework-for-privacy-preserving
Repo
Framework

Imitation Learning for End to End Vehicle Longitudinal Control with Forward Camera


Title	Imitation Learning for End to End Vehicle Longitudinal Control with Forward Camera
Authors	Laurent George, Thibault Buhet, Emilie Wirbel, Gaetan Le-Gall, Xavier Perrotton
Abstract	In this paper we present a complete study of an end-to-end imitation learning system for speed control of a real car, based on a neural network with a Long Short Term Memory (LSTM). To achieve robustness and generalization from expert demonstrations, we propose data augmentation and label augmentation that are relevant for imitation learning in longitudinal control context. Based on front camera image only, our system is able to correctly control the speed of a car in simulation environment, and in a real car on a challenging test track. The system also shows promising results in open road context.
Tasks	Data Augmentation, Imitation Learning
Published	2018-12-14
URL	http://arxiv.org/abs/1812.05841v1
PDF	http://arxiv.org/pdf/1812.05841v1.pdf
PWC	https://paperswithcode.com/paper/imitation-learning-for-end-to-end-vehicle
Repo
Framework

On the effectiveness of task granularity for transfer learning


Title	On the effectiveness of task granularity for transfer learning
Authors	Farzaneh Mahdisoltani, Guillaume Berger, Waseem Gharbieh, David Fleet, Roland Memisevic
Abstract	We describe a DNN for video classification and captioning, trained end-to-end, with shared features, to solve tasks at different levels of granularity, exploring the link between granularity in a source task and the quality of learned features for transfer learning. For solving the new task domain in transfer learning, we freeze the trained encoder and fine-tune a neural net on the target domain. We train on the Something-Something dataset with over 220, 000 videos, and multiple levels of target granularity, including 50 action groups, 174 fine-grained action categories and captions. Classification and captioning with Something-Something are challenging because of the subtle differences between actions, applied to thousands of different object classes, and the diversity of captions penned by crowd actors. Our model performs better than existing classification baselines for SomethingSomething, with impressive fine-grained results. And it yields a strong baseline on the new Something-Something captioning task. Experiments reveal that training with more fine-grained tasks tends to produce better features for transfer learning.
Tasks	Transfer Learning, Video Classification
Published	2018-04-24
URL	http://arxiv.org/abs/1804.09235v2
PDF	http://arxiv.org/pdf/1804.09235v2.pdf
PWC	https://paperswithcode.com/paper/on-the-effectiveness-of-task-granularity-for
Repo
Framework

Multi-distance Support Matrix Machines


Title	Multi-distance Support Matrix Machines
Authors	Yunfei Ye, Dong Han
Abstract	Real-world data such as digital images, MRI scans and electroencephalography signals are naturally represented as matrices with structural information. Most existing classifiers aim to capture these structures by regularizing the regression matrix to be low-rank or sparse. Some other methodologies introduce factorization technique to explore nonlinear relationships of matrix data in kernel space. In this paper, we propose a multi-distance support matrix machine (MDSMM), which provides a principled way of solving matrix classification problems. The multi-distance is introduced to capture the correlation within matrix data, by means of intrinsic information in rows and columns of input data. A complex hyperplane is established upon these values to separate distinct classes. We further study the generalization bounds for i.i.d. processes and non i.i.d. process based on both SVM and SMM classifiers. For typical hypothesis classes where matrix norms are constrained, MDSMM achieves a faster learning rate than traditional classifiers. We also provide a more general approach for samples without prior knowledge. We demonstrate the merits of the proposed method by conducting exhaustive experiments on both simulation study and a number of real-word datasets.
Tasks
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00451v2
PDF	http://arxiv.org/pdf/1807.00451v2.pdf
PWC	https://paperswithcode.com/paper/multi-distance-support-matrix-machines
Repo
Framework

Portfolio Optimization for Cointelated Pairs: SDEs vs. Machine Learning


Title	Portfolio Optimization for Cointelated Pairs: SDEs vs. Machine Learning
Authors	Babak Mahdavi-Damghani, Konul Mustafayeva, Stephen Roberts, Cristin Buescu
Abstract	With the recent rise of Machine Learning as a candidate to partially replace classic Financial Mathematics methodologies, we investigate the performances of both in solving the problem of dynamic portfolio optimization in continuous-time, finite-horizon setting for a portfolio of two assets that are intertwined. In Financial Mathematics approach we model the asset prices not via the common approaches used in pairs trading such as a high correlation or cointegration, but with the cointelation model that aims to reconcile both short-term risk and long-term equilibrium. We maximize the overall P&L with Financial Mathematics approach that dynamically switches between a mean-variance optimal strategy and a power utility maximizing strategy. We use a stochastic control formulation of the problem of power utility maximization and solve numerically the resulting HJB equation with the Deep Galerkin method. We turn to Machine Learning for the same P&L maximization problem and use clustering analysis to devise bands, combined with in-band optimization. Although this approach is model agnostic, results obtained with data simulated from the same cointelation model as FM give an edge to ML.
Tasks	Portfolio Optimization
Published	2018-12-26
URL	https://arxiv.org/abs/1812.10183v2
PDF	https://arxiv.org/pdf/1812.10183v2.pdf
PWC	https://paperswithcode.com/paper/portfolio-optimization-for-cointelated-pairs
Repo
Framework

Improving Regression Performance with Distributional Losses


Title	Improving Regression Performance with Distributional Losses
Authors	Ehsan Imani, Martha White
Abstract	There is growing evidence that converting targets to soft targets in supervised learning can provide considerable gains in performance. Much of this work has considered classification, converting hard zero-one values to soft labels—such as by adding label noise, incorporating label ambiguity or using distillation. In parallel, there is some evidence from a regression setting in reinforcement learning that learning distributions can improve performance. In this work, we investigate the reasons for this improvement, in a regression setting. We introduce a novel distributional regression loss, and similarly find it significantly improves prediction accuracy. We investigate several common hypotheses, around reducing overfitting and improved representations. We instead find evidence for an alternative hypothesis: this loss is easier to optimize, with better behaved gradients, resulting in improved generalization. We provide theoretical support for this alternative hypothesis, by characterizing the norm of the gradients of this loss.
Tasks
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04613v1
PDF	http://arxiv.org/pdf/1806.04613v1.pdf
PWC	https://paperswithcode.com/paper/improving-regression-performance-with
Repo
Framework

Differentially Private Online Submodular Optimization


Title	Differentially Private Online Submodular Optimization
Authors	Adrian Rivera Cardoso, Rachel Cummings
Abstract	In this paper we develop the first algorithms for online submodular minimization that preserve differential privacy under full information feedback and bandit feedback. A sequence of $T$ submodular functions over a collection of $n$ elements arrive online, and at each timestep the algorithm must choose a subset of $[n]$ before seeing the function. The algorithm incurs a cost equal to the function evaluated on the chosen set, and seeks to choose a sequence of sets that achieves low expected regret. Our first result is in the full information setting, where the algorithm can observe the entire function after making its decision at each timestep. We give an algorithm in this setting that is $\epsilon$-differentially private and achieves expected regret $\tilde{O}\left(\frac{n^{3/2}\sqrt{T}}{\epsilon}\right)$. This algorithm works by relaxing submodular function to a convex function using the Lovasz extension, and then simulating an algorithm for differentially private online convex optimization. Our second result is in the bandit setting, where the algorithm can only see the cost incurred by its chosen set, and does not have access to the entire function. This setting is significantly more challenging because the algorithm does not receive enough information to compute the Lovasz extension or its subgradients. Instead, we construct an unbiased estimate using a single-point estimation, and then simulate private online convex optimization using this estimate. Our algorithm using bandit feedback is $\epsilon$-differentially private and achieves expected regret $\tilde{O}\left(\frac{n^{3/2}T^{3/4}}{\epsilon}\right)$.
Tasks
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02290v1
PDF	http://arxiv.org/pdf/1807.02290v1.pdf
PWC	https://paperswithcode.com/paper/differentially-private-online-submodular
Repo
Framework

IKA: Independent Kernel Approximator


Title	IKA: Independent Kernel Approximator
Authors	Matteo Ronchetti
Abstract	This paper describes a new method for low rank kernel approximation called IKA. The main advantage of IKA is that it produces a function $\psi(x)$ defined as a linear combination of arbitrarily chosen functions. In contrast the approximation produced by Nystr"om method is a linear combination of kernel evaluations. The proposed method consistently outperformed Nystr"om method in a comparison on the STL-10 dataset. Numerical results are reproducible using the source code available at https://gitlab.com/matteo-ronchetti/IKA
Tasks
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01353v1
PDF	http://arxiv.org/pdf/1809.01353v1.pdf
PWC	https://paperswithcode.com/paper/ika-independent-kernel-approximator
Repo
Framework

DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild


Title	DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild
Authors	Riza Alp Guler, Yuxiang Zhou, George Trigeorgis, Epameinondas Antonakos, Patrick Snape, Stefanos Zafeiriou, Iasonas Kokkinos
Abstract	In this work we use deep learning to establish dense correspondences between a 3D object model and an image “in the wild”. We introduce “DenseReg”, a fully-convolutional neural network (F-CNN) that densely regresses at every foreground pixel a pair of U-V template coordinates in a single feedforward pass. To train DenseReg we construct a supervision signal by combining 3D deformable model fitting and 2D landmark annotations. We define the regression task in terms of the intrinsic, U-V coordinates of a 3D deformable model that is brought into correspondence with image instances at training time. A host of other object-related tasks (e.g. part segmentation, landmark localization) are shown to be by-products of this task, and to largely improve thanks to its introduction. We obtain highly-accurate regression results by combining ideas from semantic segmentation with regression networks, yielding a ‘quantized regression’ architecture that first obtains a quantized estimate of position through classification, and refines it through regression of the residual. We show that such networks can boost the performance of existing state-of-the-art systems for pose estimation. Firstly, we show that our system can serve as an initialization for Statistical Deformable Models, as well as an element of cascaded architectures that jointly localize landmarks and estimate dense correspondences. We also show that the obtained dense correspondence can act as a source of ‘privileged information’ that complements and extends the pure landmark-level annotations, accelerating and improving the training of pose estimation networks. We report state-of-the-art performance on the challenging 300W benchmark for facial landmark localization and on the MPII and LSP datasets for human pose estimation.
Tasks	Face Alignment, Pose Estimation, Semantic Segmentation
Published	2018-03-05
URL	http://arxiv.org/abs/1803.02188v2
PDF	http://arxiv.org/pdf/1803.02188v2.pdf
PWC	https://paperswithcode.com/paper/densereg-fully-convolutional-dense-shape
Repo
Framework

Memory-based Parameter Adaptation


Title	Memory-based Parameter Adaptation
Authors	Pablo Sprechmann, Siddhant M. Jayakumar, Jack W. Rae, Alexander Pritzel, Adrià Puigdomènech Badia, Benigno Uria, Oriol Vinyals, Demis Hassabis, Razvan Pascanu, Charles Blundell
Abstract	Deep neural networks have excelled on a wide range of problems, from vision to language and game playing. Neural networks very gradually incorporate information into weights as they process data, requiring very low learning rates. If the training distribution shifts, the network is slow to adapt, and when it does adapt, it typically performs badly on the training distribution before the shift. Our method, Memory-based Parameter Adaptation, stores examples in memory and then uses a context-based lookup to directly modify the weights of a neural network. Much higher learning rates can be used for this local adaptation, reneging the need for many iterations over similar data before good predictions can be made. As our method is memory-based, it alleviates several shortcomings of neural networks, such as catastrophic forgetting, fast, stable acquisition of new knowledge, learning with an imbalanced class labels, and fast learning during evaluation. We demonstrate this on a range of supervised tasks: large-scale image classification and language modelling.
Tasks	Image Classification, Language Modelling
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10542v1
PDF	http://arxiv.org/pdf/1802.10542v1.pdf
PWC	https://paperswithcode.com/paper/memory-based-parameter-adaptation
Repo
Framework

Walk-Steered Convolution for Graph Classification


Title	Walk-Steered Convolution for Graph Classification
Authors	Jiatao Jiang, Chunyan Xu, Zhen Cui, Tong Zhang, Wenming Zheng, Jian Yang
Abstract	Graph classification is a fundamental but challenging issue for numerous real-world applications. Despite recent great progress in image/video classification, convolutional neural networks (CNNs) cannot yet cater to graphs well because of graphical non-Euclidean topology. In this work, we propose a walk-steered convolutional (WSC) network to assemble the essential success of standard convolutional neural networks as well as the powerful representation ability of random walk. Instead of deterministic neighbor searching used in previous graphical CNNs, we construct multi-scale walk fields (a.k.a. local receptive fields) with random walk paths to depict subgraph structures and advocate graph scalability. To express the internal variations of a walk field, Gaussian mixture models are introduced to encode principal components of walk paths therein. As an analogy to a standard convolution kernel on image, Gaussian models implicitly coordinate those unordered vertices/nodes and edges in a local receptive field after projecting to the gradient space of Gaussian parameters. We further stack graph coarsening upon Gaussian encoding by using dynamic clustering, such that high-level semantics of graph can be well learned like the conventional pooling on image. The experimental results on several public datasets demonstrate the superiority of our proposed WSC method over many state-of-the-arts for graph classification.
Tasks	Graph Classification, Video Classification
Published	2018-04-16
URL	https://arxiv.org/abs/1804.05837v2
PDF	https://arxiv.org/pdf/1804.05837v2.pdf
PWC	https://paperswithcode.com/paper/walk-steered-convolution-for-graph
Repo
Framework