Paper Group ANR 1088
Understanding and Improving Kernel Local Descriptors. Recurrent Neural Networks with Pre-trained Language Model Embedding for Slot Filling Task. Contextual Stochastic Block Models. An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems. PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn. Imitati …
Understanding and Improving Kernel Local Descriptors
Title | Understanding and Improving Kernel Local Descriptors |
Authors | Arun Mukundan, Giorgos Tolias, Andrei Bursuc, Hervé Jégou, Ondřej Chum |
Abstract | We propose a multiple-kernel local-patch descriptor based on efficient match kernels from pixel gradients. It combines two parametrizations of gradient position and direction, each parametrization provides robustness to a different type of patch mis-registration: polar parametrization for noise in the patch dominant orientation detection, Cartesian for imprecise location of the feature point. Combined with whitening of the descriptor space, that is learned with or without supervision, the performance is significantly improved. We analyze the effect of the whitening on patch similarity and demonstrate its semantic meaning. Our unsupervised variant is the best performing descriptor constructed without the need of labeled data. Despite the simplicity of the proposed descriptor, it competes well with deep learning approaches on a number of different tasks. |
Tasks | |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.11147v1 |
http://arxiv.org/pdf/1811.11147v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-and-improving-kernel-local |
Repo | |
Framework | |
Recurrent Neural Networks with Pre-trained Language Model Embedding for Slot Filling Task
Title | Recurrent Neural Networks with Pre-trained Language Model Embedding for Slot Filling Task |
Authors | Liang Qiu, Yuanyi Ding, Lei He |
Abstract | In recent years, Recurrent Neural Networks (RNNs) based models have been applied to the Slot Filling problem of Spoken Language Understanding and achieved the state-of-the-art performances. In this paper, we investigate the effect of incorporating pre-trained language models into RNN based Slot Filling models. Our evaluation on the Airline Travel Information System (ATIS) data corpus shows that we can significantly reduce the size of labeled training data and achieve the same level of Slot Filling performance by incorporating extra word embedding and language model embedding layers pre-trained on unlabeled corpora. |
Tasks | Language Modelling, Slot Filling, Spoken Language Understanding |
Published | 2018-12-12 |
URL | http://arxiv.org/abs/1812.05199v2 |
http://arxiv.org/pdf/1812.05199v2.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-networks-with-pre-trained |
Repo | |
Framework | |
Contextual Stochastic Block Models
Title | Contextual Stochastic Block Models |
Authors | Yash Deshpande, Andrea Montanari, Elchanan Mossel, Subhabrata Sen |
Abstract | We provide the first information theoretic tight analysis for inference of latent community structure given a sparse graph along with high dimensional node covariates, correlated with the same latent communities. Our work bridges recent theoretical breakthroughs in the detection of latent community structure without nodes covariates and a large body of empirical work using diverse heuristics for combining node covariates with graphs for inference. The tightness of our analysis implies in particular, the information theoretical necessity of combining the different sources of information. Our analysis holds for networks of large degrees as well as for a Gaussian version of the model. |
Tasks | |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.09596v1 |
http://arxiv.org/pdf/1807.09596v1.pdf | |
PWC | https://paperswithcode.com/paper/contextual-stochastic-block-models |
Repo | |
Framework | |
An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems
Title | An Asymptotically Optimal Strategy for Constrained Multi-armed Bandit Problems |
Authors | Hyeong Soo Chang |
Abstract | For the stochastic multi-armed bandit (MAB) problem from a constrained model that generalizes the classical one, we show that an asymptotic optimality is achievable by a simple strategy extended from the $\epsilon_t$-greedy strategy. We provide a finite-time lower bound on the probability of correct selection of an optimal near-feasible arm that holds for all time steps. Under some conditions, the bound approaches one as time $t$ goes to infinity. A particular example sequence of ${\epsilon_t}$ having the asymptotic convergence rate in the order of $(1-\frac{1}{t})^4$ that holds from a sufficiently large $t$ is also discussed. |
Tasks | |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01237v1 |
http://arxiv.org/pdf/1805.01237v1.pdf | |
PWC | https://paperswithcode.com/paper/an-asymptotically-optimal-strategy-for |
Repo | |
Framework | |
PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn
Title | PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn |
Authors | Krishnaram Kenthapadi, Thanh T. L. Tran |
Abstract | Preserving privacy of users is a key requirement of web-scale analytics and reporting applications, and has witnessed a renewed focus in light of recent data breaches and new regulations such as GDPR. We focus on the problem of computing robust, reliable analytics in a privacy-preserving manner, while satisfying product requirements. We present PriPeARL, a framework for privacy-preserving analytics and reporting, inspired by differential privacy. We describe the overall design and architecture, and the key modeling components, focusing on the unique challenges associated with privacy, coverage, utility, and consistency. We perform an experimental study in the context of ads analytics and reporting at LinkedIn, thereby demonstrating the tradeoffs between privacy and utility needs, and the applicability of privacy-preserving mechanisms to real-world data. We also highlight the lessons learned from the production deployment of our system at LinkedIn. |
Tasks | |
Published | 2018-09-20 |
URL | http://arxiv.org/abs/1809.07754v1 |
http://arxiv.org/pdf/1809.07754v1.pdf | |
PWC | https://paperswithcode.com/paper/pripearl-a-framework-for-privacy-preserving |
Repo | |
Framework | |
Imitation Learning for End to End Vehicle Longitudinal Control with Forward Camera
Title | Imitation Learning for End to End Vehicle Longitudinal Control with Forward Camera |
Authors | Laurent George, Thibault Buhet, Emilie Wirbel, Gaetan Le-Gall, Xavier Perrotton |
Abstract | In this paper we present a complete study of an end-to-end imitation learning system for speed control of a real car, based on a neural network with a Long Short Term Memory (LSTM). To achieve robustness and generalization from expert demonstrations, we propose data augmentation and label augmentation that are relevant for imitation learning in longitudinal control context. Based on front camera image only, our system is able to correctly control the speed of a car in simulation environment, and in a real car on a challenging test track. The system also shows promising results in open road context. |
Tasks | Data Augmentation, Imitation Learning |
Published | 2018-12-14 |
URL | http://arxiv.org/abs/1812.05841v1 |
http://arxiv.org/pdf/1812.05841v1.pdf | |
PWC | https://paperswithcode.com/paper/imitation-learning-for-end-to-end-vehicle |
Repo | |
Framework | |
On the effectiveness of task granularity for transfer learning
Title | On the effectiveness of task granularity for transfer learning |
Authors | Farzaneh Mahdisoltani, Guillaume Berger, Waseem Gharbieh, David Fleet, Roland Memisevic |
Abstract | We describe a DNN for video classification and captioning, trained end-to-end, with shared features, to solve tasks at different levels of granularity, exploring the link between granularity in a source task and the quality of learned features for transfer learning. For solving the new task domain in transfer learning, we freeze the trained encoder and fine-tune a neural net on the target domain. We train on the Something-Something dataset with over 220, 000 videos, and multiple levels of target granularity, including 50 action groups, 174 fine-grained action categories and captions. Classification and captioning with Something-Something are challenging because of the subtle differences between actions, applied to thousands of different object classes, and the diversity of captions penned by crowd actors. Our model performs better than existing classification baselines for SomethingSomething, with impressive fine-grained results. And it yields a strong baseline on the new Something-Something captioning task. Experiments reveal that training with more fine-grained tasks tends to produce better features for transfer learning. |
Tasks | Transfer Learning, Video Classification |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.09235v2 |
http://arxiv.org/pdf/1804.09235v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-effectiveness-of-task-granularity-for |
Repo | |
Framework | |
Multi-distance Support Matrix Machines
Title | Multi-distance Support Matrix Machines |
Authors | Yunfei Ye, Dong Han |
Abstract | Real-world data such as digital images, MRI scans and electroencephalography signals are naturally represented as matrices with structural information. Most existing classifiers aim to capture these structures by regularizing the regression matrix to be low-rank or sparse. Some other methodologies introduce factorization technique to explore nonlinear relationships of matrix data in kernel space. In this paper, we propose a multi-distance support matrix machine (MDSMM), which provides a principled way of solving matrix classification problems. The multi-distance is introduced to capture the correlation within matrix data, by means of intrinsic information in rows and columns of input data. A complex hyperplane is established upon these values to separate distinct classes. We further study the generalization bounds for i.i.d. processes and non i.i.d. process based on both SVM and SMM classifiers. For typical hypothesis classes where matrix norms are constrained, MDSMM achieves a faster learning rate than traditional classifiers. We also provide a more general approach for samples without prior knowledge. We demonstrate the merits of the proposed method by conducting exhaustive experiments on both simulation study and a number of real-word datasets. |
Tasks | |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00451v2 |
http://arxiv.org/pdf/1807.00451v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-distance-support-matrix-machines |
Repo | |
Framework | |
Portfolio Optimization for Cointelated Pairs: SDEs vs. Machine Learning
Title | Portfolio Optimization for Cointelated Pairs: SDEs vs. Machine Learning |
Authors | Babak Mahdavi-Damghani, Konul Mustafayeva, Stephen Roberts, Cristin Buescu |
Abstract | With the recent rise of Machine Learning as a candidate to partially replace classic Financial Mathematics methodologies, we investigate the performances of both in solving the problem of dynamic portfolio optimization in continuous-time, finite-horizon setting for a portfolio of two assets that are intertwined. In Financial Mathematics approach we model the asset prices not via the common approaches used in pairs trading such as a high correlation or cointegration, but with the cointelation model that aims to reconcile both short-term risk and long-term equilibrium. We maximize the overall P&L with Financial Mathematics approach that dynamically switches between a mean-variance optimal strategy and a power utility maximizing strategy. We use a stochastic control formulation of the problem of power utility maximization and solve numerically the resulting HJB equation with the Deep Galerkin method. We turn to Machine Learning for the same P&L maximization problem and use clustering analysis to devise bands, combined with in-band optimization. Although this approach is model agnostic, results obtained with data simulated from the same cointelation model as FM give an edge to ML. |
Tasks | Portfolio Optimization |
Published | 2018-12-26 |
URL | https://arxiv.org/abs/1812.10183v2 |
https://arxiv.org/pdf/1812.10183v2.pdf | |
PWC | https://paperswithcode.com/paper/portfolio-optimization-for-cointelated-pairs |
Repo | |
Framework | |
Improving Regression Performance with Distributional Losses
Title | Improving Regression Performance with Distributional Losses |
Authors | Ehsan Imani, Martha White |
Abstract | There is growing evidence that converting targets to soft targets in supervised learning can provide considerable gains in performance. Much of this work has considered classification, converting hard zero-one values to soft labels—such as by adding label noise, incorporating label ambiguity or using distillation. In parallel, there is some evidence from a regression setting in reinforcement learning that learning distributions can improve performance. In this work, we investigate the reasons for this improvement, in a regression setting. We introduce a novel distributional regression loss, and similarly find it significantly improves prediction accuracy. We investigate several common hypotheses, around reducing overfitting and improved representations. We instead find evidence for an alternative hypothesis: this loss is easier to optimize, with better behaved gradients, resulting in improved generalization. We provide theoretical support for this alternative hypothesis, by characterizing the norm of the gradients of this loss. |
Tasks | |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04613v1 |
http://arxiv.org/pdf/1806.04613v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-regression-performance-with |
Repo | |
Framework | |
Differentially Private Online Submodular Optimization
Title | Differentially Private Online Submodular Optimization |
Authors | Adrian Rivera Cardoso, Rachel Cummings |
Abstract | In this paper we develop the first algorithms for online submodular minimization that preserve differential privacy under full information feedback and bandit feedback. A sequence of $T$ submodular functions over a collection of $n$ elements arrive online, and at each timestep the algorithm must choose a subset of $[n]$ before seeing the function. The algorithm incurs a cost equal to the function evaluated on the chosen set, and seeks to choose a sequence of sets that achieves low expected regret. Our first result is in the full information setting, where the algorithm can observe the entire function after making its decision at each timestep. We give an algorithm in this setting that is $\epsilon$-differentially private and achieves expected regret $\tilde{O}\left(\frac{n^{3/2}\sqrt{T}}{\epsilon}\right)$. This algorithm works by relaxing submodular function to a convex function using the Lovasz extension, and then simulating an algorithm for differentially private online convex optimization. Our second result is in the bandit setting, where the algorithm can only see the cost incurred by its chosen set, and does not have access to the entire function. This setting is significantly more challenging because the algorithm does not receive enough information to compute the Lovasz extension or its subgradients. Instead, we construct an unbiased estimate using a single-point estimation, and then simulate private online convex optimization using this estimate. Our algorithm using bandit feedback is $\epsilon$-differentially private and achieves expected regret $\tilde{O}\left(\frac{n^{3/2}T^{3/4}}{\epsilon}\right)$. |
Tasks | |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02290v1 |
http://arxiv.org/pdf/1807.02290v1.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-online-submodular |
Repo | |
Framework | |
IKA: Independent Kernel Approximator
Title | IKA: Independent Kernel Approximator |
Authors | Matteo Ronchetti |
Abstract | This paper describes a new method for low rank kernel approximation called IKA. The main advantage of IKA is that it produces a function $\psi(x)$ defined as a linear combination of arbitrarily chosen functions. In contrast the approximation produced by Nystr"om method is a linear combination of kernel evaluations. The proposed method consistently outperformed Nystr"om method in a comparison on the STL-10 dataset. Numerical results are reproducible using the source code available at https://gitlab.com/matteo-ronchetti/IKA |
Tasks | |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01353v1 |
http://arxiv.org/pdf/1809.01353v1.pdf | |
PWC | https://paperswithcode.com/paper/ika-independent-kernel-approximator |
Repo | |
Framework | |
DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild
Title | DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild |
Authors | Riza Alp Guler, Yuxiang Zhou, George Trigeorgis, Epameinondas Antonakos, Patrick Snape, Stefanos Zafeiriou, Iasonas Kokkinos |
Abstract | In this work we use deep learning to establish dense correspondences between a 3D object model and an image “in the wild”. We introduce “DenseReg”, a fully-convolutional neural network (F-CNN) that densely regresses at every foreground pixel a pair of U-V template coordinates in a single feedforward pass. To train DenseReg we construct a supervision signal by combining 3D deformable model fitting and 2D landmark annotations. We define the regression task in terms of the intrinsic, U-V coordinates of a 3D deformable model that is brought into correspondence with image instances at training time. A host of other object-related tasks (e.g. part segmentation, landmark localization) are shown to be by-products of this task, and to largely improve thanks to its introduction. We obtain highly-accurate regression results by combining ideas from semantic segmentation with regression networks, yielding a ‘quantized regression’ architecture that first obtains a quantized estimate of position through classification, and refines it through regression of the residual. We show that such networks can boost the performance of existing state-of-the-art systems for pose estimation. Firstly, we show that our system can serve as an initialization for Statistical Deformable Models, as well as an element of cascaded architectures that jointly localize landmarks and estimate dense correspondences. We also show that the obtained dense correspondence can act as a source of ‘privileged information’ that complements and extends the pure landmark-level annotations, accelerating and improving the training of pose estimation networks. We report state-of-the-art performance on the challenging 300W benchmark for facial landmark localization and on the MPII and LSP datasets for human pose estimation. |
Tasks | Face Alignment, Pose Estimation, Semantic Segmentation |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.02188v2 |
http://arxiv.org/pdf/1803.02188v2.pdf | |
PWC | https://paperswithcode.com/paper/densereg-fully-convolutional-dense-shape |
Repo | |
Framework | |
Memory-based Parameter Adaptation
Title | Memory-based Parameter Adaptation |
Authors | Pablo Sprechmann, Siddhant M. Jayakumar, Jack W. Rae, Alexander Pritzel, Adrià Puigdomènech Badia, Benigno Uria, Oriol Vinyals, Demis Hassabis, Razvan Pascanu, Charles Blundell |
Abstract | Deep neural networks have excelled on a wide range of problems, from vision to language and game playing. Neural networks very gradually incorporate information into weights as they process data, requiring very low learning rates. If the training distribution shifts, the network is slow to adapt, and when it does adapt, it typically performs badly on the training distribution before the shift. Our method, Memory-based Parameter Adaptation, stores examples in memory and then uses a context-based lookup to directly modify the weights of a neural network. Much higher learning rates can be used for this local adaptation, reneging the need for many iterations over similar data before good predictions can be made. As our method is memory-based, it alleviates several shortcomings of neural networks, such as catastrophic forgetting, fast, stable acquisition of new knowledge, learning with an imbalanced class labels, and fast learning during evaluation. We demonstrate this on a range of supervised tasks: large-scale image classification and language modelling. |
Tasks | Image Classification, Language Modelling |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10542v1 |
http://arxiv.org/pdf/1802.10542v1.pdf | |
PWC | https://paperswithcode.com/paper/memory-based-parameter-adaptation |
Repo | |
Framework | |
Walk-Steered Convolution for Graph Classification
Title | Walk-Steered Convolution for Graph Classification |
Authors | Jiatao Jiang, Chunyan Xu, Zhen Cui, Tong Zhang, Wenming Zheng, Jian Yang |
Abstract | Graph classification is a fundamental but challenging issue for numerous real-world applications. Despite recent great progress in image/video classification, convolutional neural networks (CNNs) cannot yet cater to graphs well because of graphical non-Euclidean topology. In this work, we propose a walk-steered convolutional (WSC) network to assemble the essential success of standard convolutional neural networks as well as the powerful representation ability of random walk. Instead of deterministic neighbor searching used in previous graphical CNNs, we construct multi-scale walk fields (a.k.a. local receptive fields) with random walk paths to depict subgraph structures and advocate graph scalability. To express the internal variations of a walk field, Gaussian mixture models are introduced to encode principal components of walk paths therein. As an analogy to a standard convolution kernel on image, Gaussian models implicitly coordinate those unordered vertices/nodes and edges in a local receptive field after projecting to the gradient space of Gaussian parameters. We further stack graph coarsening upon Gaussian encoding by using dynamic clustering, such that high-level semantics of graph can be well learned like the conventional pooling on image. The experimental results on several public datasets demonstrate the superiority of our proposed WSC method over many state-of-the-arts for graph classification. |
Tasks | Graph Classification, Video Classification |
Published | 2018-04-16 |
URL | https://arxiv.org/abs/1804.05837v2 |
https://arxiv.org/pdf/1804.05837v2.pdf | |
PWC | https://paperswithcode.com/paper/walk-steered-convolution-for-graph |
Repo | |
Framework | |