Paper Group ANR 404
Confidence Guided Stereo 3D Object Detection with Split Depth Estimation. Learning discrete state abstractions with deep variational inference. MDLdroid: a ChainSGD-reduce Approach to Mobile Deep Learning for Personal Mobile Sensing. Graph Universal Adversarial Attacks: A Few Bad Actors Ruin Graph Learning Models. Group Sparsity: The Hinge Between …
Confidence Guided Stereo 3D Object Detection with Split Depth Estimation
Title | Confidence Guided Stereo 3D Object Detection with Split Depth Estimation |
Authors | Chengyao Li, Jason Ku, Steven L. Waslander |
Abstract | Accurate and reliable 3D object detection is vital to safe autonomous driving. Despite recent developments, the performance gap between stereo-based methods and LiDAR-based methods is still considerable. Accurate depth estimation is crucial to the performance of stereo-based 3D object detection methods, particularly for those pixels associated with objects in the foreground. Moreover, stereo-based methods suffer from high variance in the depth estimation accuracy, which is often not considered in the object detection pipeline. To tackle these two issues, we propose CG-Stereo, a confidence-guided stereo 3D object detection pipeline that uses separate decoders for foreground and background pixels during depth estimation, and leverages the confidence estimation from the depth estimation network as a soft attention mechanism in the 3D object detector. Our approach outperforms all state-of-the-art stereo-based 3D detectors on the KITTI benchmark. |
Tasks | 3D Object Detection, Autonomous Driving, Depth Estimation, Object Detection |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05505v1 |
https://arxiv.org/pdf/2003.05505v1.pdf | |
PWC | https://paperswithcode.com/paper/confidence-guided-stereo-3d-object-detection |
Repo | |
Framework | |
Learning discrete state abstractions with deep variational inference
Title | Learning discrete state abstractions with deep variational inference |
Authors | Ondrej Biza, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong |
Abstract | Abstraction is crucial for effective sequential decision making in domains with large state spaces. In this work, we propose a variational information bottleneck method for learning approximate bisimulations, a type of state abstraction. We use a deep neural net encoder to map states onto continuous embeddings. The continuous latent space is then compressed into a discrete representation using an action-conditioned hidden Markov model, which is trained end-to-end with the neural network. Our method is suited for environments with high-dimensional states and learns from a stream of experience collected by an agent acting in a Markov decision process. Through a learned discrete abstract model, we can efficiently plan for unseen goals in a multi-goal Reinforcement Learning setting. We test our method in simplified robotic manipulation domains with image states. We also compare it against previous model-based approaches to finding bisimulations in discrete grid-world-like environments. |
Tasks | Decision Making, Multi-Goal Reinforcement Learning |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04300v1 |
https://arxiv.org/pdf/2003.04300v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-discrete-state-abstractions-with |
Repo | |
Framework | |
MDLdroid: a ChainSGD-reduce Approach to Mobile Deep Learning for Personal Mobile Sensing
Title | MDLdroid: a ChainSGD-reduce Approach to Mobile Deep Learning for Personal Mobile Sensing |
Authors | Yu Zhang, Tao Gu, Xi Zhang |
Abstract | Personal mobile sensing is fast permeating our daily lives to enable activity monitoring, healthcare and rehabilitation. Combined with deep learning, these applications have achieved significant success in recent years. Different from conventional cloud-based paradigms, running deep learning on devices offers several advantages including data privacy preservation and low-latency response for both model inference and update. Since data collection is costly in reality, Google’s Federated Learning offers not only complete data privacy but also better model robustness based on multiple user data. However, personal mobile sensing applications are mostly user-specific and highly affected by environment. As a result, continuous local changes may seriously affect the performance of a global model generated by Federated Learning. In addition, deploying Federated Learning on a local server, e.g., edge server, may quickly reach the bottleneck due to resource constraint and serious failure by attacks. Towards pushing deep learning on devices, we present MDLdroid, a novel decentralized mobile deep learning framework to enable resource-aware on-device collaborative learning for personal mobile sensing applications. To address resource limitation, we propose a ChainSGD-reduce approach which includes a novel chain-directed Synchronous Stochastic Gradient Descent algorithm to effectively reduce overhead among multiple devices. We also design an agent-based multi-goal reinforcement learning mechanism to balance resources in a fair and efficient manner. Our evaluations show that our model training on off-the-shelf mobile devices achieves 2x to 3.5x faster than single-device training, and 1.5x faster than the master-slave approach. |
Tasks | Multi-Goal Reinforcement Learning |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02897v2 |
https://arxiv.org/pdf/2002.02897v2.pdf | |
PWC | https://paperswithcode.com/paper/mdldroid-a-chainsgd-reduce-approach-to-mobile |
Repo | |
Framework | |
Graph Universal Adversarial Attacks: A Few Bad Actors Ruin Graph Learning Models
Title | Graph Universal Adversarial Attacks: A Few Bad Actors Ruin Graph Learning Models |
Authors | Xiao Zang, Yi Xie, Jie Chen, Bo Yuan |
Abstract | Deep neural networks, while generalize well, are known to be sensitive to small adversarial perturbations. This phenomenon poses severe security threat and calls for in-depth investigation of the robustness of deep learning models. With the emergence of neural networks for graph structured data, similar investigations are urged to understand their robustness. It has been found that adversarially perturbing the graph structure and/or node features may result in a significant degradation of the model performance. In this work, we show from a different angle that such fragility similarly occurs if the graph contains a few bad-actor nodes, which compromise a trained graph neural network through flipping the connections to any targeted victim. Worse, the bad actors found for one graph model severely compromise other models as well. We call the bad actors “anchor nodes” and propose an algorithm, named GUA, to identify them. Thorough empirical investigations suggest an interesting finding that the anchor nodes often belong to the same class; and they also corroborate the intuitive trade-off between the number of anchor nodes and the attack success rate. For the data set Cora which contains 2708 nodes, as few as six anchor nodes will result in an attack success rate higher than 80% for GCN and other three models. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.04784v1 |
https://arxiv.org/pdf/2002.04784v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-universal-adversarial-attacks-a-few-bad |
Repo | |
Framework | |
Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression
Title | Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression |
Authors | Yawei Li, Shuhang Gu, Christoph Mayer, Luc Van Gool, Radu Timofte |
Abstract | In this paper, we analyze two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense. By simply changing the way the sparsity regularization is enforced, filter pruning and low-rank decomposition can be derived accordingly. This provides another flexible choice for network compression because the techniques complement each other. For example, in popular network architectures with shortcut connections (e.g. ResNet), filter pruning cannot deal with the last convolutional layer in a ResBlock while the low-rank decomposition methods can. In addition, we propose to compress the whole network jointly instead of in a layer-wise manner. Our approach proves its potential as it compares favorably to the state-of-the-art on several benchmarks. |
Tasks | |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.08935v1 |
https://arxiv.org/pdf/2003.08935v1.pdf | |
PWC | https://paperswithcode.com/paper/group-sparsity-the-hinge-between-filter |
Repo | |
Framework | |
Towards Backward-Compatible Representation Learning
Title | Towards Backward-Compatible Representation Learning |
Authors | Yantao Shen, Yuanjun Xiong, Wei Xia, Stefano Soatto |
Abstract | We propose a way to learn visual features that are compatible with previously computed ones even when they have different dimensions and are learned via different neural network architectures and loss functions. Compatible means that, if such features are used to compare images, then “new” features can be compared directly to “old” features, so they can be used interchangeably. This enables visual search systems to bypass computing new features for all previously seen images when updating the embedding models, a process known as backfilling. Backward compatibility is critical to quickly deploy new embedding models that leverage ever-growing large-scale training datasets and improvements in deep learning architectures and training methods. We propose a framework to train embedding models, called backward-compatible training (BCT), as a first step towards backward compatible representation learning. In experiments on learning embeddings for face recognition, models trained with BCT successfully achieve backward compatibility without sacrificing accuracy, thus enabling backfill-free model updates of visual embeddings. |
Tasks | Face Recognition, Representation Learning |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.11942v2 |
https://arxiv.org/pdf/2003.11942v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-backward-compatible-representation |
Repo | |
Framework | |
Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability
Title | Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability |
Authors | Yikai Yan, Chaoyue Niu, Yucheng Ding, Zhenzhe Zheng, Fan Wu, Guihai Chen, Shaojie Tang, Zhihua Wu |
Abstract | Federated learning is a new distributed machine learning framework, where a bunch of heterogeneous clients collaboratively train a model without sharing training data. In this work, we consider a practical and ubiquitous issue in federated learning: intermittent client availability, where the set of eligible clients may change during the training process. Such an intermittent client availability model would significantly deteriorate the performance of the classical Federated Averaging algorithm (FedAvg for short). We propose a simple distributed non-convex optimization algorithm, called Federated Latest Averaging (FedLaAvg for short), which leverages the latest gradients of all clients, even when the clients are not available, to jointly update the global model in each iteration. Our theoretical analysis shows that FedLaAvg attains the convergence rate of $O(1/(N^{1/4} T^{1/2}))$, achieving a sublinear speedup with respect to the total number of clients. We implement and evaluate FedLaAvg with the CIFAR-10 dataset. The evaluation results demonstrate that FedLaAvg indeed reaches a sublinear speedup and achieves 4.23% higher test accuracy than FedAvg. |
Tasks | |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07399v1 |
https://arxiv.org/pdf/2002.07399v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-non-convex-optimization-with |
Repo | |
Framework | |
Tightly Robust Optimization via Empirical Domain Reduction
Title | Tightly Robust Optimization via Empirical Domain Reduction |
Authors | Akihiro Yabe, Takanori Maehara |
Abstract | Data-driven decision-making is performed by solving a parameterized optimization problem, and the optimal decision is given by an optimal solution for unknown true parameters. We often need a solution that satisfies true constraints even though these are unknown. Robust optimization is employed to obtain such a solution, where the uncertainty of the parameter is represented by an ellipsoid, and the scale of robustness is controlled by a coefficient. In this study, we propose an algorithm to determine the scale such that the solution has a good objective value and satisfies the true constraints with a given confidence probability. Under some regularity conditions, the scale obtained by our algorithm is asymptotically $O(1/\sqrt{n})$, whereas the scale obtained by a standard approach is $O(\sqrt{d/n})$. This means that our algorithm is less affected by the dimensionality of the parameters. |
Tasks | Decision Making |
Published | 2020-02-29 |
URL | https://arxiv.org/abs/2003.00248v1 |
https://arxiv.org/pdf/2003.00248v1.pdf | |
PWC | https://paperswithcode.com/paper/tightly-robust-optimization-via-empirical |
Repo | |
Framework | |
Improving Convolutional Neural Networks Via Conservative Field Regularisation and Integration
Title | Improving Convolutional Neural Networks Via Conservative Field Regularisation and Integration |
Authors | Dominique Beaini, Sofiane Achiche, Maxime Raison |
Abstract | Current research in convolutional neural networks (CNN) focuses mainly on changing the architecture of the networks, optimizing the hyper-parameters and improving the gradient descent. However, most work use only 3 standard families of operations inside the CNN, the convolution, the activation function, and the pooling. In this work, we propose a new family of operations based on the Green’s function of the Laplacian, which allows the network to solve the Laplacian, to integrate any vector field and to regularize the field by forcing it to be conservative. Hence, the Green’s function (GF) is the first operation that regularizes the 2D or 3D feature space by forcing it to be conservative and physically interpretable, instead of regularizing the norm of the weights. Our results show that such regularization allows the network to learn faster, to have smoother training curves and to better generalize, without any additional parameter. The current manuscript presents early results, more work is required to benchmark the proposed method. |
Tasks | |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05182v1 |
https://arxiv.org/pdf/2003.05182v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-convolutional-neural-networks-via |
Repo | |
Framework | |
Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning
Title | Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning |
Authors | Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, Vicente Ordonez |
Abstract | Semi-supervised learning aims to take advantage of a large amount of unlabeled data to improve the accuracy of a model that only has access to a small number of labeled examples. We propose curriculum labeling, an approach that exploits pseudo-labeling for propagating labels to unlabeled samples in an iterative and self-paced fashion. This approach is surprisingly simple and effective and surpasses or is comparable with the best methods proposed in the recent literature across all the standard benchmarks for image classification. Notably, we obtain 94.91% accuracy on CIFAR-10 using only 4,000 labeled samples, and 88.56% top-5 accuracy on Imagenet-ILSVRC using 128,000 labeled samples. In contrast to prior works, our approach shows improvements even in a more realistic scenario that leverages out-of-distribution unlabeled data samples. |
Tasks | Image Classification |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.06001v1 |
https://arxiv.org/pdf/2001.06001v1.pdf | |
PWC | https://paperswithcode.com/paper/curriculum-labeling-self-paced-pseudo |
Repo | |
Framework | |
Entrywise convergence of iterative methods for eigenproblems
Title | Entrywise convergence of iterative methods for eigenproblems |
Authors | Vasileios Charisopoulos, Austin R. Benson, Anil Damle |
Abstract | Several problems in machine learning, statistics, and other fields rely on computing eigenvectors. For large scale problems, the computation of these eigenvectors is typically performed via iterative schemes such as subspace iteration or Krylov methods. While there is classical and comprehensive analysis for subspace convergence guarantees with respect to the spectral norm, in many modern applications other notions of subspace distance are more appropriate. Recent theoretical work has focused on perturbations of subspaces measured in the $\ell_{2 \to \infty}$ norm, but does not consider the actual computation of eigenvectors. Here we address the convergence of subspace iteration when distances are measured in the $\ell_{2 \to \infty}$ norm and provide deterministic bounds. We complement our analysis with a practical stopping criterion and demonstrate its applicability via numerical experiments. Our results show that one can get comparable performance on downstream tasks while requiring fewer iterations, thereby saving substantial computational time. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08491v1 |
https://arxiv.org/pdf/2002.08491v1.pdf | |
PWC | https://paperswithcode.com/paper/entrywise-convergence-of-iterative-methods |
Repo | |
Framework | |
Hallucinative Topological Memory for Zero-Shot Visual Planning
Title | Hallucinative Topological Memory for Zero-Shot Visual Planning |
Authors | Kara Liu, Thanard Kurutach, Christine Tung, Pieter Abbeel, Aviv Tamar |
Abstract | In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e.g., images obtained from self-supervised robot interaction. Most previous works on VP approached the problem by planning in a learned latent space, resulting in low-quality visual plans, and difficult training algorithms. Here, instead, we propose a simple VP method that plans directly in image space and displays competitive performance. We build on the semi-parametric topological memory (SPTM) method: image samples are treated as nodes in a graph, the graph connectivity is learned from image sequence data, and planning can be performed using conventional graph search methods. We propose two modifications on SPTM. First, we train an energy-based graph connectivity function using contrastive predictive coding that admits stable training. Second, to allow zero-shot planning in new domains, we learn a conditional VAE model that generates images given a context of the domain, and use these hallucinated samples for building the connectivity graph and planning. We show that this simple approach significantly outperform the state-of-the-art VP methods, in terms of both plan interpretability and success rate when using the plan to guide a trajectory-following controller. Interestingly, our method can pick up non-trivial visual properties of objects, such as their geometry, and account for it in the plans. |
Tasks | |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12336v1 |
https://arxiv.org/pdf/2002.12336v1.pdf | |
PWC | https://paperswithcode.com/paper/hallucinative-topological-memory-for-zero-1 |
Repo | |
Framework | |
How to train your neural ODE
Title | How to train your neural ODE |
Authors | Chris Finlay, Jörn-Henrik Jacobsen, Levon Nurbekyan, Adam M Oberman |
Abstract | Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values. In practice this leads to dynamics equivalent to many hundreds or even thousands of layers. In this paper, we overcome this apparent difficulty by introducing a theoretically-grounded combination of both optimal transport and stability regularizations which encourage neural ODEs to prefer simpler dynamics out of all the dynamics that solve a problem well. Simpler dynamics lead to faster convergence and to fewer discretizations of the solver, considerably decreasing wall-clock time without loss in performance. Our approach allows us to train neural ODE based generative models to the same performance as the unregularized dynamics in just over a day on one GPU, whereas unregularized dynamics can take up to 4-6 days of training time on multiple GPUs. This brings neural ODEs significantly closer to practical relevance in large-scale applications. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02798v1 |
https://arxiv.org/pdf/2002.02798v1.pdf | |
PWC | https://paperswithcode.com/paper/how-to-train-your-neural-ode |
Repo | |
Framework | |
Multi-target regression via output space quantization
Title | Multi-target regression via output space quantization |
Authors | Eleftherios Spyromitros-Xioufis, Konstantinos Sechidis, Ioannis Vlahavas |
Abstract | Multi-target regression is concerned with the prediction of multiple continuous target variables using a shared set of predictors. Two key challenges in multi-target regression are: (a) modelling target dependencies and (b) scalability to large output spaces. In this paper, a new multi-target regression method is proposed that tries to jointly address these challenges via a novel problem transformation approach. The proposed method, called MRQ, is based on the idea of quantizing the output space in order to transform the multiple continuous targets into one or more discrete ones. Learning on the transformed output space naturally enables modeling of target dependencies while the quantization strategy can be flexibly parameterized to control the trade-off between prediction accuracy and computational efficiency. Experiments on a large collection of benchmark datasets show that MRQ is both highly scalable and also competitive with the state-of-the-art in terms of accuracy. In particular, an ensemble version of MRQ obtains the best overall accuracy, while being an order of magnitude faster than the runner up method. |
Tasks | Quantization |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.09896v1 |
https://arxiv.org/pdf/2003.09896v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-target-regression-via-output-space |
Repo | |
Framework | |
Vector symbolic architectures for context-free grammars
Title | Vector symbolic architectures for context-free grammars |
Authors | Peter beim Graben, Markus Huber, Werner Meyer, Ronald Römer, Constanze Tschöpe, Matthias Wolff |
Abstract | Background / introduction. Vector symbolic architectures (VSA) are a viable approach for the hyperdimensional representation of symbolic data, such as documents, syntactic structures, or semantic frames. Methods. We present a rigorous mathematical framework for the representation of phrase structure trees and parse-trees of context-free grammars (CFG) in Fock space, i.e. infinite-dimensional Hilbert space as being used in quantum field theory. We define a novel normal form for CFG by means of term algebras. Using a recently developed software toolbox, called FockBox, we construct Fock space representations for the trees built up by a CFG left-corner (LC) parser. Results. We prove a universal representation theorem for CFG term algebras in Fock space and illustrate our findings through a low-dimensional principal component projection of the LC parser states. Conclusions. Our approach could leverage the development of VSA for explainable artificial intelligence (XAI) by means of hyperdimensional deep neural computation. It could be of significance for the improvement of cognitive user interfaces and other applications of VSA in machine learning. |
Tasks | |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05171v1 |
https://arxiv.org/pdf/2003.05171v1.pdf | |
PWC | https://paperswithcode.com/paper/vector-symbolic-architectures-for-context |
Repo | |
Framework | |