Paper Group ANR 56
WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics. Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training. A New Multi-vehicle Trajectory Generator to Simulate Vehicle-to-Vehicle Encounters. Unsupervised Domain Adaptation using Generative Models and Self-ensembling. Efficient Facial Representation …
WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics
Title | WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics |
Authors | Asit Mishra, Debbie Marr |
Abstract | Today’s high performance deep learning architectures involve large models with numerous parameters. Low precision numerics has emerged as a popular technique to reduce both the compute and memory requirements of these large models. However, lowering precision often leads to accuracy degradation. We describe three schemes whereby one can both train and do efficient inference using low precision numerics without hurting accuracy. Finally, we describe an efficient hardware accelerator that can take advantage of the proposed low precision numerics. |
Tasks | |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00227v1 |
http://arxiv.org/pdf/1803.00227v1.pdf | |
PWC | https://paperswithcode.com/paper/wrpn-apprentice-methods-for-training-and |
Repo | |
Framework | |
Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training
Title | Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training |
Authors | Fangda Gu, Armin Askari, Laurent El Ghaoui |
Abstract | Despite the recent successes of deep neural networks, the corresponding training problem remains highly non-convex and difficult to optimize. Classes of models have been proposed that introduce greater structure to the objective function at the cost of lifting the dimension of the problem. However, these lifted methods sometimes perform poorly compared to traditional neural networks. In this paper, we introduce a new class of lifted models, Fenchel lifted networks, that enjoy the same benefits as previous lifted models, without suffering a degradation in performance over classical networks. Our model represents activation functions as equivalent biconvex constraints and uses Lagrange Multipliers to arrive at a rigorous lower bound of the traditional neural network training problem. This model is efficiently trained using block-coordinate descent and is parallelizable across data points and/or layers. We compare our model against standard fully connected and convolutional networks and show that we are able to match or beat their performance. |
Tasks | |
Published | 2018-11-20 |
URL | https://arxiv.org/abs/1811.08039v3 |
https://arxiv.org/pdf/1811.08039v3.pdf | |
PWC | https://paperswithcode.com/paper/fenchel-lifted-networks-a-lagrange-relaxation |
Repo | |
Framework | |
A New Multi-vehicle Trajectory Generator to Simulate Vehicle-to-Vehicle Encounters
Title | A New Multi-vehicle Trajectory Generator to Simulate Vehicle-to-Vehicle Encounters |
Authors | Wenhao Ding, Wenshuo Wang, Ding Zhao |
Abstract | Generating multi-vehicle trajectories from existing limited data can provide rich resources for autonomous vehicle development and testing. This paper introduces a multi-vehicle trajectory generator (MTG) that can encode multi-vehicle interaction scenarios (called driving encounters) into an interpretable representation from which new driving encounter scenarios are generated by sampling. The MTG consists of a bi-directional encoder and a multi-branch decoder. A new disentanglement metric is then developed for model analyses and comparisons in terms of model robustness and the independence of the latent codes. Comparison of our proposed MTG with $\beta$-VAE and InfoGAN demonstrates that the MTG has stronger capability to purposely generate rational vehicle-to-vehicle encounters through operating the disentangled latent codes. Thus the MTG could provide more data for engineers and researchers to develop testing and evaluation scenarios for autonomous vehicles. |
Tasks | Autonomous Vehicles |
Published | 2018-09-15 |
URL | http://arxiv.org/abs/1809.05680v5 |
http://arxiv.org/pdf/1809.05680v5.pdf | |
PWC | https://paperswithcode.com/paper/a-new-multi-vehicle-trajectory-generator-to |
Repo | |
Framework | |
Unsupervised Domain Adaptation using Generative Models and Self-ensembling
Title | Unsupervised Domain Adaptation using Generative Models and Self-ensembling |
Authors | Eman T. Hassan, Xin Chen, David Crandall |
Abstract | Transferring knowledge across different datasets is an important approach to successfully train deep models with a small-scale target dataset or when few labeled instances are available. In this paper, we aim at developing a model that can generalize across multiple domain shifts, so that this model can adapt from a single source to multiple targets. This can be achieved by randomizing the generation of the data of various styles to mitigate the domain mismatch. First, we present a new adaptation to the CycleGAN model to produce stochastic style transfer between two image batches of different domains. Second, we enhance the classifier performance by using a self-ensembling technique with a teacher and student model to train on both original and generated data. Finally, we present experimental results on three datasets Office-31, Office-Home, and Visual Domain adaptation. The results suggest that selfensembling is better than simple data augmentation with the newly generated data and a single model trained this way can have the best performance across all different transfer tasks. |
Tasks | Data Augmentation, Domain Adaptation, Style Transfer, Unsupervised Domain Adaptation |
Published | 2018-12-02 |
URL | http://arxiv.org/abs/1812.00479v1 |
http://arxiv.org/pdf/1812.00479v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-domain-adaptation-using-1 |
Repo | |
Framework | |
Efficient Facial Representations for Age, Gender and Identity Recognition in Organizing Photo Albums using Multi-output CNN
Title | Efficient Facial Representations for Age, Gender and Identity Recognition in Organizing Photo Albums using Multi-output CNN |
Authors | Andrey V. Savchenko |
Abstract | This paper is focused on the automatic extraction of persons and their attributes (gender, year of born) from album of photos and videos. We propose the two-stage approach, in which, firstly, the convolutional neural network simultaneously predicts age/gender from all photos and additionally extracts facial representations suitable for face identification. We modified the MobileNet, which is preliminarily trained to perform face recognition, in order to additionally recognize age and gender. In the second stage of our approach, extracted faces are grouped using hierarchical agglomerative clustering techniques. The born year and gender of a person in each cluster are estimated using aggregation of predictions for individual photos. We experimentally demonstrated that our facial clustering quality is competitive with the state-of-the-art neural networks, though our implementation is much computationally cheaper. Moreover, our approach is characterized by more accurate video-based age/gender recognition when compared to the publicly available models. |
Tasks | Face Identification, Face Recognition |
Published | 2018-07-20 |
URL | https://arxiv.org/abs/1807.07718v3 |
https://arxiv.org/pdf/1807.07718v3.pdf | |
PWC | https://paperswithcode.com/paper/efficient-facial-representations-for-age |
Repo | |
Framework | |
Leveraging Product as an Activation Function in Deep Networks
Title | Leveraging Product as an Activation Function in Deep Networks |
Authors | Luke B. Godfrey, Michael S. Gashler |
Abstract | Product unit neural networks (PUNNs) are powerful representational models with a strong theoretical basis, but have proven to be difficult to train with gradient-based optimizers. We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network. Windowing the product tames the complex gradient surface and enables WPUNNs to learn effectively, solving the problems faced by PUNNs. WPUNNs use product layers between traditional sum layers, capturing the representational power of product units and using the product itself as a nonlinearity. We find the result that this method works as well as traditional nonlinearities like ReLU on the MNIST dataset. We demonstrate that WPUNNs can also generalize gated units in recurrent neural networks, yielding results comparable to LSTM networks. |
Tasks | |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08578v1 |
http://arxiv.org/pdf/1810.08578v1.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-product-as-an-activation-function |
Repo | |
Framework | |
Decision Tree Design for Classification in Crowdsourcing Systems
Title | Decision Tree Design for Classification in Crowdsourcing Systems |
Authors | Baocheng Geng, Qunwei Li, Pramod K. Varshney |
Abstract | In this paper, we present a novel sequential paradigm for classification in crowdsourcing systems. Considering that workers are unreliable and they perform the tests with errors, we study the construction of decision trees so as to minimize the probability of mis-classification. By exploiting the connection between probability of mis-classification and entropy at each level of the decision tree, we propose two algorithms for decision tree design. Furthermore, the worker assignment problem is studied when workers can be assigned to different tests of the decision tree to provide a trade-off between classification cost and resulting error performance. Numerical results are presented for illustration. |
Tasks | |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00559v1 |
http://arxiv.org/pdf/1805.00559v1.pdf | |
PWC | https://paperswithcode.com/paper/decision-tree-design-for-classification-in |
Repo | |
Framework | |
Deep Learning-Aided Projected Gradient Detector for Massive Overloaded MIMO Channels
Title | Deep Learning-Aided Projected Gradient Detector for Massive Overloaded MIMO Channels |
Authors | Satoshi Takabe, Masayuki Imanishi, Tadashi Wadayama, Kazunori Hayashi |
Abstract | The paper presents a deep learning-aided iterative detection algorithm for massive overloaded MIMO systems. Since the proposed algorithm is based on the projected gradient descent method with trainable parameters, it is named as trainable projected descent-detector (TPG-detector). The trainable internal parameters can be optimized with standard deep learning techniques such as back propagation and stochastic gradient descent algorithms. This approach referred to as data-driven tuning brings notable advantages of the proposed scheme such as fast convergence. The numerical experiments show that TPG-detector achieves comparable detection performance to those of the known algorithms for massive overloaded MIMO channels with lower computation cost. |
Tasks | |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.10827v2 |
http://arxiv.org/pdf/1806.10827v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-aided-projected-gradient |
Repo | |
Framework | |
Two Can Play That Game: An Adversarial Evaluation of a Cyber-alert Inspection System
Title | Two Can Play That Game: An Adversarial Evaluation of a Cyber-alert Inspection System |
Authors | Ankit Shah, Arunesh Sinha, Rajesh Ganesan, Sushil Jajodia, Hasan Cam |
Abstract | Cyber-security is an important societal concern. Cyber-attacks have increased in numbers as well as in the extent of damage caused in every attack. Large organizations operate a Cyber Security Operation Center (CSOC), which form the first line of cyber-defense. The inspection of cyber-alerts is a critical part of CSOC operations. A recent work, in collaboration with Army Research Lab, USA proposed a reinforcement learning (RL) based approach to prevent the cyber-alert queue length from growing large and overwhelming the defender. Given the potential deployment of this approach to CSOCs run by US defense agencies, we perform a red team (adversarial) evaluation of this approach. Further, with the recent attacks on learning systems, it is even more important to test the limits of this RL approach. Towards that end, we learn an adversarial alert generation policy that is a best response to the defender inspection policy. Surprisingly, we find the defender policy to be quite robust to the best response of the attacker. In order to explain this observation, we extend the earlier RL model to a game model and show that there exists defender policies that can be robust against any adversarial policy. We also derive a competitive baseline from the game theory model and compare it to the RL approach. However, we go further to exploit assumptions made in the MDP in the RL model and discover an attacker policy that overwhelms the defender. We use a double oracle approach to retrain the defender with episodes from this discovered attacker policy. This made the defender robust to the discovered attacker policy and no further harmful attacker policies were discovered. Overall, the adversarial RL and double oracle approach in RL are general techniques that are applicable to other RL usage in adversarial environments. |
Tasks | |
Published | 2018-10-13 |
URL | http://arxiv.org/abs/1810.05921v1 |
http://arxiv.org/pdf/1810.05921v1.pdf | |
PWC | https://paperswithcode.com/paper/two-can-play-that-game-an-adversarial |
Repo | |
Framework | |
Real-World Repetition Estimation by Div, Grad and Curl
Title | Real-World Repetition Estimation by Div, Grad and Curl |
Authors | Tom F. H. Runia, Cees G. M. Snoek, Arnold W. M. Smeulders |
Abstract | We consider the problem of estimating repetition in video, such as performing push-ups, cutting a melon or playing violin. Existing work shows good results under the assumption of static and stationary periodicity. As realistic video is rarely perfectly static and stationary, the often preferred Fourier-based measurements is inapt. Instead, we adopt the wavelet transform to better handle non-static and non-stationary video dynamics. From the flow field and its differentials, we derive three fundamental motion types and three motion continuities of intrinsic periodicity in 3D. On top of this, the 2D perception of 3D periodicity considers two extreme viewpoints. What follows are 18 fundamental cases of recurrent perception in 2D. In practice, to deal with the variety of repetitive appearance, our theory implies measuring time-varying flow and its differentials (gradient, divergence and curl) over segmented foreground motion. For experiments, we introduce the new QUVA Repetition dataset, reflecting reality by including non-static and non-stationary videos. On the task of counting repetitions in video, we obtain favorable results compared to a deep learning alternative. |
Tasks | |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09971v1 |
http://arxiv.org/pdf/1802.09971v1.pdf | |
PWC | https://paperswithcode.com/paper/real-world-repetition-estimation-by-div-grad |
Repo | |
Framework | |
Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective
Title | Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective |
Authors | Jing Zhang, Tong Zhang, Yuchao Dai, Mehrtash Harandi, Richard Hartley |
Abstract | The success of current deep saliency detection methods heavily depends on the availability of large-scale supervision in the form of per-pixel labeling. Such supervision, while labor-intensive and not always possible, tends to hinder the generalization ability of the learned models. By contrast, traditional handcrafted features based unsupervised saliency detection methods, even though have been surpassed by the deep supervised methods, are generally dataset-independent and could be applied in the wild. This raises a natural question that “Is it possible to learn saliency maps without using labeled data while improving the generalization ability?". To this end, we present a novel perspective to unsupervised saliency detection through learning from multiple noisy labeling generated by “weak” and “noisy” unsupervised handcrafted saliency methods. Our end-to-end deep learning framework for unsupervised saliency detection consists of a latent saliency prediction module and a noise modeling module that work collaboratively and are optimized jointly. Explicit noise modeling enables us to deal with noisy saliency maps in a probabilistic way. Extensive experimental results on various benchmarking datasets show that our model not only outperforms all the unsupervised saliency methods with a large margin but also achieves comparable performance with the recent state-of-the-art supervised deep saliency methods. |
Tasks | Saliency Detection, Saliency Prediction |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1803.10910v1 |
http://arxiv.org/pdf/1803.10910v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-unsupervised-saliency-detection-a |
Repo | |
Framework | |
Improved Generalization Bounds for Robust Learning
Title | Improved Generalization Bounds for Robust Learning |
Authors | Idan Attias, Aryeh Kontorovich, Yishay Mansour |
Abstract | We consider a model of robust learning in an adversarial environment. The learner gets uncorrupted training data with access to possible corruptions that may be affected by the adversary during testing. The learner’s goal is to build a robust classifier that would be tested on future adversarial examples. We use a zero-sum game between the learner and the adversary as our game theoretic framework. The adversary is limited to $k$ possible corruptions for each input. Our model is closely related to the adversarial examples model of Schmidt et al. (2018); Madry et al. (2017). Our main results consist of generalization bounds for the binary and multi-class classification, as well as the real-valued case (regression). For the binary classification setting, we both tighten the generalization bound of Feige, Mansour, and Schapire (2015), and also are able to handle an infinite hypothesis class $H$. The sample complexity is improved from $O(\frac{1}{\epsilon^4}\log(\frac{H}{\delta}))$ to $O(\frac{1}{\epsilon^2}(k\log(k)VC(H)+\log\frac{1}{\delta}))$. Additionally, we extend the algorithm and generalization bound from the binary to the multiclass and real-valued cases. Along the way, we obtain results on fat-shattering dimension and Rademacher complexity of $k$-fold maxima over function classes; these may be of independent interest. For binary classification, the algorithm of Feige et al. (2015) uses a regret minimization algorithm and an ERM oracle as a blackbox; we adapt it for the multi-class and regression settings. The algorithm provides us with near-optimal policies for the players on a given training sample. |
Tasks | |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02180v2 |
http://arxiv.org/pdf/1810.02180v2.pdf | |
PWC | https://paperswithcode.com/paper/improved-generalization-bounds-for-robust |
Repo | |
Framework | |
Learning Policy Representations in Multiagent Systems
Title | Learning Policy Representations in Multiagent Systems |
Authors | Aditya Grover, Maruan Al-Shedivat, Jayesh K. Gupta, Yura Burda, Harrison Edwards |
Abstract | Modeling agent behavior is central to understanding the emergence of complex phenomena in multiagent systems. Prior work in agent modeling has largely been task-specific and driven by hand-engineering domain-specific prior knowledge. We propose a general learning framework for modeling agent behavior in any multiagent system using only a handful of interaction data. Our framework casts agent modeling as a representation learning problem. Consequently, we construct a novel objective inspired by imitation learning and agent identification and design an algorithm for unsupervised learning of representations of agent policies. We demonstrate empirically the utility of the proposed framework in (i) a challenging high-dimensional competitive environment for continuous control and (ii) a cooperative environment for communication, on supervised predictive tasks, unsupervised clustering, and policy optimization using deep reinforcement learning. |
Tasks | Continuous Control, Imitation Learning, Representation Learning |
Published | 2018-06-17 |
URL | http://arxiv.org/abs/1806.06464v2 |
http://arxiv.org/pdf/1806.06464v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-policy-representations-in-multiagent |
Repo | |
Framework | |
Automatic Rank Selection for High-Speed Convolutional Neural Network
Title | Automatic Rank Selection for High-Speed Convolutional Neural Network |
Authors | Hyeji Kim, Chong-Min Kyung |
Abstract | Low-rank decomposition plays a central role in accelerating convolutional neural network (CNN), and the rank of decomposed kernel-tensor is a key parameter that determines the complexity and accuracy of a neural network. In this paper, we define rank selection as a combinatorial optimization problem and propose a methodology to minimize network complexity while maintaining the desired accuracy. Combinatorial optimization is not feasible due to search space limitations. To restrict the search space and obtain the optimal rank, we define the space constraint parameters with a boundary condition. We also propose a linearly-approximated accuracy function to predict the fine-tuned accuracy of the optimized CNN model during the cost reduction. Experimental results on AlexNet and VGG-16 show that the proposed rank selection algorithm satisfies the accuracy constraint. Our method combined with truncated-SVD outperforms state-of-the-art methods in terms of inference and training time at almost the same accuracy. |
Tasks | Combinatorial Optimization |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.10821v2 |
http://arxiv.org/pdf/1806.10821v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-rank-selection-for-high-speed |
Repo | |
Framework | |
Inferring a Third Spatial Dimension from 2D Histological Images
Title | Inferring a Third Spatial Dimension from 2D Histological Images |
Authors | Maxime W. Lafarge, Josien P. W. Pluim, Koen A. J. Eppenhof, Pim Moeskops, Mitko Veta |
Abstract | Histological images are obtained by transmitting light through a tissue specimen that has been stained in order to produce contrast. This process results in 2D images of the specimen that has a three-dimensional structure. In this paper, we propose a method to infer how the stains are distributed in the direction perpendicular to the surface of the slide for a given 2D image in order to obtain a 3D representation of the tissue. This inference is achieved by decomposition of the staining concentration maps under constraints that ensure realistic decomposition and reconstruction of the original 2D images. Our study shows that it is possible to generate realistic 3D images making this method a potential tool for data augmentation when training deep learning models. |
Tasks | Data Augmentation |
Published | 2018-01-10 |
URL | http://arxiv.org/abs/1801.03431v1 |
http://arxiv.org/pdf/1801.03431v1.pdf | |
PWC | https://paperswithcode.com/paper/inferring-a-third-spatial-dimension-from-2d |
Repo | |
Framework | |