April 2, 2020

2699 words 13 mins read

Paper Group ANR 254

Paper Group ANR 254

Approximation smooth and sparse functions by deep neural networks without saturation. Attentional networks for music generation. PaRoT: A Practical Framework for Robust Deep Neural Network Training. Taylor Expansion Policy Optimization. Enabling the Analysis of Personality Aspects in Recommender Systems. Learning to Simulate Human Movement. Weakly …

Approximation smooth and sparse functions by deep neural networks without saturation

Title Approximation smooth and sparse functions by deep neural networks without saturation
Authors Xia Liu
Abstract Constructing neural networks for function approximation is a classical and longstanding topic in approximation theory. In this paper, we aim at constructing deep neural networks (deep nets for short) with three hidden layers to approximate smooth and sparse functions. In particular, we prove that the constructed deep nets can reach the optimal approximation rate in approximating both smooth and sparse functions with controllable magnitude of free parameters. Since the saturation that describes the bottleneck of approximate is an insurmountable problem of constructive neural networks, we also prove that deepening the neural network with only one more hidden layer can avoid the saturation. The obtained results underlie advantages of deep nets and provide theoretical explanations for deep learning.
Published 2020-01-13
URL https://arxiv.org/abs/2001.04114v1
PDF https://arxiv.org/pdf/2001.04114v1.pdf
PWC https://paperswithcode.com/paper/approximation-smooth-and-sparse-functions-by

Attentional networks for music generation

Title Attentional networks for music generation
Authors Gullapalli Keerti, A N Vaishnavi, Prerana Mukherjee, A Sree Vidya, Gattineni Sai Sreenithya, Deeksha Nayab
Abstract Realistic music generation has always remained as a challenging problem as it may lack structure or rationality. In this work, we propose a deep learning based music generation method in order to produce old style music particularly JAZZ with rehashed melodic structures utilizing a Bi-directional Long Short Term Memory (Bi-LSTM) Neural Network with Attention. Owing to the success in modelling long-term temporal dependencies in sequential data and its success in case of videos, Bi-LSTMs with attention serve as the natural choice and early utilization in music generation. We validate in our experiments that Bi-LSTMs with attention are able to preserve the richness and technical nuances of the music performed.
Tasks Music Generation
Published 2020-02-06
URL https://arxiv.org/abs/2002.03854v1
PDF https://arxiv.org/pdf/2002.03854v1.pdf
PWC https://paperswithcode.com/paper/attentional-networks-for-music-generation

PaRoT: A Practical Framework for Robust Deep Neural Network Training

Title PaRoT: A Practical Framework for Robust Deep Neural Network Training
Authors Edward Ayers, Francisco Eiras, Majd Hawasly, Iain Whiteside
Abstract Deep Neural Networks (DNNs) are finding important applications in safety-critical systems such as Autonomous Vehicles (AVs), where perceiving the environment correctly and robustly is necessary for safe operation. Raising unique challenges for assurance due to their black-box nature, DNNs pose a fundamental problem for regulatory acceptance of these types of systems. Robust training — training to minimize excessive sensitivity to small changes in input — has emerged as one promising technique to address this challenge. However, existing robust training tools are inconvenient to use or apply to existing codebases and models: they typically only support a small subset of model elements and require users to extensively rewrite the training code. In this paper we introduce a novel framework, PaRoT, developed on the popular TensorFlow platform, that greatly reduces the barrier to entry. Our framework enables robust training to be performed on arbitrary DNNs without any rewrites to the model. We demonstrate that our framework’s performance is comparable to prior art, and exemplify its ease of use on off-the-shelf, trained models and its testing capabilities on a real-world industrial application: a traffic light detection network.
Tasks Autonomous Vehicles
Published 2020-01-07
URL https://arxiv.org/abs/2001.02152v3
PDF https://arxiv.org/pdf/2001.02152v3.pdf
PWC https://paperswithcode.com/paper/parot-a-practical-framework-for-robust-deep

Taylor Expansion Policy Optimization

Title Taylor Expansion Policy Optimization
Authors Yunhao Tang, Michal Valko, Rémi Munos
Abstract In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor expansion policy optimization, a policy optimization formalism that generalizes prior work (e.g., TRPO) as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.
Published 2020-03-13
URL https://arxiv.org/abs/2003.06259v1
PDF https://arxiv.org/pdf/2003.06259v1.pdf
PWC https://paperswithcode.com/paper/taylor-expansion-policy-optimization

Enabling the Analysis of Personality Aspects in Recommender Systems

Title Enabling the Analysis of Personality Aspects in Recommender Systems
Authors Shahpar Yakhchi, Amin Beheshti, Seyed Mohssen Ghafari, Mehmet Orgun
Abstract Existing Recommender Systems mainly focus on exploiting users’ feedback, e.g., ratings, and reviews on common items to detect similar users. Thus, they might fail when there are no common items of interest among users. We call this problem the Data Sparsity With no Feedback on Common Items (DSW-n-FCI). Personality-based recommender systems have shown a great success to identify similar users based on their personality types. However, there are only a few personality-based recommender systems in the literature which either discover personality explicitly through filling a questionnaire that is a tedious task, or neglect the impact of users’ personal interests and level of knowledge, as a key factor to increase recommendations’ acceptance. Differently, we identifying users’ personality type implicitly with no burden on users and incorporate it along with users’ personal interests and their level of knowledge. Experimental results on a real-world dataset demonstrate the effectiveness of our model, especially in DSW-n-FCI situations.
Tasks Recommendation Systems
Published 2020-01-07
URL https://arxiv.org/abs/2001.04825v1
PDF https://arxiv.org/pdf/2001.04825v1.pdf
PWC https://paperswithcode.com/paper/enabling-the-analysis-of-personality-aspects

Learning to Simulate Human Movement

Title Learning to Simulate Human Movement
Authors Hua Wei, Zhenhui Li
Abstract Modeling how human moves on the space is useful for policy-making in transportation, public safety, and public health. The human movements can be viewed as a dynamic process that human transits between states (e.g., locations) over time. In the human world where both intelligent agents like humans or vehicles with human drivers play an important role, the states of agents mostly describe human activities, and the state transition is influenced by both the human decisions and physical constraints from the real-world system (e.g., agents need to spend time to move over a certain distance). Therefore, the modeling of state transition should include the modeling of the agent’s decision process and the physical system dynamics. In this paper, we propose to model state transition in human movement through learning decision model and integrating system dynamics. In experiments on real-world datasets, we demonstrate that the proposed method can achieve superior performance against the state-of-the-art methods in predicting the next state and generating long-term future states.
Published 2020-03-01
URL https://arxiv.org/abs/2003.00613v2
PDF https://arxiv.org/pdf/2003.00613v2.pdf
PWC https://paperswithcode.com/paper/how-do-we-move-learning-to-simulate-with

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

Title Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows
Authors Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, Cristian Sminchisescu
Abstract Monocular 3D human pose and shape estimation is challenging due to the many degrees of freedom of the human body and thedifficulty to acquire training data for large-scale supervised learning in complex visual scenes. In this paper we present practical semi-supervised and self-supervised models that support training and good generalization in real-world images and video. Our formulation is based on kinematic latent normalizing flow representations and dynamics, as well as differentiable, semantic body part alignment loss functions that support self-supervised learning. In extensive experiments using 3D motion capture datasets like CMU, Human3.6M, 3DPW, or AMASS, as well as image repositories like COCO, we show that the proposed methods outperform the state of the art, supporting the practical construction of an accurate family of models based on large-scale training with diverse and incompletely labeled image and video data.
Tasks Motion Capture
Published 2020-03-23
URL https://arxiv.org/abs/2003.10350v1
PDF https://arxiv.org/pdf/2003.10350v1.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-3d-human-pose-and-shape

DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures

Title DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures
Authors Yang Zhao, Chaojian Li, Yue Wang, Pengfei Xu, Yongan Zhang, Yingyan Lin
Abstract The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators. However, designing DNN accelerators is non-trivial as it often takes months/years and requires cross-disciplinary knowledge. To enable fast and effective DNN accelerator development, we propose DNN-Chip Predictor, an analytical performance predictor which can accurately predict DNN accelerators’ energy, throughput, and latency prior to their actual implementation. Our Predictor features two highlights: (1) its analytical performance formulation of DNN ASIC/FPGA accelerators facilitates fast design space exploration and optimization; and (2) it supports DNN accelerators with different algorithm-to-hardware mapping methods (i.e., dataflows) and hardware architectures. Experiment results based on 2 DNN models and 3 different ASIC/FPGA implementations show that our DNN-Chip Predictor’s predicted performance differs from those of chip measurements of FPGA/ASIC implementation by no more than 17.66% when using different DNN models, hardware architectures, and dataflows. We will release code upon acceptance.
Published 2020-02-26
URL https://arxiv.org/abs/2002.11270v1
PDF https://arxiv.org/pdf/2002.11270v1.pdf
PWC https://paperswithcode.com/paper/dnn-chip-predictor-an-analytical-performance

Mixed Integer Programming for Searching Maximum Quasi-Bicliques

Title Mixed Integer Programming for Searching Maximum Quasi-Bicliques
Authors Dmitry I. Ignatov, Polina Ivanova, Albina Zamaletdinova
Abstract This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its “almost” complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a $\gamma$-quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least $\gamma \in (0,1]$. For a bigraph and fixed $\gamma$, the problem of searching for the maximal quasi-biclique consists of finding a subset of vertices of the bigraph such that the induced subgraph is a quasi-biclique and its size is maximal for a given graph. Several models based on Mixed Integer Programming (MIP) to search for a quasi-biclique are proposed and tested for working efficiency. An alternative model inspired by biclustering is formulated and tested; this model simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by triclustering \textsc{TriBox}.
Published 2020-02-23
URL https://arxiv.org/abs/2002.09880v1
PDF https://arxiv.org/pdf/2002.09880v1.pdf
PWC https://paperswithcode.com/paper/mixed-integer-programming-for-searching

Cortical surface parcellation based on intra-subject white matter fiber clustering

Title Cortical surface parcellation based on intra-subject white matter fiber clustering
Authors Narciso López-López, Andrea Vázquez, Cyril Poupon, Jean-François Mangin, Pamela Guevara
Abstract We present a hybrid method that performs the complete parcellation of the cerebral cortex of an individual, based on the connectivity information of the white matter fibers from a whole-brain tractography dataset. The method consists of five steps, first intra-subject clustering is performed on the brain tractography. The fibers that make up each cluster are then intersected with the cortical mesh and then filtered to discard outliers. In addition, the method resolves the overlapping between the different intersection regions (sub-parcels) throughout the cortex efficiently. Finally, a post-processing is done to achieve more uniform sub-parcels. The output is the complete labeling of cortical mesh vertices, representing the different cortex sub-parcels, with strong connections to other sub-parcels. We evaluated our method with measures of brain connectivity such as functional segregation (clustering coefficient), functional integration (characteristic path length) and small-world. Results in five subjects from ARCHI database show a good individual cortical parcellation for each one, composed of about 200 subparcels per hemisphere and complying with these connectivity measures.
Published 2020-02-16
URL https://arxiv.org/abs/2002.09034v1
PDF https://arxiv.org/pdf/2002.09034v1.pdf
PWC https://paperswithcode.com/paper/cortical-surface-parcellation-based-on-intra

Autonomous discovery in the chemical sciences part II: Outlook

Title Autonomous discovery in the chemical sciences part II: Outlook
Authors Connor W. Coley, Natalie S. Eyke, Klavs F. Jensen
Abstract This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this second part, we reflect on a selection of exemplary studies. It is increasingly important to articulate what the role of automation and computation has been in the scientific process and how that has or has not accelerated discovery. One can argue that even the best automated systems have yet to ``discover’’ despite being incredibly useful as laboratory assistants. We must carefully consider how they have been and can be applied to future problems of chemical discovery in order to effectively design and interact with future autonomous platforms. The majority of this article defines a large set of open research directions, including improving our ability to work with complex data, build empirical models, automate both physical and computational experiments for validation, select experiments, and evaluate whether we are making progress toward the ultimate goal of autonomous discovery. Addressing these practical and methodological challenges will greatly advance the extent to which autonomous systems can make meaningful discoveries. |
Published 2020-03-30
URL https://arxiv.org/abs/2003.13755v1
PDF https://arxiv.org/pdf/2003.13755v1.pdf
PWC https://paperswithcode.com/paper/autonomous-discovery-in-the-chemical-sciences

On the generalization of bayesian deep nets for multi-class classification

Title On the generalization of bayesian deep nets for multi-class classification
Authors Yossi Adi, Yaniv Nemcovsky, Alex Schwing, Tamir Hazan
Abstract Generalization bounds which assess the difference between the true risk and the empirical risk have been studied extensively. However, to obtain bounds, current techniques use strict assumptions such as a uniformly bounded or a Lipschitz loss function. To avoid these assumptions, in this paper, we propose a new generalization bound for Bayesian deep nets by exploiting the contractivity of the Log-Sobolev inequalities. Using these inequalities adds an additional loss-gradient norm term to the generalization bound, which is intuitively a surrogate of the model complexity. Empirically, we analyze the affect of this loss-gradient norm term using different deep nets.
Published 2020-02-23
URL https://arxiv.org/abs/2002.09866v1
PDF https://arxiv.org/pdf/2002.09866v1.pdf
PWC https://paperswithcode.com/paper/on-the-generalization-of-bayesian-deep-nets

Image Entropy for Classification and Analysis of Pathology Slides

Title Image Entropy for Classification and Analysis of Pathology Slides
Authors Steven J. Frank
Abstract Pathology slides of lung malignancies are classified using the “Salient Slices” technique described in Frank et al., 2020. A four-fold cross-validation study using a small image set (42 adenocarcinoma slides and 42 squamous cell carcinoma slides) produced fully correct classifications in each fold. Probability maps enable visualization of the underlying basis for a classification.
Published 2020-02-16
URL https://arxiv.org/abs/2002.07621v1
PDF https://arxiv.org/pdf/2002.07621v1.pdf
PWC https://paperswithcode.com/paper/image-entropy-for-classification-and-analysis

Image Hashing by Minimizing Independent Relaxed Wasserstein Distance

Title Image Hashing by Minimizing Independent Relaxed Wasserstein Distance
Authors Khoa D. Doan, Amir Kimiyaie, Saurav Manchanda, Chandan K. Reddy
Abstract Image hashing is a fundamental problem in the computer vision domain with various challenges, primarily, in terms of efficiency and effectiveness. Existing hashing methods lack a principled characterization of the goodness of the hash codes and a principled approach to learn the discrete hash functions that are being optimized in the continuous space. Adversarial autoencoders are shown to be able to implicitly learn a robust hash function that generates hash codes which are balanced and have low-quantization error. However, the existing adversarial autoencoders for hashing are too inefficient to be employed for large-scale image retrieval applications because of the minmax optimization procedure. In this paper, we propose an Independent Relaxed Wasserstein Autoencoder, which presents a novel, efficient hashing method that can implicitly learn the optimal hash function by directly training the adversarial autoencoder without any discriminator/critic. Our method is an order-of-magnitude more efficient and has a much lower sample complexity than the Optimal Transport formulation of the Wasserstein distance. The proposed method outperforms the current state-of-the-art image hashing methods for the retrieval task on several prominent image collections.
Tasks Image Retrieval, Quantization
Published 2020-02-29
URL https://arxiv.org/abs/2003.00134v2
PDF https://arxiv.org/pdf/2003.00134v2.pdf
PWC https://paperswithcode.com/paper/image-hashing-by-minimizing-independent

Cross-modal Learning for Multi-modal Video Categorization

Title Cross-modal Learning for Multi-modal Video Categorization
Authors Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee
Abstract Multi-modal machine learning (ML) models can process data in multiple modalities (e.g., video, audio, text) and are useful for video content analysis in a variety of problems (e.g., object detection, scene understanding, activity recognition). In this paper, we focus on the problem of video categorization using a multi-modal ML technique. In particular, we have developed a novel multi-modal ML approach that we call “cross-modal learning”, where one modality influences another but only when there is correlation between the modalities — for that, we first train a correlation tower that guides the main multi-modal video categorization tower in the model. We show how this cross-modal principle can be applied to different types of models (e.g., RNN, Transformer, NetVLAD), and demonstrate through experiments how our proposed multi-modal video categorization models with cross-modal learning out-perform strong state-of-the-art baseline models.
Tasks Activity Recognition, Object Detection, Scene Understanding
Published 2020-03-07
URL https://arxiv.org/abs/2003.03501v2
PDF https://arxiv.org/pdf/2003.03501v2.pdf
PWC https://paperswithcode.com/paper/cross-modal-learning-for-multi-modal-video
comments powered by Disqus