April 2, 2020

2699 words 13 mins read

Paper Group ANR 254

Approximation smooth and sparse functions by deep neural networks without saturation. Attentional networks for music generation. PaRoT: A Practical Framework for Robust Deep Neural Network Training. Taylor Expansion Policy Optimization. Enabling the Analysis of Personality Aspects in Recommender Systems. Learning to Simulate Human Movement. Weakly …

Approximation smooth and sparse functions by deep neural networks without saturation


Title	Approximation smooth and sparse functions by deep neural networks without saturation
Authors	Xia Liu
Abstract	Constructing neural networks for function approximation is a classical and longstanding topic in approximation theory. In this paper, we aim at constructing deep neural networks (deep nets for short) with three hidden layers to approximate smooth and sparse functions. In particular, we prove that the constructed deep nets can reach the optimal approximation rate in approximating both smooth and sparse functions with controllable magnitude of free parameters. Since the saturation that describes the bottleneck of approximate is an insurmountable problem of constructive neural networks, we also prove that deepening the neural network with only one more hidden layer can avoid the saturation. The obtained results underlie advantages of deep nets and provide theoretical explanations for deep learning.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04114v1
PDF	https://arxiv.org/pdf/2001.04114v1.pdf
PWC	https://paperswithcode.com/paper/approximation-smooth-and-sparse-functions-by
Repo
Framework

Attentional networks for music generation


Title	Attentional networks for music generation
Authors	Gullapalli Keerti, A N Vaishnavi, Prerana Mukherjee, A Sree Vidya, Gattineni Sai Sreenithya, Deeksha Nayab
Abstract	Realistic music generation has always remained as a challenging problem as it may lack structure or rationality. In this work, we propose a deep learning based music generation method in order to produce old style music particularly JAZZ with rehashed melodic structures utilizing a Bi-directional Long Short Term Memory (Bi-LSTM) Neural Network with Attention. Owing to the success in modelling long-term temporal dependencies in sequential data and its success in case of videos, Bi-LSTMs with attention serve as the natural choice and early utilization in music generation. We validate in our experiments that Bi-LSTMs with attention are able to preserve the richness and technical nuances of the music performed.
Tasks	Music Generation
Published	2020-02-06
URL	https://arxiv.org/abs/2002.03854v1
PDF	https://arxiv.org/pdf/2002.03854v1.pdf
PWC	https://paperswithcode.com/paper/attentional-networks-for-music-generation
Repo
Framework

PaRoT: A Practical Framework for Robust Deep Neural Network Training


Title	PaRoT: A Practical Framework for Robust Deep Neural Network Training
Authors	Edward Ayers, Francisco Eiras, Majd Hawasly, Iain Whiteside
Abstract	Deep Neural Networks (DNNs) are finding important applications in safety-critical systems such as Autonomous Vehicles (AVs), where perceiving the environment correctly and robustly is necessary for safe operation. Raising unique challenges for assurance due to their black-box nature, DNNs pose a fundamental problem for regulatory acceptance of these types of systems. Robust training — training to minimize excessive sensitivity to small changes in input — has emerged as one promising technique to address this challenge. However, existing robust training tools are inconvenient to use or apply to existing codebases and models: they typically only support a small subset of model elements and require users to extensively rewrite the training code. In this paper we introduce a novel framework, PaRoT, developed on the popular TensorFlow platform, that greatly reduces the barrier to entry. Our framework enables robust training to be performed on arbitrary DNNs without any rewrites to the model. We demonstrate that our framework’s performance is comparable to prior art, and exemplify its ease of use on off-the-shelf, trained models and its testing capabilities on a real-world industrial application: a traffic light detection network.
Tasks	Autonomous Vehicles
Published	2020-01-07
URL	https://arxiv.org/abs/2001.02152v3
PDF	https://arxiv.org/pdf/2001.02152v3.pdf
PWC	https://paperswithcode.com/paper/parot-a-practical-framework-for-robust-deep
Repo
Framework

Taylor Expansion Policy Optimization


Title	Taylor Expansion Policy Optimization
Authors	Yunhao Tang, Michal Valko, Rémi Munos
Abstract	In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor expansion policy optimization, a policy optimization formalism that generalizes prior work (e.g., TRPO) as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms.
Tasks
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06259v1
PDF	https://arxiv.org/pdf/2003.06259v1.pdf
PWC	https://paperswithcode.com/paper/taylor-expansion-policy-optimization
Repo
Framework

Enabling the Analysis of Personality Aspects in Recommender Systems


Title	Enabling the Analysis of Personality Aspects in Recommender Systems
Authors	Shahpar Yakhchi, Amin Beheshti, Seyed Mohssen Ghafari, Mehmet Orgun
Abstract	Existing Recommender Systems mainly focus on exploiting users’ feedback, e.g., ratings, and reviews on common items to detect similar users. Thus, they might fail when there are no common items of interest among users. We call this problem the Data Sparsity With no Feedback on Common Items (DSW-n-FCI). Personality-based recommender systems have shown a great success to identify similar users based on their personality types. However, there are only a few personality-based recommender systems in the literature which either discover personality explicitly through filling a questionnaire that is a tedious task, or neglect the impact of users’ personal interests and level of knowledge, as a key factor to increase recommendations’ acceptance. Differently, we identifying users’ personality type implicitly with no burden on users and incorporate it along with users’ personal interests and their level of knowledge. Experimental results on a real-world dataset demonstrate the effectiveness of our model, especially in DSW-n-FCI situations.
Tasks	Recommendation Systems
Published	2020-01-07
URL	https://arxiv.org/abs/2001.04825v1
PDF	https://arxiv.org/pdf/2001.04825v1.pdf
PWC	https://paperswithcode.com/paper/enabling-the-analysis-of-personality-aspects
Repo
Framework

Learning to Simulate Human Movement


Title	Learning to Simulate Human Movement
Authors	Hua Wei, Zhenhui Li
Abstract	Modeling how human moves on the space is useful for policy-making in transportation, public safety, and public health. The human movements can be viewed as a dynamic process that human transits between states (e.g., locations) over time. In the human world where both intelligent agents like humans or vehicles with human drivers play an important role, the states of agents mostly describe human activities, and the state transition is influenced by both the human decisions and physical constraints from the real-world system (e.g., agents need to spend time to move over a certain distance). Therefore, the modeling of state transition should include the modeling of the agent’s decision process and the physical system dynamics. In this paper, we propose to model state transition in human movement through learning decision model and integrating system dynamics. In experiments on real-world datasets, we demonstrate that the proposed method can achieve superior performance against the state-of-the-art methods in predicting the next state and generating long-term future states.
Tasks
Published	2020-03-01
URL	https://arxiv.org/abs/2003.00613v2
PDF	https://arxiv.org/pdf/2003.00613v2.pdf
PWC	https://paperswithcode.com/paper/how-do-we-move-learning-to-simulate-with
Repo
Framework

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows


Title	Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows
Authors	Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, Cristian Sminchisescu
Abstract	Monocular 3D human pose and shape estimation is challenging due to the many degrees of freedom of the human body and thedifficulty to acquire training data for large-scale supervised learning in complex visual scenes. In this paper we present practical semi-supervised and self-supervised models that support training and good generalization in real-world images and video. Our formulation is based on kinematic latent normalizing flow representations and dynamics, as well as differentiable, semantic body part alignment loss functions that support self-supervised learning. In extensive experiments using 3D motion capture datasets like CMU, Human3.6M, 3DPW, or AMASS, as well as image repositories like COCO, we show that the proposed methods outperform the state of the art, supporting the practical construction of an accurate family of models based on large-scale training with diverse and incompletely labeled image and video data.
Tasks	Motion Capture
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10350v1
PDF	https://arxiv.org/pdf/2003.10350v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-3d-human-pose-and-shape
Repo
Framework

DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures


Title	DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures
Authors	Yang Zhao, Chaojian Li, Yue Wang, Pengfei Xu, Yongan Zhang, Yingyan Lin
Abstract	The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators. However, designing DNN accelerators is non-trivial as it often takes months/years and requires cross-disciplinary knowledge. To enable fast and effective DNN accelerator development, we propose DNN-Chip Predictor, an analytical performance predictor which can accurately predict DNN accelerators’ energy, throughput, and latency prior to their actual implementation. Our Predictor features two highlights: (1) its analytical performance formulation of DNN ASIC/FPGA accelerators facilitates fast design space exploration and optimization; and (2) it supports DNN accelerators with different algorithm-to-hardware mapping methods (i.e., dataflows) and hardware architectures. Experiment results based on 2 DNN models and 3 different ASIC/FPGA implementations show that our DNN-Chip Predictor’s predicted performance differs from those of chip measurements of FPGA/ASIC implementation by no more than 17.66% when using different DNN models, hardware architectures, and dataflows. We will release code upon acceptance.
Tasks
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11270v1
PDF	https://arxiv.org/pdf/2002.11270v1.pdf
PWC	https://paperswithcode.com/paper/dnn-chip-predictor-an-analytical-performance
Repo
Framework

Mixed Integer Programming for Searching Maximum Quasi-Bicliques


Title	Mixed Integer Programming for Searching Maximum Quasi-Bicliques
Authors	Dmitry I. Ignatov, Polina Ivanova, Albina Zamaletdinova
Abstract	This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its “almost” complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a $\gamma$-quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least $\gamma \in (0,1]$. For a bigraph and fixed $\gamma$, the problem of searching for the maximal quasi-biclique consists of finding a subset of vertices of the bigraph such that the induced subgraph is a quasi-biclique and its size is maximal for a given graph. Several models based on Mixed Integer Programming (MIP) to search for a quasi-biclique are proposed and tested for working efficiency. An alternative model inspired by biclustering is formulated and tested; this model simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by triclustering \textsc{TriBox}.
Tasks
Published	2020-02-23
URL	https://arxiv.org/abs/2002.09880v1
PDF	https://arxiv.org/pdf/2002.09880v1.pdf
PWC	https://paperswithcode.com/paper/mixed-integer-programming-for-searching
Repo
Framework

Cortical surface parcellation based on intra-subject white matter fiber clustering


Title	Cortical surface parcellation based on intra-subject white matter fiber clustering
Authors	Narciso López-López, Andrea Vázquez, Cyril Poupon, Jean-François Mangin, Pamela Guevara
Abstract	We present a hybrid method that performs the complete parcellation of the cerebral cortex of an individual, based on the connectivity information of the white matter fibers from a whole-brain tractography dataset. The method consists of five steps, first intra-subject clustering is performed on the brain tractography. The fibers that make up each cluster are then intersected with the cortical mesh and then filtered to discard outliers. In addition, the method resolves the overlapping between the different intersection regions (sub-parcels) throughout the cortex efficiently. Finally, a post-processing is done to achieve more uniform sub-parcels. The output is the complete labeling of cortical mesh vertices, representing the different cortex sub-parcels, with strong connections to other sub-parcels. We evaluated our method with measures of brain connectivity such as functional segregation (clustering coefficient), functional integration (characteristic path length) and small-world. Results in five subjects from ARCHI database show a good individual cortical parcellation for each one, composed of about 200 subparcels per hemisphere and complying with these connectivity measures.
Tasks
Published	2020-02-16
URL	https://arxiv.org/abs/2002.09034v1
PDF	https://arxiv.org/pdf/2002.09034v1.pdf
PWC	https://paperswithcode.com/paper/cortical-surface-parcellation-based-on-intra
Repo
Framework

Autonomous discovery in the chemical sciences part II: Outlook


Title	Autonomous discovery in the chemical sciences part II: Outlook
Authors	Connor W. Coley, Natalie S. Eyke, Klavs F. Jensen
Abstract	This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this second part, we reflect on a selection of exemplary studies. It is increasingly important to articulate what the role of automation and computation has been in the scientific process and how that has or has not accelerated discovery. One can argue that even the best automated systems have yet to ``discover’’ despite being incredibly useful as laboratory assistants. We must carefully consider how they have been and can be applied to future problems of chemical discovery in order to effectively design and interact with future autonomous platforms. The majority of this article defines a large set of open research directions, including improving our ability to work with complex data, build empirical models, automate both physical and computational experiments for validation, select experiments, and evaluate whether we are making progress toward the ultimate goal of autonomous discovery. Addressing these practical and methodological challenges will greatly advance the extent to which autonomous systems can make meaningful discoveries. \|
Tasks
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13755v1
PDF	https://arxiv.org/pdf/2003.13755v1.pdf
PWC	https://paperswithcode.com/paper/autonomous-discovery-in-the-chemical-sciences
Repo
Framework

On the generalization of bayesian deep nets for multi-class classification


Title	On the generalization of bayesian deep nets for multi-class classification
Authors	Yossi Adi, Yaniv Nemcovsky, Alex Schwing, Tamir Hazan
Abstract	Generalization bounds which assess the difference between the true risk and the empirical risk have been studied extensively. However, to obtain bounds, current techniques use strict assumptions such as a uniformly bounded or a Lipschitz loss function. To avoid these assumptions, in this paper, we propose a new generalization bound for Bayesian deep nets by exploiting the contractivity of the Log-Sobolev inequalities. Using these inequalities adds an additional loss-gradient norm term to the generalization bound, which is intuitively a surrogate of the model complexity. Empirically, we analyze the affect of this loss-gradient norm term using different deep nets.
Tasks
Published	2020-02-23
URL	https://arxiv.org/abs/2002.09866v1
PDF	https://arxiv.org/pdf/2002.09866v1.pdf
PWC	https://paperswithcode.com/paper/on-the-generalization-of-bayesian-deep-nets
Repo
Framework

Image Entropy for Classification and Analysis of Pathology Slides


Title	Image Entropy for Classification and Analysis of Pathology Slides
Authors	Steven J. Frank
Abstract	Pathology slides of lung malignancies are classified using the “Salient Slices” technique described in Frank et al., 2020. A four-fold cross-validation study using a small image set (42 adenocarcinoma slides and 42 squamous cell carcinoma slides) produced fully correct classifications in each fold. Probability maps enable visualization of the underlying basis for a classification.
Tasks
Published	2020-02-16
URL	https://arxiv.org/abs/2002.07621v1
PDF	https://arxiv.org/pdf/2002.07621v1.pdf
PWC	https://paperswithcode.com/paper/image-entropy-for-classification-and-analysis
Repo
Framework

Image Hashing by Minimizing Independent Relaxed Wasserstein Distance


Title	Image Hashing by Minimizing Independent Relaxed Wasserstein Distance
Authors	Khoa D. Doan, Amir Kimiyaie, Saurav Manchanda, Chandan K. Reddy
Abstract	Image hashing is a fundamental problem in the computer vision domain with various challenges, primarily, in terms of efficiency and effectiveness. Existing hashing methods lack a principled characterization of the goodness of the hash codes and a principled approach to learn the discrete hash functions that are being optimized in the continuous space. Adversarial autoencoders are shown to be able to implicitly learn a robust hash function that generates hash codes which are balanced and have low-quantization error. However, the existing adversarial autoencoders for hashing are too inefficient to be employed for large-scale image retrieval applications because of the minmax optimization procedure. In this paper, we propose an Independent Relaxed Wasserstein Autoencoder, which presents a novel, efficient hashing method that can implicitly learn the optimal hash function by directly training the adversarial autoencoder without any discriminator/critic. Our method is an order-of-magnitude more efficient and has a much lower sample complexity than the Optimal Transport formulation of the Wasserstein distance. The proposed method outperforms the current state-of-the-art image hashing methods for the retrieval task on several prominent image collections.
Tasks	Image Retrieval, Quantization
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00134v2
PDF	https://arxiv.org/pdf/2003.00134v2.pdf
PWC	https://paperswithcode.com/paper/image-hashing-by-minimizing-independent
Repo
Framework


Title	Cross-modal Learning for Multi-modal Video Categorization
Authors	Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee
Abstract	Multi-modal machine learning (ML) models can process data in multiple modalities (e.g., video, audio, text) and are useful for video content analysis in a variety of problems (e.g., object detection, scene understanding, activity recognition). In this paper, we focus on the problem of video categorization using a multi-modal ML technique. In particular, we have developed a novel multi-modal ML approach that we call “cross-modal learning”, where one modality influences another but only when there is correlation between the modalities — for that, we first train a correlation tower that guides the main multi-modal video categorization tower in the model. We show how this cross-modal principle can be applied to different types of models (e.g., RNN, Transformer, NetVLAD), and demonstrate through experiments how our proposed multi-modal video categorization models with cross-modal learning out-perform strong state-of-the-art baseline models.
Tasks	Activity Recognition, Object Detection, Scene Understanding
Published	2020-03-07
URL	https://arxiv.org/abs/2003.03501v2
PDF	https://arxiv.org/pdf/2003.03501v2.pdf
PWC	https://paperswithcode.com/paper/cross-modal-learning-for-multi-modal-video
Repo
Framework