Paper Group ANR 254
Approximation smooth and sparse functions by deep neural networks without saturation. Attentional networks for music generation. PaRoT: A Practical Framework for Robust Deep Neural Network Training. Taylor Expansion Policy Optimization. Enabling the Analysis of Personality Aspects in Recommender Systems. Learning to Simulate Human Movement. Weakly …
Approximation smooth and sparse functions by deep neural networks without saturation
Title | Approximation smooth and sparse functions by deep neural networks without saturation |
Authors | Xia Liu |
Abstract | Constructing neural networks for function approximation is a classical and longstanding topic in approximation theory. In this paper, we aim at constructing deep neural networks (deep nets for short) with three hidden layers to approximate smooth and sparse functions. In particular, we prove that the constructed deep nets can reach the optimal approximation rate in approximating both smooth and sparse functions with controllable magnitude of free parameters. Since the saturation that describes the bottleneck of approximate is an insurmountable problem of constructive neural networks, we also prove that deepening the neural network with only one more hidden layer can avoid the saturation. The obtained results underlie advantages of deep nets and provide theoretical explanations for deep learning. |
Tasks | |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.04114v1 |
https://arxiv.org/pdf/2001.04114v1.pdf | |
PWC | https://paperswithcode.com/paper/approximation-smooth-and-sparse-functions-by |
Repo | |
Framework | |
Attentional networks for music generation
Title | Attentional networks for music generation |
Authors | Gullapalli Keerti, A N Vaishnavi, Prerana Mukherjee, A Sree Vidya, Gattineni Sai Sreenithya, Deeksha Nayab |
Abstract | Realistic music generation has always remained as a challenging problem as it may lack structure or rationality. In this work, we propose a deep learning based music generation method in order to produce old style music particularly JAZZ with rehashed melodic structures utilizing a Bi-directional Long Short Term Memory (Bi-LSTM) Neural Network with Attention. Owing to the success in modelling long-term temporal dependencies in sequential data and its success in case of videos, Bi-LSTMs with attention serve as the natural choice and early utilization in music generation. We validate in our experiments that Bi-LSTMs with attention are able to preserve the richness and technical nuances of the music performed. |
Tasks | Music Generation |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.03854v1 |
https://arxiv.org/pdf/2002.03854v1.pdf | |
PWC | https://paperswithcode.com/paper/attentional-networks-for-music-generation |
Repo | |
Framework | |
PaRoT: A Practical Framework for Robust Deep Neural Network Training
Title | PaRoT: A Practical Framework for Robust Deep Neural Network Training |
Authors | Edward Ayers, Francisco Eiras, Majd Hawasly, Iain Whiteside |
Abstract | Deep Neural Networks (DNNs) are finding important applications in safety-critical systems such as Autonomous Vehicles (AVs), where perceiving the environment correctly and robustly is necessary for safe operation. Raising unique challenges for assurance due to their black-box nature, DNNs pose a fundamental problem for regulatory acceptance of these types of systems. Robust training — training to minimize excessive sensitivity to small changes in input — has emerged as one promising technique to address this challenge. However, existing robust training tools are inconvenient to use or apply to existing codebases and models: they typically only support a small subset of model elements and require users to extensively rewrite the training code. In this paper we introduce a novel framework, PaRoT, developed on the popular TensorFlow platform, that greatly reduces the barrier to entry. Our framework enables robust training to be performed on arbitrary DNNs without any rewrites to the model. We demonstrate that our framework’s performance is comparable to prior art, and exemplify its ease of use on off-the-shelf, trained models and its testing capabilities on a real-world industrial application: a traffic light detection network. |
Tasks | Autonomous Vehicles |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.02152v3 |
https://arxiv.org/pdf/2001.02152v3.pdf | |
PWC | https://paperswithcode.com/paper/parot-a-practical-framework-for-robust-deep |
Repo | |
Framework | |
Taylor Expansion Policy Optimization
Title | Taylor Expansion Policy Optimization |
Authors | Yunhao Tang, Michal Valko, Rémi Munos |
Abstract | In this work, we investigate the application of Taylor expansions in reinforcement learning. In particular, we propose Taylor expansion policy optimization, a policy optimization formalism that generalizes prior work (e.g., TRPO) as a first-order special case. We also show that Taylor expansions intimately relate to off-policy evaluation. Finally, we show that this new formulation entails modifications which improve the performance of several state-of-the-art distributed algorithms. |
Tasks | |
Published | 2020-03-13 |
URL | https://arxiv.org/abs/2003.06259v1 |
https://arxiv.org/pdf/2003.06259v1.pdf | |
PWC | https://paperswithcode.com/paper/taylor-expansion-policy-optimization |
Repo | |
Framework | |
Enabling the Analysis of Personality Aspects in Recommender Systems
Title | Enabling the Analysis of Personality Aspects in Recommender Systems |
Authors | Shahpar Yakhchi, Amin Beheshti, Seyed Mohssen Ghafari, Mehmet Orgun |
Abstract | Existing Recommender Systems mainly focus on exploiting users’ feedback, e.g., ratings, and reviews on common items to detect similar users. Thus, they might fail when there are no common items of interest among users. We call this problem the Data Sparsity With no Feedback on Common Items (DSW-n-FCI). Personality-based recommender systems have shown a great success to identify similar users based on their personality types. However, there are only a few personality-based recommender systems in the literature which either discover personality explicitly through filling a questionnaire that is a tedious task, or neglect the impact of users’ personal interests and level of knowledge, as a key factor to increase recommendations’ acceptance. Differently, we identifying users’ personality type implicitly with no burden on users and incorporate it along with users’ personal interests and their level of knowledge. Experimental results on a real-world dataset demonstrate the effectiveness of our model, especially in DSW-n-FCI situations. |
Tasks | Recommendation Systems |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.04825v1 |
https://arxiv.org/pdf/2001.04825v1.pdf | |
PWC | https://paperswithcode.com/paper/enabling-the-analysis-of-personality-aspects |
Repo | |
Framework | |
Learning to Simulate Human Movement
Title | Learning to Simulate Human Movement |
Authors | Hua Wei, Zhenhui Li |
Abstract | Modeling how human moves on the space is useful for policy-making in transportation, public safety, and public health. The human movements can be viewed as a dynamic process that human transits between states (e.g., locations) over time. In the human world where both intelligent agents like humans or vehicles with human drivers play an important role, the states of agents mostly describe human activities, and the state transition is influenced by both the human decisions and physical constraints from the real-world system (e.g., agents need to spend time to move over a certain distance). Therefore, the modeling of state transition should include the modeling of the agent’s decision process and the physical system dynamics. In this paper, we propose to model state transition in human movement through learning decision model and integrating system dynamics. In experiments on real-world datasets, we demonstrate that the proposed method can achieve superior performance against the state-of-the-art methods in predicting the next state and generating long-term future states. |
Tasks | |
Published | 2020-03-01 |
URL | https://arxiv.org/abs/2003.00613v2 |
https://arxiv.org/pdf/2003.00613v2.pdf | |
PWC | https://paperswithcode.com/paper/how-do-we-move-learning-to-simulate-with |
Repo | |
Framework | |
Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows
Title | Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows |
Authors | Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, Cristian Sminchisescu |
Abstract | Monocular 3D human pose and shape estimation is challenging due to the many degrees of freedom of the human body and thedifficulty to acquire training data for large-scale supervised learning in complex visual scenes. In this paper we present practical semi-supervised and self-supervised models that support training and good generalization in real-world images and video. Our formulation is based on kinematic latent normalizing flow representations and dynamics, as well as differentiable, semantic body part alignment loss functions that support self-supervised learning. In extensive experiments using 3D motion capture datasets like CMU, Human3.6M, 3DPW, or AMASS, as well as image repositories like COCO, we show that the proposed methods outperform the state of the art, supporting the practical construction of an accurate family of models based on large-scale training with diverse and incompletely labeled image and video data. |
Tasks | Motion Capture |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10350v1 |
https://arxiv.org/pdf/2003.10350v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-3d-human-pose-and-shape |
Repo | |
Framework | |
DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures
Title | DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures |
Authors | Yang Zhao, Chaojian Li, Yue Wang, Pengfei Xu, Yongan Zhang, Yingyan Lin |
Abstract | The recent breakthroughs in deep neural networks (DNNs) have spurred a tremendously increased demand for DNN accelerators. However, designing DNN accelerators is non-trivial as it often takes months/years and requires cross-disciplinary knowledge. To enable fast and effective DNN accelerator development, we propose DNN-Chip Predictor, an analytical performance predictor which can accurately predict DNN accelerators’ energy, throughput, and latency prior to their actual implementation. Our Predictor features two highlights: (1) its analytical performance formulation of DNN ASIC/FPGA accelerators facilitates fast design space exploration and optimization; and (2) it supports DNN accelerators with different algorithm-to-hardware mapping methods (i.e., dataflows) and hardware architectures. Experiment results based on 2 DNN models and 3 different ASIC/FPGA implementations show that our DNN-Chip Predictor’s predicted performance differs from those of chip measurements of FPGA/ASIC implementation by no more than 17.66% when using different DNN models, hardware architectures, and dataflows. We will release code upon acceptance. |
Tasks | |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11270v1 |
https://arxiv.org/pdf/2002.11270v1.pdf | |
PWC | https://paperswithcode.com/paper/dnn-chip-predictor-an-analytical-performance |
Repo | |
Framework | |
Mixed Integer Programming for Searching Maximum Quasi-Bicliques
Title | Mixed Integer Programming for Searching Maximum Quasi-Bicliques |
Authors | Dmitry I. Ignatov, Polina Ivanova, Albina Zamaletdinova |
Abstract | This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its “almost” complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a $\gamma$-quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least $\gamma \in (0,1]$. For a bigraph and fixed $\gamma$, the problem of searching for the maximal quasi-biclique consists of finding a subset of vertices of the bigraph such that the induced subgraph is a quasi-biclique and its size is maximal for a given graph. Several models based on Mixed Integer Programming (MIP) to search for a quasi-biclique are proposed and tested for working efficiency. An alternative model inspired by biclustering is formulated and tested; this model simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by triclustering \textsc{TriBox}. |
Tasks | |
Published | 2020-02-23 |
URL | https://arxiv.org/abs/2002.09880v1 |
https://arxiv.org/pdf/2002.09880v1.pdf | |
PWC | https://paperswithcode.com/paper/mixed-integer-programming-for-searching |
Repo | |
Framework | |
Cortical surface parcellation based on intra-subject white matter fiber clustering
Title | Cortical surface parcellation based on intra-subject white matter fiber clustering |
Authors | Narciso López-López, Andrea Vázquez, Cyril Poupon, Jean-François Mangin, Pamela Guevara |
Abstract | We present a hybrid method that performs the complete parcellation of the cerebral cortex of an individual, based on the connectivity information of the white matter fibers from a whole-brain tractography dataset. The method consists of five steps, first intra-subject clustering is performed on the brain tractography. The fibers that make up each cluster are then intersected with the cortical mesh and then filtered to discard outliers. In addition, the method resolves the overlapping between the different intersection regions (sub-parcels) throughout the cortex efficiently. Finally, a post-processing is done to achieve more uniform sub-parcels. The output is the complete labeling of cortical mesh vertices, representing the different cortex sub-parcels, with strong connections to other sub-parcels. We evaluated our method with measures of brain connectivity such as functional segregation (clustering coefficient), functional integration (characteristic path length) and small-world. Results in five subjects from ARCHI database show a good individual cortical parcellation for each one, composed of about 200 subparcels per hemisphere and complying with these connectivity measures. |
Tasks | |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.09034v1 |
https://arxiv.org/pdf/2002.09034v1.pdf | |
PWC | https://paperswithcode.com/paper/cortical-surface-parcellation-based-on-intra |
Repo | |
Framework | |
Autonomous discovery in the chemical sciences part II: Outlook
Title | Autonomous discovery in the chemical sciences part II: Outlook |
Authors | Connor W. Coley, Natalie S. Eyke, Klavs F. Jensen |
Abstract | This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this second part, we reflect on a selection of exemplary studies. It is increasingly important to articulate what the role of automation and computation has been in the scientific process and how that has or has not accelerated discovery. One can argue that even the best automated systems have yet to ``discover’’ despite being incredibly useful as laboratory assistants. We must carefully consider how they have been and can be applied to future problems of chemical discovery in order to effectively design and interact with future autonomous platforms. The majority of this article defines a large set of open research directions, including improving our ability to work with complex data, build empirical models, automate both physical and computational experiments for validation, select experiments, and evaluate whether we are making progress toward the ultimate goal of autonomous discovery. Addressing these practical and methodological challenges will greatly advance the extent to which autonomous systems can make meaningful discoveries. | |
Tasks | |
Published | 2020-03-30 |
URL | https://arxiv.org/abs/2003.13755v1 |
https://arxiv.org/pdf/2003.13755v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-discovery-in-the-chemical-sciences |
Repo | |
Framework | |
On the generalization of bayesian deep nets for multi-class classification
Title | On the generalization of bayesian deep nets for multi-class classification |
Authors | Yossi Adi, Yaniv Nemcovsky, Alex Schwing, Tamir Hazan |
Abstract | Generalization bounds which assess the difference between the true risk and the empirical risk have been studied extensively. However, to obtain bounds, current techniques use strict assumptions such as a uniformly bounded or a Lipschitz loss function. To avoid these assumptions, in this paper, we propose a new generalization bound for Bayesian deep nets by exploiting the contractivity of the Log-Sobolev inequalities. Using these inequalities adds an additional loss-gradient norm term to the generalization bound, which is intuitively a surrogate of the model complexity. Empirically, we analyze the affect of this loss-gradient norm term using different deep nets. |
Tasks | |
Published | 2020-02-23 |
URL | https://arxiv.org/abs/2002.09866v1 |
https://arxiv.org/pdf/2002.09866v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-generalization-of-bayesian-deep-nets |
Repo | |
Framework | |
Image Entropy for Classification and Analysis of Pathology Slides
Title | Image Entropy for Classification and Analysis of Pathology Slides |
Authors | Steven J. Frank |
Abstract | Pathology slides of lung malignancies are classified using the “Salient Slices” technique described in Frank et al., 2020. A four-fold cross-validation study using a small image set (42 adenocarcinoma slides and 42 squamous cell carcinoma slides) produced fully correct classifications in each fold. Probability maps enable visualization of the underlying basis for a classification. |
Tasks | |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.07621v1 |
https://arxiv.org/pdf/2002.07621v1.pdf | |
PWC | https://paperswithcode.com/paper/image-entropy-for-classification-and-analysis |
Repo | |
Framework | |
Image Hashing by Minimizing Independent Relaxed Wasserstein Distance
Title | Image Hashing by Minimizing Independent Relaxed Wasserstein Distance |
Authors | Khoa D. Doan, Amir Kimiyaie, Saurav Manchanda, Chandan K. Reddy |
Abstract | Image hashing is a fundamental problem in the computer vision domain with various challenges, primarily, in terms of efficiency and effectiveness. Existing hashing methods lack a principled characterization of the goodness of the hash codes and a principled approach to learn the discrete hash functions that are being optimized in the continuous space. Adversarial autoencoders are shown to be able to implicitly learn a robust hash function that generates hash codes which are balanced and have low-quantization error. However, the existing adversarial autoencoders for hashing are too inefficient to be employed for large-scale image retrieval applications because of the minmax optimization procedure. In this paper, we propose an Independent Relaxed Wasserstein Autoencoder, which presents a novel, efficient hashing method that can implicitly learn the optimal hash function by directly training the adversarial autoencoder without any discriminator/critic. Our method is an order-of-magnitude more efficient and has a much lower sample complexity than the Optimal Transport formulation of the Wasserstein distance. The proposed method outperforms the current state-of-the-art image hashing methods for the retrieval task on several prominent image collections. |
Tasks | Image Retrieval, Quantization |
Published | 2020-02-29 |
URL | https://arxiv.org/abs/2003.00134v2 |
https://arxiv.org/pdf/2003.00134v2.pdf | |
PWC | https://paperswithcode.com/paper/image-hashing-by-minimizing-independent |
Repo | |
Framework | |
Cross-modal Learning for Multi-modal Video Categorization
Title | Cross-modal Learning for Multi-modal Video Categorization |
Authors | Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee |
Abstract | Multi-modal machine learning (ML) models can process data in multiple modalities (e.g., video, audio, text) and are useful for video content analysis in a variety of problems (e.g., object detection, scene understanding, activity recognition). In this paper, we focus on the problem of video categorization using a multi-modal ML technique. In particular, we have developed a novel multi-modal ML approach that we call “cross-modal learning”, where one modality influences another but only when there is correlation between the modalities — for that, we first train a correlation tower that guides the main multi-modal video categorization tower in the model. We show how this cross-modal principle can be applied to different types of models (e.g., RNN, Transformer, NetVLAD), and demonstrate through experiments how our proposed multi-modal video categorization models with cross-modal learning out-perform strong state-of-the-art baseline models. |
Tasks | Activity Recognition, Object Detection, Scene Understanding |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03501v2 |
https://arxiv.org/pdf/2003.03501v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-modal-learning-for-multi-modal-video |
Repo | |
Framework | |