Paper Group ANR 176
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation. Modeling Human Motion with Quaternion-based Neural Networks. Two Way Adversarial Unsupervised Word Translation. Meta-Learning of Neural Architectures for Few-Shot Learning. HRGE-Net: Hierarchical Relational Graph Embedding Network for Multi-view 3D Shape Recognition. SGD on Neural Net …
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation
Title | Look, Listen, and Act: Towards Audio-Visual Embodied Navigation |
Authors | Chuang Gan, Yiwei Zhang, Jiajun Wu, Boqing Gong, Joshua B. Tenenbaum |
Abstract | A crucial ability of mobile intelligent agents is to integrate the evidence from multiple sensory inputs in an environment and to make a sequence of actions to reach their goals. In this paper, we attempt to approach the problem of Audio-Visual Embodied Navigation, the task of planning the shortest path from a random starting location in a scene to the sound source in an indoor environment, given only raw egocentric visual and audio sensory data. To accomplish this task, the agent is required to learn from various modalities, i.e. relating the audio signal to the visual environment. Here we describe an approach to audio-visual embodied navigation that takes advantage of both visual and audio pieces of evidence. Our solution is based on three key ideas: a visual perception mapper module that constructs its spatial memory of the environment, a sound perception module that infers the relative location of the sound source from the agent, and a dynamic path planner that plans a sequence of actions based on the audio-visual observations and the spatial memory of the environment to navigate toward the goal. Experimental results on a newly collected Visual-Audio-Room dataset using the simulated multi-modal environment demonstrate the effectiveness of our approach over several competitive baselines. |
Tasks | |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/1912.11684v2 |
https://arxiv.org/pdf/1912.11684v2.pdf | |
PWC | https://paperswithcode.com/paper/look-listen-and-act-towards-audio-visual |
Repo | |
Framework | |
Modeling Human Motion with Quaternion-based Neural Networks
Title | Modeling Human Motion with Quaternion-based Neural Networks |
Authors | Dario Pavllo, Christoph Feichtenhofer, Michael Auli, David Grangier |
Abstract | Previous work on predicting or generating 3D human pose sequences regresses either joint rotations or joint positions. The former strategy is prone to error accumulation along the kinematic chain, as well as discontinuities when using Euler angles or exponential maps as parameterizations. The latter requires re-projection onto skeleton constraints to avoid bone stretching and invalid configurations. This work addresses both limitations. QuaterNet represents rotations with quaternions and our loss function performs forward kinematics on a skeleton to penalize absolute position errors instead of angle errors. We investigate both recurrent and convolutional architectures and evaluate on short-term prediction and long-term generation. For the latter, our approach is qualitatively judged as realistic as recent neural strategies from the graphics literature. Our experiments compare quaternions to Euler angles as well as exponential maps and show that only a very short context is required to make reliable future predictions. Finally, we show that the standard evaluation protocol for Human3.6M produces high variance results and we propose a simple solution. |
Tasks | |
Published | 2019-01-21 |
URL | https://arxiv.org/abs/1901.07677v2 |
https://arxiv.org/pdf/1901.07677v2.pdf | |
PWC | https://paperswithcode.com/paper/modeling-human-motion-with-quaternion-based |
Repo | |
Framework | |
Two Way Adversarial Unsupervised Word Translation
Title | Two Way Adversarial Unsupervised Word Translation |
Authors | Blaine Cole |
Abstract | Word translation is a problem in machine translation that seeks to build models that recover word level correspondence between languages. Recent approaches to this problem have shown that word translation models can learned with very small seeding dictionaries, and even without any starting supervision. In this paper we propose a method to jointly find translations between a pair of languages. Not only does our method learn translations in both directions but it improves accuracy of those translations over past methods. |
Tasks | Machine Translation |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.10168v1 |
https://arxiv.org/pdf/1912.10168v1.pdf | |
PWC | https://paperswithcode.com/paper/two-way-adversarial-unsupervised-word |
Repo | |
Framework | |
Meta-Learning of Neural Architectures for Few-Shot Learning
Title | Meta-Learning of Neural Architectures for Few-Shot Learning |
Authors | Thomas Elsken, Benedikt Staffler, Jan Hendrik Metzen, Frank Hutter |
Abstract | The recent progress in neural architectures search (NAS) has allowed scaling the automated design of neural architectures to real-world domains such as object detection and semantic segmentation. However, one prerequisite for the application of NAS are large amounts of labeled data and compute resources. This renders its application challenging in few-shot learning scenarios, where many related tasks need to be learned, each with limited amounts of data and compute time. Thus, few-shot learning is typically done with a fixed neural architecture. To improve upon this, we propose MetaNAS, the first method which fully integrates NAS with gradient-based meta-learning. MetaNAS optimizes a meta-architecture along with the meta-weights during meta-training. During meta-testing, architectures can be adapted to a novel task with a few steps of the task optimizer, that is: task adaptation becomes computationally cheap and requires only little data per task. Moreover, MetaNAS is agnostic in that it can be used with arbitrary model-agnostic meta-learning algorithms and arbitrary gradient-based NAS methods. Empirical results on standard few-shot classification benchmarks show that MetaNAS with a combination of DARTS and REPTILE yields state-of-the-art results. |
Tasks | Few-Shot Learning, Meta-Learning, Object Detection, Semantic Segmentation |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.11090v1 |
https://arxiv.org/pdf/1911.11090v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-of-neural-architectures-for-few |
Repo | |
Framework | |
HRGE-Net: Hierarchical Relational Graph Embedding Network for Multi-view 3D Shape Recognition
Title | HRGE-Net: Hierarchical Relational Graph Embedding Network for Multi-view 3D Shape Recognition |
Authors | Xin Wei, Ruixuan Yu, Jian Sun |
Abstract | View-based approach that recognizes 3D shape through its projected 2D images achieved state-of-the-art performance for 3D shape recognition. One essential challenge for view-based approach is how to aggregate the multi-view features extracted from 2D images to be a global 3D shape descriptor. In this work, we propose a novel feature aggregation network by fully investigating the relations among views. We construct a relational graph with multi-view images as nodes, and design relational graph embedding by modeling pairwise and neighboring relations among views. By gradually coarsening the graph, we build a hierarchical relational graph embedding network (HRGE-Net) to aggregate the multi-view features to be a global shape descriptor. Extensive experiments show that HRGE-Net achieves stateof-the-art performance for 3D shape classification and retrieval on benchmark datasets. |
Tasks | 3D Shape Recognition, Graph Embedding |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10098v1 |
https://arxiv.org/pdf/1908.10098v1.pdf | |
PWC | https://paperswithcode.com/paper/hrge-net-hierarchical-relational-graph |
Repo | |
Framework | |
SGD on Neural Networks Learns Functions of Increasing Complexity
Title | SGD on Neural Networks Learns Functions of Increasing Complexity |
Authors | Preetum Nakkiran, Gal Kaplun, Dimitris Kalimeris, Tristan Yang, Benjamin L. Edelman, Fred Zhang, Boaz Barak |
Abstract | We perform an experimental study of the dynamics of Stochastic Gradient Descent (SGD) in learning deep neural networks for several real and synthetic classification tasks. We show that in the initial epochs, almost all of the performance improvement of the classifier obtained by SGD can be explained by a linear classifier. More generally, we give evidence for the hypothesis that, as iterations progress, SGD learns functions of increasing complexity. This hypothesis can be helpful in explaining why SGD-learned classifiers tend to generalize well even in the over-parameterized regime. We also show that the linear classifier learned in the initial stages is “retained” throughout the execution even if training is continued to the point of zero training error, and complement this with a theoretical result in a simplified model. Key to our work is a new measure of how well one classifier explains the performance of another, based on conditional mutual information. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11604v1 |
https://arxiv.org/pdf/1905.11604v1.pdf | |
PWC | https://paperswithcode.com/paper/sgd-on-neural-networks-learns-functions-of |
Repo | |
Framework | |
A COLD Approach to Generating Optimal Samples
Title | A COLD Approach to Generating Optimal Samples |
Authors | Omar Mahmood, José Miguel Hernández-Lobato |
Abstract | Optimising discrete data for a desired characteristic using gradient-based methods involves projecting the data into a continuous latent space and carrying out optimisation in this space. Carrying out global optimisation is difficult as optimisers are likely to follow gradients into regions of the latent space that the model has not been exposed to during training; samples generated from these regions are likely to be too dissimilar to the training data to be useful. We propose Constrained Optimisation with Latent Distributions (COLD), a constrained global optimisation procedure to find samples with high values of a desired property that are similar to yet distinct from the training data. We find that on MNIST, our procedure yields optima for each of three different objectives, and that enforcing tighter constraints improves the quality and increases the diversity of the generated images. On the ChEMBL molecular dataset, our method generates a diverse set of new molecules with drug-likeness scores similar to those of the highest-scoring molecules in the training data. We also demonstrate a computationally efficient way to approximate the constraint when evaluating it exactly is computationally expensive. |
Tasks | |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09885v1 |
https://arxiv.org/pdf/1905.09885v1.pdf | |
PWC | https://paperswithcode.com/paper/a-cold-approach-to-generating-optimal-samples |
Repo | |
Framework | |
Pixelation is NOT Done in Videos Yet
Title | Pixelation is NOT Done in Videos Yet |
Authors | Jizhe Zhou, Chi-Man Pun, YingYu Wang |
Abstract | This paper introduces an algorithm to protect the privacy of individuals in streaming video data by blurring faces such that face cannot be reliably recognized. This thwarts any possible face recognition, but because all facial details are obscured, the result is of limited use. We propose a new clustering algorithm to create raw trajectories for detected faces. Associating faces across frames to form trajectories, it auto-generates cluster number and discovers new clusters through deep feature and position aggregated affinities. We introduce a Gaussian Process to refine the raw trajectories. We conducted an online experiment with 47 participants to evaluate the effectiveness of face blurring compared to the original photo (as-is), and users’ experience (satisfaction, information sufficiency, enjoyment, social presence, and filter likeability) |
Tasks | Face Detection, Face Recognition |
Published | 2019-03-26 |
URL | http://arxiv.org/abs/1903.10836v3 |
http://arxiv.org/pdf/1903.10836v3.pdf | |
PWC | https://paperswithcode.com/paper/personal-privacy-filtering-via-face |
Repo | |
Framework | |
Dectecting Invasive Ductal Carcinoma with Semi-Supervised Conditional GANs
Title | Dectecting Invasive Ductal Carcinoma with Semi-Supervised Conditional GANs |
Authors | Jeremiah W. Johnson |
Abstract | Invasive ductal carcinoma (IDC) comprises nearly 80% of all breast cancers. The detection of IDC is a necessary preprocessing step in determining the aggressiveness of the cancer, determining treatment protocols, and predicting patient outcomes, and is usually performed manually by an expert pathologist. Here, we describe a novel algorithm for automatically detecting IDC using semi-supervised conditional generative adversarial networks (cGANs). The framework is simple and effective at improving scores on a range of metrics over a baseline CNN. |
Tasks | Predicting Patient Outcomes |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.06216v1 |
https://arxiv.org/pdf/1911.06216v1.pdf | |
PWC | https://paperswithcode.com/paper/dectecting-invasive-ductal-carcinoma-with |
Repo | |
Framework | |
A Survey of Deep Learning Applications to Autonomous Vehicle Control
Title | A Survey of Deep Learning Applications to Autonomous Vehicle Control |
Authors | Sampo Kuutti, Richard Bowden, Yaochu Jin, Phil Barber, Saber Fallah |
Abstract | Designing a controller for autonomous vehicles capable of providing adequate performance in all driving scenarios is challenging due to the highly complex environment and inability to test the system in the wide variety of scenarios which it may encounter after deployment. However, deep learning methods have shown great promise in not only providing excellent performance for complex and non-linear control problems, but also in generalising previously learned rules to new scenarios. For these reasons, the use of deep learning for vehicle control is becoming increasingly popular. Although important advancements have been achieved in this field, these works have not been fully summarised. This paper surveys a wide range of research works reported in the literature which aim to control a vehicle through deep learning methods. Although there exists overlap between control and perception, the focus of this paper is on vehicle control, rather than the wider perception problem which includes tasks such as semantic segmentation and object detection. The paper identifies the strengths and limitations of available deep learning methods through comparative analysis and discusses the research challenges in terms of computation, architecture selection, goal specification, generalisation, verification and validation, as well as safety. Overall, this survey brings timely and topical information to a rapidly evolving field relevant to intelligent transportation systems. |
Tasks | Autonomous Vehicles, Object Detection, Semantic Segmentation |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.10773v1 |
https://arxiv.org/pdf/1912.10773v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-deep-learning-applications-to |
Repo | |
Framework | |
Modeling of negative protein-protein interactions: methods and experiments
Title | Modeling of negative protein-protein interactions: methods and experiments |
Authors | Andrea Moscatelli |
Abstract | Protein-protein interactions (PPIs) are of fundamental importance for the human body, and the knowledge of their existence can facilitate very important tasks like drug target developing and therapy design. The high-throughput experiments for detecting new PPIs are costly and time-consuming, stressing the need for new computational systems able to generate high-quality PPIs predictions. These systems have to face two main problems: the high incompleteness of the human interactome and the lack of high-quality negative protein-protein interactions (i.e. proteins that are known to not interact). The latter is usually overlooked by the PPIs prediction systems, causing a significant bias in the performances and metrics. In this work, we compare methods for simulating negative knowledge using highly reliable training and test sets. Moreover, we measure the performances of two state-of-the-art systems when very reliable settings are adopted. |
Tasks | |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04709v1 |
https://arxiv.org/pdf/1910.04709v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-of-negative-protein-protein |
Repo | |
Framework | |
Robust Training with Ensemble Consensus
Title | Robust Training with Ensemble Consensus |
Authors | Jisoo Lee, Sae-Young Chung |
Abstract | Since deep neural networks are over-parameterized, they can memorize noisy examples. We address such memorizing issue in the presence of annotation noise. From the fact that deep neural networks cannot generalize neighborhoods of the features acquired via memorization, we hypothesize that noisy examples do not consistently incur small losses on the network under a certain perturbation. Based on this, we propose a novel training method called Learning with Ensemble Consensus (LEC) that prevents overfitting noisy examples by eliminating them using the consensus of an ensemble of perturbed networks. One of the proposed LECs, LTEC outperforms the current state-of-the-art methods on noisy MNIST, CIFAR-10, and CIFAR-100 in an efficient manner. |
Tasks | |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.09792v2 |
https://arxiv.org/pdf/1910.09792v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-training-with-ensemble-consensus |
Repo | |
Framework | |
A greedy anytime algorithm for sparse PCA
Title | A greedy anytime algorithm for sparse PCA |
Authors | Guy Holtzman, Adam Soffer, Dan Vilenchik |
Abstract | The taxing computational effort that is involved in solving some high-dimensional statistical problems, in particular problems involving non-convex optimization, has popularized the development and analysis of algorithms that run efficiently (polynomial-time) but with no general guarantee on statistical consistency. In light of the ever-increasing compute power and decreasing costs, a more useful characterization of algorithms is by their ability to calibrate the invested computational effort with various characteristics of the input at hand and with the available computational resources. For example, design an algorithm that always guarantees statistical consistency of its output by increasing the running time as the SNR weakens. We propose a new greedy algorithm for the $\ell_0$-sparse PCA problem which supports the calibration principle. We provide both a rigorous analysis of our algorithm in the spiked covariance model, as well as simulation results and comparison with other existing methods. Our findings show that our algorithm recovers the spike in SNR regimes where all polynomial-time algorithms fail while running in a reasonable parallel-time on a cluster. |
Tasks | Calibration |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06846v5 |
https://arxiv.org/pdf/1910.06846v5.pdf | |
PWC | https://paperswithcode.com/paper/a-greedy-anytime-algorithm-for-sparse-pca |
Repo | |
Framework | |
Deep Reinforcement Learning based Adaptive Moving Target Defense
Title | Deep Reinforcement Learning based Adaptive Moving Target Defense |
Authors | Taha Eghtesad, Yevgeniy Vorobeychik, Aron Laszka |
Abstract | Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary’s uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary’s observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender’s actions. In this paper, we propose finding optimal MTD strategies using deep reinforcement learning. Based on an established model of adaptive MTD, we formulate finding an MTD strategy as finding a policy for a partially-observable Markov decision process. To significantly improve training performance, we introduce compact memory representations. To demonstrate our approach, we provide thorough numerical results, showing significant improvement over existing strategies. |
Tasks | |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.11972v1 |
https://arxiv.org/pdf/1911.11972v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-based-adaptive |
Repo | |
Framework | |
DeepCF: A Unified Framework of Representation Learning and Matching Function Learning in Recommender System
Title | DeepCF: A Unified Framework of Representation Learning and Matching Function Learning in Recommender System |
Authors | Zhi-Hong Deng, Ling Huang, Chang-Dong Wang, Jian-Huang Lai, Philip S. Yu |
Abstract | In general, recommendation can be viewed as a matching problem, i.e., match proper items for proper users. However, due to the huge semantic gap between users and items, it’s almost impossible to directly match users and items in their initial representation spaces. To solve this problem, many methods have been studied, which can be generally categorized into two types, i.e., representation learning-based CF methods and matching function learning-based CF methods. Representation learning-based CF methods try to map users and items into a common representation space. In this case, the higher similarity between a user and an item in that space implies they match better. Matching function learning-based CF methods try to directly learn the complex matching function that maps user-item pairs to matching scores. Although both methods are well developed, they suffer from two fundamental flaws, i.e., the limited expressiveness of dot product and the weakness in capturing low-rank relations respectively. To this end, we propose a general framework named DeepCF, short for Deep Collaborative Filtering, to combine the strengths of the two types of methods and overcome such flaws. Extensive experiments on four publicly available datasets demonstrate the effectiveness of the proposed DeepCF framework. |
Tasks | Recommendation Systems, Representation Learning |
Published | 2019-01-15 |
URL | http://arxiv.org/abs/1901.04704v1 |
http://arxiv.org/pdf/1901.04704v1.pdf | |
PWC | https://paperswithcode.com/paper/deepcf-a-unified-framework-of-representation |
Repo | |
Framework | |