February 2, 2020

3271 words 16 mins read

Paper Group AWR 29

Paper Group AWR 29

Meta-Curvature. Analyzing machine-learned representations: A natural language case study. Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions. The Universal Decompositional Semantics Dataset and Decomp Toolkit. Compressed Indexes for Fast Search of Semantic Data. Deep Independently Recurrent Neural Netwo …


Title Meta-Curvature
Authors Eunbyung Park, Junier B. Oliva
Abstract We propose meta-curvature (MC), a framework to learn curvature information for better generalization and fast model adaptation. MC expands on the model-agnostic meta-learner (MAML) by learning to transform the gradients in the inner optimization such that the transformed gradients achieve better generalization performance to a new task. For training large scale neural networks, we decompose the curvature matrix into smaller matrices in a novel scheme where we capture the dependencies of the model’s parameters with a series of tensor products. We demonstrate the effects of our proposed method on several few-shot learning tasks and datasets. Without any task specific techniques and architectures, the proposed method achieves substantial improvement upon previous MAML variants and outperforms the recent state-of-the-art methods. Furthermore, we observe faster convergence rates of the meta-training process. Finally, we present an analysis that explains better generalization performance with the meta-trained curvature.
Tasks Few-Shot Image Classification, Few-Shot Learning, Image Classification
Published 2019-02-09
URL https://arxiv.org/abs/1902.03356v3
PDF https://arxiv.org/pdf/1902.03356v3.pdf
PWC https://paperswithcode.com/paper/meta-curvature
Repo https://github.com/silverbottlep/meta_curvature
Framework none

Analyzing machine-learned representations: A natural language case study

Title Analyzing machine-learned representations: A natural language case study
Authors Ishita Dasgupta, Demi Guo, Samuel J. Gershman, Noah D. Goodman
Abstract As modern deep networks become more complex, and get closer to human-like capabilities in certain domains, the question arises of how the representations and decision rules they learn compare to the ones in humans. In this work, we study representations of sentences in one such artificial system for natural language processing. We first present a diagnostic test dataset to examine the degree of abstract composable structure represented. Analyzing performance on these diagnostic tests indicates a lack of systematicity in the representations and decision rules, and reveals a set of heuristic strategies. We then investigate the effect of the training distribution on learning these heuristic strategies, and study changes in these representations with various augmentations to the training set. Our results reveal parallels to the analogous representations in people. We find that these systems can learn abstract rules and generalize them to new contexts under certain circumstances – similar to human zero-shot reasoning. However, we also note some shortcomings in this generalization behavior – similar to human judgment errors like belief bias. Studying these parallels suggests new ways to understand psychological phenomena in humans as well as informs best strategies for building artificial intelligence with human-like language understanding.
Published 2019-09-12
URL https://arxiv.org/abs/1909.05885v1
PDF https://arxiv.org/pdf/1909.05885v1.pdf
PWC https://paperswithcode.com/paper/analyzing-machine-learned-representations-a
Repo https://github.com/ishita-dg/ScrambleTests
Framework pytorch

Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

Title Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions
Authors Matthew Faw, Rajat Sen, Karthikeyan Shanmugam, Constantine Caramanis, Sanjay Shakkottai
Abstract We consider a covariate shift problem where one has access to several different training datasets for the same learning problem and a small validation set which possibly differs from all the individual training distributions. This covariate shift is caused, in part, due to unobserved features in the datasets. The objective, then, is to find the best mixture distribution over the training datasets (with only observed features) such that training a learning algorithm using this mixture has the best validation performance. Our proposed algorithm, ${\sf Mix&Match}$, combines stochastic gradient descent (SGD) with optimistic tree search and model re-use (evolving partially trained models with samples from different mixture distributions) over the space of mixtures, for this task. We prove simple regret guarantees for our algorithm with respect to recovering the optimal mixture, given a total budget of SGD evaluations. Finally, we validate our algorithm on two real-world datasets.
Published 2019-07-23
URL https://arxiv.org/abs/1907.10154v4
PDF https://arxiv.org/pdf/1907.10154v4.pdf
PWC https://paperswithcode.com/paper/mix-and-match-an-optimistic-tree-search
Repo https://github.com/matthewfaw/mixnmatch-infrastructure
Framework none

The Universal Decompositional Semantics Dataset and Decomp Toolkit

Title The Universal Decompositional Semantics Dataset and Decomp Toolkit
Authors Aaron Steven White, Elias Stengel-Eskin, Siddharth Vashishtha, Venkata Govindarajan, Dee Ann Reisinger, Tim Vieira, Keisuke Sakaguchi, Sheng Zhang, Francis Ferraro, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme
Abstract We present the Universal Decompositional Semantics (UDS) dataset (v1.0), which is bundled with the Decomp toolkit (v0.1). UDS1.0 unifies five high-quality, decompositional semantics-aligned annotation sets within a single semantic graph specification—with graph structures defined by the predicative patterns produced by the PredPatt tool and real-valued node and edge attributes constructed using sophisticated normalization procedures. The Decomp toolkit provides a suite of Python 3 tools for querying UDS graphs using SPARQL. Both UDS1.0 and Decomp0.1 are publicly available at http://decomp.io.
Published 2019-09-30
URL https://arxiv.org/abs/1909.13851v1
PDF https://arxiv.org/pdf/1909.13851v1.pdf
PWC https://paperswithcode.com/paper/the-universal-decompositional-semantics
Repo https://github.com/decompositional-semantics-initiative/decomp
Framework none

Compressed Indexes for Fast Search of Semantic Data

Title Compressed Indexes for Fast Search of Semantic Data
Authors Raffaele Perego, Giulio Ermanno Pibiri, Rossano Venturini
Abstract The sheer increase in volume of RDF data demands efficient solutions for the triple indexing problem, that is devising a compressed data structure to compactly represent RDF triples by guaranteeing, at the same time, fast pattern matching operations. This problem lies at the heart of delivering good practical performance for the resolution of complex SPARQL queries on large RDF datasets. In this work, we propose a trie-based index layout to solve the problem and introduce two novel techniques to reduce its space of representation for improved effectiveness. The extensive experimental analysis conducted over a wide range of publicly available real-world datasets, reveals that our best space/time trade-off configuration substantially outperforms existing solutions at the state-of-the-art, by taking 30-60% less space and speeding up query execution by a factor of 2-81x.
Published 2019-04-16
URL https://arxiv.org/abs/1904.07619v3
PDF https://arxiv.org/pdf/1904.07619v3.pdf
PWC https://paperswithcode.com/paper/compressed-indexes-for-fast-search-of
Repo https://github.com/jermp/rdf_indexes
Framework none

Deep Independently Recurrent Neural Network (IndRNN)

Title Deep Independently Recurrent Neural Network (IndRNN)
Authors Shuai Li, Wanqing Li, Chris Cook, Yanbo Gao, Ce Zhu
Abstract Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems and thus difficult to learn long-term patterns. Long short-term memory (LSTM) was developed to address these problems, but the use of hyperbolic tangent and the sigmoid activation functions results in gradient decay over layers. Consequently, construction of an efficiently trainable deep RNN is challenging. Moreover, training of LSTM is very compute-intensive as the recurrent connection using matrix product is conducted at every time step. To address these problems, this paper proposes a new type of RNNs with the recurrent connection formulated as Hadamard product, referred to as independently recurrent neural network (IndRNN), where neurons in the same layer are independent of each other and connected across layers. The gradient vanishing and exploding problems are solved in IndRNN by simply regulating the recurrent weights, and thus long-term dependencies can be learned. Moreover, an IndRNN can work with non-saturated activation functions such as ReLU and be still trained robustly. Different deeper IndRNN architectures, including the basic stacked IndRNN, residual IndRNN and densely connected IndRNN, have been investigated, all of which can be much deeper than the existing RNNs. Furthermore, IndRNN reduces the computation at each time step and can be over 10 times faster than the LSTM. The code is made publicly available at https://github.com/Sunnydreamrain/IndRNN_pytorch. Experimental results have shown that the proposed IndRNN is able to process very long sequences (over 5000 time steps), can be used to construct very deep networks (the 21 layers residual IndRNN and deep densely connected IndRNN used in the experiment for example). Better performances have been achieved on various tasks with IndRNNs compared with the traditional RNN and LSTM.
Tasks Language Modelling, Sequential Image Classification, Skeleton Based Action Recognition
Published 2019-10-11
URL https://arxiv.org/abs/1910.06251v2
PDF https://arxiv.org/pdf/1910.06251v2.pdf
PWC https://paperswithcode.com/paper/deep-independently-recurrent-neural-network
Repo https://github.com/Sunnydreamrain/IndRNN_pytorch
Framework pytorch

CityLearn: Diverse Real-World Environments for Sample-Efficient Navigation Policy Learning

Title CityLearn: Diverse Real-World Environments for Sample-Efficient Navigation Policy Learning
Authors Marvin Chancán, Michael Milford
Abstract Visual navigation tasks in real-world environments often require both self-motion and place recognition feedback. While deep reinforcement learning has shown success in solving these perception and decision-making problems in an end-to-end manner, these algorithms require large amounts of experience to learn navigation policies from high-dimensional data, which is generally impractical for real robots due to sample complexity. In this paper, we address these problems with two main contributions. We first leverage place recognition and deep learning techniques combined with goal destination feedback to generate compact, bimodal image representations that can then be used to effectively learn control policies from a small amount of experience. Second, we present an interactive framework, CityLearn, that enables for the first time training and deployment of navigation algorithms across city-sized, realistic environments with extreme visual appearance changes. CityLearn features more than 10 benchmark datasets, often used in visual place recognition and autonomous driving research, including over 100 recorded traversals across 60 cities around the world. We evaluate our approach on two CityLearn environments, training our navigation policy on a single traversal. Results show our method can be over 2 orders of magnitude faster than when using raw images, and can also generalize across extreme visual changes including day to night and summer to winter transitions.
Tasks Autonomous Driving, Decision Making, Visual Navigation, Visual Place Recognition
Published 2019-10-10
URL https://arxiv.org/abs/1910.04335v2
PDF https://arxiv.org/pdf/1910.04335v2.pdf
PWC https://paperswithcode.com/paper/from-visual-place-recognition-to-navigation
Repo https://github.com/mchancan/citylearn
Framework none

Pose-aware Multi-level Feature Network for Human Object Interaction Detection

Title Pose-aware Multi-level Feature Network for Human Object Interaction Detection
Authors Bo Wan, Desen Zhou, Yongfei Liu, Rongjie Li, Xuming He
Abstract Reasoning human object interactions is a core problem in human-centric scene understanding and detecting such relations poses a unique challenge to vision systems due to large variations in human-object configurations, multiple co-occurring relation instances and subtle visual difference between relation categories. To address those challenges, we propose a multi-level relation detection strategy that utilizes human pose cues to capture global spatial configurations of relations and as an attention mechanism to dynamically zoom into relevant regions at human part level. Specifically, we develop a multi-branch deep network to learn a pose-augmented relation representation at three semantic levels, incorporating interaction context, object features and detailed semantic part cues. As a result, our approach is capable of generating robust predictions on fine-grained human object interactions with interpretable outputs. Extensive experimental evaluations on public benchmarks show that our model outperforms prior methods by a considerable margin, demonstrating its efficacy in handling complex scenes.
Tasks Human-Object Interaction Detection, Scene Understanding
Published 2019-09-18
URL https://arxiv.org/abs/1909.08453v1
PDF https://arxiv.org/pdf/1909.08453v1.pdf
PWC https://paperswithcode.com/paper/pose-aware-multi-level-feature-network-for
Repo https://github.com/bobwan1995/PMFNet
Framework pytorch

Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression

Title Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression
Authors Xinyao Wang, Liefeng Bo, Li Fuxin
Abstract Heatmap regression with a deep network has become one of the mainstream approaches to localize facial landmarks. However, the loss function for heatmap regression is rarely studied. In this paper, we analyze the ideal loss function properties for heatmap regression in face alignment problems. Then we propose a novel loss function, named Adaptive Wing loss, that is able to adapt its shape to different types of ground truth heatmap pixels. This adaptability penalizes loss more on foreground pixels while less on background pixels. To address the imbalance between foreground and background pixels, we also propose Weighted Loss Map, which assigns high weights on foreground and difficult background pixels to help training process focus more on pixels that are crucial to landmark localization. To further improve face alignment accuracy, we introduce boundary prediction and CoordConv with boundary coordinates. Extensive experiments on different benchmarks, including COFW, 300W and WFLW, show our approach outperforms the state-of-the-art by a significant margin on various evaluation metrics. Besides, the Adaptive Wing loss also helps other heatmap regression tasks. Code will be made publicly available at https://github.com/protossw512/AdaptiveWingLoss.
Tasks Face Alignment, Robust Face Alignment
Published 2019-04-16
URL https://arxiv.org/abs/1904.07399v2
PDF https://arxiv.org/pdf/1904.07399v2.pdf
PWC https://paperswithcode.com/paper/adaptive-wing-loss-for-robust-face-alignment
Repo https://github.com/SeungyounShin/Adaptive-Wing-Loss-for-Robust-Face-Alignment-via-Heatmap-Regression
Framework pytorch

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Title Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels
Authors Simon S. Du, Kangcheng Hou, Barnabás Póczos, Ruslan Salakhutdinov, Ruosong Wang, Keyulu Xu
Abstract While graph kernels (GKs) are easy to train and enjoy provable theoretical guarantees, their practical performances are limited by their expressive power, as the kernel function often depends on hand-crafted combinatorial features of graphs. Compared to graph kernels, graph neural networks (GNNs) usually achieve better practical performance, as GNNs use multi-layer architectures and non-linear activation functions to extract high-order information of graphs as features. However, due to the large number of hyper-parameters and the non-convex nature of the training procedure, GNNs are harder to train. Theoretical guarantees of GNNs are also not well-understood. Furthermore, the expressive power of GNNs scales with the number of parameters, and thus it is hard to exploit the full power of GNNs when computing resources are limited. The current paper presents a new class of graph kernels, Graph Neural Tangent Kernels (GNTKs), which correspond to infinitely wide multi-layer GNNs trained by gradient descent. GNTKs enjoy the full expressive power of GNNs and inherit advantages of GKs. Theoretically, we show GNTKs provably learn a class of smooth functions on graphs. Empirically, we test GNTKs on graph classification datasets and show they achieve strong performance.
Tasks Graph Classification
Published 2019-05-30
URL https://arxiv.org/abs/1905.13192v2
PDF https://arxiv.org/pdf/1905.13192v2.pdf
PWC https://paperswithcode.com/paper/graph-neural-tangent-kernel-fusing-graph
Repo https://github.com/KangchengHou/gntk
Framework none

Face Alignment using a 3D Deeply-initialized Ensemble of Regression Trees

Title Face Alignment using a 3D Deeply-initialized Ensemble of Regression Trees
Authors Roberto Valle, José M. Buenaposada, Antonio Valdés, Luis Baumela
Abstract Face alignment algorithms locate a set of landmark points in images of faces taken in unrestricted situations. State-of-the-art approaches typically fail or lose accuracy in the presence of occlusions, strong deformations, large pose variations and ambiguous configurations. In this paper we present 3DDE, a robust and efficient face alignment algorithm based on a coarse-to-fine cascade of ensembles of regression trees. It is initialized by robustly fitting a 3D face model to the probability maps produced by a convolutional neural network. With this initialization we address self-occlusions and large face rotations. Further, the regressor implicitly imposes a prior face shape on the solution, addressing occlusions and ambiguous face configurations. Its coarse-to-fine structure tackles the combinatorial explosion of parts deformation. In the experiments performed, 3DDE improves the state-of-the-art in 300W, COFW, AFLW and WFLW data sets. Finally, we perform cross-dataset experiments that reveal the existence of a significant data set bias in these benchmarks.
Tasks Face Alignment, Facial Landmark Detection
Published 2019-02-05
URL https://arxiv.org/abs/1902.01831v2
PDF https://arxiv.org/pdf/1902.01831v2.pdf
PWC https://paperswithcode.com/paper/face-alignment-using-a-3d-deeply-initialized
Repo https://github.com/bobetocalo/bobetocalo_eccv18
Framework none

Defense Against Adversarial Attacks Using Feature Scattering-based Adversarial Training

Title Defense Against Adversarial Attacks Using Feature Scattering-based Adversarial Training
Authors Haichao Zhang, Jianyu Wang
Abstract We introduce a feature scattering-based adversarial training approach for improving model robustness against adversarial attacks. Conventional adversarial training approaches leverage a supervised scheme (either targeted or non-targeted) in generating attacks for training, which typically suffer from issues such as label leaking as noted in recent works. Differently, the proposed approach generates adversarial images for training through feature scattering in the latent space, which is unsupervised in nature and avoids label leaking. More importantly, this new approach generates perturbed images in a collaborative fashion, taking the inter-sample relationships into consideration. We conduct analysis on model robustness and demonstrate the effectiveness of the proposed approach through extensively experiments on different datasets compared with state-of-the-art approaches.
Published 2019-07-24
URL https://arxiv.org/abs/1907.10764v4
PDF https://arxiv.org/pdf/1907.10764v4.pdf
PWC https://paperswithcode.com/paper/defense-against-adversarial-attacks-using-2
Repo https://github.com/Line290/FeatureAttack
Framework pytorch

Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons

Title Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons
Authors Yi-Fan Song, Zhang Zhang, Liang Wang
Abstract Current methods for skeleton-based human action recognition usually work with completely observed skeletons. However, in real scenarios, it is prone to capture incomplete and noisy skeletons, which will deteriorate the performance of traditional models. To enhance the robustness of action recognition models to incomplete skeletons, we propose a multi-stream graph convolutional network (GCN) for exploring sufficient discriminative features distributed over all skeleton joints. Here, each stream of the network is only responsible for learning features from currently unactivated joints, which are distinguished by the class activation maps (CAM) obtained by preceding streams, so that the activated joints of the proposed method are obviously more than traditional methods. Thus, the proposed method is termed richly activated GCN (RA-GCN), where the richly discovered features will improve the robustness of the model. Compared to the state-of-the-art methods, the RA-GCN achieves comparable performance on the NTU RGB+D dataset. Moreover, on a synthetic occlusion dataset, the performance deterioration can be alleviated by the RA-GCN significantly.
Tasks Skeleton Based Action Recognition, Temporal Action Localization
Published 2019-05-16
URL https://arxiv.org/abs/1905.06774v2
PDF https://arxiv.org/pdf/1905.06774v2.pdf
PWC https://paperswithcode.com/paper/richlt-activated-graph-convolutional-network
Repo https://github.com/yfsong0709/RA-GCNv1
Framework pytorch

Learning protein sequence embeddings using information from structure

Title Learning protein sequence embeddings using information from structure
Authors Tristan Bepler, Bonnie Berger
Abstract Inferring the structural properties of a protein from its amino acid sequence is a challenging yet important problem in biology. Structures are not known for the vast majority of protein sequences, but structure is critical for understanding function. Existing approaches for detecting structural similarity between proteins from sequence are unable to recognize and exploit structural patterns when sequences have diverged too far, limiting our ability to transfer knowledge between structurally related proteins. We newly approach this problem through the lens of representation learning. We introduce a framework that maps any protein sequence to a sequence of vector embeddings — one per amino acid position — that encode structural information. We train bidirectional long short-term memory (LSTM) models on protein sequences with a two-part feedback mechanism that incorporates information from (i) global structural similarity between proteins and (ii) pairwise residue contact maps for individual proteins. To enable learning from structural similarity information, we define a novel similarity measure between arbitrary-length sequences of vector embeddings based on a soft symmetric alignment (SSA) between them. Our method is able to learn useful position-specific embeddings despite lacking direct observations of position-level correspondence between sequences. We show empirically that our multi-task framework outperforms other sequence-based methods and even a top-performing structure-based alignment method when predicting structural similarity, our goal. Finally, we demonstrate that our learned embeddings can be transferred to other protein sequence problems, improving the state-of-the-art in transmembrane domain prediction.
Tasks Representation Learning
Published 2019-02-22
URL https://arxiv.org/abs/1902.08661v2
PDF https://arxiv.org/pdf/1902.08661v2.pdf
PWC https://paperswithcode.com/paper/learning-protein-sequence-embeddings-using
Repo https://github.com/cguerramain/protein-structure-prediction-models
Framework none

Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos

Title Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Authors Yitian Yuan, Lin Ma, Jingwen Wang, Wei Liu, Wenwu Zhu
Abstract Temporal sentence grounding in videos aims to detect and localize one target video segment, which semantically corresponds to a given sentence. Existing methods mainly tackle this task via matching and aligning semantics between a sentence and candidate video segments, while neglect the fact that the sentence information plays an important role in temporally correlating and composing the described contents in videos. In this paper, we propose a novel semantic conditioned dynamic modulation (SCDM) mechanism, which relies on the sentence semantics to modulate the temporal convolution operations for better correlating and composing the sentence related video contents over time. More importantly, the proposed SCDM performs dynamically with respect to the diverse video contents so as to establish a more precise matching relationship between sentence and video, thereby improving the temporal grounding accuracy. Extensive experiments on three public datasets demonstrate that our proposed model outperforms the state-of-the-arts with clear margins, illustrating the ability of SCDM to better associate and localize relevant video contents for temporal sentence grounding. Our code for this paper is available at https://github.com/yytzsy/SCDM .
Published 2019-10-31
URL https://arxiv.org/abs/1910.14303v1
PDF https://arxiv.org/pdf/1910.14303v1.pdf
PWC https://paperswithcode.com/paper/semantic-conditioned-dynamic-modulation-for
Repo https://github.com/yytzsy/SCDM
Framework tf
comments powered by Disqus