April 1, 2020

3109 words 15 mins read

Paper Group NANR 78

The function of contextual illusions. PDP: A General Neural Framework for Learning SAT Solvers. Project and Forget: Solving Large Scale Metric Constrained Problems. Visual Explanation for Deep Metric Learning. The Frechet Distance of training and test distribution predicts the generalization gap. End to End Trainable Active Contours via Differentia …

The function of contextual illusions


Title	The function of contextual illusions
Authors	Anonymous
Abstract	Many visual illusions are contextual by nature. In the orientation-tilt illusion, the perceived orientation of a central grating is repulsed from or attracted towards the orientation of a surrounding grating. An open question in vision science is whether such illusions reflect basic limitations of the visual system, or if they correspond to corner cases of neural computations that are efficient in everyday settings. Here we develop deep recurrent network architectures that approximate neural circuits linked to contextual illusions. We show that these architectures, which we refer to as gamma-nets, are more sample efficient for learning contour detection than the state of the art, and exhibit an orientation-tilt illusion consistent with human data. Correcting this illusion significantly reduces gamma-net performance by driving it to prefer low-level edges over high-level object boundary contours. Overall, our study suggests that contextual illusions are a byproduct of neural circuits that help biological visual systems achieve robust and efficient perception, and that incorporating such circuits in artificial neural networks can improve computer vision.
Tasks	Contour Detection
Published	2020-01-01
URL	https://openreview.net/forum?id=H1gB4RVKvB
PDF	https://openreview.net/pdf?id=H1gB4RVKvB
PWC	https://paperswithcode.com/paper/the-function-of-contextual-illusions
Repo
Framework

PDP: A General Neural Framework for Learning SAT Solvers


Title	PDP: A General Neural Framework for Learning SAT Solvers
Authors	Anonymous
Abstract	There have been recent efforts for incorporating Graph Neural Network models for learning fully neural solvers for constraint satisfaction problems (CSP) and particularly Boolean satisfiability (SAT). Despite the unique representational power of these neural embedding models, it is not clear to what extent they actually learn a search strategy vs. statistical biases in the training data. On the other hand, by fixing the search strategy (e.g. greedy search), one would effectively deprive the neural models of learning better strategies than those given. In this paper, we propose a generic neural framework for learning SAT solvers (and in general any CSP solver) that can be described in terms of probabilistic inference and yet learn search strategies beyond greedy search. Our framework is based on the idea of propagation, decimation and prediction (and hence the name PDP) in graphical models, and can be trained directly toward solving SAT in a fully unsupervised manner via energy minimization, as shown in the paper. Our experimental results demonstrate the effectiveness of our framework for SAT solving compared to both neural and the industrial baselines.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=S1xaf6VFPB
PDF	https://openreview.net/pdf?id=S1xaf6VFPB
PWC	https://paperswithcode.com/paper/pdp-a-general-neural-framework-for-learning-1
Repo
Framework

Project and Forget: Solving Large Scale Metric Constrained Problems


Title	Project and Forget: Solving Large Scale Metric Constrained Problems
Authors	Anonymous
Abstract	Given a set of distances amongst points, determining what metric representation is most “consistent” with the input distances or the metric that captures the relevant geometric features of the data is a key step in many machine learning algorithms. In this paper, we focus on metric constrained problems, a class of optimization problems with metric constraints. In particular, we identify three types of metric constrained problems: metric nearness Brickell et al. (2008), weighted correlation clustering on general graphs Bansal et al. (2004), and metric learning Bellet et al. (2013); Davis et al. (2007). Because of the large number of constraints in these problems, however, researchers have been forced to restrict either the kinds of metrics learned or the size of the problem that can be solved. We provide an algorithm, PROJECT AND FORGET, that uses Bregman projections with cutting planes, to solve metric constrained problems with many (possibly exponentially) inequality constraints. We also prove that our algorithm converges to the global optimal solution. Additionally, we show that the optimality error (L2 distance of the current iterate to the optimal) asymptotically decays at an exponential rate. We show that using our method we can solve large problem instances of three types of metric constrained problems, out-performing all state of the art methods with respect to CPU times and problem sizes.
Tasks	Metric Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=SJeX2aVFwH
PDF	https://openreview.net/pdf?id=SJeX2aVFwH
PWC	https://paperswithcode.com/paper/project-and-forget-solving-large-scale-metric
Repo
Framework

Visual Explanation for Deep Metric Learning


Title	Visual Explanation for Deep Metric Learning
Authors	Anonymous
Abstract	This work explores the visual explanation for deep metric learning and its applications. As an important problem for learning representation, metric learning has attracted much attention recently, while the interpretation of such model is not as well studied as classification. To this end, we propose an intuitive idea to show where contributes the most to the overall similarity of two input images by decomposing the final activation. Instead of only providing the overall activation map of each image, we propose to generate point-to-point activation intensity between two images so that the relationship between different regions is uncovered. We show that the proposed framework can be directly deployed to a large range of metric learning applications and provides valuable information for understanding the model. Furthermore, our experiments show its effectiveness on two potential applications, i.e. cross-view pattern discovery and interactive retrieval.
Tasks	Metric Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=S1xLuRVFvr
PDF	https://openreview.net/pdf?id=S1xLuRVFvr
PWC	https://paperswithcode.com/paper/visual-explanation-for-deep-metric-learning-1
Repo
Framework

The Frechet Distance of training and test distribution predicts the generalization gap


Title	The Frechet Distance of training and test distribution predicts the generalization gap
Authors	Anonymous
Abstract	Learning theory tells us that more data is better when minimizing the generalization error of identically distributed training and test sets. However, when training and test distribution differ, this distribution shift can have a significant effect. With a novel perspective on function transfer learning, we are able to lower bound the change of performance when transferring from training to test set with the Wasserstein distance between the embedded training and test set distribution. We find that there is a trade-off affecting performance between how invariant a function is to changes in training and test distribution and how large this shift in distribution is. Empirically across several data domains, we substantiate this viewpoint by showing that test performance correlates strongly with the distance in data distributions between training and test set. Complementary to the popular belief that more data is always better, our results highlight the utility of also choosing a training data distribution that is close to the test data distribution when the learned function is not invariant to such changes.
Tasks	Transfer Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=SJgSflHKDr
PDF	https://openreview.net/pdf?id=SJgSflHKDr
PWC	https://paperswithcode.com/paper/the-frechet-distance-of-training-and-test
Repo
Framework

End to End Trainable Active Contours via Differentiable Rendering


Title	End to End Trainable Active Contours via Differentiable Rendering
Authors	Anonymous
Abstract	We present an image segmentation method that iteratively evolves a polygon. At each iteration, the vertices of the polygon are displaced based on the local value of a 2D shift map that is inferred from the input image via an encoder-decoder architecture. The main training loss that is used is the difference between the polygon shape and the ground truth segmentation mask. The network employs a neural renderer to create the polygon from its vertices, making the process fully differentiable. We demonstrate that our method outperforms the state of the art segmentation networks and deep active contour solutions in a variety of benchmarks, including medical imaging and aerial images.
Tasks	Semantic Segmentation
Published	2020-01-01
URL	https://openreview.net/forum?id=rkxawlHKDr
PDF	https://openreview.net/pdf?id=rkxawlHKDr
PWC	https://paperswithcode.com/paper/end-to-end-trainable-active-contours-via
Repo
Framework

Local Label Propagation for Large-Scale Semi-Supervised Learning


Title	Local Label Propagation for Large-Scale Semi-Supervised Learning
Authors	Anonymous
Abstract	A significant issue in training deep neural networks to solve supervised learning tasks is the need for large numbers of labeled datapoints. The goal of semisupervised learning is to leverage ubiquitous unlabeled data, together with small quantities of labeled data, to achieve high task performance. Though substantial recent progress has been made in developing semi-supervised algorithms that are effective for comparatively small datasets, many of these techniques do not scale readily to the large (unlabeled) datasets characteristic of real-world applications. In this paper we introduce a novel approach to scalable semi-supervised learning, called Local Label Propagation (LLP). Extending ideas from recent work on unsupervised embedding learning, LLP first embeds datapoints, labeled and otherwise, in a common latent space using a deep neural network. It then propagates pseudolabels from known to unknown datapoints in a manner that depends on the local geometry of the embedding, taking into account both inter-point distance and local data density as a weighting on propagation likelihood. The parameters of the deep embedding are then trained to simultaneously maximize pseudolabel categorization performance as well as a metric of the clustering of datapoints within each psuedo-label group, iteratively alternating stages of network training and label propagation. We illustrate the utility of the LLP method on the ImageNet dataset, achieving results that outperform previous state-of-the-art scalable semi-supervised learning algorithms by large margins, consistently across a wide variety of training regimes. We also show that the feature representation learned with LLP transfers well to scene recognition in the Places 205 dataset.
Tasks	Scene Recognition
Published	2020-01-01
URL	https://openreview.net/forum?id=B1x2eCNFvH
PDF	https://openreview.net/pdf?id=B1x2eCNFvH
PWC	https://paperswithcode.com/paper/local-label-propagation-for-large-scale-semi-1
Repo
Framework

Lyceum: An efficient and scalable ecosystem for robot learning


Title	Lyceum: An efficient and scalable ecosystem for robot learning
Authors	Anonymous
Abstract	We introduce Lyceum, a high-performance computational ecosystem for robotlearning. Lyceum is built on top of the Julia programming language and theMuJoCo physics simulator, combining the ease-of-use of a high-level program-ming language with the performance of native C. Lyceum is up to 10-20Xfaster compared to other popular abstractions like OpenAI’sGymand Deep-Mind’sdm-control. This substantially reduces training time for various re-inforcement learning algorithms; and is also fast enough to support real-timemodel predictive control with physics simulators. Lyceum has a straightfor-ward API and supports parallel computation across multiple cores or machines.The code base, tutorials, and demonstration videos can be found at: https://sites.google.com/view/lyceum-anon.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SyxytxBFDr
PDF	https://openreview.net/pdf?id=SyxytxBFDr
PWC	https://paperswithcode.com/paper/lyceum-an-efficient-and-scalable-ecosystem
Repo
Framework

NoiGAN: NOISE AWARE KNOWLEDGE GRAPH EMBEDDING WITH GAN


Title	NoiGAN: NOISE AWARE KNOWLEDGE GRAPH EMBEDDING WITH GAN
Authors	Anonymous
Abstract	Knowledge graph has gained increasing attention to recent years for its successful applications of numerous tasks. Despite the rapid growth of knowledge construction, knowledge graphs still suffer from severe incompletion and inevitably involve various kinds of errors. Several attempts have been made to complete knowledge graph as well as to detect noise. However, none of them considers unifying these two tasks even though they are inter-dependent and can mutually boost the performance of each other. In this paper, we proposed to jointly combine these two tasks with a unified Generative Adversarial Networks (GAN) framework to learn noise-aware knowledge graph embedding. Extensive experiments have demonstrated that our approach is superior to existing state-of-the-art algorithms both in regard to knowledge graph completion and error detection.
Tasks	Graph Embedding, Knowledge Graph Completion, Knowledge Graph Embedding, Knowledge Graphs
Published	2020-01-01
URL	https://openreview.net/forum?id=rkgTdkrtPH
PDF	https://openreview.net/pdf?id=rkgTdkrtPH
PWC	https://paperswithcode.com/paper/noigan-noise-aware-knowledge-graph-embedding
Repo
Framework

Inductive representation learning on temporal graphs


Title	Inductive representation learning on temporal graphs
Authors	Anonymous
Abstract	Inductive representation learning on temporal graphs is an important step toward salable machine learning on real-world dynamic networks. The evolving nature of temporal dynamic graphs requires handling new nodes while learning temporal patterns. The node embeddings, which become functions of time under the temporal setting, should capture both static node features and evolving topological structures. Moreover, node and topological features may exhibit temporal patterns that are informative for prediction, of which the temporal node embeddings should also be aware. We propose the temporal graph attention (TGAT) layer to effectively aggregate temporal-topological neighborhood features as well as learning time-feature interactions. For TGAT, we use the self-attention mechanism as the building block and develop the novel functional time encoding technique based on the classical Bochner’s theorem from harmonic alaysis. By stacking TGAT layers, the network learns node embeddings as functions of time and can inductively infer embeddings for both new and observed nodes whenever the graph evolves. The proposed approach handles both node classification and link prediction task, and can be naturally extended to aggregate edge features. We evaluate our method with transductive and inductive tasks under temporal setting with two benchmark and one industrial dataset. Our TGAT model compares favorably to state-of-the-art baselines and prior temporal graph embedding approaches.
Tasks	Graph Embedding, Link Prediction, Node Classification, Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=rJeW1yHYwH
PDF	https://openreview.net/pdf?id=rJeW1yHYwH
PWC	https://paperswithcode.com/paper/inductive-representation-learning-on-temporal
Repo
Framework

GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding


Title	GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding
Authors	Anonymous
Abstract	Graph embedding techniques have been increasingly deployed in a multitude of different applications that involve learning on non-Euclidean data. However, existing graph embedding models either fail to incorporate node attribute information during training or suffer from node attribute noise, which compromises the accuracy. Moreover, very few of them scale to large graphs due to their high computational complexity and memory usage. In this paper we propose GraphZoom, a multi-level framework for improving both accuracy and scalability of unsupervised graph embedding algorithms. GraphZoom first performs graph fusion to generate a new graph that effectively encodes the topology of the original graph and the node attribute information. This fused graph is then repeatedly coarsened into a much smaller graph by merging nodes with high spectral similarities. GraphZoom allows any existing embedding methods to be applied to the coarsened graph, before it progressively refine the embeddings obtained at the coarsest level to increasingly finer graphs. We have evaluated our approach on a number of popular graph datasets for both transductive and inductive tasks. Our experiments show that GraphZoom increases the classification accuracy and significantly reduces the run time compared to state-of-the-art unsupervised embedding methods.
Tasks	Graph Embedding
Published	2020-01-01
URL	https://openreview.net/forum?id=r1lGO0EKDH
PDF	https://openreview.net/pdf?id=r1lGO0EKDH
PWC	https://paperswithcode.com/paper/graphzoom-a-multi-level-spectral-approach-for-1
Repo
Framework

Hardware-aware One-Shot Neural Architecture Search in Coordinate Ascent Framework


Title	Hardware-aware One-Shot Neural Architecture Search in Coordinate Ascent Framework
Authors	Anonymous
Abstract	Designing accurate and efficient convolutional neural architectures for vast amount of hardware is challenging because hardware designs are complex and diverse. This paper addresses the hardware diversity challenge in Neural Architecture Search (NAS). Unlike previous approaches that apply search algorithms on a small, human-designed search space without considering hardware diversity, we propose HURRICANE that explores the automatic hardware-aware search over a much larger search space and a multistep search scheme in coordinate ascent framework, to generate tailored models for different types of hardware. Extensive experiments on ImageNet show that our algorithm consistently achieves a much lower inference latency with a similar or better accuracy than state-of-the-art NAS methods on three types of hardware. Remarkably, HURRICANE achieves a 76.63% top-1 accuracy on ImageNet with a inference latency of only 16.5 ms for DSP, which is a 3.4% higher accuracy and a 6.35x inference speedup than FBNet-iPhoneX. For VPU, HURRICANE achieves a 0.53% higher top-1 accuracy than Proxyless-mobile with a 1.49x speedup. Even for well-studied mobile CPU, HURRICANE achieves a 1.63% higher top-1 accuracy than FBNet-iPhoneX with a comparable inference latency. HURRICANE also reduces the training time by 54.7% on average compared to SinglePath-Oneshot.
Tasks	Neural Architecture Search
Published	2020-01-01
URL	https://openreview.net/forum?id=BJe6BkHYDB
PDF	https://openreview.net/pdf?id=BJe6BkHYDB
PWC	https://paperswithcode.com/paper/hardware-aware-one-shot-neural-architecture
Repo
Framework

Graph Neural Networks For Multi-Image Matching


Title	Graph Neural Networks For Multi-Image Matching
Authors	Anonymous
Abstract	In geometric computer vision applications, multi-image feature matching gives more accurate and robust solutions compared to simple two-image matching. In this work, we formulate multi-image matching as a graph embedding problem, then use a Graph Neural Network to learn an appropriate embedding function for aligning image features. We use cycle consistency to train our network in an unsupervised fashion, since ground truth correspondence can be difficult or expensive to acquire. Geometric consistency losses are added to aid training, though unlike optimization based methods no geometric information is necessary at inference time. To the best of our knowledge, no other works have used graph neural networks for multi-image feature matching. Our experiments show that our method is competitive with other optimization based approaches.
Tasks	Graph Embedding
Published	2020-01-01
URL	https://openreview.net/forum?id=Hkgpnn4YvH
PDF	https://openreview.net/pdf?id=Hkgpnn4YvH
PWC	https://paperswithcode.com/paper/graph-neural-networks-for-multi-image
Repo
Framework

Interpretable Network Structure for Modeling Contextual Dependency


Title	Interpretable Network Structure for Modeling Contextual Dependency
Authors	Anonymous
Abstract	Neural language models have achieved great success in many NLP tasks, to a large extent, due to the ability to capture contextual dependencies among terms in a text. While many efforts have been devoted to empirically explain the connection between the network hyperparameters and the ability to represent the contextual dependency, the theoretical analysis is relatively insufficient. Inspired by the recent research on the use of tensor space to explain the neural network architecture, we explore the interpretable mechanism for neural language models. Specifically, we define the concept of separation rank in the language modeling process, in order to theoretically measure the degree of contextual dependencies in a sentence. Then, we show that the lower bound of such a separation rank can reveal the quantitative relation between the network structure (e.g. depth/width) and the modeling ability for the contextual dependency. Especially, increasing the depth of the neural network can be more effective to improve the ability of modeling contextual dependency. Therefore, it is important to design an adaptive network to compute the adaptive depth in a task. Inspired by Adaptive Computation Time (ACT), we design an adaptive recurrent network based on the separation rank to model contextual dependency. Experiments on various NLP tasks have verified the proposed theoretical analysis. We also test our adaptive recurrent neural network in the sentence classification task, and the experiments show that it can achieve better results than the traditional bidirectional LSTM.
Tasks	Language Modelling, Sentence Classification
Published	2020-01-01
URL	https://openreview.net/forum?id=BkgUB1SYPS
PDF	https://openreview.net/pdf?id=BkgUB1SYPS
PWC	https://paperswithcode.com/paper/interpretable-network-structure-for-modeling
Repo
Framework

Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation


Title	Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation
Authors	Anonymous
Abstract	Sequence generation models are commonly refined with reinforcement learning over user-defined metrics. However, high gradient variance hinders the practical use of this method. To stabilize this method for contextual generation of categorical sequences, we estimate the gradient by evaluating a set of correlated Monte Carlo rollouts. Due to the correlation, the number of unique rollouts is random and adaptive to model uncertainty; those rollouts naturally become baselines for each other, and hence are combined to effectively reduce gradient variance. We also demonstrate the use of correlated MC rollouts for binary-tree softmax models which reduce the high generation cost in large vocabulary scenarios, by decomposing each categorical action into a sequence of binary actions. We evaluate our methods on both neural program synthesis and image captioning. The proposed methods yield lower gradient variance and consistent improvement over related baselines.
Tasks	Image Captioning, Program Synthesis
Published	2020-01-01
URL	https://openreview.net/forum?id=r1lOgyrKDS
PDF	https://openreview.net/pdf?id=r1lOgyrKDS
PWC	https://paperswithcode.com/paper/adaptive-correlated-monte-carlo-for
Repo
Framework