Paper Group NANR 78
The function of contextual illusions. PDP: A General Neural Framework for Learning SAT Solvers. Project and Forget: Solving Large Scale Metric Constrained Problems. Visual Explanation for Deep Metric Learning. The Frechet Distance of training and test distribution predicts the generalization gap. End to End Trainable Active Contours via Differentia …
The function of contextual illusions
Title | The function of contextual illusions |
Authors | Anonymous |
Abstract | Many visual illusions are contextual by nature. In the orientation-tilt illusion, the perceived orientation of a central grating is repulsed from or attracted towards the orientation of a surrounding grating. An open question in vision science is whether such illusions reflect basic limitations of the visual system, or if they correspond to corner cases of neural computations that are efficient in everyday settings. Here we develop deep recurrent network architectures that approximate neural circuits linked to contextual illusions. We show that these architectures, which we refer to as gamma-nets, are more sample efficient for learning contour detection than the state of the art, and exhibit an orientation-tilt illusion consistent with human data. Correcting this illusion significantly reduces gamma-net performance by driving it to prefer low-level edges over high-level object boundary contours. Overall, our study suggests that contextual illusions are a byproduct of neural circuits that help biological visual systems achieve robust and efficient perception, and that incorporating such circuits in artificial neural networks can improve computer vision. |
Tasks | Contour Detection |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=H1gB4RVKvB |
https://openreview.net/pdf?id=H1gB4RVKvB | |
PWC | https://paperswithcode.com/paper/the-function-of-contextual-illusions |
Repo | |
Framework | |
PDP: A General Neural Framework for Learning SAT Solvers
Title | PDP: A General Neural Framework for Learning SAT Solvers |
Authors | Anonymous |
Abstract | There have been recent efforts for incorporating Graph Neural Network models for learning fully neural solvers for constraint satisfaction problems (CSP) and particularly Boolean satisfiability (SAT). Despite the unique representational power of these neural embedding models, it is not clear to what extent they actually learn a search strategy vs. statistical biases in the training data. On the other hand, by fixing the search strategy (e.g. greedy search), one would effectively deprive the neural models of learning better strategies than those given. In this paper, we propose a generic neural framework for learning SAT solvers (and in general any CSP solver) that can be described in terms of probabilistic inference and yet learn search strategies beyond greedy search. Our framework is based on the idea of propagation, decimation and prediction (and hence the name PDP) in graphical models, and can be trained directly toward solving SAT in a fully unsupervised manner via energy minimization, as shown in the paper. Our experimental results demonstrate the effectiveness of our framework for SAT solving compared to both neural and the industrial baselines. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=S1xaf6VFPB |
https://openreview.net/pdf?id=S1xaf6VFPB | |
PWC | https://paperswithcode.com/paper/pdp-a-general-neural-framework-for-learning-1 |
Repo | |
Framework | |
Project and Forget: Solving Large Scale Metric Constrained Problems
Title | Project and Forget: Solving Large Scale Metric Constrained Problems |
Authors | Anonymous |
Abstract | Given a set of distances amongst points, determining what metric representation is most “consistent” with the input distances or the metric that captures the relevant geometric features of the data is a key step in many machine learning algorithms. In this paper, we focus on metric constrained problems, a class of optimization problems with metric constraints. In particular, we identify three types of metric constrained problems: metric nearness Brickell et al. (2008), weighted correlation clustering on general graphs Bansal et al. (2004), and metric learning Bellet et al. (2013); Davis et al. (2007). Because of the large number of constraints in these problems, however, researchers have been forced to restrict either the kinds of metrics learned or the size of the problem that can be solved. We provide an algorithm, PROJECT AND FORGET, that uses Bregman projections with cutting planes, to solve metric constrained problems with many (possibly exponentially) inequality constraints. We also prove that our algorithm converges to the global optimal solution. Additionally, we show that the optimality error (L2 distance of the current iterate to the optimal) asymptotically decays at an exponential rate. We show that using our method we can solve large problem instances of three types of metric constrained problems, out-performing all state of the art methods with respect to CPU times and problem sizes. |
Tasks | Metric Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJeX2aVFwH |
https://openreview.net/pdf?id=SJeX2aVFwH | |
PWC | https://paperswithcode.com/paper/project-and-forget-solving-large-scale-metric |
Repo | |
Framework | |
Visual Explanation for Deep Metric Learning
Title | Visual Explanation for Deep Metric Learning |
Authors | Anonymous |
Abstract | This work explores the visual explanation for deep metric learning and its applications. As an important problem for learning representation, metric learning has attracted much attention recently, while the interpretation of such model is not as well studied as classification. To this end, we propose an intuitive idea to show where contributes the most to the overall similarity of two input images by decomposing the final activation. Instead of only providing the overall activation map of each image, we propose to generate point-to-point activation intensity between two images so that the relationship between different regions is uncovered. We show that the proposed framework can be directly deployed to a large range of metric learning applications and provides valuable information for understanding the model. Furthermore, our experiments show its effectiveness on two potential applications, i.e. cross-view pattern discovery and interactive retrieval. |
Tasks | Metric Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=S1xLuRVFvr |
https://openreview.net/pdf?id=S1xLuRVFvr | |
PWC | https://paperswithcode.com/paper/visual-explanation-for-deep-metric-learning-1 |
Repo | |
Framework | |
The Frechet Distance of training and test distribution predicts the generalization gap
Title | The Frechet Distance of training and test distribution predicts the generalization gap |
Authors | Anonymous |
Abstract | Learning theory tells us that more data is better when minimizing the generalization error of identically distributed training and test sets. However, when training and test distribution differ, this distribution shift can have a significant effect. With a novel perspective on function transfer learning, we are able to lower bound the change of performance when transferring from training to test set with the Wasserstein distance between the embedded training and test set distribution. We find that there is a trade-off affecting performance between how invariant a function is to changes in training and test distribution and how large this shift in distribution is. Empirically across several data domains, we substantiate this viewpoint by showing that test performance correlates strongly with the distance in data distributions between training and test set. Complementary to the popular belief that more data is always better, our results highlight the utility of also choosing a training data distribution that is close to the test data distribution when the learned function is not invariant to such changes. |
Tasks | Transfer Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJgSflHKDr |
https://openreview.net/pdf?id=SJgSflHKDr | |
PWC | https://paperswithcode.com/paper/the-frechet-distance-of-training-and-test |
Repo | |
Framework | |
End to End Trainable Active Contours via Differentiable Rendering
Title | End to End Trainable Active Contours via Differentiable Rendering |
Authors | Anonymous |
Abstract | We present an image segmentation method that iteratively evolves a polygon. At each iteration, the vertices of the polygon are displaced based on the local value of a 2D shift map that is inferred from the input image via an encoder-decoder architecture. The main training loss that is used is the difference between the polygon shape and the ground truth segmentation mask. The network employs a neural renderer to create the polygon from its vertices, making the process fully differentiable. We demonstrate that our method outperforms the state of the art segmentation networks and deep active contour solutions in a variety of benchmarks, including medical imaging and aerial images. |
Tasks | Semantic Segmentation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rkxawlHKDr |
https://openreview.net/pdf?id=rkxawlHKDr | |
PWC | https://paperswithcode.com/paper/end-to-end-trainable-active-contours-via |
Repo | |
Framework | |
Local Label Propagation for Large-Scale Semi-Supervised Learning
Title | Local Label Propagation for Large-Scale Semi-Supervised Learning |
Authors | Anonymous |
Abstract | A significant issue in training deep neural networks to solve supervised learning tasks is the need for large numbers of labeled datapoints. The goal of semisupervised learning is to leverage ubiquitous unlabeled data, together with small quantities of labeled data, to achieve high task performance. Though substantial recent progress has been made in developing semi-supervised algorithms that are effective for comparatively small datasets, many of these techniques do not scale readily to the large (unlabeled) datasets characteristic of real-world applications. In this paper we introduce a novel approach to scalable semi-supervised learning, called Local Label Propagation (LLP). Extending ideas from recent work on unsupervised embedding learning, LLP first embeds datapoints, labeled and otherwise, in a common latent space using a deep neural network. It then propagates pseudolabels from known to unknown datapoints in a manner that depends on the local geometry of the embedding, taking into account both inter-point distance and local data density as a weighting on propagation likelihood. The parameters of the deep embedding are then trained to simultaneously maximize pseudolabel categorization performance as well as a metric of the clustering of datapoints within each psuedo-label group, iteratively alternating stages of network training and label propagation. We illustrate the utility of the LLP method on the ImageNet dataset, achieving results that outperform previous state-of-the-art scalable semi-supervised learning algorithms by large margins, consistently across a wide variety of training regimes. We also show that the feature representation learned with LLP transfers well to scene recognition in the Places 205 dataset. |
Tasks | Scene Recognition |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=B1x2eCNFvH |
https://openreview.net/pdf?id=B1x2eCNFvH | |
PWC | https://paperswithcode.com/paper/local-label-propagation-for-large-scale-semi-1 |
Repo | |
Framework | |
Lyceum: An efficient and scalable ecosystem for robot learning
Title | Lyceum: An efficient and scalable ecosystem for robot learning |
Authors | Anonymous |
Abstract | We introduce Lyceum, a high-performance computational ecosystem for robotlearning. Lyceum is built on top of the Julia programming language and theMuJoCo physics simulator, combining the ease-of-use of a high-level program-ming language with the performance of native C. Lyceum is up to 10-20Xfaster compared to other popular abstractions like OpenAI’sGymand Deep-Mind’sdm-control. This substantially reduces training time for various re-inforcement learning algorithms; and is also fast enough to support real-timemodel predictive control with physics simulators. Lyceum has a straightfor-ward API and supports parallel computation across multiple cores or machines.The code base, tutorials, and demonstration videos can be found at: https://sites.google.com/view/lyceum-anon. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SyxytxBFDr |
https://openreview.net/pdf?id=SyxytxBFDr | |
PWC | https://paperswithcode.com/paper/lyceum-an-efficient-and-scalable-ecosystem |
Repo | |
Framework | |
NoiGAN: NOISE AWARE KNOWLEDGE GRAPH EMBEDDING WITH GAN
Title | NoiGAN: NOISE AWARE KNOWLEDGE GRAPH EMBEDDING WITH GAN |
Authors | Anonymous |
Abstract | Knowledge graph has gained increasing attention to recent years for its successful applications of numerous tasks. Despite the rapid growth of knowledge construction, knowledge graphs still suffer from severe incompletion and inevitably involve various kinds of errors. Several attempts have been made to complete knowledge graph as well as to detect noise. However, none of them considers unifying these two tasks even though they are inter-dependent and can mutually boost the performance of each other. In this paper, we proposed to jointly combine these two tasks with a unified Generative Adversarial Networks (GAN) framework to learn noise-aware knowledge graph embedding. Extensive experiments have demonstrated that our approach is superior to existing state-of-the-art algorithms both in regard to knowledge graph completion and error detection. |
Tasks | Graph Embedding, Knowledge Graph Completion, Knowledge Graph Embedding, Knowledge Graphs |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rkgTdkrtPH |
https://openreview.net/pdf?id=rkgTdkrtPH | |
PWC | https://paperswithcode.com/paper/noigan-noise-aware-knowledge-graph-embedding |
Repo | |
Framework | |
Inductive representation learning on temporal graphs
Title | Inductive representation learning on temporal graphs |
Authors | Anonymous |
Abstract | Inductive representation learning on temporal graphs is an important step toward salable machine learning on real-world dynamic networks. The evolving nature of temporal dynamic graphs requires handling new nodes while learning temporal patterns. The node embeddings, which become functions of time under the temporal setting, should capture both static node features and evolving topological structures. Moreover, node and topological features may exhibit temporal patterns that are informative for prediction, of which the temporal node embeddings should also be aware. We propose the temporal graph attention (TGAT) layer to effectively aggregate temporal-topological neighborhood features as well as learning time-feature interactions. For TGAT, we use the self-attention mechanism as the building block and develop the novel functional time encoding technique based on the classical Bochner’s theorem from harmonic alaysis. By stacking TGAT layers, the network learns node embeddings as functions of time and can inductively infer embeddings for both new and observed nodes whenever the graph evolves. The proposed approach handles both node classification and link prediction task, and can be naturally extended to aggregate edge features. We evaluate our method with transductive and inductive tasks under temporal setting with two benchmark and one industrial dataset. Our TGAT model compares favorably to state-of-the-art baselines and prior temporal graph embedding approaches. |
Tasks | Graph Embedding, Link Prediction, Node Classification, Representation Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rJeW1yHYwH |
https://openreview.net/pdf?id=rJeW1yHYwH | |
PWC | https://paperswithcode.com/paper/inductive-representation-learning-on-temporal |
Repo | |
Framework | |
GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding
Title | GraphZoom: A Multi-level Spectral Approach for Accurate and Scalable Graph Embedding |
Authors | Anonymous |
Abstract | Graph embedding techniques have been increasingly deployed in a multitude of different applications that involve learning on non-Euclidean data. However, existing graph embedding models either fail to incorporate node attribute information during training or suffer from node attribute noise, which compromises the accuracy. Moreover, very few of them scale to large graphs due to their high computational complexity and memory usage. In this paper we propose GraphZoom, a multi-level framework for improving both accuracy and scalability of unsupervised graph embedding algorithms. GraphZoom first performs graph fusion to generate a new graph that effectively encodes the topology of the original graph and the node attribute information. This fused graph is then repeatedly coarsened into a much smaller graph by merging nodes with high spectral similarities. GraphZoom allows any existing embedding methods to be applied to the coarsened graph, before it progressively refine the embeddings obtained at the coarsest level to increasingly finer graphs. We have evaluated our approach on a number of popular graph datasets for both transductive and inductive tasks. Our experiments show that GraphZoom increases the classification accuracy and significantly reduces the run time compared to state-of-the-art unsupervised embedding methods. |
Tasks | Graph Embedding |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1lGO0EKDH |
https://openreview.net/pdf?id=r1lGO0EKDH | |
PWC | https://paperswithcode.com/paper/graphzoom-a-multi-level-spectral-approach-for-1 |
Repo | |
Framework | |
Hardware-aware One-Shot Neural Architecture Search in Coordinate Ascent Framework
Title | Hardware-aware One-Shot Neural Architecture Search in Coordinate Ascent Framework |
Authors | Anonymous |
Abstract | Designing accurate and efficient convolutional neural architectures for vast amount of hardware is challenging because hardware designs are complex and diverse. This paper addresses the hardware diversity challenge in Neural Architecture Search (NAS). Unlike previous approaches that apply search algorithms on a small, human-designed search space without considering hardware diversity, we propose HURRICANE that explores the automatic hardware-aware search over a much larger search space and a multistep search scheme in coordinate ascent framework, to generate tailored models for different types of hardware. Extensive experiments on ImageNet show that our algorithm consistently achieves a much lower inference latency with a similar or better accuracy than state-of-the-art NAS methods on three types of hardware. Remarkably, HURRICANE achieves a 76.63% top-1 accuracy on ImageNet with a inference latency of only 16.5 ms for DSP, which is a 3.4% higher accuracy and a 6.35x inference speedup than FBNet-iPhoneX. For VPU, HURRICANE achieves a 0.53% higher top-1 accuracy than Proxyless-mobile with a 1.49x speedup. Even for well-studied mobile CPU, HURRICANE achieves a 1.63% higher top-1 accuracy than FBNet-iPhoneX with a comparable inference latency. HURRICANE also reduces the training time by 54.7% on average compared to SinglePath-Oneshot. |
Tasks | Neural Architecture Search |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BJe6BkHYDB |
https://openreview.net/pdf?id=BJe6BkHYDB | |
PWC | https://paperswithcode.com/paper/hardware-aware-one-shot-neural-architecture |
Repo | |
Framework | |
Graph Neural Networks For Multi-Image Matching
Title | Graph Neural Networks For Multi-Image Matching |
Authors | Anonymous |
Abstract | In geometric computer vision applications, multi-image feature matching gives more accurate and robust solutions compared to simple two-image matching. In this work, we formulate multi-image matching as a graph embedding problem, then use a Graph Neural Network to learn an appropriate embedding function for aligning image features. We use cycle consistency to train our network in an unsupervised fashion, since ground truth correspondence can be difficult or expensive to acquire. Geometric consistency losses are added to aid training, though unlike optimization based methods no geometric information is necessary at inference time. To the best of our knowledge, no other works have used graph neural networks for multi-image feature matching. Our experiments show that our method is competitive with other optimization based approaches. |
Tasks | Graph Embedding |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Hkgpnn4YvH |
https://openreview.net/pdf?id=Hkgpnn4YvH | |
PWC | https://paperswithcode.com/paper/graph-neural-networks-for-multi-image |
Repo | |
Framework | |
Interpretable Network Structure for Modeling Contextual Dependency
Title | Interpretable Network Structure for Modeling Contextual Dependency |
Authors | Anonymous |
Abstract | Neural language models have achieved great success in many NLP tasks, to a large extent, due to the ability to capture contextual dependencies among terms in a text. While many efforts have been devoted to empirically explain the connection between the network hyperparameters and the ability to represent the contextual dependency, the theoretical analysis is relatively insufficient. Inspired by the recent research on the use of tensor space to explain the neural network architecture, we explore the interpretable mechanism for neural language models. Specifically, we define the concept of separation rank in the language modeling process, in order to theoretically measure the degree of contextual dependencies in a sentence. Then, we show that the lower bound of such a separation rank can reveal the quantitative relation between the network structure (e.g. depth/width) and the modeling ability for the contextual dependency. Especially, increasing the depth of the neural network can be more effective to improve the ability of modeling contextual dependency. Therefore, it is important to design an adaptive network to compute the adaptive depth in a task. Inspired by Adaptive Computation Time (ACT), we design an adaptive recurrent network based on the separation rank to model contextual dependency. Experiments on various NLP tasks have verified the proposed theoretical analysis. We also test our adaptive recurrent neural network in the sentence classification task, and the experiments show that it can achieve better results than the traditional bidirectional LSTM. |
Tasks | Language Modelling, Sentence Classification |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BkgUB1SYPS |
https://openreview.net/pdf?id=BkgUB1SYPS | |
PWC | https://paperswithcode.com/paper/interpretable-network-structure-for-modeling |
Repo | |
Framework | |
Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation
Title | Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation |
Authors | Anonymous |
Abstract | Sequence generation models are commonly refined with reinforcement learning over user-defined metrics. However, high gradient variance hinders the practical use of this method. To stabilize this method for contextual generation of categorical sequences, we estimate the gradient by evaluating a set of correlated Monte Carlo rollouts. Due to the correlation, the number of unique rollouts is random and adaptive to model uncertainty; those rollouts naturally become baselines for each other, and hence are combined to effectively reduce gradient variance. We also demonstrate the use of correlated MC rollouts for binary-tree softmax models which reduce the high generation cost in large vocabulary scenarios, by decomposing each categorical action into a sequence of binary actions. We evaluate our methods on both neural program synthesis and image captioning. The proposed methods yield lower gradient variance and consistent improvement over related baselines. |
Tasks | Image Captioning, Program Synthesis |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1lOgyrKDS |
https://openreview.net/pdf?id=r1lOgyrKDS | |
PWC | https://paperswithcode.com/paper/adaptive-correlated-monte-carlo-for |
Repo | |
Framework | |