Paper Group NANR 44
Transferable Recognition-Aware Image Processing. Towards a Speech Recognizer for Komi, an Endangered and Low-Resource Uralic Language. PopSGD: Decentralized Stochastic Gradient Descent in the Population Model. Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation. COPHY: Counterfactual Learning of Phys …
Transferable Recognition-Aware Image Processing
Title | Transferable Recognition-Aware Image Processing |
Authors | Anonymous |
Abstract | Recent progress in image recognition has stimulated the deployment of vision systems (e.g. image search engines) at an unprecedented scale. As a result, visual data are now often consumed not only by humans but also by machines. Meanwhile, existing image processing methods only optimize for better human perception, whereas the resulting images may not be accurately recognized by machines. This can be undesirable, e.g., the images can be improperly handled by search engines or recommendation systems. In this work, we propose simple approaches to improve machine interpretability of processed images: optimizing the recognition loss directly on the image processing network or through an intermediate transforming model, a process which we show can also be done in an unsupervised manner. Interestingly, the processing model’s ability to enhance the recognition performance can transfer when evaluated on different recognition models, even if they are of different architectures, trained on different object categories or even different recognition tasks. This makes the solutions applicable even when we do not have the knowledge about future downstream recognition models, e.g., if we are to upload the processed images to the Internet. We conduct comprehensive experiments on three image processing tasks with two downstream recognition tasks, and confirm our method brings substantial accuracy improvement on both the same recognition model and when transferring to a different one, with minimal or no loss in the image processing quality. |
Tasks | Image Retrieval, Recommendation Systems |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1xF7lSYDS |
https://openreview.net/pdf?id=r1xF7lSYDS | |
PWC | https://paperswithcode.com/paper/transferable-recognition-aware-image |
Repo | |
Framework | |
Towards a Speech Recognizer for Komi, an Endangered and Low-Resource Uralic Language
Title | Towards a Speech Recognizer for Komi, an Endangered and Low-Resource Uralic Language |
Authors | Nils Hjortnaes, Niko Partanen, Michael Rie{\ss}ler, Francis M. Tyers |
Abstract | |
Tasks | |
Published | 2020-10-01 |
URL | https://www.aclweb.org/anthology/2020.iwclul-1.5/ |
https://www.aclweb.org/anthology/2020.iwclul-1.5 | |
PWC | https://paperswithcode.com/paper/towards-a-speech-recognizer-for-komi-an |
Repo | |
Framework | |
PopSGD: Decentralized Stochastic Gradient Descent in the Population Model
Title | PopSGD: Decentralized Stochastic Gradient Descent in the Population Model |
Authors | Anonymous |
Abstract | The population model is a standard way to represent large-scale decentralized distributed systems, in which agents with limited computational power interact in randomly chosen pairs, in order to collectively solve global computational tasks. In contrast with synchronous gossip models, nodes are anonymous, lack a common notion of time, and have no control over their scheduling. In this paper, we examine whether large-scale distributed optimization can be performed in this extremely restrictive setting. We introduce and analyze a natural decentralized variant of stochastic gradient descent (SGD), called PopSGD, in which every node maintains a local parameter, and is able to compute stochastic gradients with respect to this parameter. Every pair-wise node interaction performs a stochastic gradient step at each agent, followed by averaging of the two models. We prove that, under standard assumptions, SGD can converge even in this extremely loose, decentralized setting, for both convex and non-convex objectives. Moreover, surprisingly, in the former case, the algorithm can achieve linear speedup in the number of nodes n. Our analysis leverages a new technical connection between decentralized SGD and randomized load balancing, which enables us to tightly bound the concentration of node parameters. We validate our analysis through experiments, showing that PopSGD can achieve convergence and speedup for large-scale distributed learning tasks in a supercomputing environment. |
Tasks | Distributed Optimization |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=BkgqExrYvS |
https://openreview.net/pdf?id=BkgqExrYvS | |
PWC | https://paperswithcode.com/paper/popsgd-decentralized-stochastic-gradient |
Repo | |
Framework | |
Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation
Title | Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation |
Authors | Anonymous |
Abstract | Video prediction models combined with planning algorithms have shown promise in enabling robots to learn to perform many vision-based tasks through only self-supervision, reaching novel goals in cluttered scenes with unseen objects. However, due to the compounding uncertainty in long horizon video prediction and poor scalability of sampling-based planning optimizers, one significant limitation of these approaches is the ability to plan over long horizons to reach distant goals. To that end, we propose a framework for subgoal generation and planning, hierarchical visual foresight (HVF), which generates subgoal images conditioned on a goal image, and uses them for planning. The subgoal images are directly optimized to decompose the task into easy to plan segments, and as a result, we observe that the method naturally identifies semantically meaningful states as subgoals. Across three out of four simulated vision-based manipulation tasks, we find that our method achieves nearly a 200% performance improvement over planning without subgoals and model-free RL approaches. Further, our experiments illustrate that our approach extends to real, cluttered visual scenes. |
Tasks | Video Prediction |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=H1gzR2VKDH |
https://openreview.net/pdf?id=H1gzR2VKDH | |
PWC | https://paperswithcode.com/paper/hierarchical-foresight-self-supervised-1 |
Repo | |
Framework | |
COPHY: Counterfactual Learning of Physical Dynamics
Title | COPHY: Counterfactual Learning of Physical Dynamics |
Authors | Anonymous |
Abstract | Understanding causes and effects in mechanical systems is an essential component of reasoning in the physical world. This work poses a new problem of counterfactual learning of object mechanics from visual input. We develop the COPHY benchmark to assess the capacity of the state-of-the-art models for causal physical reasoning in a synthetic 3D environment and propose a model for learning the physical dynamics in a counterfactual setting. Having observed a mechanical experiment that involves, for example, a falling tower of blocks, a set of bouncing balls or colliding objects, we learn to predict how its outcome is affected by an arbitrary intervention on its initial conditions, such as displacing one of the objects in the scene. The alternative future is predicted given the altered past and a latent representation of the confounders learned by the model in an end-to-end fashion with no supervision. We compare against feedforward video prediction baselines and show how observing alternative experiences allows the network to capture latent physical properties of the environment, which results in significantly more accurate predictions at the level of super human performance. |
Tasks | Video Prediction |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SkeyppEFvS |
https://openreview.net/pdf?id=SkeyppEFvS | |
PWC | https://paperswithcode.com/paper/cophy-counterfactual-learning-of-physical-1 |
Repo | |
Framework | |
A closer look at the approximation capabilities of neural networks
Title | A closer look at the approximation capabilities of neural networks |
Authors | Anonymous |
Abstract | The universal approximation theorem, in one of its most general versions, says that if we consider only continuous activation functions σ, then a standard feedforward neural network with one hidden layer is able to approximate any continuous multivariate function f to any given approximation threshold ε, if and only if σ is non-polynomial. In this paper, we give a direct algebraic proof of the theorem. Furthermore we shall explicitly quantify the number of hidden units required for approximation. Specifically, if X in R^n is compact, then a neural network with n input units, m output units, and a single hidden layer with {n+d choose d} hidden units (independent of m and ε), can uniformly approximate any polynomial function f:X -> R^m whose total degree is at most d for each of its m coordinate functions. In the general case that f is any continuous function, we show there exists some N in O(ε^{-n}) (independent of m), such that N hidden units would suffice to approximate f. We also show that this uniform approximation property (UAP) still holds even under seemingly strong conditions imposed on the weights. We highlight several consequences: (i) For any δ > 0, the UAP still holds if we restrict all non-bias weights w in the last layer to satisfy w < δ. (ii) There exists some λ>0 (depending only on f and σ), such that the UAP still holds if we restrict all non-bias weights w in the first layer to satisfy w>λ. (iii) If the non-bias weights in the first layer are fixed and randomly chosen from a suitable range, then the UAP holds with probability 1. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rkevSgrtPr |
https://openreview.net/pdf?id=rkevSgrtPr | |
PWC | https://paperswithcode.com/paper/a-closer-look-at-the-approximation |
Repo | |
Framework | |
Keyword Spotter Model for Crop Pest and Disease Monitoring from Community Radio Data
Title | Keyword Spotter Model for Crop Pest and Disease Monitoring from Community Radio Data |
Authors | Anonymous |
Abstract | In societies with well developed internet infrastructure, social media is the leading medium of communication for various social issues especially for breaking news situations. In rural Uganda however, public community radio is still a dominant means for news dissemination. Community radio gives audience to the general public especially to individuals living in rural areas, and thus plays an important role in giving a voice to those living in the broadcast area. It is an avenue for participatory communication and a tool relevant in both economic and social development.This is supported by the rise to ubiquity of mobile phones providing access to phone-in or text-in talk shows. In this paper, we describe an approach to analysing the readily available community radio data with machine learning-based speech keyword spotting techniques. We identify the keywords of interest related to agriculture and build models to automatically identify these keywords from audio streams. Our contribution through these techniques is a cost-efficient and effective way to monitor food security concerns particularly in rural areas. Through keyword spotting and radio talk show analysis, issues such as crop diseases, pests, drought and famine can be captured and fed into an early warning system for stakeholders and policy makers. |
Tasks | Keyword Spotting |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=r1gc3lBFPH |
https://openreview.net/pdf?id=r1gc3lBFPH | |
PWC | https://paperswithcode.com/paper/keyword-spotter-model-for-crop-pest-and |
Repo | |
Framework | |
Reinforcement Learning without Ground-Truth State
Title | Reinforcement Learning without Ground-Truth State |
Authors | Anonymous |
Abstract | To perform robot manipulation tasks, a low-dimensional state of the environment typically needs to be estimated. However, designing a state estimator can sometimes be difficult, especially in environments with deformable objects. An alternative is to learn an end-to-end policy that maps directly from high-dimensional sensor inputs to actions. However, if this policy is trained with reinforcement learning, then without a state estimator, it is hard to specify a reward function based on high-dimensional observations. To meet this challenge, we propose a simple indicator reward function for goal-conditioned reinforcement learning: we only give a positive reward when the robot’s observation exactly matches a target goal observation. We show that by relabeling the original goal with the achieved goal to obtain positive rewards (Andrychowicz et al., 2017), we can learn with the indicator reward function even in continuous state spaces. We propose two methods to further speed up convergence with indicator rewards: reward balancing and reward filtering. We show comparable performance between our method and an oracle which uses the ground-truth state for computing rewards. We show that our method can perform complex tasks in continuous state spaces such as rope manipulation from RGB-D images, without knowledge of the ground-truth state. |
Tasks | |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HkeO104tPB |
https://openreview.net/pdf?id=HkeO104tPB | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-without-ground-truth-1 |
Repo | |
Framework | |
Tensor Decompositions for Temporal Knowledge Base Completion
Title | Tensor Decompositions for Temporal Knowledge Base Completion |
Authors | Anonymous |
Abstract | Most algorithms for representation learning and link prediction in relational data have been designed for static data. However, the data they are applied to usually evolves with time, such as friend graphs in social networks or user interactions with items in recommender systems. This is also the case for knowledge bases, which contain facts such as (US, has president, B. Obama, [2009-2017]) that are valid only at certain points in time. For the problem of link prediction under temporal constraints, i.e., answering queries of the form (US, has president, ?, 2012), we propose a solution inspired by the canonical decomposition of tensors of order 4. We introduce new regularization schemes and present an extension of ComplEx that achieves state-of-the-art performance. Additionally, we propose a new dataset for knowledge base completion constructed from Wikidata, larger than previous benchmarks by an order of magnitude, as a new reference for evaluating temporal and non-temporal link prediction methods. |
Tasks | Knowledge Base Completion, Link Prediction, Recommendation Systems, Representation Learning |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rke2P1BFwS |
https://openreview.net/pdf?id=rke2P1BFwS | |
PWC | https://paperswithcode.com/paper/tensor-decompositions-for-temporal-knowledge |
Repo | |
Framework | |
Toward Controllable Text Content Manipulation
Title | Toward Controllable Text Content Manipulation |
Authors | Anonymous |
Abstract | Controlled generation of text is of high practical use. Recent efforts have made impressive progress in generating or editing sentences with given textual attributes (e.g., sentiment). This work studies a new practical setting of text content manipulation. Given a structured record, such as (PLAYER: Lebron, POINTS: 20, ASSISTS: 10), and a reference sentence, such as Kobe easily dropped 30 points, we aim to generate a sentence that accurately describes the full content in the record, with the same writing style (e.g., wording, transitions) of the reference. The problem combines the characteristics of data-to-text generation and style transfer, and is challenging to minimally yet effectively manipulate the text (by rewriting/adding/deleting text portions) to ensure fidelity to the structured content. We derive two datasets from the data-to-text task as our testbed, and develop a neural method with weakly supervised competing objectives and explicit content coverage constraints. Automatic and human evaluations show superiority of our approach over competitive methods including a strong rule-based baseline and prior approaches designed for style transfer. |
Tasks | Data-to-Text Generation, Style Transfer, Text Generation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Skg7TerKPH |
https://openreview.net/pdf?id=Skg7TerKPH | |
PWC | https://paperswithcode.com/paper/toward-controllable-text-content-manipulation |
Repo | |
Framework | |
A Quality-Diversity Controllable GAN for Text Generation
Title | A Quality-Diversity Controllable GAN for Text Generation |
Authors | Anonymous |
Abstract | Text generation is a critical and difficult natural language processing task. Maximum likelihood estimate (MLE) based models have been arguably suffered from exposure bias in the inference stage and thus varieties of language generative adversarial networks (GANs) bypassing this problem have emerged. However, recent study has demonstrated that MLE models can constantly outperform GANs models over quality-diversity space under several metrics. In this paper, we propose a quality-diversity controllable language GAN. |
Tasks | Text Generation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rJlTXxSFPr |
https://openreview.net/pdf?id=rJlTXxSFPr | |
PWC | https://paperswithcode.com/paper/a-quality-diversity-controllable-gan-for-text |
Repo | |
Framework | |
Regularly varying representation for sentence embedding
Title | Regularly varying representation for sentence embedding |
Authors | Anonymous |
Abstract | The dominant approaches to sentence representation in natural language rely on learning embeddings on massive corpuses. The obtained embeddings have desirable properties such as compositionality and distance preservation (sentences with similar meanings have similar representations). In this paper, we develop a novel method for learning an embedding enjoying a dilation invariance property. We propose two algorithms: Orthrus, a classification algorithm, constrains the distribution of the embedded variable to be regularly varying, i.e. multivariate heavy-tail. and uses Extreme Value Theory (EVT) to tackle the classification task on two separate regions: the tail and the bulk. Hydra, a text generation algorithm for dataset augmentation, leverages the invariance property of the embedding learnt by Orthrus to generate coherent sentences with controllable attribute, e.g. positive or negative sentiment. Numerical experiments on synthetic and real text data demonstrate the relevance of the proposed framework. |
Tasks | Sentence Embedding, Text Generation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=HyljzgHtwS |
https://openreview.net/pdf?id=HyljzgHtwS | |
PWC | https://paperswithcode.com/paper/regularly-varying-representation-for-sentence |
Repo | |
Framework | |
Learning Semantic Correspondences from Noisy Data-text Pairs by Local-to-Global Alignments
Title | Learning Semantic Correspondences from Noisy Data-text Pairs by Local-to-Global Alignments |
Authors | Anonymous |
Abstract | Learning semantic correspondence between the structured data (e.g., slot-value pairs) and associated texts is a core problem for many downstream NLP applications, e.g., data-to-text generation. Recent neural generation methods require to use large scale training data. However, the collected data-text pairs for training are usually loosely corresponded, where texts contain additional or contradicted information compare to its paired input. In this paper, we propose a local-to-global alignment (L2GA) framework to learn semantic correspondences from loosely related data-text pairs. First, a local alignment model based on multi-instance learning is applied to build the semantic correspondences within a data-text pair. Then, a global alignment model built on top of a memory guided conditional random field (CRF) layer is designed to exploit dependencies among alignments in the entire training corpus, where the memory is used to integrate the alignment clues provided by the local alignment model. Therefore, it is capable of inducing missing alignments for text spans that are not supported by its imperfect paired input. Experiments on recent restaurant dataset show that our proposed method can improve the alignment accuracy and as a by product, our method is also applicable to induce semantically equivalent training data-text pairs for neural generation models. |
Tasks | Data-to-Text Generation, Text Generation |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=Byx_GeSKPS |
https://openreview.net/pdf?id=Byx_GeSKPS | |
PWC | https://paperswithcode.com/paper/learning-semantic-correspondences-from-noisy |
Repo | |
Framework | |
Ordinary differential equations on graph networks
Title | Ordinary differential equations on graph networks |
Authors | Anonymous |
Abstract | Recently various neural networks have been proposed for irregularly structured data such as graphs and manifolds. To our knowledge, all existing graph networks have discrete depth. Inspired by neural ordinary differential equation (NODE) for data in the Euclidean domain, we extend the idea of continuous-depth models to graph data, and propose graph ordinary differential equation (GODE). The derivative of hidden node states are parameterized with a graph neural network, and the output states are the solution to this ordinary differential equation. We demonstrate two end-to-end methods for efficient training of GODE: (1) indirect back-propagation with the adjoint method; (2) direct back-propagation through the ODE solver, which accurately computes the gradient. We demonstrate that direct backprop outperforms the adjoint method in experiments. We then introduce a family of bijective blocks, which enables $\mathcal{O}(1)$ memory consumption. We demonstrate that GODE can be easily adapted to different existing graph neural networks and improve accuracy. We validate the performance of GODE in both semi-supervised node classification tasks and graph classification tasks. Our GODE model achieves a continuous model in time, memory efficiency, accurate gradient estimation, and generalizability with different graph networks. |
Tasks | Graph Classification, Node Classification |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=SJg9z6VFDr |
https://openreview.net/pdf?id=SJg9z6VFDr | |
PWC | https://paperswithcode.com/paper/ordinary-differential-equations-on-graph |
Repo | |
Framework | |
Understanding Isomorphism Bias in Graph Data Sets
Title | Understanding Isomorphism Bias in Graph Data Sets |
Authors | Anonymous |
Abstract | In recent years there has been a rapid increase in classification methods on graph structured data. Both in graph kernels and graph neural networks, one of the implicit assumptions of successful state-of-the-art models was that incorporating graph isomorphism features into the architecture leads to better empirical performance. However, as we discover in this work, commonly used data sets for graph classification have repeating instances which cause the problem of isomorphism bias, i.e. artificially increasing the accuracy of the models by memorizing target information from the training set. This prevents fair competition of the algorithms and raises a question of the validity of the obtained results. We analyze 54 data sets, previously extensively used for graph-related tasks, on the existence of isomorphism bias, give a set of recommendations to machine learning practitioners to properly set up their models, and open source new data sets for the future experiments. |
Tasks | Graph Classification |
Published | 2020-01-01 |
URL | https://openreview.net/forum?id=rJlUhhVYvS |
https://openreview.net/pdf?id=rJlUhhVYvS | |
PWC | https://paperswithcode.com/paper/understanding-isomorphism-bias-in-graph-data-1 |
Repo | |
Framework | |