Paper Group AWR 187
Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks. Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning. Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text. Event-based Vision: A Survey. Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Gen …
Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks
Title | Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks |
Authors | Eunwoo Kim, Chanho Ahn, Philip H. S. Torr, Songhwai Oh |
Abstract | Deep networks consume a large amount of memory by their nature. A natural question arises can we reduce that memory requirement whilst maintaining performance. In particular, in this work we address the problem of memory efficient learning for multiple tasks. To this end, we propose a novel network architecture producing multiple networks of different configurations, termed deep virtual networks (DVNs), for different tasks. Each DVN is specialized for a single task and structured hierarchically. The hierarchical structure, which contains multiple levels of hierarchy corresponding to different numbers of parameters, enables multiple inference for different memory budgets. The building block of a deep virtual network is based on a disjoint collection of parameters of a network, which we call a unit. The lowest level of hierarchy in a deep virtual network is a unit, and higher levels of hierarchy contain lower levels’ units and other additional units. Given a budget on the number of parameters, a different level of a deep virtual network can be chosen to perform the task. A unit can be shared by different DVNs, allowing multiple DVNs in a single network. In addition, shared units provide assistance to the target task with additional knowledge learned from another tasks. This cooperative configuration of DVNs makes it possible to handle different tasks in a memory-aware manner. Our experiments show that the proposed method outperforms existing approaches for multiple tasks. Notably, ours is more efficient than others as it allows memory-aware inference for all tasks. |
Tasks | |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04562v1 |
http://arxiv.org/pdf/1904.04562v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-virtual-networks-for-memory-efficient |
Repo | https://github.com/niceday15/deep-virtual-network-cifar |
Framework | tf |
Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
Title | Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning |
Authors | Shichen Liu, Tianye Li, Weikai Chen, Hao Li |
Abstract | Rendering bridges the gap between 2D vision and 3D scenes by simulating the physical process of image formation. By inverting such renderer, one can think of a learning approach to infer 3D information from 2D images. However, standard graphics renderers involve a fundamental discretization step called rasterization, which prevents the rendering process to be differentiable, hence able to be learned. Unlike the state-of-the-art differentiable renderers, which only approximate the rendering gradient in the back propagation, we propose a truly differentiable rendering framework that is able to (1) directly render colorized mesh using differentiable functions and (2) back-propagate efficient supervision signals to mesh vertices and their attributes from various forms of image representations, including silhouette, shading and color images. The key to our framework is a novel formulation that views rendering as an aggregation function that fuses the probabilistic contributions of all mesh triangles with respect to the rendered pixels. Such formulation enables our framework to flow gradients to the occluded and far-range vertices, which cannot be achieved by the previous state-of-the-arts. We show that by using the proposed renderer, one can achieve significant improvement in 3D unsupervised single-view reconstruction both qualitatively and quantitatively. Experiments also demonstrate that our approach is able to handle the challenging tasks in image-based shape fitting, which remain nontrivial to existing differentiable renderers. |
Tasks | 3D Object Reconstruction, Single-View 3D Reconstruction |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.01786v1 |
http://arxiv.org/pdf/1904.01786v1.pdf | |
PWC | https://paperswithcode.com/paper/soft-rasterizer-a-differentiable-renderer-for |
Repo | https://github.com/ShichenLiu/SoftRas |
Framework | pytorch |
Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text
Title | Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text |
Authors | Bidisha Samanta, Niloy Ganguly, Soumen Chakrabarti |
Abstract | Multilingual writers and speakers often alternate between two languages in a single discourse, a practice called “code-switching”. Existing sentiment detection methods are usually trained on sentiment-labeled monolingual text. Manually labeled code-switched text, especially involving minority languages, is extremely rare. Consequently, the best monolingual methods perform relatively poorly on code-switched text. We present an effective technique for synthesizing labeled code-switched text from labeled monolingual text, which is more readily available. The idea is to replace carefully selected subtrees of constituency parses of sentences in the resource-rich language with suitable token spans selected from automatic translations to the resource-poor language. By augmenting scarce human-labeled code-switched text with plentiful synthetic code-switched text, we achieve significant improvements in sentiment labeling accuracy (1.5%, 5.11%, 7.20%) for three different language pairs (English-Hindi, English-Spanish and English-Bengali). We also get significant gains for hate speech detection: 4% improvement using only synthetic text and 6% if augmented with real text. |
Tasks | Hate Speech Detection |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05725v1 |
https://arxiv.org/pdf/1906.05725v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-sentiment-detection-via-label |
Repo | https://github.com/bidishasamantakgp/2019_CSGen_ACL |
Framework | tf |
Event-based Vision: A Survey
Title | Event-based Vision: A Survey |
Authors | Guillermo Gallego, Tobi Delbruck, Garrick Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew Davison, Joerg Conradt, Kostas Daniilidis, Davide Scaramuzza |
Abstract | Event cameras are bio-inspired sensors that work radically different from traditional cameras. Instead of capturing images at a fixed rate, they measure per-pixel brightness changes asynchronously. This results in a stream of events, which encode the time, location and sign of the brightness changes. Event cameras posses outstanding properties compared to traditional cameras: very high dynamic range (140 dB vs. 60 dB), high temporal resolution (in the order of microseconds), low power consumption, and do not suffer from motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as high speed and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world. |
Tasks | Event-based vision |
Published | 2019-04-17 |
URL | https://arxiv.org/abs/1904.08405v2 |
https://arxiv.org/pdf/1904.08405v2.pdf | |
PWC | https://paperswithcode.com/paper/event-based-vision-a-survey |
Repo | https://github.com/uzh-rpg/event-based_vision_resources |
Framework | none |
Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation
Title | Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation |
Authors | Amit Moryossef, Yoav Goldberg, Ido Dagan |
Abstract | Data-to-text generation can be conceptually divided into two parts: ordering and structuring the information (planning), and generating fluent language describing the information (realization). Modern neural generation systems conflate these two steps into a single end-to-end differentiable system. We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on realization. For training a plan-to-text generator, we present a method for matching reference texts to their corresponding text plans. For inference time, we describe a method for selecting high-quality text plans for new inputs. We implement and evaluate our approach on the WebNLG benchmark. Our results demonstrate that decoupling text planning from neural realization indeed improves the system’s reliability and adequacy while maintaining fluent output. We observe improvements both in BLEU scores and in manual evaluations. Another benefit of our approach is the ability to output diverse realizations of the same input, paving the way to explicit control over the generated text structure. |
Tasks | Data-to-Text Generation, Graph-to-Sequence, Text Generation |
Published | 2019-04-06 |
URL | http://arxiv.org/abs/1904.03396v2 |
http://arxiv.org/pdf/1904.03396v2.pdf | |
PWC | https://paperswithcode.com/paper/step-by-step-separating-planning-from |
Repo | https://github.com/AmitMY/chimera |
Framework | none |
ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations
Title | ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations |
Authors | Ekagra Ranjan, Soumya Sanyal, Partha Pratim Talukdar |
Abstract | Graph Neural Networks (GNN) have been shown to work effectively for modeling graph structured data to solve tasks such as node classification, link prediction and graph classification. There has been some recent progress in defining the notion of pooling in graphs whereby the model tries to generate a graph level representation by downsampling and summarizing the information present in the nodes. Existing pooling methods either fail to effectively capture the graph substructure or do not easily scale to large graphs. In this work, we propose ASAP (Adaptive Structure Aware Pooling), a sparse and differentiable pooling method that addresses the limitations of previous graph pooling architectures. ASAP utilizes a novel self-attention network along with a modified GNN formulation to capture the importance of each node in a given graph. It also learns a sparse soft cluster assignment for nodes at each layer to effectively pool the subgraphs to form the pooled graph. Through extensive experiments on multiple datasets and theoretical analysis, we motivate our choice of the components used in ASAP. Our experimental results show that combining existing GNN architectures with ASAP leads to state-of-the-art results on multiple graph classification benchmarks. ASAP has an average improvement of 4%, compared to current sparse hierarchical state-of-the-art method. |
Tasks | Graph Classification, Link Prediction, Node Classification |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07979v3 |
https://arxiv.org/pdf/1911.07979v3.pdf | |
PWC | https://paperswithcode.com/paper/asap-adaptive-structure-aware-pooling-for |
Repo | https://github.com/malllabiisc/ASAP |
Framework | pytorch |
SignalTrain: Profiling Audio Compressors with Deep Neural Networks
Title | SignalTrain: Profiling Audio Compressors with Deep Neural Networks |
Authors | Scott H. Hawley, Benjamin Colburn, Stylianos I. Mimilakis |
Abstract | In this work we present a data-driven approach for predicting the behavior of (i.e., profiling) a given non-linear audio signal processing effect (henceforth “audio effect”). Our objective is to learn a mapping function that maps the unprocessed audio to the processed by the audio effect to be profiled, using time-domain samples. To that aim, we employ a deep auto-encoder model that is conditioned on both time-domain samples and the control parameters of the target audio effect. As a test-case study, we focus on the offline profiling of two dynamic range compression audio effects, one software-based and the other analog. Compressors were chosen because they are a widely used and important set of effects and because their parameterized nonlinear time-dependent nature makes them a challenging problem for a system aiming to profile “general” audio effects. Results from our experimental procedure show that the primary functional and auditory characteristics of the compressors can be captured, however there is still sufficient audible noise to merit further investigation before such methods are applied to real-world audio processing workflows. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11928v2 |
https://arxiv.org/pdf/1905.11928v2.pdf | |
PWC | https://paperswithcode.com/paper/signaltrain-profiling-audio-compressors-with |
Repo | https://github.com/drscotthawley/signaltrain |
Framework | pytorch |
HighRES: Highlight-based Reference-less Evaluation of Summarization
Title | HighRES: Highlight-based Reference-less Evaluation of Summarization |
Authors | Hardy, Shashi Narayan, Andreas Vlachos |
Abstract | There has been substantial progress in summarization research enabled by the availability of novel, often large-scale, datasets and recent advances on neural network-based approaches. However, manual evaluation of the system generated summaries is inconsistent due to the difficulty the task poses to human non-expert readers. To address this issue, we propose a novel approach for manual evaluation, Highlight-based Reference-less Evaluation of Summarization (HighRES), in which summaries are assessed by multiple annotators against the source document via manually highlighted salient content in the latter. Thus summary assessment on the source document by human judges is facilitated, while the highlights can be used for evaluating multiple systems. To validate our approach we employ crowd-workers to augment with highlights a recently proposed dataset and compare two state-of-the-art systems. We demonstrate that HighRES improves inter-annotator agreement in comparison to using the source document directly, while they help emphasize differences among systems that would be ignored under other evaluation approaches. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01361v1 |
https://arxiv.org/pdf/1906.01361v1.pdf | |
PWC | https://paperswithcode.com/paper/highres-highlight-based-reference-less |
Repo | https://github.com/sheffieldnlp/highres |
Framework | none |
Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization
Title | Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization |
Authors | Måns Larsson, Erik Stenborg, Carl Toft, Lars Hammarstrand, Torsten Sattler, Fredrik Kahl |
Abstract | Long-term visual localization is the problem of estimating the camera pose of a given query image in a scene whose appearance changes over time. It is an important problem in practice, for example, encountered in autonomous driving. In order to gain robustness to such changes, long-term localization approaches often use segmantic segmentations as an invariant scene representation, as the semantic meaning of each scene part should not be affected by seasonal and other changes. However, these representations are typically not very discriminative due to the limited number of available classes. In this paper, we propose a new neural network, the Fine-Grained Segmentation Network (FGSN), that can be used to provide image segmentations with a larger number of labels and can be trained in a self-supervised fashion. In addition, we show how FGSNs can be trained to output consistent labels across seasonal changes. We demonstrate through extensive experiments that integrating the fine-grained segmentations produced by our FGSNs into existing localization algorithms leads to substantial improvements in localization performance. |
Tasks | Autonomous Driving, Visual Localization |
Published | 2019-08-18 |
URL | https://arxiv.org/abs/1908.06387v1 |
https://arxiv.org/pdf/1908.06387v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-segmentation-networks-self |
Repo | https://github.com/maunzzz/fine-grained-segmentation-networks |
Framework | pytorch |
TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text
Title | TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text |
Authors | Fady Medhat, Mahnaz Mohammadi, Sardar Jaf, Chris G. Willcocks, Toby P. Breckon, Peter Matthews, Andrew Stephen McGough, Georgios Theodoropoulos, Boguslaw Obara |
Abstract | Handling large corpuses of documents is of significant importance in many fields, no more so than in the areas of crime investigation and defence, where an organisation may be presented with a large volume of scanned documents which need to be processed in a finite time. However, this problem is exacerbated both by the volume, in terms of scanned documents and the complexity of the pages, which need to be processed. Often containing many different elements, which each need to be processed and understood. Text recognition, which is a primary task of this process, is usually dependent upon the type of text, being either handwritten or machine-printed. Accordingly, the recognition involves prior classification of the text category, before deciding on the recognition method to be applied. This poses a more challenging task if a document contains both handwritten and machine-printed text. In this work, we present a generic process flow for text recognition in scanned documents containing mixed handwritten and machine-printed text without the need to classify text in advance. We realize the proposed process flow using several open-source image processing and text recognition packages1. The evaluation is performed using a specially developed variant, presented in this work, of the IAM handwriting database, where we achieve an average transcription accuracy of nearly 80% for pages containing both printed and handwritten text. |
Tasks | |
Published | 2019-04-28 |
URL | http://arxiv.org/abs/1904.12387v1 |
http://arxiv.org/pdf/1904.12387v1.pdf | |
PWC | https://paperswithcode.com/paper/tmixt-a-process-flow-for-transcribing-mixed |
Repo | https://github.com/fadymedhat/TMIXT |
Framework | none |
ESA: Entity Summarization with Attention
Title | ESA: Entity Summarization with Attention |
Authors | Dongjun Wei, Yaxin Liu |
Abstract | Entity summarization aims at creating brief but informative descriptions of entities from knowledge graphs. While previous work mostly focused on traditional techniques such as clustering algorithms and graph models, we ask how to apply deep learning methods into this task. In this paper we propose ESA, a neural network with supervised attention mechanisms for entity summarization. Specifically, we calculate attention weights for facts in each entity, and rank facts to generate reliable summaries. We explore techniques to solve difficult learning problems presented by the ESA, and demonstrate the effectiveness of our model in comparison with the state-of-the-art methods. Experimental results show that our model improves the quality of the entity summaries in both F-measure and MAP. |
Tasks | Knowledge Graphs |
Published | 2019-05-25 |
URL | https://arxiv.org/abs/1905.10625v2 |
https://arxiv.org/pdf/1905.10625v2.pdf | |
PWC | https://paperswithcode.com/paper/esa-entity-summarization-with-attention |
Repo | https://github.com/WeiDongjunGabriel/ESA |
Framework | pytorch |
Trinity of Pixel Enhancement: a Joint Solution for Demosaicking, Denoising and Super-Resolution
Title | Trinity of Pixel Enhancement: a Joint Solution for Demosaicking, Denoising and Super-Resolution |
Authors | Guocheng Qian, Jinjin Gu, Jimmy S. Ren, Chao Dong, Furong Zhao, Juan Lin |
Abstract | Demosaicing, denoising and super-resolution (SR) are of practical importance in digital image processing and have been studied independently in the passed decades. Despite the recent improvement of learning-based image processing methods in image quality, there lacks enough analysis into their interactions and characteristics under a realistic setting of the mixture problem of demosaicing, denoising and SR. In existing solutions, these tasks are simply combined to obtain a high-resolution image from a low-resolution raw mosaic image, resulting in a performance drop of the final image quality. In this paper, we first rethink the mixture problem from a holistic perspective and then propose the Trinity Enhancement Network (TENet), a specially designed learning-based method for the mixture problem, which adopts a novel image processing pipeline order and a joint learning strategy. In order to obtain the correct color sampling for training, we also contribute a new dataset namely PixelShift200, which consists of high-quality full color sampled real-world images using the advanced pixel shift technique. Experiments demonstrate that our TENet is superior to existing solutions in both quantitative and qualitative perspective. Our experiments also show the necessity of the proposed PixelShift200 dataset. |
Tasks | Demosaicking, Denoising, Super-Resolution |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02538v1 |
https://arxiv.org/pdf/1905.02538v1.pdf | |
PWC | https://paperswithcode.com/paper/trinity-of-pixel-enhancement-a-joint-solution |
Repo | https://github.com/lutxyl/Deblur-Denoising-Hyperspectral |
Framework | pytorch |
MAGSAC++, a fast, reliable and accurate robust estimator
Title | MAGSAC++, a fast, reliable and accurate robust estimator |
Authors | Daniel Barath, Jana Noskova, Maksym Ivashechkin, Jiri Matas |
Abstract | A new method for robust estimation, MAGSAC++, is proposed. It introduces a new model quality (scoring) function that does not require the inlier-outlier decision, and a novel marginalization procedure formulated as an iteratively re-weighted least-squares approach. We also propose a new sampler, Progressive NAPSAC, for RANSAC-like robust estimators. Exploiting the fact that nearby points often originate from the same model in real-world data, it finds local structures earlier than global samplers. The progressive transition from local to global sampling does not suffer from the weaknesses of purely localized samplers. On six publicly available real-world datasets for homography and fundamental matrix fitting, MAGSAC++ produces results superior to state-of-the-art robust methods. It is faster, more geometrically accurate and fails less often. |
Tasks | |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05909v1 |
https://arxiv.org/pdf/1912.05909v1.pdf | |
PWC | https://paperswithcode.com/paper/magsac-a-fast-reliable-and-accurate-robust |
Repo | https://github.com/danini/magsac |
Framework | none |
Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages
Title | Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages |
Authors | Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Kai-Wei Chang, Nanyun Peng |
Abstract | Cross-lingual transfer learning has become an important weapon to battle the unavailability of annotated resources for low-resource languages. One of the fundamental techniques to transfer across languages is learning \emph{language-agnostic} representations, in the form of word embeddings or contextual encodings. In this work, we propose to leverage unannotated sentences from auxiliary languages to help learning language-agnostic representations. Specifically, we explore adversarial training for learning contextual encoders that produce invariant representations across languages to facilitate cross-lingual transfer. We conduct experiments on cross-lingual dependency parsing where we train a dependency parser on a source language and transfer it to a wide range of target languages. Experiments on 28 target languages demonstrate that adversarial training significantly improves the overall transfer performances under several different settings. We conduct a careful analysis to evaluate the language-agnostic representations resulted from adversarial training. |
Tasks | Cross-Lingual Transfer, Dependency Parsing, Transfer Learning, Word Embeddings |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09265v1 |
https://arxiv.org/pdf/1909.09265v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-lingual-dependency-parsing-with-1 |
Repo | https://github.com/wasiahmad/cross_lingual_parsing |
Framework | pytorch |
Scalable sim-to-real transfer of soft robot designs
Title | Scalable sim-to-real transfer of soft robot designs |
Authors | Sam Kriegman, Amir Mohammadi Nasab, Dylan Shah, Hannah Steele, Gabrielle Branin, Michael Levin, Josh Bongard, Rebecca Kramer-Bottiglio |
Abstract | The manual design of soft robots and their controllers is notoriously challenging, but it could be augmented—or, in some cases, entirely replaced—by automated design tools. Machine learning algorithms can automatically propose, test, and refine designs in simulation, and the most promising ones can then be manufactured in reality (sim2real). However, it is currently not known how to guarantee that behavior generated in simulation can be preserved when deployed in reality. Although many previous studies have devised training protocols that facilitate sim2real transfer of control polices, little to no work has investigated the simulation-reality gap as a function of morphology. This is due in part to an overall lack of tools capable of systematically designing and rapidly manufacturing robots. Here we introduce a low cost, open source, and modular soft robot design and construction kit, and use it to simulate, fabricate, and measure the simulation-reality gap of minimally complex yet soft, locomoting machines. We prove the scalability of this approach by transferring an order of magnitude more robot designs from simulation to reality than any other method. The kit and its instructions can be found here: https://github.com/skriegman/sim2real4designs |
Tasks | |
Published | 2019-11-23 |
URL | https://arxiv.org/abs/1911.10290v1 |
https://arxiv.org/pdf/1911.10290v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-sim-to-real-transfer-of-soft-robot |
Repo | https://github.com/skriegman/sim2real4designs |
Framework | none |