February 1, 2020

3440 words 17 mins read

Paper Group AWR 187

Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks. Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning. Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text. Event-based Vision: A Survey. Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Gen …

Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks


Title	Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks
Authors	Eunwoo Kim, Chanho Ahn, Philip H. S. Torr, Songhwai Oh
Abstract	Deep networks consume a large amount of memory by their nature. A natural question arises can we reduce that memory requirement whilst maintaining performance. In particular, in this work we address the problem of memory efficient learning for multiple tasks. To this end, we propose a novel network architecture producing multiple networks of different configurations, termed deep virtual networks (DVNs), for different tasks. Each DVN is specialized for a single task and structured hierarchically. The hierarchical structure, which contains multiple levels of hierarchy corresponding to different numbers of parameters, enables multiple inference for different memory budgets. The building block of a deep virtual network is based on a disjoint collection of parameters of a network, which we call a unit. The lowest level of hierarchy in a deep virtual network is a unit, and higher levels of hierarchy contain lower levels’ units and other additional units. Given a budget on the number of parameters, a different level of a deep virtual network can be chosen to perform the task. A unit can be shared by different DVNs, allowing multiple DVNs in a single network. In addition, shared units provide assistance to the target task with additional knowledge learned from another tasks. This cooperative configuration of DVNs makes it possible to handle different tasks in a memory-aware manner. Our experiments show that the proposed method outperforms existing approaches for multiple tasks. Notably, ours is more efficient than others as it allows memory-aware inference for all tasks.
Tasks
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04562v1
PDF	http://arxiv.org/pdf/1904.04562v1.pdf
PWC	https://paperswithcode.com/paper/deep-virtual-networks-for-memory-efficient
Repo	https://github.com/niceday15/deep-virtual-network-cifar
Framework	tf

Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning


Title	Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning
Authors	Shichen Liu, Tianye Li, Weikai Chen, Hao Li
Abstract	Rendering bridges the gap between 2D vision and 3D scenes by simulating the physical process of image formation. By inverting such renderer, one can think of a learning approach to infer 3D information from 2D images. However, standard graphics renderers involve a fundamental discretization step called rasterization, which prevents the rendering process to be differentiable, hence able to be learned. Unlike the state-of-the-art differentiable renderers, which only approximate the rendering gradient in the back propagation, we propose a truly differentiable rendering framework that is able to (1) directly render colorized mesh using differentiable functions and (2) back-propagate efficient supervision signals to mesh vertices and their attributes from various forms of image representations, including silhouette, shading and color images. The key to our framework is a novel formulation that views rendering as an aggregation function that fuses the probabilistic contributions of all mesh triangles with respect to the rendered pixels. Such formulation enables our framework to flow gradients to the occluded and far-range vertices, which cannot be achieved by the previous state-of-the-arts. We show that by using the proposed renderer, one can achieve significant improvement in 3D unsupervised single-view reconstruction both qualitatively and quantitatively. Experiments also demonstrate that our approach is able to handle the challenging tasks in image-based shape fitting, which remain nontrivial to existing differentiable renderers.
Tasks	3D Object Reconstruction, Single-View 3D Reconstruction
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01786v1
PDF	http://arxiv.org/pdf/1904.01786v1.pdf
PWC	https://paperswithcode.com/paper/soft-rasterizer-a-differentiable-renderer-for
Repo	https://github.com/ShichenLiu/SoftRas
Framework	pytorch

Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text


Title	Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text
Authors	Bidisha Samanta, Niloy Ganguly, Soumen Chakrabarti
Abstract	Multilingual writers and speakers often alternate between two languages in a single discourse, a practice called “code-switching”. Existing sentiment detection methods are usually trained on sentiment-labeled monolingual text. Manually labeled code-switched text, especially involving minority languages, is extremely rare. Consequently, the best monolingual methods perform relatively poorly on code-switched text. We present an effective technique for synthesizing labeled code-switched text from labeled monolingual text, which is more readily available. The idea is to replace carefully selected subtrees of constituency parses of sentences in the resource-rich language with suitable token spans selected from automatic translations to the resource-poor language. By augmenting scarce human-labeled code-switched text with plentiful synthetic code-switched text, we achieve significant improvements in sentiment labeling accuracy (1.5%, 5.11%, 7.20%) for three different language pairs (English-Hindi, English-Spanish and English-Bengali). We also get significant gains for hate speech detection: 4% improvement using only synthetic text and 6% if augmented with real text.
Tasks	Hate Speech Detection
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05725v1
PDF	https://arxiv.org/pdf/1906.05725v1.pdf
PWC	https://paperswithcode.com/paper/improved-sentiment-detection-via-label
Repo	https://github.com/bidishasamantakgp/2019_CSGen_ACL
Framework	tf

Event-based Vision: A Survey


Title	Event-based Vision: A Survey
Authors	Guillermo Gallego, Tobi Delbruck, Garrick Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew Davison, Joerg Conradt, Kostas Daniilidis, Davide Scaramuzza
Abstract	Event cameras are bio-inspired sensors that work radically different from traditional cameras. Instead of capturing images at a fixed rate, they measure per-pixel brightness changes asynchronously. This results in a stream of events, which encode the time, location and sign of the brightness changes. Event cameras posses outstanding properties compared to traditional cameras: very high dynamic range (140 dB vs. 60 dB), high temporal resolution (in the order of microseconds), low power consumption, and do not suffer from motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as high speed and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world.
Tasks	Event-based vision
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08405v2
PDF	https://arxiv.org/pdf/1904.08405v2.pdf
PWC	https://paperswithcode.com/paper/event-based-vision-a-survey
Repo	https://github.com/uzh-rpg/event-based_vision_resources
Framework	none

Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation


Title	Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation
Authors	Amit Moryossef, Yoav Goldberg, Ido Dagan
Abstract	Data-to-text generation can be conceptually divided into two parts: ordering and structuring the information (planning), and generating fluent language describing the information (realization). Modern neural generation systems conflate these two steps into a single end-to-end differentiable system. We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on realization. For training a plan-to-text generator, we present a method for matching reference texts to their corresponding text plans. For inference time, we describe a method for selecting high-quality text plans for new inputs. We implement and evaluate our approach on the WebNLG benchmark. Our results demonstrate that decoupling text planning from neural realization indeed improves the system’s reliability and adequacy while maintaining fluent output. We observe improvements both in BLEU scores and in manual evaluations. Another benefit of our approach is the ability to output diverse realizations of the same input, paving the way to explicit control over the generated text structure.
Tasks	Data-to-Text Generation, Graph-to-Sequence, Text Generation
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03396v2
PDF	http://arxiv.org/pdf/1904.03396v2.pdf
PWC	https://paperswithcode.com/paper/step-by-step-separating-planning-from
Repo	https://github.com/AmitMY/chimera
Framework	none

ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations


Title	ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations
Authors	Ekagra Ranjan, Soumya Sanyal, Partha Pratim Talukdar
Abstract	Graph Neural Networks (GNN) have been shown to work effectively for modeling graph structured data to solve tasks such as node classification, link prediction and graph classification. There has been some recent progress in defining the notion of pooling in graphs whereby the model tries to generate a graph level representation by downsampling and summarizing the information present in the nodes. Existing pooling methods either fail to effectively capture the graph substructure or do not easily scale to large graphs. In this work, we propose ASAP (Adaptive Structure Aware Pooling), a sparse and differentiable pooling method that addresses the limitations of previous graph pooling architectures. ASAP utilizes a novel self-attention network along with a modified GNN formulation to capture the importance of each node in a given graph. It also learns a sparse soft cluster assignment for nodes at each layer to effectively pool the subgraphs to form the pooled graph. Through extensive experiments on multiple datasets and theoretical analysis, we motivate our choice of the components used in ASAP. Our experimental results show that combining existing GNN architectures with ASAP leads to state-of-the-art results on multiple graph classification benchmarks. ASAP has an average improvement of 4%, compared to current sparse hierarchical state-of-the-art method.
Tasks	Graph Classification, Link Prediction, Node Classification
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07979v3
PDF	https://arxiv.org/pdf/1911.07979v3.pdf
PWC	https://paperswithcode.com/paper/asap-adaptive-structure-aware-pooling-for
Repo	https://github.com/malllabiisc/ASAP
Framework	pytorch

SignalTrain: Profiling Audio Compressors with Deep Neural Networks


Title	SignalTrain: Profiling Audio Compressors with Deep Neural Networks
Authors	Scott H. Hawley, Benjamin Colburn, Stylianos I. Mimilakis
Abstract	In this work we present a data-driven approach for predicting the behavior of (i.e., profiling) a given non-linear audio signal processing effect (henceforth “audio effect”). Our objective is to learn a mapping function that maps the unprocessed audio to the processed by the audio effect to be profiled, using time-domain samples. To that aim, we employ a deep auto-encoder model that is conditioned on both time-domain samples and the control parameters of the target audio effect. As a test-case study, we focus on the offline profiling of two dynamic range compression audio effects, one software-based and the other analog. Compressors were chosen because they are a widely used and important set of effects and because their parameterized nonlinear time-dependent nature makes them a challenging problem for a system aiming to profile “general” audio effects. Results from our experimental procedure show that the primary functional and auditory characteristics of the compressors can be captured, however there is still sufficient audible noise to merit further investigation before such methods are applied to real-world audio processing workflows.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11928v2
PDF	https://arxiv.org/pdf/1905.11928v2.pdf
PWC	https://paperswithcode.com/paper/signaltrain-profiling-audio-compressors-with
Repo	https://github.com/drscotthawley/signaltrain
Framework	pytorch

HighRES: Highlight-based Reference-less Evaluation of Summarization


Title	HighRES: Highlight-based Reference-less Evaluation of Summarization
Authors	Hardy, Shashi Narayan, Andreas Vlachos
Abstract	There has been substantial progress in summarization research enabled by the availability of novel, often large-scale, datasets and recent advances on neural network-based approaches. However, manual evaluation of the system generated summaries is inconsistent due to the difficulty the task poses to human non-expert readers. To address this issue, we propose a novel approach for manual evaluation, Highlight-based Reference-less Evaluation of Summarization (HighRES), in which summaries are assessed by multiple annotators against the source document via manually highlighted salient content in the latter. Thus summary assessment on the source document by human judges is facilitated, while the highlights can be used for evaluating multiple systems. To validate our approach we employ crowd-workers to augment with highlights a recently proposed dataset and compare two state-of-the-art systems. We demonstrate that HighRES improves inter-annotator agreement in comparison to using the source document directly, while they help emphasize differences among systems that would be ignored under other evaluation approaches.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01361v1
PDF	https://arxiv.org/pdf/1906.01361v1.pdf
PWC	https://paperswithcode.com/paper/highres-highlight-based-reference-less
Repo	https://github.com/sheffieldnlp/highres
Framework	none

Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization


Title	Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization
Authors	Måns Larsson, Erik Stenborg, Carl Toft, Lars Hammarstrand, Torsten Sattler, Fredrik Kahl
Abstract	Long-term visual localization is the problem of estimating the camera pose of a given query image in a scene whose appearance changes over time. It is an important problem in practice, for example, encountered in autonomous driving. In order to gain robustness to such changes, long-term localization approaches often use segmantic segmentations as an invariant scene representation, as the semantic meaning of each scene part should not be affected by seasonal and other changes. However, these representations are typically not very discriminative due to the limited number of available classes. In this paper, we propose a new neural network, the Fine-Grained Segmentation Network (FGSN), that can be used to provide image segmentations with a larger number of labels and can be trained in a self-supervised fashion. In addition, we show how FGSNs can be trained to output consistent labels across seasonal changes. We demonstrate through extensive experiments that integrating the fine-grained segmentations produced by our FGSNs into existing localization algorithms leads to substantial improvements in localization performance.
Tasks	Autonomous Driving, Visual Localization
Published	2019-08-18
URL	https://arxiv.org/abs/1908.06387v1
PDF	https://arxiv.org/pdf/1908.06387v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-segmentation-networks-self
Repo	https://github.com/maunzzz/fine-grained-segmentation-networks
Framework	pytorch

TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text


Title	TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text
Authors	Fady Medhat, Mahnaz Mohammadi, Sardar Jaf, Chris G. Willcocks, Toby P. Breckon, Peter Matthews, Andrew Stephen McGough, Georgios Theodoropoulos, Boguslaw Obara
Abstract	Handling large corpuses of documents is of significant importance in many fields, no more so than in the areas of crime investigation and defence, where an organisation may be presented with a large volume of scanned documents which need to be processed in a finite time. However, this problem is exacerbated both by the volume, in terms of scanned documents and the complexity of the pages, which need to be processed. Often containing many different elements, which each need to be processed and understood. Text recognition, which is a primary task of this process, is usually dependent upon the type of text, being either handwritten or machine-printed. Accordingly, the recognition involves prior classification of the text category, before deciding on the recognition method to be applied. This poses a more challenging task if a document contains both handwritten and machine-printed text. In this work, we present a generic process flow for text recognition in scanned documents containing mixed handwritten and machine-printed text without the need to classify text in advance. We realize the proposed process flow using several open-source image processing and text recognition packages1. The evaluation is performed using a specially developed variant, presented in this work, of the IAM handwriting database, where we achieve an average transcription accuracy of nearly 80% for pages containing both printed and handwritten text.
Tasks
Published	2019-04-28
URL	http://arxiv.org/abs/1904.12387v1
PDF	http://arxiv.org/pdf/1904.12387v1.pdf
PWC	https://paperswithcode.com/paper/tmixt-a-process-flow-for-transcribing-mixed
Repo	https://github.com/fadymedhat/TMIXT
Framework	none

ESA: Entity Summarization with Attention


Title	ESA: Entity Summarization with Attention
Authors	Dongjun Wei, Yaxin Liu
Abstract	Entity summarization aims at creating brief but informative descriptions of entities from knowledge graphs. While previous work mostly focused on traditional techniques such as clustering algorithms and graph models, we ask how to apply deep learning methods into this task. In this paper we propose ESA, a neural network with supervised attention mechanisms for entity summarization. Specifically, we calculate attention weights for facts in each entity, and rank facts to generate reliable summaries. We explore techniques to solve difficult learning problems presented by the ESA, and demonstrate the effectiveness of our model in comparison with the state-of-the-art methods. Experimental results show that our model improves the quality of the entity summaries in both F-measure and MAP.
Tasks	Knowledge Graphs
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10625v2
PDF	https://arxiv.org/pdf/1905.10625v2.pdf
PWC	https://paperswithcode.com/paper/esa-entity-summarization-with-attention
Repo	https://github.com/WeiDongjunGabriel/ESA
Framework	pytorch

Trinity of Pixel Enhancement: a Joint Solution for Demosaicking, Denoising and Super-Resolution


Title	Trinity of Pixel Enhancement: a Joint Solution for Demosaicking, Denoising and Super-Resolution
Authors	Guocheng Qian, Jinjin Gu, Jimmy S. Ren, Chao Dong, Furong Zhao, Juan Lin
Abstract	Demosaicing, denoising and super-resolution (SR) are of practical importance in digital image processing and have been studied independently in the passed decades. Despite the recent improvement of learning-based image processing methods in image quality, there lacks enough analysis into their interactions and characteristics under a realistic setting of the mixture problem of demosaicing, denoising and SR. In existing solutions, these tasks are simply combined to obtain a high-resolution image from a low-resolution raw mosaic image, resulting in a performance drop of the final image quality. In this paper, we first rethink the mixture problem from a holistic perspective and then propose the Trinity Enhancement Network (TENet), a specially designed learning-based method for the mixture problem, which adopts a novel image processing pipeline order and a joint learning strategy. In order to obtain the correct color sampling for training, we also contribute a new dataset namely PixelShift200, which consists of high-quality full color sampled real-world images using the advanced pixel shift technique. Experiments demonstrate that our TENet is superior to existing solutions in both quantitative and qualitative perspective. Our experiments also show the necessity of the proposed PixelShift200 dataset.
Tasks	Demosaicking, Denoising, Super-Resolution
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02538v1
PDF	https://arxiv.org/pdf/1905.02538v1.pdf
PWC	https://paperswithcode.com/paper/trinity-of-pixel-enhancement-a-joint-solution
Repo	https://github.com/lutxyl/Deblur-Denoising-Hyperspectral
Framework	pytorch

MAGSAC++, a fast, reliable and accurate robust estimator


Title	MAGSAC++, a fast, reliable and accurate robust estimator
Authors	Daniel Barath, Jana Noskova, Maksym Ivashechkin, Jiri Matas
Abstract	A new method for robust estimation, MAGSAC++, is proposed. It introduces a new model quality (scoring) function that does not require the inlier-outlier decision, and a novel marginalization procedure formulated as an iteratively re-weighted least-squares approach. We also propose a new sampler, Progressive NAPSAC, for RANSAC-like robust estimators. Exploiting the fact that nearby points often originate from the same model in real-world data, it finds local structures earlier than global samplers. The progressive transition from local to global sampling does not suffer from the weaknesses of purely localized samplers. On six publicly available real-world datasets for homography and fundamental matrix fitting, MAGSAC++ produces results superior to state-of-the-art robust methods. It is faster, more geometrically accurate and fails less often.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05909v1
PDF	https://arxiv.org/pdf/1912.05909v1.pdf
PWC	https://paperswithcode.com/paper/magsac-a-fast-reliable-and-accurate-robust
Repo	https://github.com/danini/magsac
Framework	none

Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages


Title	Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages
Authors	Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Kai-Wei Chang, Nanyun Peng
Abstract	Cross-lingual transfer learning has become an important weapon to battle the unavailability of annotated resources for low-resource languages. One of the fundamental techniques to transfer across languages is learning \emph{language-agnostic} representations, in the form of word embeddings or contextual encodings. In this work, we propose to leverage unannotated sentences from auxiliary languages to help learning language-agnostic representations. Specifically, we explore adversarial training for learning contextual encoders that produce invariant representations across languages to facilitate cross-lingual transfer. We conduct experiments on cross-lingual dependency parsing where we train a dependency parser on a source language and transfer it to a wide range of target languages. Experiments on 28 target languages demonstrate that adversarial training significantly improves the overall transfer performances under several different settings. We conduct a careful analysis to evaluate the language-agnostic representations resulted from adversarial training.
Tasks	Cross-Lingual Transfer, Dependency Parsing, Transfer Learning, Word Embeddings
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09265v1
PDF	https://arxiv.org/pdf/1909.09265v1.pdf
PWC	https://paperswithcode.com/paper/cross-lingual-dependency-parsing-with-1
Repo	https://github.com/wasiahmad/cross_lingual_parsing
Framework	pytorch

Scalable sim-to-real transfer of soft robot designs


Title	Scalable sim-to-real transfer of soft robot designs
Authors	Sam Kriegman, Amir Mohammadi Nasab, Dylan Shah, Hannah Steele, Gabrielle Branin, Michael Levin, Josh Bongard, Rebecca Kramer-Bottiglio
Abstract	The manual design of soft robots and their controllers is notoriously challenging, but it could be augmented—or, in some cases, entirely replaced—by automated design tools. Machine learning algorithms can automatically propose, test, and refine designs in simulation, and the most promising ones can then be manufactured in reality (sim2real). However, it is currently not known how to guarantee that behavior generated in simulation can be preserved when deployed in reality. Although many previous studies have devised training protocols that facilitate sim2real transfer of control polices, little to no work has investigated the simulation-reality gap as a function of morphology. This is due in part to an overall lack of tools capable of systematically designing and rapidly manufacturing robots. Here we introduce a low cost, open source, and modular soft robot design and construction kit, and use it to simulate, fabricate, and measure the simulation-reality gap of minimally complex yet soft, locomoting machines. We prove the scalability of this approach by transferring an order of magnitude more robot designs from simulation to reality than any other method. The kit and its instructions can be found here: https://github.com/skriegman/sim2real4designs
Tasks
Published	2019-11-23
URL	https://arxiv.org/abs/1911.10290v1
PDF	https://arxiv.org/pdf/1911.10290v1.pdf
PWC	https://paperswithcode.com/paper/scalable-sim-to-real-transfer-of-soft-robot
Repo	https://github.com/skriegman/sim2real4designs
Framework	none