January 28, 2020

2985 words 15 mins read

Paper Group ANR 975

Serif or Sans: Visual Font Analytics on Book Covers and Online Advertisements. Robustifying deep networks for image segmentation. Single Image Super-resolution via Dense Blended Attention Generative Adversarial Network for Clinical Diagnosis. Big Data Goes Small: Real-Time Spectrum-Driven Embedded Wireless Networking Through Deep Learning in the RF …

Serif or Sans: Visual Font Analytics on Book Covers and Online Advertisements


Title	Serif or Sans: Visual Font Analytics on Book Covers and Online Advertisements
Authors	Yuto Shinahara, Takuro Karamatsu, Daisuke Harada, Kota Yamaguchi, Seiichi Uchida
Abstract	In this paper, we conduct a large-scale study of font statistics in book covers and online advertisements. Through the statistical study, we try to understand how graphic designers relate fonts and content genres and identify the relationship between font styles, colors, and genres. We propose an automatic approach to extract font information from graphic designs by applying a sequence of character detection, style classification, and clustering techniques to the graphic designs. The extracted font information is accumulated together with genre information, such as romance or business, for further trend analysis. Through our unique empirical study, we show that the collected font statistics reveal interesting trends in terms of how typographic design represents the impression and the atmosphere of the content genres.
Tasks
Published	2019-06-24
URL	https://arxiv.org/abs/1906.10269v2
PDF	https://arxiv.org/pdf/1906.10269v2.pdf
PWC	https://paperswithcode.com/paper/serif-or-sans-visual-font-analytics-on-book
Repo
Framework

Robustifying deep networks for image segmentation


Title	Robustifying deep networks for image segmentation
Authors	Zheng Liu, Jinnian Zhang, Varun Jog, Po-Ling Loh, Alan B McMillan
Abstract	Purpose: The purpose of this study is to investigate the robustness of a commonly-used convolutional neural network for image segmentation with respect to visually-subtle adversarial perturbations, and suggest new methods to make these networks more robust to such perturbations. Materials and Methods: In this retrospective study, the accuracy of brain tumor segmentation was studied in subjects with low- and high-grade gliomas. A three-dimensional UNet model was implemented to segment four different MR series (T1-weighted, post-contrast T1-weighted, T2- weighted, and T2-weighted FLAIR) into four pixelwise labels (Gd-enhancing tumor, peritumoral edema, necrotic and non-enhancing tumor, and background). We developed attack strategies based on the Fast Gradient Sign Method (FGSM), iterative FGSM (i-FGSM), and targeted iterative FGSM (ti-FGSM) to produce effective attacks. Additionally, we explored the effectiveness of distillation and adversarial training via data augmentation to counteract adversarial attacks. Robustness was measured by comparing the Dice coefficient for each attack method using Wilcoxon signed-rank tests. Results: Attacks based on FGSM, i-FGSM, and ti-FGSM were effective in significantly reducing the quality of image segmentation with reductions in Dice coefficient by up to 65%. For attack defenses, distillation performed significantly better than adversarial training approaches. However, all defense approaches performed worse compared to unperturbed test images. Conclusion: Segmentation networks can be adversely affected by targeted attacks that introduce visually minor (and potentially undetectable) modifications to existing images. With an increasing interest in applying deep learning techniques to medical imaging data, it is important to quantify the ramifications of adversarial inputs (either intentional or unintentional).
Tasks	Brain Tumor Segmentation, Data Augmentation, Semantic Segmentation
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00656v1
PDF	https://arxiv.org/pdf/1908.00656v1.pdf
PWC	https://paperswithcode.com/paper/robustifying-deep-networks-for-image
Repo
Framework

Single Image Super-resolution via Dense Blended Attention Generative Adversarial Network for Clinical Diagnosis


Title	Single Image Super-resolution via Dense Blended Attention Generative Adversarial Network for Clinical Diagnosis
Authors	Kewen Liu, Yuan Ma, Hongxia Xiong, Zejun Yan, Zhijun Zhou, Chaoyang Liu, Panpan Fang, Xiaojun Li, Yalei Chen
Abstract	During training phase, more connections (e.g. channel concatenation in last layer of DenseNet) means more occupied GPU memory and lower GPU utilization, requiring more training time. The increase of training time is also not conducive to launch application of SR algorithms. This’s why we abandoned DenseNet as basic network. Futhermore, we abandoned this paper due to its limitation only applied on medical images. Please view our lastest work applied on general images at arXiv:1911.03464.
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-06-15
URL	https://arxiv.org/abs/1906.06575v4
PDF	https://arxiv.org/pdf/1906.06575v4.pdf
PWC	https://paperswithcode.com/paper/single-image-super-resolution-via-dense
Repo
Framework

Big Data Goes Small: Real-Time Spectrum-Driven Embedded Wireless Networking Through Deep Learning in the RF Loop


Title	Big Data Goes Small: Real-Time Spectrum-Driven Embedded Wireless Networking Through Deep Learning in the RF Loop
Authors	Francesco Restuccia, Tommaso Melodia
Abstract	The explosion of 5G networks and the Internet of Things will result in an exceptionally crowded RF environment, where techniques such as spectrum sharing and dynamic spectrum access will become essential components of the wireless communication process. In this vision, wireless devices must be able to (i) learn to autonomously extract knowledge from the spectrum on-the-fly; and (ii) react in real time to the inferred spectrum knowledge by appropriately changing communication parameters, including frequency band, symbol modulation, coding rate, among others. Traditional CPU-based machine learning suffers from high latency, and requires application-specific and computationally-intensive feature extraction/selection algorithms. In this paper, we present RFLearn, the first system enabling spectrum knowledge extraction from unprocessed I/Q samples by deep learning directly in the RF loop. RFLearn provides (i) a complete hardware/software architecture where the CPU, radio transceiver and learning/actuation circuits are tightly connected for maximum performance; and (ii) a learning circuit design framework where the latency vs. hardware resource consumption trade-off can explored. We implement and evaluate the performance of RFLearn on custom software-defined radio built on a system-on-chip (SoC) ZYNQ-7000 device mounting AD9361 radio transceivers and VERT2450 antennas. We showcase the capabilities of RFLearn by applying it to solving the fundamental problems of modulation and OFDM parameter recognition. Experimental results reveal that RFLearn decreases latency and power by about 17x and 15x with respect to a software-based solution, with a comparatively low hardware resource consumption.
Tasks
Published	2019-03-12
URL	http://arxiv.org/abs/1903.05460v1
PDF	http://arxiv.org/pdf/1903.05460v1.pdf
PWC	https://paperswithcode.com/paper/big-data-goes-small-real-time-spectrum-driven
Repo
Framework

Deep Set-to-Set Matching and Learning


Title	Deep Set-to-Set Matching and Learning
Authors	Yuki Saito, Takuma Nakamura, Hirotaka Hachiya, Kenji Fukumizu
Abstract	Matching two sets of items, called set-to-set matching problem, is being recently raised. The difficulties of set-to-set matching over ordinary data matching lie in the exchangeability in 1) set-feature extraction and 2) set-matching score; the pair of sets and the items in each set should be exchangeable. In this paper, we propose a deep learning architecture for the set-to-set matching that overcomes the above difficulties, including two novel modules: 1) a cross-set transformation and 2) cross-similarity function. The former provides the exchangeable set-feature through interactions between two sets in intermediate layers, and the latter provides the exchangeable set matching through calculating the cross-feature similarity of items between two sets. We evaluate the methods through experiments with two industrial applications, fashion set recommendation, and group re-identification. Through these experiments, we show that the proposed methods perform better than a baseline given by an extension of the Set Transformer, the state-of-the-art set-input function.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09972v1
PDF	https://arxiv.org/pdf/1910.09972v1.pdf
PWC	https://paperswithcode.com/paper/deep-set-to-set-matching-and-learning
Repo
Framework

GECOR: An End-to-End Generative Ellipsis and Co-reference Resolution Model for Task-Oriented Dialogue


Title	GECOR: An End-to-End Generative Ellipsis and Co-reference Resolution Model for Task-Oriented Dialogue
Authors	Jun Quan, Deyi Xiong, Bonnie Webber, Changjian Hu
Abstract	Ellipsis and co-reference are common and ubiquitous especially in multi-turn dialogues. In this paper, we treat the resolution of ellipsis and co-reference in dialogue as a problem of generating omitted or referred expressions from the dialogue context. We therefore propose a unified end-to-end Generative Ellipsis and CO-reference Resolution model (GECOR) in the context of dialogue. The model can generate a new pragmatically complete user utterance by alternating the generation and copy mode for each user utterance. A multi-task learning framework is further proposed to integrate the GECOR into an end-to-end task-oriented dialogue. In order to train both the GECOR and the multi-task learning framework, we manually construct a new dataset on the basis of the public dataset CamRest676 with both ellipsis and co-reference annotation. On this dataset, intrinsic evaluations on the resolution of ellipsis and co-reference show that the GECOR model significantly outperforms the sequence-to-sequence (seq2seq) baseline model in terms of EM, BLEU and F1 while extrinsic evaluations on the downstream dialogue task demonstrate that our multi-task learning framework with GECOR achieves a higher success rate of task completion than TSCP, a state-of-the-art end-to-end task-oriented dialogue model.
Tasks	Multi-Task Learning
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12086v1
PDF	https://arxiv.org/pdf/1909.12086v1.pdf
PWC	https://paperswithcode.com/paper/gecor-an-end-to-end-generative-ellipsis-and
Repo
Framework

Empirical Analysis of Multi-Task Learning for Reducing Model Bias in Toxic Comment Detection


Title	Empirical Analysis of Multi-Task Learning for Reducing Model Bias in Toxic Comment Detection
Authors	Ameya Vaidya, Feng Mai, Yue Ning
Abstract	With the recent rise of toxicity in online conversations on social media platforms, using modern machine learning algorithms for toxic comment detection has become a central focus of many online applications. Researchers and companies have developed a variety of models to identify toxicity in online conversations, reviews, or comments with mixed successes. However, many existing approaches have learned to incorrectly associate non-toxic comments that have certain trigger-words (e.g. gay, lesbian, black, muslim) as a potential source of toxicity. In this paper, we evaluate several state-of-the-art models with the specific focus of reducing model bias towards these commonly-attacked identity groups. We propose a multi-task learning model with an attention layer that jointly learns to predict the toxicity of a comment as well as the identities present in the comments in order to reduce this bias. We then compare our model to an array of shallow and deep-learning models using metrics designed especially to test for unintended model bias within these identity groups.
Tasks	Multi-Task Learning
Published	2019-09-21
URL	https://arxiv.org/abs/1909.09758v3
PDF	https://arxiv.org/pdf/1909.09758v3.pdf
PWC	https://paperswithcode.com/paper/190909758
Repo
Framework

Multi-Dimensional Explanation of Ratings from Reviews


Title	Multi-Dimensional Explanation of Ratings from Reviews
Authors	Diego Antognini, Claudiu Musat, Boi Faltings
Abstract	Automated predictions require explanations to be interpretable by humans. However, neural methods generally offer little transparency, and interpretability often comes at the cost of performance. In this paper, we consider explaining multi-aspect sentiments with text snippets from reviews, which suffice to make the prediction. Earlier work used attention mechanisms as a way of finding words that predict the sentiment towards a specific aspect and improving recommendation or summarization models. In our work, we propose a neural model that generates, in an unsupervised manner, probabilistic multi-dimensional masks that are interpretable and predict multi-aspect sentiment ratings. We show how using multi-task learning improves both interpretability and F1 scores. Our evaluation shows that on two datasets in different domains, our model outperforms strong baselines and generates masks that are strong feature predictors and have a meaningful interpretation.
Tasks	Multi-Task Learning, Sentiment Analysis
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11386v2
PDF	https://arxiv.org/pdf/1909.11386v2.pdf
PWC	https://paperswithcode.com/paper/multi-dimensional-explanation-of-reviews
Repo
Framework

Reinforcement Learning based Curriculum Optimization for Neural Machine Translation


Title	Reinforcement Learning based Curriculum Optimization for Neural Machine Translation
Authors	Gaurav Kumar, George Foster, Colin Cherry, Maxim Krikun
Abstract	We consider the problem of making efficient use of heterogeneous training data in neural machine translation (NMT). Specifically, given a training dataset with a sentence-level feature such as noise, we seek an optimal curriculum, or order for presenting examples to the system during training. Our curriculum framework allows examples to appear an arbitrary number of times, and thus generalizes data weighting, filtering, and fine-tuning schemes. Rather than relying on prior knowledge to design a curriculum, we use reinforcement learning to learn one automatically, jointly with the NMT system, in the course of a single training run. We show that this approach can beat uniform and filtering baselines on Paracrawl and WMT English-to-French datasets by up to +3.4 BLEU, and match the performance of a hand-designed, state-of-the-art curriculum.
Tasks	Machine Translation
Published	2019-02-28
URL	http://arxiv.org/abs/1903.00041v1
PDF	http://arxiv.org/pdf/1903.00041v1.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-based-curriculum
Repo
Framework

Polystore++: Accelerated Polystore System for Heterogeneous Workloads


Title	Polystore++: Accelerated Polystore System for Heterogeneous Workloads
Authors	Rekha Singhal, Nathan Zhang, Luigi Nardi, Muhammad Shahbaz, Kunle Olukotun
Abstract	Modern real-time business analytic consist of heterogeneous workloads (e.g, database queries, graph processing, and machine learning). These analytic applications need programming environments that can capture all aspects of the constituent workloads (including data models they work on and movement of data across processing engines). Polystore systems suit such applications; however, these systems currently execute on CPUs and the slowdown of Moore’s Law means they cannot meet the performance and efficiency requirements of modern workloads. We envision Polystore++, an architecture to accelerate existing polystore systems using hardware accelerators (e.g, FPGAs, CGRAs, and GPUs). Polystore++ systems can achieve high performance at low power by identifying and offloading components of a polystore system that are amenable to acceleration using specialized hardware. Building a Polystore++ system is challenging and introduces new research problems motivated by the use of hardware accelerators (e.g, optimizing and mapping query plans across heterogeneous computing units and exploiting hardware pipelining and parallelism to improve performance). In this paper, we discuss these challenges in detail and list possible approaches to address these problems.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10336v1
PDF	https://arxiv.org/pdf/1905.10336v1.pdf
PWC	https://paperswithcode.com/paper/polystore-accelerated-polystore-system-for
Repo
Framework

Random Forest as a Tumour Genetic Marker Extractor


Title	Random Forest as a Tumour Genetic Marker Extractor
Authors	Raquel Pérez-Arnal, Dario Garcia-Gasulla, David Torrents, Ferran Parés, Ulises Cortés, Jesús Labarta, Eduard Ayguadé
Abstract	Finding tumour genetic markers is essential to biomedicine due to their relevance for cancer detection and therapy development. In this paper, we explore a recently released dataset of chromosome rearrangements in 2,586 cancer patients, where different sorts of alterations have been detected. Using a Random Forest classifier, we evaluate the relevance of several features (some directly available in the original data, some engineered by us) related to chromosome rearrangements. This evaluation results in a set of potential tumour genetic markers, some of which are validated in the bibliography, while others are potentially novel.
Tasks
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11471v1
PDF	https://arxiv.org/pdf/1911.11471v1.pdf
PWC	https://paperswithcode.com/paper/random-forest-as-a-tumour-genetic-marker
Repo
Framework

COCO: The Large Scale Black-Box Optimization Benchmarking (bbob-largescale) Test Suite


Title	COCO: The Large Scale Black-Box Optimization Benchmarking (bbob-largescale) Test Suite
Authors	Ouassim Elhara, Konstantinos Varelas, Duc Nguyen, Tea Tusar, Dimo Brockhoff, Nikolaus Hansen, Anne Auger
Abstract	The bbob-largescale test suite, containing 24 single-objective functions in continuous domain, extends the well-known single-objective noiseless bbob test suite, which has been used since 2009 in the BBOB workshop series, to large dimension. The core idea is to make the rotational transformations R, Q in search space that appear in the bbob test suite computationally cheaper while retaining some desired properties. This documentation presents an approach that replaces a full rotational transformation with a combination of a block-diagonal matrix and two permutation matrices in order to construct test functions whose computational and memory costs scale linearly in the dimension of the problem.
Tasks
Published	2019-03-15
URL	http://arxiv.org/abs/1903.06396v2
PDF	http://arxiv.org/pdf/1903.06396v2.pdf
PWC	https://paperswithcode.com/paper/coco-the-large-scale-black-box-optimization
Repo
Framework

From low probability to high confidence in stochastic convex optimization


Title	From low probability to high confidence in stochastic convex optimization
Authors	Damek Davis, Dmitriy Drusvyatskiy, Lin Xiao, Junyu Zhang
Abstract	Standard results in stochastic convex optimization bound the number of samples that an algorithm needs to generate a point with small function value in expectation. More nuanced high probability guarantees are rare, and typically either rely on “light-tail” noise assumptions or exhibit worse sample complexity. In this work, we show that a wide class of stochastic optimization algorithms for strongly convex problems can be augmented with high confidence bounds at an overhead cost that is only logarithmic in the confidence level and polylogarithmic in the condition number. The procedure we propose, called proxBoost, is elementary and builds on two well-known ingredients: robust distance estimation and the proximal point method. We discuss consequences for both streaming (online) algorithms and offline algorithms based on empirical risk minimization.
Tasks	Stochastic Optimization
Published	2019-07-31
URL	https://arxiv.org/abs/1907.13307v3
PDF	https://arxiv.org/pdf/1907.13307v3.pdf
PWC	https://paperswithcode.com/paper/robust-stochastic-optimization-with-the
Repo
Framework

Physics-Informed Probabilistic Learning of Linear Embeddings of Non-linear Dynamics With Guaranteed Stability


Title	Physics-Informed Probabilistic Learning of Linear Embeddings of Non-linear Dynamics With Guaranteed Stability
Authors	Shaowu Pan, Karthik Duraisamy
Abstract	The Koopman operator has emerged as a powerful tool for the analysis of nonlinear dynamical systems as it provides coordinate transformations to globally linearize the dynamics. While recent deep learning approaches have been useful in extracting the Koopman operator from a data-driven perspective, several challenges remain. In this work, we formalize the problem of learning the continuous-time Koopman operator with deep neural networks in a measure-theoretic framework. Our approach induces two types of models: differential and recurrent form, the choice of which depends on the availability of the governing equations and data. We then enforce a structural parameterization that renders the realization of the Koopman operator provably stable. A new autoencoder architecture is constructed, such that only the residual of the dynamic mode decomposition is learned. Finally, we employ mean-field variational inference (MFVI) on the aforementioned framework in a hierarchical Bayesian setting to quantify uncertainties in the characterization and prediction of the dynamics of observables. The framework is evaluated on a simple polynomial system, the Duffing oscillator, and an unstable cylinder wake flow with noisy measurements.
Tasks
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03663v4
PDF	https://arxiv.org/pdf/1906.03663v4.pdf
PWC	https://paperswithcode.com/paper/physics-informed-probabilistic-learning-of
Repo
Framework

MVX-Net: Multimodal VoxelNet for 3D Object Detection


Title	MVX-Net: Multimodal VoxelNet for 3D Object Detection
Authors	Vishwanath A. Sindagi, Yin Zhou, Oncel Tuzel
Abstract	Many recent works on 3D object detection have focused on designing neural network architectures that can consume point cloud data. While these approaches demonstrate encouraging performance, they are typically based on a single modality and are unable to leverage information from other modalities, such as a camera. Although a few approaches fuse data from different modalities, these methods either use a complicated pipeline to process the modalities sequentially, or perform late-fusion and are unable to learn interaction between different modalities at early stages. In this work, we present PointFusion and VoxelFusion: two simple yet effective early-fusion approaches to combine the RGB and point cloud modalities, by leveraging the recently introduced VoxelNet architecture. Evaluation on the KITTI dataset demonstrates significant improvements in performance over approaches which only use point cloud data. Furthermore, the proposed method provides results competitive with the state-of-the-art multimodal algorithms, achieving top-2 ranking in five of the six bird’s eye view and 3D detection categories on the KITTI benchmark, by using a simple single stage network.
Tasks	3D Object Detection, Object Detection
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01649v1
PDF	http://arxiv.org/pdf/1904.01649v1.pdf
PWC	https://paperswithcode.com/paper/mvx-net-multimodal-voxelnet-for-3d-object
Repo
Framework