October 20, 2019

2894 words 14 mins read

Paper Group ANR 28

Photorealistic Image Reconstruction from Hybrid Intensity and Event based Sensor. Sharp Analysis of Learning with Discrete Losses. Algebraic Expression of Subjective Spatial and Temporal Patterns. The Enemy Among Us: Detecting Hate Speech with Threats Based ‘Othering’ Language Embeddings. Advanced Methods for the Optical Quality Assurance of Silico …

Photorealistic Image Reconstruction from Hybrid Intensity and Event based Sensor


Title	Photorealistic Image Reconstruction from Hybrid Intensity and Event based Sensor
Authors	Prasan A Shedligeri, Kaushik Mitra
Abstract	Event sensors output a stream of asynchronous brightness changes (called ``events’') at a very high temporal rate. Previous works on recovering the lost intensity information from the event sensor data have heavily relied on the event stream, which makes the reconstructed images non-photorealistic and also susceptible to noise in the event stream. We propose to reconstruct photorealistic intensity images from a hybrid sensor consisting of a low frame rate conventional camera, which has the scene texture information, along with the event sensor. To accomplish our task, we warp the low frame rate intensity images to temporally dense locations of the event data by estimating a spatially dense scene depth and temporally dense sensor ego-motion. The results obtained from our algorithm are more photorealistic compared to any of the previous state-of-the-art algorithms. We also demonstrate our algorithm’s robustness to abrupt camera motion and noise in the event sensor data. \|
Tasks	Image Reconstruction
Published	2018-05-16
URL	http://arxiv.org/abs/1805.06140v4
PDF	http://arxiv.org/pdf/1805.06140v4.pdf
PWC	https://paperswithcode.com/paper/photorealistic-image-reconstruction-from
Repo
Framework

Sharp Analysis of Learning with Discrete Losses


Title	Sharp Analysis of Learning with Discrete Losses
Authors	Alex Nowak-Vila, Francis Bach, Alessandro Rudi
Abstract	The problem of devising learning strategies for discrete losses (e.g., multilabeling, ranking) is currently addressed with methods and theoretical analyses ad-hoc for each loss. In this paper we study a least-squares framework to systematically design learning algorithms for discrete losses, with quantitative characterizations in terms of statistical and computational complexity. In particular we improve existing results by providing explicit dependence on the number of labels for a wide class of losses and faster learning rates in conditions of low-noise. Theoretical results are complemented with experiments on real datasets, showing the effectiveness of the proposed general approach.
Tasks
Published	2018-10-16
URL	http://arxiv.org/abs/1810.06839v1
PDF	http://arxiv.org/pdf/1810.06839v1.pdf
PWC	https://paperswithcode.com/paper/sharp-analysis-of-learning-with-discrete
Repo
Framework

Algebraic Expression of Subjective Spatial and Temporal Patterns


Title	Algebraic Expression of Subjective Spatial and Temporal Patterns
Authors	Chuyu Xiong
Abstract	Universal learning machine is a theory trying to study machine learning from mathematical point of view. The outside world is reflected inside an universal learning machine according to pattern of incoming data. This is subjective pattern of learning machine. In [2,4], we discussed subjective spatial pattern, and established a powerful tool – X-form, which is an algebraic expression for subjective spatial pattern. However, as the initial stage of study, there we only discussed spatial pattern. Here, we will discuss spatial and temporal patterns, and algebraic expression for them.
Tasks
Published	2018-05-26
URL	http://arxiv.org/abs/1805.11959v2
PDF	http://arxiv.org/pdf/1805.11959v2.pdf
PWC	https://paperswithcode.com/paper/algebraic-expression-of-subjective-spatial
Repo
Framework

The Enemy Among Us: Detecting Hate Speech with Threats Based ‘Othering’ Language Embeddings


Title	The Enemy Among Us: Detecting Hate Speech with Threats Based ‘Othering’ Language Embeddings
Authors	Wafa Alorainy, Pete Burnap, Han Liu, Matthew Williams
Abstract	Offensive or antagonistic language targeted at individuals and social groups based on their personal characteristics (also known as cyber hate speech or cyberhate) has been frequently posted and widely circulated viathe World Wide Web. This can be considered as a key risk factor for individual and societal tension linked toregional instability. Automated Web-based cyberhate detection is important for observing and understandingcommunity and regional societal tension - especially in online social networks where posts can be rapidlyand widely viewed and disseminated. While previous work has involved using lexicons, bags-of-words orprobabilistic language parsing approaches, they often suffer from a similar issue which is that cyberhate can besubtle and indirect - thus depending on the occurrence of individual words or phrases can lead to a significantnumber of false negatives, providing inaccurate representation of the trends in cyberhate. This problemmotivated us to challenge thinking around the representation of subtle language use, such as references toperceived threats from “the other” including immigration or job prosperity in a hateful context. We propose anovel framework that utilises language use around the concept of “othering” and intergroup threat theory toidentify these subtleties and we implement a novel classification method using embedding learning to computesemantic distances between parts of speech considered to be part of an “othering” narrative. To validate ourapproach we conduct several experiments on different types of cyberhate, namely religion, disability, race andsexual orientation, with F-measure scores for classifying hateful instances obtained through applying ourmodel of 0.93, 0.86, 0.97 and 0.98 respectively, providing a significant improvement in classifier accuracy overthe state-of-the-art
Tasks
Published	2018-01-23
URL	http://arxiv.org/abs/1801.07495v3
PDF	http://arxiv.org/pdf/1801.07495v3.pdf
PWC	https://paperswithcode.com/paper/the-enemy-among-us-detecting-hate-speech-with
Repo
Framework

Advanced Methods for the Optical Quality Assurance of Silicon Sensors


Title	Advanced Methods for the Optical Quality Assurance of Silicon Sensors
Authors	E. Lavrik, I. Panasenko, H. R. Schmidt
Abstract	We describe a setup for optical quality assurance of silicon microstrip sensors. Pattern recognition algorithms were developed to analyze microscopic scans of the sensors for defects. It is shown that the software has a recognition and classification rate of $>$~90% for defects like scratches, shorts, broken metal lines etc. We have demonstrated that advanced image processing based on neural network techniques is able to further improve the recognition and defect classification rate.
Tasks
Published	2018-06-30
URL	http://arxiv.org/abs/1807.00211v2
PDF	http://arxiv.org/pdf/1807.00211v2.pdf
PWC	https://paperswithcode.com/paper/advanced-methods-for-the-optical-quality
Repo
Framework

The Dreaming Variational Autoencoder for Reinforcement Learning Environments


Title	The Dreaming Variational Autoencoder for Reinforcement Learning Environments
Authors	Per-Arne Andersen, Morten Goodwin, Ole-Christoffer Granmo
Abstract	Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.
Tasks
Published	2018-10-02
URL	http://arxiv.org/abs/1810.01112v1
PDF	http://arxiv.org/pdf/1810.01112v1.pdf
PWC	https://paperswithcode.com/paper/the-dreaming-variational-autoencoder-for
Repo
Framework

Variance-based Gradient Compression for Efficient Distributed Deep Learning


Title	Variance-based Gradient Compression for Efficient Distributed Deep Learning
Authors	Yusuke Tsuzuku, Hiroto Imachi, Takuya Akiba
Abstract	Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently communicate gradients, causing severe bottlenecks, especially on lower bandwidth connections. A few methods have been proposed to compress gradient for efficient communication, but they either suffer a low compression ratio or significantly harm the resulting model accuracy, particularly when applied to convolutional neural networks. To address these issues, we propose a method to reduce the communication overhead of distributed deep learning. Our key observation is that gradient updates can be delayed until an unambiguous (high amplitude, low variance) gradient has been calculated. We also present an efficient algorithm to compute the variance with negligible additional cost. We experimentally show that our method can achieve very high compression ratio while maintaining the result model accuracy. We also analyze the efficiency using computation and communication cost models and provide the evidence that this method enables distributed deep learning for many scenarios with commodity environments.
Tasks
Published	2018-02-16
URL	http://arxiv.org/abs/1802.06058v2
PDF	http://arxiv.org/pdf/1802.06058v2.pdf
PWC	https://paperswithcode.com/paper/variance-based-gradient-compression-for
Repo
Framework

Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation


Title	Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation
Authors	Ting-Yao Hu, Xiaojun Chang, Alexander G. Hauptmann
Abstract	Person re-identification aims to identify a specific person at distinct times and locations. It is challenging because of occlusion, illumination, and viewpoint change in camera views. Recently, multi-shot person re-id task receives more attention since it is closer to real-world application. A key point of a good algorithm for multi-shot person re-id is the temporal aggregation of the person appearance features. While most of the current approaches apply pooling strategies and obtain a fixed-size vector representation, these may lose the matching evidence between examples. In this work, we propose the idea of visual distributional representation, which interprets an image set as samples drawn from an unknown distribution in appearance feature space. Based on the supervision signals from a downstream task of interest, the method reshapes the appearance feature space and further learns the unknown distribution of each image set. In the context of multi-shot person re-id, we apply this novel concept along with Wasserstein distance and learn a distributional set distance function between two image sets. In this way, the proper alignment between two image sets can be discovered naturally in a non-parametric manner. Our experiment results on two public datasets show the advantages of our proposed method compared to other state-of-the-art approaches.
Tasks	Person Re-Identification
Published	2018-08-03
URL	http://arxiv.org/abs/1808.01119v2
PDF	http://arxiv.org/pdf/1808.01119v2.pdf
PWC	https://paperswithcode.com/paper/multi-shot-person-re-identification-through
Repo
Framework

Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects


Title	Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects
Authors	Yusuke Niitani, Takuya Akiba, Tommi Kerola, Toru Ogawa, Shotaro Sano, Shuji Suzuki
Abstract	Efficient and reliable methods for training of object detectors are in higher demand than ever, and more and more data relevant to the field is becoming available. However, large datasets like Open Images Dataset v4 (OID) are sparsely annotated, and some measure must be taken in order to ensure the training of a reliable detector. In order to take the incompleteness of these datasets into account, one possibility is to use pretrained models to detect the presence of the unverified objects. However, the performance of such a strategy depends largely on the power of the pretrained model. In this study, we propose part-aware sampling, a method that uses human intuition for the hierarchical relation between objects. In terse terms, our method works by making assumptions like “a bounding box for a car should contain a bounding box for a tire”. We demonstrate the power of our method on OID and compare the performance against a method based on a pretrained model. Our method also won the first and second place on the public and private test sets of the Google AI Open Images Competition 2018.
Tasks	Object Detection
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10862v2
PDF	http://arxiv.org/pdf/1811.10862v2.pdf
PWC	https://paperswithcode.com/paper/sampling-techniques-for-large-scale-object
Repo
Framework

Decompose to manipulate: Manipulable Object Synthesis in 3D Medical Images with Structured Image Decomposition


Title	Decompose to manipulate: Manipulable Object Synthesis in 3D Medical Images with Structured Image Decomposition
Authors	Siqi Liu, Eli Gibson, Sasa Grbic, Zhoubing Xu, Arnaud Arindra Adiyoso Setio, Jie Yang, Bogdan Georgescu, Dorin Comaniciu
Abstract	The performance of medical image analysis systems is constrained by the quantity of high-quality image annotations. Such systems require data to be annotated by experts with years of training, especially when diagnostic decisions are involved. Such datasets are thus hard to scale up. In this context, it is hard for supervised learning systems to generalize to the cases that are rare in the training set but would be present in real-world clinical practices. We believe that the synthetic image samples generated by a system trained on the real data can be useful for improving the supervised learning tasks in the medical image analysis applications. Allowing the image synthesis to be manipulable could help synthetic images provide complementary information to the training data rather than simply duplicating the real-data manifold. In this paper, we propose a framework for synthesizing 3D objects, such as pulmonary nodules, in 3D medical images with manipulable properties. The manipulation is enabled by decomposing of the object of interests into its segmentation mask and a 1D vector containing the residual information. The synthetic object is refined and blended into the image context with two adversarial discriminators. We evaluate the proposed framework on lung nodules in 3D chest CT images and show that the proposed framework could generate realistic nodules with manipulable shapes, textures and locations, etc. By sampling from both the synthetic nodules and the real nodules from 2800 3D CT volumes during the classifier training, we show the synthetic patches could improve the overall nodule detection performance by average 8.44% competition performance metric (CPM) score.
Tasks	Image Generation
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01737v2
PDF	http://arxiv.org/pdf/1812.01737v2.pdf
PWC	https://paperswithcode.com/paper/decompose-to-manipulate-manipulable-object
Repo
Framework

Deep Learning for Decoding of Linear Codes - A Syndrome-Based Approach


Title	Deep Learning for Decoding of Linear Codes - A Syndrome-Based Approach
Authors	Amir Bennatan, Yoni Choukroun, Pavel Kisilev
Abstract	We present a novel framework for applying deep neural networks (DNN) to soft decoding of linear codes at arbitrary block lengths. Unlike other approaches, our framework allows unconstrained DNN design, enabling the free application of powerful designs that were developed in other contexts. Our method is robust to overfitting that inhibits many competing methods, which follows from the exponentially large number of codewords required for their training. We achieve this by transforming the channel output before feeding it to the network, extracting only the syndrome of the hard decisions and the channel output reliabilities. We prove analytically that this approach does not involve any intrinsic performance penalty, and guarantees the generalization of performance obtained during training. Our best results are obtained using a recurrent neural network (RNN) architecture combined with simple preprocessing by permutation. We provide simulation results that demonstrate performance that sometimes approaches that of the ordered statistics decoding (OSD) algorithm.
Tasks
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04741v1
PDF	http://arxiv.org/pdf/1802.04741v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-decoding-of-linear-codes-a
Repo
Framework

Constrained Convolutional-Recurrent Networks to Improve Speech Quality with Low Impact on Recognition Accuracy


Title	Constrained Convolutional-Recurrent Networks to Improve Speech Quality with Low Impact on Recognition Accuracy
Authors	Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar
Abstract	For a speech-enhancement algorithm, it is highly desirable to simultaneously improve perceptual quality and recognition rate. Thanks to computational costs and model complexities, it is challenging to train a model that effectively optimizes both metrics at the same time. In this paper, we propose a method for speech enhancement that combines local and global contextual structures information through convolutional-recurrent neural networks that improves perceptual quality. At the same time, we introduce a new constraint on the objective function using a language model/decoder that limits the impact on recognition rate. Based on experiments conducted with real user data, we demonstrate that our new context-augmented machine-learning approach for speech enhancement improves PESQ and WER by an additional 24.5% and 51.3%, respectively, when compared to the best-performing methods in the literature.
Tasks	Language Modelling, Speech Enhancement
Published	2018-02-16
URL	http://arxiv.org/abs/1802.05874v1
PDF	http://arxiv.org/pdf/1802.05874v1.pdf
PWC	https://paperswithcode.com/paper/constrained-convolutional-recurrent-networks
Repo
Framework

Learning How to Self-Learn: Enhancing Self-Training Using Neural Reinforcement Learning


Title	Learning How to Self-Learn: Enhancing Self-Training Using Neural Reinforcement Learning
Authors	Chenhua Chen, Yue Zhang
Abstract	Self-training is a useful strategy for semi-supervised learning, leveraging raw texts for enhancing model performances. Traditional self-training methods depend on heuristics such as model confidence for instance selection, the manual adjustment of which can be expensive. To address these challenges, we propose a deep reinforcement learning method to learn the self-training strategy automatically. Based on neural network representation of sentences, our model automatically learns an optimal policy for instance selection. Experimental results show that our approach outperforms the baseline solutions in terms of better tagging performances and stability.
Tasks
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05734v1
PDF	http://arxiv.org/pdf/1804.05734v1.pdf
PWC	https://paperswithcode.com/paper/learning-how-to-self-learn-enhancing-self
Repo
Framework

PPD: Permutation Phase Defense Against Adversarial Examples in Deep Learning


Title	PPD: Permutation Phase Defense Against Adversarial Examples in Deep Learning
Authors	Mehdi Jafarnia-Jahromi, Tasmin Chowdhury, Hsin-Tai Wu, Sayandev Mukherjee
Abstract	Deep neural networks have demonstrated cutting edge performance on various tasks including classification. However, it is well known that adversarially designed imperceptible perturbation of the input can mislead advanced classifiers. In this paper, Permutation Phase Defense (PPD), is proposed as a novel method to resist adversarial attacks. PPD combines random permutation of the image with phase component of its Fourier transform. The basic idea behind this approach is to turn adversarial defense problems analogously into symmetric cryptography, which relies solely on safekeeping of the keys for security. In PPD, safe keeping of the selected permutation ensures effectiveness against adversarial attacks. Testing PPD on MNIST and CIFAR-10 datasets yielded state-of-the-art robustness against the most powerful adversarial attacks currently available.
Tasks	Adversarial Defense
Published	2018-12-25
URL	https://arxiv.org/abs/1812.10049v2
PDF	https://arxiv.org/pdf/1812.10049v2.pdf
PWC	https://paperswithcode.com/paper/ppd-permutation-phase-defense-against
Repo
Framework

What Goes Where: Predicting Object Distributions from Above


Title	What Goes Where: Predicting Object Distributions from Above
Authors	Connor Greenwell, Scott Workman, Nathan Jacobs
Abstract	In this work, we propose a cross-view learning approach, in which images captured from a ground-level view are used as weakly supervised annotations for interpreting overhead imagery. The outcome is a convolutional neural network for overhead imagery that is capable of predicting the type and count of objects that are likely to be seen from a ground-level perspective. We demonstrate our approach on a large dataset of geotagged ground-level and overhead imagery and find that our network captures semantically meaningful features, despite being trained without manual annotations.
Tasks
Published	2018-08-02
URL	http://arxiv.org/abs/1808.00995v1
PDF	http://arxiv.org/pdf/1808.00995v1.pdf
PWC	https://paperswithcode.com/paper/what-goes-where-predicting-object
Repo
Framework