Paper Group ANR 28
Photorealistic Image Reconstruction from Hybrid Intensity and Event based Sensor. Sharp Analysis of Learning with Discrete Losses. Algebraic Expression of Subjective Spatial and Temporal Patterns. The Enemy Among Us: Detecting Hate Speech with Threats Based ‘Othering’ Language Embeddings. Advanced Methods for the Optical Quality Assurance of Silico …
Photorealistic Image Reconstruction from Hybrid Intensity and Event based Sensor
Title | Photorealistic Image Reconstruction from Hybrid Intensity and Event based Sensor |
Authors | Prasan A Shedligeri, Kaushik Mitra |
Abstract | Event sensors output a stream of asynchronous brightness changes (called ``events’') at a very high temporal rate. Previous works on recovering the lost intensity information from the event sensor data have heavily relied on the event stream, which makes the reconstructed images non-photorealistic and also susceptible to noise in the event stream. We propose to reconstruct photorealistic intensity images from a hybrid sensor consisting of a low frame rate conventional camera, which has the scene texture information, along with the event sensor. To accomplish our task, we warp the low frame rate intensity images to temporally dense locations of the event data by estimating a spatially dense scene depth and temporally dense sensor ego-motion. The results obtained from our algorithm are more photorealistic compared to any of the previous state-of-the-art algorithms. We also demonstrate our algorithm’s robustness to abrupt camera motion and noise in the event sensor data. | |
Tasks | Image Reconstruction |
Published | 2018-05-16 |
URL | http://arxiv.org/abs/1805.06140v4 |
http://arxiv.org/pdf/1805.06140v4.pdf | |
PWC | https://paperswithcode.com/paper/photorealistic-image-reconstruction-from |
Repo | |
Framework | |
Sharp Analysis of Learning with Discrete Losses
Title | Sharp Analysis of Learning with Discrete Losses |
Authors | Alex Nowak-Vila, Francis Bach, Alessandro Rudi |
Abstract | The problem of devising learning strategies for discrete losses (e.g., multilabeling, ranking) is currently addressed with methods and theoretical analyses ad-hoc for each loss. In this paper we study a least-squares framework to systematically design learning algorithms for discrete losses, with quantitative characterizations in terms of statistical and computational complexity. In particular we improve existing results by providing explicit dependence on the number of labels for a wide class of losses and faster learning rates in conditions of low-noise. Theoretical results are complemented with experiments on real datasets, showing the effectiveness of the proposed general approach. |
Tasks | |
Published | 2018-10-16 |
URL | http://arxiv.org/abs/1810.06839v1 |
http://arxiv.org/pdf/1810.06839v1.pdf | |
PWC | https://paperswithcode.com/paper/sharp-analysis-of-learning-with-discrete |
Repo | |
Framework | |
Algebraic Expression of Subjective Spatial and Temporal Patterns
Title | Algebraic Expression of Subjective Spatial and Temporal Patterns |
Authors | Chuyu Xiong |
Abstract | Universal learning machine is a theory trying to study machine learning from mathematical point of view. The outside world is reflected inside an universal learning machine according to pattern of incoming data. This is subjective pattern of learning machine. In [2,4], we discussed subjective spatial pattern, and established a powerful tool – X-form, which is an algebraic expression for subjective spatial pattern. However, as the initial stage of study, there we only discussed spatial pattern. Here, we will discuss spatial and temporal patterns, and algebraic expression for them. |
Tasks | |
Published | 2018-05-26 |
URL | http://arxiv.org/abs/1805.11959v2 |
http://arxiv.org/pdf/1805.11959v2.pdf | |
PWC | https://paperswithcode.com/paper/algebraic-expression-of-subjective-spatial |
Repo | |
Framework | |
The Enemy Among Us: Detecting Hate Speech with Threats Based ‘Othering’ Language Embeddings
Title | The Enemy Among Us: Detecting Hate Speech with Threats Based ‘Othering’ Language Embeddings |
Authors | Wafa Alorainy, Pete Burnap, Han Liu, Matthew Williams |
Abstract | Offensive or antagonistic language targeted at individuals and social groups based on their personal characteristics (also known as cyber hate speech or cyberhate) has been frequently posted and widely circulated viathe World Wide Web. This can be considered as a key risk factor for individual and societal tension linked toregional instability. Automated Web-based cyberhate detection is important for observing and understandingcommunity and regional societal tension - especially in online social networks where posts can be rapidlyand widely viewed and disseminated. While previous work has involved using lexicons, bags-of-words orprobabilistic language parsing approaches, they often suffer from a similar issue which is that cyberhate can besubtle and indirect - thus depending on the occurrence of individual words or phrases can lead to a significantnumber of false negatives, providing inaccurate representation of the trends in cyberhate. This problemmotivated us to challenge thinking around the representation of subtle language use, such as references toperceived threats from “the other” including immigration or job prosperity in a hateful context. We propose anovel framework that utilises language use around the concept of “othering” and intergroup threat theory toidentify these subtleties and we implement a novel classification method using embedding learning to computesemantic distances between parts of speech considered to be part of an “othering” narrative. To validate ourapproach we conduct several experiments on different types of cyberhate, namely religion, disability, race andsexual orientation, with F-measure scores for classifying hateful instances obtained through applying ourmodel of 0.93, 0.86, 0.97 and 0.98 respectively, providing a significant improvement in classifier accuracy overthe state-of-the-art |
Tasks | |
Published | 2018-01-23 |
URL | http://arxiv.org/abs/1801.07495v3 |
http://arxiv.org/pdf/1801.07495v3.pdf | |
PWC | https://paperswithcode.com/paper/the-enemy-among-us-detecting-hate-speech-with |
Repo | |
Framework | |
Advanced Methods for the Optical Quality Assurance of Silicon Sensors
Title | Advanced Methods for the Optical Quality Assurance of Silicon Sensors |
Authors | E. Lavrik, I. Panasenko, H. R. Schmidt |
Abstract | We describe a setup for optical quality assurance of silicon microstrip sensors. Pattern recognition algorithms were developed to analyze microscopic scans of the sensors for defects. It is shown that the software has a recognition and classification rate of $>$~90% for defects like scratches, shorts, broken metal lines etc. We have demonstrated that advanced image processing based on neural network techniques is able to further improve the recognition and defect classification rate. |
Tasks | |
Published | 2018-06-30 |
URL | http://arxiv.org/abs/1807.00211v2 |
http://arxiv.org/pdf/1807.00211v2.pdf | |
PWC | https://paperswithcode.com/paper/advanced-methods-for-the-optical-quality |
Repo | |
Framework | |
The Dreaming Variational Autoencoder for Reinforcement Learning Environments
Title | The Dreaming Variational Autoencoder for Reinforcement Learning Environments |
Authors | Per-Arne Andersen, Morten Goodwin, Ole-Christoffer Granmo |
Abstract | Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration. |
Tasks | |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01112v1 |
http://arxiv.org/pdf/1810.01112v1.pdf | |
PWC | https://paperswithcode.com/paper/the-dreaming-variational-autoencoder-for |
Repo | |
Framework | |
Variance-based Gradient Compression for Efficient Distributed Deep Learning
Title | Variance-based Gradient Compression for Efficient Distributed Deep Learning |
Authors | Yusuke Tsuzuku, Hiroto Imachi, Takuya Akiba |
Abstract | Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently communicate gradients, causing severe bottlenecks, especially on lower bandwidth connections. A few methods have been proposed to compress gradient for efficient communication, but they either suffer a low compression ratio or significantly harm the resulting model accuracy, particularly when applied to convolutional neural networks. To address these issues, we propose a method to reduce the communication overhead of distributed deep learning. Our key observation is that gradient updates can be delayed until an unambiguous (high amplitude, low variance) gradient has been calculated. We also present an efficient algorithm to compute the variance with negligible additional cost. We experimentally show that our method can achieve very high compression ratio while maintaining the result model accuracy. We also analyze the efficiency using computation and communication cost models and provide the evidence that this method enables distributed deep learning for many scenarios with commodity environments. |
Tasks | |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.06058v2 |
http://arxiv.org/pdf/1802.06058v2.pdf | |
PWC | https://paperswithcode.com/paper/variance-based-gradient-compression-for |
Repo | |
Framework | |
Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation
Title | Multi-shot Person Re-identification through Set Distance with Visual Distributional Representation |
Authors | Ting-Yao Hu, Xiaojun Chang, Alexander G. Hauptmann |
Abstract | Person re-identification aims to identify a specific person at distinct times and locations. It is challenging because of occlusion, illumination, and viewpoint change in camera views. Recently, multi-shot person re-id task receives more attention since it is closer to real-world application. A key point of a good algorithm for multi-shot person re-id is the temporal aggregation of the person appearance features. While most of the current approaches apply pooling strategies and obtain a fixed-size vector representation, these may lose the matching evidence between examples. In this work, we propose the idea of visual distributional representation, which interprets an image set as samples drawn from an unknown distribution in appearance feature space. Based on the supervision signals from a downstream task of interest, the method reshapes the appearance feature space and further learns the unknown distribution of each image set. In the context of multi-shot person re-id, we apply this novel concept along with Wasserstein distance and learn a distributional set distance function between two image sets. In this way, the proper alignment between two image sets can be discovered naturally in a non-parametric manner. Our experiment results on two public datasets show the advantages of our proposed method compared to other state-of-the-art approaches. |
Tasks | Person Re-Identification |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01119v2 |
http://arxiv.org/pdf/1808.01119v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-shot-person-re-identification-through |
Repo | |
Framework | |
Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects
Title | Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects |
Authors | Yusuke Niitani, Takuya Akiba, Tommi Kerola, Toru Ogawa, Shotaro Sano, Shuji Suzuki |
Abstract | Efficient and reliable methods for training of object detectors are in higher demand than ever, and more and more data relevant to the field is becoming available. However, large datasets like Open Images Dataset v4 (OID) are sparsely annotated, and some measure must be taken in order to ensure the training of a reliable detector. In order to take the incompleteness of these datasets into account, one possibility is to use pretrained models to detect the presence of the unverified objects. However, the performance of such a strategy depends largely on the power of the pretrained model. In this study, we propose part-aware sampling, a method that uses human intuition for the hierarchical relation between objects. In terse terms, our method works by making assumptions like “a bounding box for a car should contain a bounding box for a tire”. We demonstrate the power of our method on OID and compare the performance against a method based on a pretrained model. Our method also won the first and second place on the public and private test sets of the Google AI Open Images Competition 2018. |
Tasks | Object Detection |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.10862v2 |
http://arxiv.org/pdf/1811.10862v2.pdf | |
PWC | https://paperswithcode.com/paper/sampling-techniques-for-large-scale-object |
Repo | |
Framework | |
Decompose to manipulate: Manipulable Object Synthesis in 3D Medical Images with Structured Image Decomposition
Title | Decompose to manipulate: Manipulable Object Synthesis in 3D Medical Images with Structured Image Decomposition |
Authors | Siqi Liu, Eli Gibson, Sasa Grbic, Zhoubing Xu, Arnaud Arindra Adiyoso Setio, Jie Yang, Bogdan Georgescu, Dorin Comaniciu |
Abstract | The performance of medical image analysis systems is constrained by the quantity of high-quality image annotations. Such systems require data to be annotated by experts with years of training, especially when diagnostic decisions are involved. Such datasets are thus hard to scale up. In this context, it is hard for supervised learning systems to generalize to the cases that are rare in the training set but would be present in real-world clinical practices. We believe that the synthetic image samples generated by a system trained on the real data can be useful for improving the supervised learning tasks in the medical image analysis applications. Allowing the image synthesis to be manipulable could help synthetic images provide complementary information to the training data rather than simply duplicating the real-data manifold. In this paper, we propose a framework for synthesizing 3D objects, such as pulmonary nodules, in 3D medical images with manipulable properties. The manipulation is enabled by decomposing of the object of interests into its segmentation mask and a 1D vector containing the residual information. The synthetic object is refined and blended into the image context with two adversarial discriminators. We evaluate the proposed framework on lung nodules in 3D chest CT images and show that the proposed framework could generate realistic nodules with manipulable shapes, textures and locations, etc. By sampling from both the synthetic nodules and the real nodules from 2800 3D CT volumes during the classifier training, we show the synthetic patches could improve the overall nodule detection performance by average 8.44% competition performance metric (CPM) score. |
Tasks | Image Generation |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01737v2 |
http://arxiv.org/pdf/1812.01737v2.pdf | |
PWC | https://paperswithcode.com/paper/decompose-to-manipulate-manipulable-object |
Repo | |
Framework | |
Deep Learning for Decoding of Linear Codes - A Syndrome-Based Approach
Title | Deep Learning for Decoding of Linear Codes - A Syndrome-Based Approach |
Authors | Amir Bennatan, Yoni Choukroun, Pavel Kisilev |
Abstract | We present a novel framework for applying deep neural networks (DNN) to soft decoding of linear codes at arbitrary block lengths. Unlike other approaches, our framework allows unconstrained DNN design, enabling the free application of powerful designs that were developed in other contexts. Our method is robust to overfitting that inhibits many competing methods, which follows from the exponentially large number of codewords required for their training. We achieve this by transforming the channel output before feeding it to the network, extracting only the syndrome of the hard decisions and the channel output reliabilities. We prove analytically that this approach does not involve any intrinsic performance penalty, and guarantees the generalization of performance obtained during training. Our best results are obtained using a recurrent neural network (RNN) architecture combined with simple preprocessing by permutation. We provide simulation results that demonstrate performance that sometimes approaches that of the ordered statistics decoding (OSD) algorithm. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04741v1 |
http://arxiv.org/pdf/1802.04741v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-decoding-of-linear-codes-a |
Repo | |
Framework | |
Constrained Convolutional-Recurrent Networks to Improve Speech Quality with Low Impact on Recognition Accuracy
Title | Constrained Convolutional-Recurrent Networks to Improve Speech Quality with Low Impact on Recognition Accuracy |
Authors | Rasool Fakoor, Xiaodong He, Ivan Tashev, Shuayb Zarar |
Abstract | For a speech-enhancement algorithm, it is highly desirable to simultaneously improve perceptual quality and recognition rate. Thanks to computational costs and model complexities, it is challenging to train a model that effectively optimizes both metrics at the same time. In this paper, we propose a method for speech enhancement that combines local and global contextual structures information through convolutional-recurrent neural networks that improves perceptual quality. At the same time, we introduce a new constraint on the objective function using a language model/decoder that limits the impact on recognition rate. Based on experiments conducted with real user data, we demonstrate that our new context-augmented machine-learning approach for speech enhancement improves PESQ and WER by an additional 24.5% and 51.3%, respectively, when compared to the best-performing methods in the literature. |
Tasks | Language Modelling, Speech Enhancement |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.05874v1 |
http://arxiv.org/pdf/1802.05874v1.pdf | |
PWC | https://paperswithcode.com/paper/constrained-convolutional-recurrent-networks |
Repo | |
Framework | |
Learning How to Self-Learn: Enhancing Self-Training Using Neural Reinforcement Learning
Title | Learning How to Self-Learn: Enhancing Self-Training Using Neural Reinforcement Learning |
Authors | Chenhua Chen, Yue Zhang |
Abstract | Self-training is a useful strategy for semi-supervised learning, leveraging raw texts for enhancing model performances. Traditional self-training methods depend on heuristics such as model confidence for instance selection, the manual adjustment of which can be expensive. To address these challenges, we propose a deep reinforcement learning method to learn the self-training strategy automatically. Based on neural network representation of sentences, our model automatically learns an optimal policy for instance selection. Experimental results show that our approach outperforms the baseline solutions in terms of better tagging performances and stability. |
Tasks | |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05734v1 |
http://arxiv.org/pdf/1804.05734v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-how-to-self-learn-enhancing-self |
Repo | |
Framework | |
PPD: Permutation Phase Defense Against Adversarial Examples in Deep Learning
Title | PPD: Permutation Phase Defense Against Adversarial Examples in Deep Learning |
Authors | Mehdi Jafarnia-Jahromi, Tasmin Chowdhury, Hsin-Tai Wu, Sayandev Mukherjee |
Abstract | Deep neural networks have demonstrated cutting edge performance on various tasks including classification. However, it is well known that adversarially designed imperceptible perturbation of the input can mislead advanced classifiers. In this paper, Permutation Phase Defense (PPD), is proposed as a novel method to resist adversarial attacks. PPD combines random permutation of the image with phase component of its Fourier transform. The basic idea behind this approach is to turn adversarial defense problems analogously into symmetric cryptography, which relies solely on safekeeping of the keys for security. In PPD, safe keeping of the selected permutation ensures effectiveness against adversarial attacks. Testing PPD on MNIST and CIFAR-10 datasets yielded state-of-the-art robustness against the most powerful adversarial attacks currently available. |
Tasks | Adversarial Defense |
Published | 2018-12-25 |
URL | https://arxiv.org/abs/1812.10049v2 |
https://arxiv.org/pdf/1812.10049v2.pdf | |
PWC | https://paperswithcode.com/paper/ppd-permutation-phase-defense-against |
Repo | |
Framework | |
What Goes Where: Predicting Object Distributions from Above
Title | What Goes Where: Predicting Object Distributions from Above |
Authors | Connor Greenwell, Scott Workman, Nathan Jacobs |
Abstract | In this work, we propose a cross-view learning approach, in which images captured from a ground-level view are used as weakly supervised annotations for interpreting overhead imagery. The outcome is a convolutional neural network for overhead imagery that is capable of predicting the type and count of objects that are likely to be seen from a ground-level perspective. We demonstrate our approach on a large dataset of geotagged ground-level and overhead imagery and find that our network captures semantically meaningful features, despite being trained without manual annotations. |
Tasks | |
Published | 2018-08-02 |
URL | http://arxiv.org/abs/1808.00995v1 |
http://arxiv.org/pdf/1808.00995v1.pdf | |
PWC | https://paperswithcode.com/paper/what-goes-where-predicting-object |
Repo | |
Framework | |