October 17, 2019

3338 words 16 mins read

Paper Group ANR 926

Paper Group ANR 926

Semantic Video Segmentation: A Review on Recent Approaches. Non-local RoI for Cross-Object Perception. Lung Structures Enhancement in Chest Radiographs via CT based FCNN Training. CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection. Steering Social Activity: A Stochastic Optimal Control Point Of View. …

Semantic Video Segmentation: A Review on Recent Approaches

Title Semantic Video Segmentation: A Review on Recent Approaches
Authors Mohammad Hajizadeh Saffar, Mohsen Fayyaz, Mohammad Sabokrou, Mahmood Fathy
Abstract This paper gives an overview on semantic segmentation consists of an explanation of this field, it’s status and relation with other vision fundamental tasks, different datasets and common evaluation parameters that have been used by researchers. This survey also includes an overall review on a variety of recent approaches (RDF, MRF, CRF, etc.) and their advantages and challenges and shows the superiority of CNN-based semantic segmentation systems on CamVid and NYUDv2 datasets. In addition, some areas that is ideal for future work have mentioned.
Tasks Semantic Segmentation, Video Semantic Segmentation
Published 2018-06-16
URL http://arxiv.org/abs/1806.06172v1
PDF http://arxiv.org/pdf/1806.06172v1.pdf
PWC https://paperswithcode.com/paper/semantic-video-segmentation-a-review-on
Repo
Framework

Non-local RoI for Cross-Object Perception

Title Non-local RoI for Cross-Object Perception
Authors Shou-Yao Roy Tseng, Hwann-Tzong Chen, Shao-Heng Tai, Tyng-Luh Liu
Abstract We present a generic and flexible module that encodes region proposals by both their intrinsic features and the extrinsic correlations to the others. The proposed non-local region of interest (NL-RoI) can be seamlessly adapted into different generalized R-CNN architectures to better address various perception tasks. Observe that existing techniques from R-CNN treat RoIs independently and perform the prediction solely based on image features within each region proposal. However, the pairwise relationships between proposals could further provide useful information for detection and segmentation. NL-RoI is thus formulated to enrich each RoI representation with the information from all other RoIs, and yield a simple, low-cost, yet effective module for region-based convolutional networks. Our experimental results show that NL-RoI can improve the performance of Faster/Mask R-CNN for object detection and instance segmentation.
Tasks Instance Segmentation, Object Detection, Semantic Segmentation
Published 2018-11-25
URL http://arxiv.org/abs/1811.10002v1
PDF http://arxiv.org/pdf/1811.10002v1.pdf
PWC https://paperswithcode.com/paper/non-local-roi-for-cross-object-perception
Repo
Framework

Lung Structures Enhancement in Chest Radiographs via CT based FCNN Training

Title Lung Structures Enhancement in Chest Radiographs via CT based FCNN Training
Authors Ophir Gozes, Hayit Greenspan
Abstract The abundance of overlapping anatomical structures appearing in chest radiographs can reduce the performance of lung pathology detection by automated algorithms (CAD) as well as the human reader. In this paper, we present a deep learning based image processing technique for enhancing the contrast of soft lung structures in chest radiographs using Fully Convolutional Neural Networks (FCNN). Two 2D FCNN architectures were trained to accomplish the task: The first performs 2D lung segmentation which is used for normalization of the lung area. The second FCNN is trained to extract lung structures. To create the training images, we employed Simulated X-Ray or Digitally Reconstructed Radiographs (DRR) derived from 516 scans belonging to the LIDC-IDRI dataset. By first segmenting the lungs in the CT domain, we are able to create a dataset of 2D lung masks to be used for training the segmentation FCNN. For training the extraction FCNN, we create DRR images of only voxels belonging to the 3D lung segmentation which we call “Lung X-ray” and use them as target images. Once the lung structures are extracted, the original image can be enhanced by fusing the original input x-ray and the synthesized “Lung X-ray”. We show that our enhancement technique is applicable to real x-ray data, and display our results on the recently released NIH Chest X-Ray-14 dataset. We see promising results when training a DenseNet-121 based architecture to work directly on the lung enhanced X-ray images.
Tasks
Published 2018-10-14
URL http://arxiv.org/abs/1810.05989v1
PDF http://arxiv.org/pdf/1810.05989v1.pdf
PWC https://paperswithcode.com/paper/lung-structures-enhancement-in-chest
Repo
Framework

CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection

Title CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection
Authors S. Mostafa Mousavi, Weiqiang Zhu, Yixiao Sheng, Gregory C. Beroza
Abstract Earthquake signal detection is at the core of observational seismology. A good detection algorithm should be sensitive to small and weak events with a variety of waveform shapes, robust to background noise and non-earthquake signals, and efficient for processing large data volumes. Here, we introduce the Cnn-Rnn Earthquake Detector (CRED), a detector based on deep neural networks. The network uses a combination of convolutional layers and bi-directional long-short-term memory units in a residual structure. It learns the time-frequency characteristics of the dominant phases in an earthquake signal from three component data recorded on a single station. We train the network using 500,000 seismograms (250k associated with tectonic earthquakes and 250k identified as noise) recorded in Northern California and tested it with an F-score of 99.95. The robustness of the trained model with respect to the noise level and non-earthquake signals is shown by applying it to a set of semi-synthetic signals. The model is applied to one month of continuous data recorded at Central Arkansas to demonstrate its efficiency, generalization, and sensitivity. Our model is able to detect more than 700 microearthquakes as small as -1.3 ML induced during hydraulic fracturing far away than the training region. The performance of the model is compared with STA/LTA, template matching, and FAST algorithms. Our results indicate an efficient and reliable performance of CRED. This framework holds great promise in lowering the detection threshold while minimizing false positive detection rates.
Tasks
Published 2018-10-03
URL http://arxiv.org/abs/1810.01965v1
PDF http://arxiv.org/pdf/1810.01965v1.pdf
PWC https://paperswithcode.com/paper/cred-a-deep-residual-network-of-convolutional
Repo
Framework

Steering Social Activity: A Stochastic Optimal Control Point Of View

Title Steering Social Activity: A Stochastic Optimal Control Point Of View
Authors Ali Zarezade, Abir De, Utkarsh Upadhyay, Hamid R. Rabiee, Manuel Gomez-Rodriguez
Abstract User engagement in online social networking depends critically on the level of social activity in the corresponding platform–the number of online actions, such as posts, shares or replies, taken by their users. Can we design data-driven algorithms to increase social activity? At a user level, such algorithms may increase activity by helping users decide when to take an action to be more likely to be noticed by their peers. At a network level, they may increase activity by incentivizing a few influential users to take more actions, which in turn will trigger additional actions by other users. In this paper, we model social activity using the framework of marked temporal point processes, derive an alternate representation of these processes using stochastic differential equations (SDEs) with jumps and, exploiting this alternate representation, develop two efficient online algorithms with provable guarantees to steer social activity both at a user and at a network level. In doing so, we establish a previously unexplored connection between optimal control of jump SDEs and doubly stochastic marked temporal point processes, which is of independent interest. Finally, we experiment both with synthetic and real data gathered from Twitter and show that our algorithms consistently steer social activity more effectively than the state of the art.
Tasks Point Processes
Published 2018-02-19
URL http://arxiv.org/abs/1802.07244v1
PDF http://arxiv.org/pdf/1802.07244v1.pdf
PWC https://paperswithcode.com/paper/steering-social-activity-a-stochastic-optimal
Repo
Framework

Improving the Universality and Learnability of Neural Programmer-Interpreters with Combinator Abstraction

Title Improving the Universality and Learnability of Neural Programmer-Interpreters with Combinator Abstraction
Authors Da Xiao, Jo-Yu Liao, Xingyuan Yuan
Abstract To overcome the limitations of Neural Programmer-Interpreters (NPI) in its universality and learnability, we propose the incorporation of combinator abstraction into neural programing and a new NPI architecture to support this abstraction, which we call Combinatory Neural Programmer-Interpreter (CNPI). Combinator abstraction dramatically reduces the number and complexity of programs that need to be interpreted by the core controller of CNPI, while still allowing the CNPI to represent and interpret arbitrary complex programs by the collaboration of the core with the other components. We propose a small set of four combinators to capture the most pervasive programming patterns. Due to the finiteness and simplicity of this combinator set and the offloading of some burden of interpretation from the core, we are able construct a CNPI that is universal with respect to the set of all combinatorizable programs, which is adequate for solving most algorithmic tasks. Moreover, besides supervised training on execution traces, CNPI can be trained by policy gradient reinforcement learning with appropriately designed curricula.
Tasks
Published 2018-02-08
URL http://arxiv.org/abs/1802.02696v1
PDF http://arxiv.org/pdf/1802.02696v1.pdf
PWC https://paperswithcode.com/paper/improving-the-universality-and-learnability
Repo
Framework

Spectral Analysis of Jet Substructure with Neural Networks: Boosted Higgs Case

Title Spectral Analysis of Jet Substructure with Neural Networks: Boosted Higgs Case
Authors Sung Hak Lim, Mihoko M. Nojiri
Abstract Jets from boosted heavy particles have a typical angular scale which can be used to distinguish them from QCD jets. We introduce a machine learning strategy for jet substructure analysis using a spectral function on the angular scale. The angular spectrum allows us to scan energy deposits over the angle between a pair of particles in a highly visual way. We set up an artificial neural network (ANN) to find out characteristic shapes of the spectra of the jets from heavy particle decays. By taking the Higgs jets and QCD jets as examples, we show that the ANN of the angular spectrum input has similar performance to existing taggers. In addition, some improvement is seen when additional extra radiations occur. Notably, the new algorithm automatically combines the information of the multi-point correlations in the jet.
Tasks
Published 2018-07-09
URL http://arxiv.org/abs/1807.03312v2
PDF http://arxiv.org/pdf/1807.03312v2.pdf
PWC https://paperswithcode.com/paper/spectral-analysis-of-jet-substructure-with
Repo
Framework

On the achievability of blind source separation for high-dimensional nonlinear source mixtures

Title On the achievability of blind source separation for high-dimensional nonlinear source mixtures
Authors Takuya Isomura, Taro Toyoizumi
Abstract For many years, a combination of principal component analysis (PCA) and independent component analysis (ICA) has been used as a blind source separation (BSS) technique to separate hidden sources of natural data. However, it is unclear why these linear methods work well because most real-world data involve nonlinear mixtures of sources. We show that a cascade of PCA and ICA can solve this nonlinear BSS problem accurately as the variety of input signals increases. Specifically, we present two theorems that guarantee asymptotically zero-error BSS when sources are mixed by a feedforward network with two processing layers. Our first theorem analytically quantifies the performance of an optimal linear encoder that reconstructs independent sources. Zero-error is asymptotically reached when the number of sources is large and the numbers of inputs and nonlinear bases are large relative to the number of sources. The next question involves finding an optimal linear encoder without observing the underlying sources. Our second theorem guarantees that PCA can reliably extract all the subspace represented by the optimal linear encoder, so that a subsequent application of ICA can separate all sources. Thereby, for almost all nonlinear generative processes with sufficient variety, the cascade of PCA and ICA performs asymptotically zero-error BSS in an unsupervised manner. We analytically and numerically validate the theorems. These results highlight the utility of linear BSS techniques for accurately recovering nonlinearly mixed sources when observations are sufficiently diverse. We also discuss a possible biological BSS implementation.
Tasks
Published 2018-08-02
URL http://arxiv.org/abs/1808.00668v1
PDF http://arxiv.org/pdf/1808.00668v1.pdf
PWC https://paperswithcode.com/paper/on-the-achievability-of-blind-source
Repo
Framework

How to Blend a Robot within a Group of Zebrafish: Achieving Social Acceptance through Real-time Calibration of a Multi-level Behavioural Model

Title How to Blend a Robot within a Group of Zebrafish: Achieving Social Acceptance through Real-time Calibration of a Multi-level Behavioural Model
Authors Leo Cazenille, Yohann Chemtob, Frank Bonnet, Alexey Gribovskiy, Francesco Mondada, Nicolas Bredeche, Jose Halloy
Abstract We have previously shown how to socially integrate a fish robot into a group of zebrafish thanks to biomimetic behavioural models. The models have to be calibrated on experimental data to present correct behavioural features. This calibration is essential to enhance the social integration of the robot into the group. When calibrated, the behavioural model of fish behaviour is implemented to drive a robot with closed-loop control of social interactions into a group of zebrafish. This approach can be useful to form mixed-groups, and study animal individual and collective behaviour by using biomimetic autonomous robots capable of responding to the animals in long-standing experiments. Here, we show a methodology for continuous real-time calibration and refinement of multi-level behavioural model. The real-time calibration, by an evolutionary algorithm, is based on simulation of the model to correspond to the observed fish behaviour in real-time. The calibrated model is updated on the robot and tested during the experiments. This method allows to cope with changes of dynamics in fish behaviour. Moreover, each fish presents individual behavioural differences. Thus, each trial is done with naive fish groups that display behavioural variability. This real-time calibration methodology can optimise the robot behaviours during the experiments. Our implementation of this methodology runs on three different computers that perform individual tracking, data-analysis, multi-objective evolutionary algorithms, simulation of the fish robot and adaptation of the robot behavioural models, all in real-time.
Tasks Calibration
Published 2018-05-29
URL http://arxiv.org/abs/1805.11371v1
PDF http://arxiv.org/pdf/1805.11371v1.pdf
PWC https://paperswithcode.com/paper/how-to-blend-a-robot-within-a-group-of
Repo
Framework

Enhancing Evolutionary Conversion Rate Optimization via Multi-armed Bandit Algorithms

Title Enhancing Evolutionary Conversion Rate Optimization via Multi-armed Bandit Algorithms
Authors Xin Qiu, Risto Miikkulainen
Abstract Conversion rate optimization means designing web interfaces such that more visitors perform a desired action (such as register or purchase) on the site. One promising approach, implemented in Sentient Ascend, is to optimize the design using evolutionary algorithms, evaluating each candidate design online with actual visitors. Because such evaluations are costly and noisy, several challenges emerge: How can available visitor traffic be used most efficiently? How can good solutions be identified most reliably? How can a high conversion rate be maintained during optimization? This paper proposes a new technique to address these issues. Traffic is allocated to candidate solutions using a multi-armed bandit algorithm, using more traffic on those evaluations that are most useful. In a best-arm identification mode, the best candidate can be identified reliably at the end of evolution, and in a campaign mode, the overall conversion rate can be optimized throughout the entire evolution process. Multi-armed bandit algorithms thus improve performance and reliability of machine discovery in noisy real-world environments.
Tasks
Published 2018-03-10
URL http://arxiv.org/abs/1803.03737v3
PDF http://arxiv.org/pdf/1803.03737v3.pdf
PWC https://paperswithcode.com/paper/enhancing-evolutionary-conversion-rate
Repo
Framework

Driving Policy Transfer via Modularity and Abstraction

Title Driving Policy Transfer via Modularity and Abstraction
Authors Matthias Müller, Alexey Dosovitskiy, Bernard Ghanem, Vladlen Koltun
Abstract End-to-end approaches to autonomous driving have high sample complexity and are difficult to scale to realistic urban driving. Simulation can help end-to-end driving systems by providing a cheap, safe, and diverse training environment. Yet training driving policies in simulation brings up the problem of transferring such policies to the real world. We present an approach to transferring driving policies from simulation to reality via modularity and abstraction. Our approach is inspired by classic driving systems and aims to combine the benefits of modular architectures and end-to-end deep learning approaches. The key idea is to encapsulate the driving policy such that it is not directly exposed to raw perceptual input or low-level vehicle dynamics. We evaluate the presented approach in simulated urban environments and in the real world. In particular, we transfer a driving policy trained in simulation to a 1/5-scale robotic truck that is deployed in a variety of conditions, with no finetuning, on two continents. The supplementary video can be viewed at https://youtu.be/BrMDJqI6H5U
Tasks Autonomous Driving
Published 2018-04-25
URL http://arxiv.org/abs/1804.09364v3
PDF http://arxiv.org/pdf/1804.09364v3.pdf
PWC https://paperswithcode.com/paper/driving-policy-transfer-via-modularity-and
Repo
Framework

Balancing Shared Autonomy with Human-Robot Communication

Title Balancing Shared Autonomy with Human-Robot Communication
Authors Rosario Scalise, Yonatan Bisk, Maxwell Forbes, Daqing Yi, Yejin Choi, Siddhartha Srinivasa
Abstract Robotic agents that share autonomy with a human should leverage human domain knowledge and account for their preferences when completing a task. This extra knowledge can dramatically improve plan efficiency and user-satisfaction, but these gains are lost if communicating with a robot is taxing and unnatural. In this paper, we show how viewing humanrobot language through the lens of shared autonomy explains the efficiency versus cognitive load trade-offs humans make when deciding how cooperative and explicit to make their instructions.
Tasks
Published 2018-05-20
URL http://arxiv.org/abs/1805.07719v1
PDF http://arxiv.org/pdf/1805.07719v1.pdf
PWC https://paperswithcode.com/paper/balancing-shared-autonomy-with-human-robot
Repo
Framework

Facial Action Unit Detection Using Attention and Relation Learning

Title Facial Action Unit Detection Using Attention and Relation Learning
Authors Zhiwen Shao, Zhilei Liu, Jianfei Cai, Yunsheng Wu, Lizhuang Ma
Abstract Attention mechanism has recently attracted increasing attentions in the field of facial action unit (AU) detection. By finding the region of interest of each AU with the attention mechanism, AU-related local features can be captured. Most of the existing attention based AU detection works use prior knowledge to predefine fixed attentions or refine the predefined attentions within a small range, which limits their capacity to model various AUs. In this paper, we propose an end-to-end deep learning based attention and relation learning framework for AU detection with only AU labels, which has not been explored before. In particular, multi-scale features shared by each AU are learned firstly, and then both channel-wise and spatial attentions are adaptively learned to select and extract AU-related local features. Moreover, pixel-level relations for AUs are further captured to refine spatial attentions so as to extract more relevant local features. Without changing the network architecture, our framework can be easily extended for AU intensity estimation. Extensive experiments show that our framework (i) soundly outperforms the state-of-the-art methods for both AU detection and AU intensity estimation on the challenging BP4D, DISFA, FERA 2015 and BP4D+ benchmarks, (ii) can adaptively capture the correlated regions of each AU, and (iii) also works well under severe occlusions and large poses.
Tasks Action Unit Detection, Facial Action Unit Detection
Published 2018-08-10
URL https://arxiv.org/abs/1808.03457v3
PDF https://arxiv.org/pdf/1808.03457v3.pdf
PWC https://paperswithcode.com/paper/facial-action-unit-detection-using-attention
Repo
Framework

Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization

Title Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization
Authors Constantinos Daskalakis, Ioannis Panageas
Abstract Motivated by applications in Game Theory, Optimization, and Generative Adversarial Networks, recent work of Daskalakis et al \cite{DISZ17} and follow-up work of Liang and Stokes \cite{LiangS18} have established that a variant of the widely used Gradient Descent/Ascent procedure, called “Optimistic Gradient Descent/Ascent (OGDA)", exhibits last-iterate convergence to saddle points in {\em unconstrained} convex-concave min-max optimization problems. We show that the same holds true in the more general problem of {\em constrained} min-max optimization under a variant of the no-regret Multiplicative-Weights-Update method called “Optimistic Multiplicative-Weights Update (OMWU)". This answers an open question of Syrgkanis et al \cite{SALS15}. The proof of our result requires fundamentally different techniques from those that exist in no-regret learning literature and the aforementioned papers. We show that OMWU monotonically improves the Kullback-Leibler divergence of the current iterate to the (appropriately normalized) min-max solution until it enters a neighborhood of the solution. Inside that neighborhood we show that OMWU becomes a contracting map converging to the exact solution. We believe that our techniques will be useful in the analysis of the last iterate of other learning algorithms.
Tasks
Published 2018-07-11
URL https://arxiv.org/abs/1807.04252v3
PDF https://arxiv.org/pdf/1807.04252v3.pdf
PWC https://paperswithcode.com/paper/last-iterate-convergence-zero-sum-games-and
Repo
Framework

Understanding Unequal Gender Classification Accuracy from Face Images

Title Understanding Unequal Gender Classification Accuracy from Face Images
Authors Vidya Muthukumar, Tejaswini Pedapati, Nalini Ratha, Prasanna Sattigeri, Chai-Wah Wu, Brian Kingsbury, Abhishek Kumar, Samuel Thomas, Aleksandra Mojsilovic, Kush R. Varshney
Abstract Recent work shows unequal performance of commercial face classification services in the gender classification task across intersectional groups defined by skin type and gender. Accuracy on dark-skinned females is significantly worse than on any other group. In this paper, we conduct several analyses to try to uncover the reason for this gap. The main finding, perhaps surprisingly, is that skin type is not the driver. This conclusion is reached via stability experiments that vary an image’s skin type via color-theoretic methods, namely luminance mode-shift and optimal transport. A second suspect, hair length, is also shown not to be the driver via experiments on face images cropped to exclude the hair. Finally, using contrastive post-hoc explanation techniques for neural networks, we bring forth evidence suggesting that differences in lip, eye and cheek structure across ethnicity lead to the differences. Further, lip and eye makeup are seen as strong predictors for a female face, which is a troubling propagation of a gender stereotype.
Tasks
Published 2018-11-30
URL http://arxiv.org/abs/1812.00099v1
PDF http://arxiv.org/pdf/1812.00099v1.pdf
PWC https://paperswithcode.com/paper/understanding-unequal-gender-classification
Repo
Framework
comments powered by Disqus