Paper Group ANR 926
Semantic Video Segmentation: A Review on Recent Approaches. Non-local RoI for Cross-Object Perception. Lung Structures Enhancement in Chest Radiographs via CT based FCNN Training. CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection. Steering Social Activity: A Stochastic Optimal Control Point Of View. …
Semantic Video Segmentation: A Review on Recent Approaches
Title | Semantic Video Segmentation: A Review on Recent Approaches |
Authors | Mohammad Hajizadeh Saffar, Mohsen Fayyaz, Mohammad Sabokrou, Mahmood Fathy |
Abstract | This paper gives an overview on semantic segmentation consists of an explanation of this field, it’s status and relation with other vision fundamental tasks, different datasets and common evaluation parameters that have been used by researchers. This survey also includes an overall review on a variety of recent approaches (RDF, MRF, CRF, etc.) and their advantages and challenges and shows the superiority of CNN-based semantic segmentation systems on CamVid and NYUDv2 datasets. In addition, some areas that is ideal for future work have mentioned. |
Tasks | Semantic Segmentation, Video Semantic Segmentation |
Published | 2018-06-16 |
URL | http://arxiv.org/abs/1806.06172v1 |
http://arxiv.org/pdf/1806.06172v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-video-segmentation-a-review-on |
Repo | |
Framework | |
Non-local RoI for Cross-Object Perception
Title | Non-local RoI for Cross-Object Perception |
Authors | Shou-Yao Roy Tseng, Hwann-Tzong Chen, Shao-Heng Tai, Tyng-Luh Liu |
Abstract | We present a generic and flexible module that encodes region proposals by both their intrinsic features and the extrinsic correlations to the others. The proposed non-local region of interest (NL-RoI) can be seamlessly adapted into different generalized R-CNN architectures to better address various perception tasks. Observe that existing techniques from R-CNN treat RoIs independently and perform the prediction solely based on image features within each region proposal. However, the pairwise relationships between proposals could further provide useful information for detection and segmentation. NL-RoI is thus formulated to enrich each RoI representation with the information from all other RoIs, and yield a simple, low-cost, yet effective module for region-based convolutional networks. Our experimental results show that NL-RoI can improve the performance of Faster/Mask R-CNN for object detection and instance segmentation. |
Tasks | Instance Segmentation, Object Detection, Semantic Segmentation |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.10002v1 |
http://arxiv.org/pdf/1811.10002v1.pdf | |
PWC | https://paperswithcode.com/paper/non-local-roi-for-cross-object-perception |
Repo | |
Framework | |
Lung Structures Enhancement in Chest Radiographs via CT based FCNN Training
Title | Lung Structures Enhancement in Chest Radiographs via CT based FCNN Training |
Authors | Ophir Gozes, Hayit Greenspan |
Abstract | The abundance of overlapping anatomical structures appearing in chest radiographs can reduce the performance of lung pathology detection by automated algorithms (CAD) as well as the human reader. In this paper, we present a deep learning based image processing technique for enhancing the contrast of soft lung structures in chest radiographs using Fully Convolutional Neural Networks (FCNN). Two 2D FCNN architectures were trained to accomplish the task: The first performs 2D lung segmentation which is used for normalization of the lung area. The second FCNN is trained to extract lung structures. To create the training images, we employed Simulated X-Ray or Digitally Reconstructed Radiographs (DRR) derived from 516 scans belonging to the LIDC-IDRI dataset. By first segmenting the lungs in the CT domain, we are able to create a dataset of 2D lung masks to be used for training the segmentation FCNN. For training the extraction FCNN, we create DRR images of only voxels belonging to the 3D lung segmentation which we call “Lung X-ray” and use them as target images. Once the lung structures are extracted, the original image can be enhanced by fusing the original input x-ray and the synthesized “Lung X-ray”. We show that our enhancement technique is applicable to real x-ray data, and display our results on the recently released NIH Chest X-Ray-14 dataset. We see promising results when training a DenseNet-121 based architecture to work directly on the lung enhanced X-ray images. |
Tasks | |
Published | 2018-10-14 |
URL | http://arxiv.org/abs/1810.05989v1 |
http://arxiv.org/pdf/1810.05989v1.pdf | |
PWC | https://paperswithcode.com/paper/lung-structures-enhancement-in-chest |
Repo | |
Framework | |
CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection
Title | CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection |
Authors | S. Mostafa Mousavi, Weiqiang Zhu, Yixiao Sheng, Gregory C. Beroza |
Abstract | Earthquake signal detection is at the core of observational seismology. A good detection algorithm should be sensitive to small and weak events with a variety of waveform shapes, robust to background noise and non-earthquake signals, and efficient for processing large data volumes. Here, we introduce the Cnn-Rnn Earthquake Detector (CRED), a detector based on deep neural networks. The network uses a combination of convolutional layers and bi-directional long-short-term memory units in a residual structure. It learns the time-frequency characteristics of the dominant phases in an earthquake signal from three component data recorded on a single station. We train the network using 500,000 seismograms (250k associated with tectonic earthquakes and 250k identified as noise) recorded in Northern California and tested it with an F-score of 99.95. The robustness of the trained model with respect to the noise level and non-earthquake signals is shown by applying it to a set of semi-synthetic signals. The model is applied to one month of continuous data recorded at Central Arkansas to demonstrate its efficiency, generalization, and sensitivity. Our model is able to detect more than 700 microearthquakes as small as -1.3 ML induced during hydraulic fracturing far away than the training region. The performance of the model is compared with STA/LTA, template matching, and FAST algorithms. Our results indicate an efficient and reliable performance of CRED. This framework holds great promise in lowering the detection threshold while minimizing false positive detection rates. |
Tasks | |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01965v1 |
http://arxiv.org/pdf/1810.01965v1.pdf | |
PWC | https://paperswithcode.com/paper/cred-a-deep-residual-network-of-convolutional |
Repo | |
Framework | |
Steering Social Activity: A Stochastic Optimal Control Point Of View
Title | Steering Social Activity: A Stochastic Optimal Control Point Of View |
Authors | Ali Zarezade, Abir De, Utkarsh Upadhyay, Hamid R. Rabiee, Manuel Gomez-Rodriguez |
Abstract | User engagement in online social networking depends critically on the level of social activity in the corresponding platform–the number of online actions, such as posts, shares or replies, taken by their users. Can we design data-driven algorithms to increase social activity? At a user level, such algorithms may increase activity by helping users decide when to take an action to be more likely to be noticed by their peers. At a network level, they may increase activity by incentivizing a few influential users to take more actions, which in turn will trigger additional actions by other users. In this paper, we model social activity using the framework of marked temporal point processes, derive an alternate representation of these processes using stochastic differential equations (SDEs) with jumps and, exploiting this alternate representation, develop two efficient online algorithms with provable guarantees to steer social activity both at a user and at a network level. In doing so, we establish a previously unexplored connection between optimal control of jump SDEs and doubly stochastic marked temporal point processes, which is of independent interest. Finally, we experiment both with synthetic and real data gathered from Twitter and show that our algorithms consistently steer social activity more effectively than the state of the art. |
Tasks | Point Processes |
Published | 2018-02-19 |
URL | http://arxiv.org/abs/1802.07244v1 |
http://arxiv.org/pdf/1802.07244v1.pdf | |
PWC | https://paperswithcode.com/paper/steering-social-activity-a-stochastic-optimal |
Repo | |
Framework | |
Improving the Universality and Learnability of Neural Programmer-Interpreters with Combinator Abstraction
Title | Improving the Universality and Learnability of Neural Programmer-Interpreters with Combinator Abstraction |
Authors | Da Xiao, Jo-Yu Liao, Xingyuan Yuan |
Abstract | To overcome the limitations of Neural Programmer-Interpreters (NPI) in its universality and learnability, we propose the incorporation of combinator abstraction into neural programing and a new NPI architecture to support this abstraction, which we call Combinatory Neural Programmer-Interpreter (CNPI). Combinator abstraction dramatically reduces the number and complexity of programs that need to be interpreted by the core controller of CNPI, while still allowing the CNPI to represent and interpret arbitrary complex programs by the collaboration of the core with the other components. We propose a small set of four combinators to capture the most pervasive programming patterns. Due to the finiteness and simplicity of this combinator set and the offloading of some burden of interpretation from the core, we are able construct a CNPI that is universal with respect to the set of all combinatorizable programs, which is adequate for solving most algorithmic tasks. Moreover, besides supervised training on execution traces, CNPI can be trained by policy gradient reinforcement learning with appropriately designed curricula. |
Tasks | |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.02696v1 |
http://arxiv.org/pdf/1802.02696v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-the-universality-and-learnability |
Repo | |
Framework | |
Spectral Analysis of Jet Substructure with Neural Networks: Boosted Higgs Case
Title | Spectral Analysis of Jet Substructure with Neural Networks: Boosted Higgs Case |
Authors | Sung Hak Lim, Mihoko M. Nojiri |
Abstract | Jets from boosted heavy particles have a typical angular scale which can be used to distinguish them from QCD jets. We introduce a machine learning strategy for jet substructure analysis using a spectral function on the angular scale. The angular spectrum allows us to scan energy deposits over the angle between a pair of particles in a highly visual way. We set up an artificial neural network (ANN) to find out characteristic shapes of the spectra of the jets from heavy particle decays. By taking the Higgs jets and QCD jets as examples, we show that the ANN of the angular spectrum input has similar performance to existing taggers. In addition, some improvement is seen when additional extra radiations occur. Notably, the new algorithm automatically combines the information of the multi-point correlations in the jet. |
Tasks | |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03312v2 |
http://arxiv.org/pdf/1807.03312v2.pdf | |
PWC | https://paperswithcode.com/paper/spectral-analysis-of-jet-substructure-with |
Repo | |
Framework | |
On the achievability of blind source separation for high-dimensional nonlinear source mixtures
Title | On the achievability of blind source separation for high-dimensional nonlinear source mixtures |
Authors | Takuya Isomura, Taro Toyoizumi |
Abstract | For many years, a combination of principal component analysis (PCA) and independent component analysis (ICA) has been used as a blind source separation (BSS) technique to separate hidden sources of natural data. However, it is unclear why these linear methods work well because most real-world data involve nonlinear mixtures of sources. We show that a cascade of PCA and ICA can solve this nonlinear BSS problem accurately as the variety of input signals increases. Specifically, we present two theorems that guarantee asymptotically zero-error BSS when sources are mixed by a feedforward network with two processing layers. Our first theorem analytically quantifies the performance of an optimal linear encoder that reconstructs independent sources. Zero-error is asymptotically reached when the number of sources is large and the numbers of inputs and nonlinear bases are large relative to the number of sources. The next question involves finding an optimal linear encoder without observing the underlying sources. Our second theorem guarantees that PCA can reliably extract all the subspace represented by the optimal linear encoder, so that a subsequent application of ICA can separate all sources. Thereby, for almost all nonlinear generative processes with sufficient variety, the cascade of PCA and ICA performs asymptotically zero-error BSS in an unsupervised manner. We analytically and numerically validate the theorems. These results highlight the utility of linear BSS techniques for accurately recovering nonlinearly mixed sources when observations are sufficiently diverse. We also discuss a possible biological BSS implementation. |
Tasks | |
Published | 2018-08-02 |
URL | http://arxiv.org/abs/1808.00668v1 |
http://arxiv.org/pdf/1808.00668v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-achievability-of-blind-source |
Repo | |
Framework | |
How to Blend a Robot within a Group of Zebrafish: Achieving Social Acceptance through Real-time Calibration of a Multi-level Behavioural Model
Title | How to Blend a Robot within a Group of Zebrafish: Achieving Social Acceptance through Real-time Calibration of a Multi-level Behavioural Model |
Authors | Leo Cazenille, Yohann Chemtob, Frank Bonnet, Alexey Gribovskiy, Francesco Mondada, Nicolas Bredeche, Jose Halloy |
Abstract | We have previously shown how to socially integrate a fish robot into a group of zebrafish thanks to biomimetic behavioural models. The models have to be calibrated on experimental data to present correct behavioural features. This calibration is essential to enhance the social integration of the robot into the group. When calibrated, the behavioural model of fish behaviour is implemented to drive a robot with closed-loop control of social interactions into a group of zebrafish. This approach can be useful to form mixed-groups, and study animal individual and collective behaviour by using biomimetic autonomous robots capable of responding to the animals in long-standing experiments. Here, we show a methodology for continuous real-time calibration and refinement of multi-level behavioural model. The real-time calibration, by an evolutionary algorithm, is based on simulation of the model to correspond to the observed fish behaviour in real-time. The calibrated model is updated on the robot and tested during the experiments. This method allows to cope with changes of dynamics in fish behaviour. Moreover, each fish presents individual behavioural differences. Thus, each trial is done with naive fish groups that display behavioural variability. This real-time calibration methodology can optimise the robot behaviours during the experiments. Our implementation of this methodology runs on three different computers that perform individual tracking, data-analysis, multi-objective evolutionary algorithms, simulation of the fish robot and adaptation of the robot behavioural models, all in real-time. |
Tasks | Calibration |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11371v1 |
http://arxiv.org/pdf/1805.11371v1.pdf | |
PWC | https://paperswithcode.com/paper/how-to-blend-a-robot-within-a-group-of |
Repo | |
Framework | |
Enhancing Evolutionary Conversion Rate Optimization via Multi-armed Bandit Algorithms
Title | Enhancing Evolutionary Conversion Rate Optimization via Multi-armed Bandit Algorithms |
Authors | Xin Qiu, Risto Miikkulainen |
Abstract | Conversion rate optimization means designing web interfaces such that more visitors perform a desired action (such as register or purchase) on the site. One promising approach, implemented in Sentient Ascend, is to optimize the design using evolutionary algorithms, evaluating each candidate design online with actual visitors. Because such evaluations are costly and noisy, several challenges emerge: How can available visitor traffic be used most efficiently? How can good solutions be identified most reliably? How can a high conversion rate be maintained during optimization? This paper proposes a new technique to address these issues. Traffic is allocated to candidate solutions using a multi-armed bandit algorithm, using more traffic on those evaluations that are most useful. In a best-arm identification mode, the best candidate can be identified reliably at the end of evolution, and in a campaign mode, the overall conversion rate can be optimized throughout the entire evolution process. Multi-armed bandit algorithms thus improve performance and reliability of machine discovery in noisy real-world environments. |
Tasks | |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.03737v3 |
http://arxiv.org/pdf/1803.03737v3.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-evolutionary-conversion-rate |
Repo | |
Framework | |
Driving Policy Transfer via Modularity and Abstraction
Title | Driving Policy Transfer via Modularity and Abstraction |
Authors | Matthias Müller, Alexey Dosovitskiy, Bernard Ghanem, Vladlen Koltun |
Abstract | End-to-end approaches to autonomous driving have high sample complexity and are difficult to scale to realistic urban driving. Simulation can help end-to-end driving systems by providing a cheap, safe, and diverse training environment. Yet training driving policies in simulation brings up the problem of transferring such policies to the real world. We present an approach to transferring driving policies from simulation to reality via modularity and abstraction. Our approach is inspired by classic driving systems and aims to combine the benefits of modular architectures and end-to-end deep learning approaches. The key idea is to encapsulate the driving policy such that it is not directly exposed to raw perceptual input or low-level vehicle dynamics. We evaluate the presented approach in simulated urban environments and in the real world. In particular, we transfer a driving policy trained in simulation to a 1/5-scale robotic truck that is deployed in a variety of conditions, with no finetuning, on two continents. The supplementary video can be viewed at https://youtu.be/BrMDJqI6H5U |
Tasks | Autonomous Driving |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09364v3 |
http://arxiv.org/pdf/1804.09364v3.pdf | |
PWC | https://paperswithcode.com/paper/driving-policy-transfer-via-modularity-and |
Repo | |
Framework | |
Balancing Shared Autonomy with Human-Robot Communication
Title | Balancing Shared Autonomy with Human-Robot Communication |
Authors | Rosario Scalise, Yonatan Bisk, Maxwell Forbes, Daqing Yi, Yejin Choi, Siddhartha Srinivasa |
Abstract | Robotic agents that share autonomy with a human should leverage human domain knowledge and account for their preferences when completing a task. This extra knowledge can dramatically improve plan efficiency and user-satisfaction, but these gains are lost if communicating with a robot is taxing and unnatural. In this paper, we show how viewing humanrobot language through the lens of shared autonomy explains the efficiency versus cognitive load trade-offs humans make when deciding how cooperative and explicit to make their instructions. |
Tasks | |
Published | 2018-05-20 |
URL | http://arxiv.org/abs/1805.07719v1 |
http://arxiv.org/pdf/1805.07719v1.pdf | |
PWC | https://paperswithcode.com/paper/balancing-shared-autonomy-with-human-robot |
Repo | |
Framework | |
Facial Action Unit Detection Using Attention and Relation Learning
Title | Facial Action Unit Detection Using Attention and Relation Learning |
Authors | Zhiwen Shao, Zhilei Liu, Jianfei Cai, Yunsheng Wu, Lizhuang Ma |
Abstract | Attention mechanism has recently attracted increasing attentions in the field of facial action unit (AU) detection. By finding the region of interest of each AU with the attention mechanism, AU-related local features can be captured. Most of the existing attention based AU detection works use prior knowledge to predefine fixed attentions or refine the predefined attentions within a small range, which limits their capacity to model various AUs. In this paper, we propose an end-to-end deep learning based attention and relation learning framework for AU detection with only AU labels, which has not been explored before. In particular, multi-scale features shared by each AU are learned firstly, and then both channel-wise and spatial attentions are adaptively learned to select and extract AU-related local features. Moreover, pixel-level relations for AUs are further captured to refine spatial attentions so as to extract more relevant local features. Without changing the network architecture, our framework can be easily extended for AU intensity estimation. Extensive experiments show that our framework (i) soundly outperforms the state-of-the-art methods for both AU detection and AU intensity estimation on the challenging BP4D, DISFA, FERA 2015 and BP4D+ benchmarks, (ii) can adaptively capture the correlated regions of each AU, and (iii) also works well under severe occlusions and large poses. |
Tasks | Action Unit Detection, Facial Action Unit Detection |
Published | 2018-08-10 |
URL | https://arxiv.org/abs/1808.03457v3 |
https://arxiv.org/pdf/1808.03457v3.pdf | |
PWC | https://paperswithcode.com/paper/facial-action-unit-detection-using-attention |
Repo | |
Framework | |
Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization
Title | Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization |
Authors | Constantinos Daskalakis, Ioannis Panageas |
Abstract | Motivated by applications in Game Theory, Optimization, and Generative Adversarial Networks, recent work of Daskalakis et al \cite{DISZ17} and follow-up work of Liang and Stokes \cite{LiangS18} have established that a variant of the widely used Gradient Descent/Ascent procedure, called “Optimistic Gradient Descent/Ascent (OGDA)", exhibits last-iterate convergence to saddle points in {\em unconstrained} convex-concave min-max optimization problems. We show that the same holds true in the more general problem of {\em constrained} min-max optimization under a variant of the no-regret Multiplicative-Weights-Update method called “Optimistic Multiplicative-Weights Update (OMWU)". This answers an open question of Syrgkanis et al \cite{SALS15}. The proof of our result requires fundamentally different techniques from those that exist in no-regret learning literature and the aforementioned papers. We show that OMWU monotonically improves the Kullback-Leibler divergence of the current iterate to the (appropriately normalized) min-max solution until it enters a neighborhood of the solution. Inside that neighborhood we show that OMWU becomes a contracting map converging to the exact solution. We believe that our techniques will be useful in the analysis of the last iterate of other learning algorithms. |
Tasks | |
Published | 2018-07-11 |
URL | https://arxiv.org/abs/1807.04252v3 |
https://arxiv.org/pdf/1807.04252v3.pdf | |
PWC | https://paperswithcode.com/paper/last-iterate-convergence-zero-sum-games-and |
Repo | |
Framework | |
Understanding Unequal Gender Classification Accuracy from Face Images
Title | Understanding Unequal Gender Classification Accuracy from Face Images |
Authors | Vidya Muthukumar, Tejaswini Pedapati, Nalini Ratha, Prasanna Sattigeri, Chai-Wah Wu, Brian Kingsbury, Abhishek Kumar, Samuel Thomas, Aleksandra Mojsilovic, Kush R. Varshney |
Abstract | Recent work shows unequal performance of commercial face classification services in the gender classification task across intersectional groups defined by skin type and gender. Accuracy on dark-skinned females is significantly worse than on any other group. In this paper, we conduct several analyses to try to uncover the reason for this gap. The main finding, perhaps surprisingly, is that skin type is not the driver. This conclusion is reached via stability experiments that vary an image’s skin type via color-theoretic methods, namely luminance mode-shift and optimal transport. A second suspect, hair length, is also shown not to be the driver via experiments on face images cropped to exclude the hair. Finally, using contrastive post-hoc explanation techniques for neural networks, we bring forth evidence suggesting that differences in lip, eye and cheek structure across ethnicity lead to the differences. Further, lip and eye makeup are seen as strong predictors for a female face, which is a troubling propagation of a gender stereotype. |
Tasks | |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1812.00099v1 |
http://arxiv.org/pdf/1812.00099v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-unequal-gender-classification |
Repo | |
Framework | |