January 30, 2020

2785 words 14 mins read

Paper Group ANR 269

Topological Machine Learning with Persistence Indicator Functions. Transient-evoked otoacoustic emission signals predicting outcomes of acute sensorineural hearing loss in patients with Meniere’s Disease. A Dataset of Multi-Illumination Images in the Wild. Weakly-supervised Action Localization with Background Modeling. A PolSAR Scattering Power Fac …

Topological Machine Learning with Persistence Indicator Functions


Title	Topological Machine Learning with Persistence Indicator Functions
Authors	Bastian Rieck, Filip Sadlo, Heike Leitte
Abstract	Techniques from computational topology, in particular persistent homology, are becoming increasingly relevant for data analysis. Their stable metrics permit the use of many distance-based data analysis methods, such as multidimensional scaling, while providing a firm theoretical ground. Many modern machine learning algorithms, however, are based on kernels. This paper presents persistence indicator functions (PIFs), which summarize persistence diagrams, i.e., feature descriptors in topological data analysis. PIFs can be calculated and compared in linear time and have many beneficial properties, such as the availability of a kernel-based similarity measure. We demonstrate their usage in common data analysis scenarios, such as confidence set estimation and classification of complex structured data.
Tasks	Topological Data Analysis
Published	2019-07-31
URL	https://arxiv.org/abs/1907.13496v1
PDF	https://arxiv.org/pdf/1907.13496v1.pdf
PWC	https://paperswithcode.com/paper/topological-machine-learning-with-persistence
Repo
Framework

Transient-evoked otoacoustic emission signals predicting outcomes of acute sensorineural hearing loss in patients with Meniere’s Disease


Title	Transient-evoked otoacoustic emission signals predicting outcomes of acute sensorineural hearing loss in patients with Meniere’s Disease
Authors	Yi-Wen Liu, Sheng-Lun Kao, Hau-Tieng Wu, Tzu-Chi Liu, Te-Yung Fang, Pa-Chun Wang
Abstract	Background: Fluctuating hearing loss is characteristic of Meniere’s Disease (MD) during acute episodes. However, no reliable audiometric hallmarks are available for counselling the hearing recovery possibility. Aims/Objectives: To find parameters for predicting MD hearing outcomes. Material and Methods: We applied machine learning techniques to analyse transient-evoked otoacoustic emission (TEOAE) signals recorded from patients with MD. Thirty unilateral MD patients were recruited prospectively after onset of acute cochleo-vestibular symptoms. Serial TEOAE and pure-tone audiogram (PTA) data were recorded longitudinally. Denoised TEOAE signals were projected onto the three most prominent principal directions through a linear transformation. Binary classification was performed using a support vector machine (SVM). TEOAE signal parameters, including signal energy and group delay, were compared between improved and nonimproved groups using Welchs t-test. Results: Signal energy did not differ (p = 0.64) but a significant difference in 1-kHz (p = 0.045) group delay was recorded between improved and nonimproved groups. The SVM achieved a cross-validated accuracy of >80% in predicting hearing outcomes. Conclusions and Significance: This study revealed that baseline TEOAE parameters obtained during acute MD episodes, when processed through machine learning technology, may provide information on outer hair cell function to predict hearing recovery.
Tasks
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13573v2
PDF	https://arxiv.org/pdf/1905.13573v2.pdf
PWC	https://paperswithcode.com/paper/menieres-disease-prognosis-by-learning-from
Repo
Framework

A Dataset of Multi-Illumination Images in the Wild


Title	A Dataset of Multi-Illumination Images in the Wild
Authors	Lukas Murmann, Michael Gharbi, Miika Aittala, Fredo Durand
Abstract	Collections of images under a single, uncontrolled illumination have enabled the rapid advancement of core computer vision tasks like classification, detection, and segmentation. But even with modern learning techniques, many inverse problems involving lighting and material understanding remain too severely ill-posed to be solved with single-illumination datasets. To fill this gap, we introduce a new multi-illumination dataset of more than 1000 real scenes, each captured under 25 lighting conditions. We demonstrate the richness of this dataset by training state-of-the-art models for three challenging applications: single-image illumination estimation, image relighting, and mixed-illuminant white balance.
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08131v1
PDF	https://arxiv.org/pdf/1910.08131v1.pdf
PWC	https://paperswithcode.com/paper/a-dataset-of-multi-illumination-images-in-the
Repo
Framework

Weakly-supervised Action Localization with Background Modeling


Title	Weakly-supervised Action Localization with Background Modeling
Authors	Phuc Xuan Nguyen, Deva Ramanan, Charless C. Fowlkes
Abstract	We describe a latent approach that learns to detect actions in long sequences given training videos with only whole-video class labels. Our approach makes use of two innovations to attention-modeling in weakly-supervised learning. First, and most notably, our framework uses an attention model to extract both foreground and background frames whose appearance is explicitly modeled. Most prior works ignore the background, but we show that modeling it allows our system to learn a richer notion of actions and their temporal extents. Second, we combine bottom-up, class-agnostic attention modules with top-down, class-specific activation maps, using the latter as form of self-supervision for the former. Doing so allows our model to learn a more accurate model of attention without explicit temporal supervision. These modifications lead to 10% AP@IoU=0.5 improvement over existing systems on THUMOS14. Our proposed weaklysupervised system outperforms recent state-of-the-arts by at least 4.3% AP@IoU=0.5. Finally, we demonstrate that weakly-supervised learning can be used to aggressively scale-up learning to in-the-wild, uncurated Instagram videos. The addition of these videos significantly improves localization performance of our weakly-supervised model
Tasks	Action Localization, Weakly Supervised Action Localization
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06552v1
PDF	https://arxiv.org/pdf/1908.06552v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-action-localization-with
Repo
Framework

A PolSAR Scattering Power Factorization Framework and Novel Roll-Invariant Parameters Based Unsupervised Classification Scheme Using a Geodesic Distance


Title	A PolSAR Scattering Power Factorization Framework and Novel Roll-Invariant Parameters Based Unsupervised Classification Scheme Using a Geodesic Distance
Authors	Debanshu Ratha, Eric Pottier, Avik Bhattacharya, Alejandro C. Frery
Abstract	We propose a generic Scattering Power Factorization Framework (SPFF) for Polarimetric Synthetic Aperture Radar (PolSAR) data to directly obtain $N$ scattering power components along with a residue power component for each pixel. Each scattering power component is factorized into similarity (or dissimilarity) using elementary targets and a generalized random volume model. The similarity measure is derived using a geodesic distance between pairs of $4\times4$ real Kennaugh matrices. In standard model-based decomposition schemes, the $3\times3$ Hermitian positive semi-definite covariance (or coherency) matrix is expressed as a weighted linear combination of scattering targets following a fixed hierarchical process. In contrast, under the proposed framework, a convex splitting of unity is performed to obtain the weights while preserving the dominance of the scattering components. The product of the total power (Span) with these weights provides the non-negative scattering power components. Furthermore, the framework along the geodesic distance is effectively used to obtain specific roll-invariant parameters which are then utilized to design an unsupervised classification scheme. The SPFF, the roll invariant parameters, and the classification results are assessed using C-band RADARSAT-2 and L-band ALOS-2 images of San Francisco.
Tasks
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11577v1
PDF	https://arxiv.org/pdf/1906.11577v1.pdf
PWC	https://paperswithcode.com/paper/a-polsar-scattering-power-factorization
Repo
Framework

Instance Segmentation as Image Segmentation Annotation


Title	Instance Segmentation as Image Segmentation Annotation
Authors	Thomio Watanabe, Denis Wolf
Abstract	The instance segmentation problem intends to precisely detect and delineate objects in images. Most of the current solutions rely on deep convolutional neural networks but despite this fact proposed solutions are very diverse. Some solutions approach the problem as a network problem, where they use several networks or specialize a single network to solve several tasks. A different approach tries to solve the problem as an annotation problem, where the instance information is encoded in a mathematical representation. This work proposes a solution based in the DCME technique to solve the instance segmentation with a single segmentation network. Different from others, the segmentation network decoder is not specialized in a multi-task network. Instead, the network encoder is repurposed to classify image objects, reducing the computational cost of the solution.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2019-02-01
URL	http://arxiv.org/abs/1902.05498v1
PDF	http://arxiv.org/pdf/1902.05498v1.pdf
PWC	https://paperswithcode.com/paper/instance-segmentation-as-image-segmentation
Repo
Framework

Learning STRIPS Action Models with Classical Planning


Title	Learning STRIPS Action Models with Classical Planning
Authors	Diego Aineto, Sergio Jiménez, Eva Onaindia
Abstract	This paper presents a novel approach for learning STRIPS action models from examples that compiles this inductive learning task into a classical planning task. Interestingly, the compilation approach is flexible to different amounts of available input knowledge; the learning examples can range from a set of plans (with their corresponding initial and final states) to just a pair of initial and final states (no intermediate action or state is given). Moreover, the compilation accepts partially specified action models and it can be used to validate whether the observation of a plan execution follows a given STRIPS action model, even if this model is not fully specified.
Tasks
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01153v1
PDF	http://arxiv.org/pdf/1903.01153v1.pdf
PWC	https://paperswithcode.com/paper/learning-strips-action-models-with-classical
Repo
Framework

PrecoderNet: Hybrid Beamforming for Millimeter Wave Systems Using Deep Reinforcement Learning


Title	PrecoderNet: Hybrid Beamforming for Millimeter Wave Systems Using Deep Reinforcement Learning
Authors	Qisheng Wang, Keming Feng
Abstract	Millimeter wave (mmWave) with large-scale antenna arrays is a promising solution to resolve the frequency resource shortage in next generation wireless communication. However, fully digital beamforming structure becomes infeasible due to its prohibitively high hardware cost and unacceptable energy consumption while traditional hybrid beamforming algorithms have unnegligible gap to the optimal up bound. In this paper, we consider a mmWave point-to-point massive multiple-input-multiple-output (MIMO) system and propose a new hybrid analog and digital beamforming (HBF) scheme based on deep reinforcement learning (DRL) to improve the spectral efficiency and reduce system bit error rate (BER). At the base station (BS) side, we propose a novel DRL-based HBF design method called PrecoderNet to design the hybrid precoding matrix. The DRL agent denotes the system sum rate as state and the real /imaginary part of the digital beamformer as actions. For the user side, the minimum mean-square-error (MMSE) criterion is used to design the receiving hybrid precoders which minimizes the distance between the processed signals and the transmitted signals. Furthermore, HBF design algorithm such as weighted MMSE and orthogonal matching pursuit (OMP) are regarded as benchmarks to verify the performance of our algorithm. Finally, simulation results demonstrate that our proposed PrecoderNet outperforms the benchmarks in terms of spectral efficiency and BER while is more tractable in practical implementation.
Tasks
Published	2019-07-31
URL	https://arxiv.org/abs/1907.13266v1
PDF	https://arxiv.org/pdf/1907.13266v1.pdf
PWC	https://paperswithcode.com/paper/precodernet-hybrid-beamforming-for-millimeter
Repo
Framework

A Survey of Pruning Methods for Efficient Person Re-identification Across Domains


Title	A Survey of Pruning Methods for Efficient Person Re-identification Across Domains
Authors	Hugo Masson, Amran Bhuiyan, Le Thanh Nguyen-Meidine, Mehrsan Javan, Parthipan Siva, Ismail Ben Ayed, Eric Granger
Abstract	Recent years have witnessed a substantial increase in the deep learning architectures proposed for visual recognition tasks like person re-identification, where individuals must be recognized over multiple distributed cameras. Although deep Siamese networks have greatly improved the state-of-the-art accuracy, the computational complexity of the CNNs used for feature extraction remains an issue, hindering their deployment on platforms with with limited resources, or in applications with real-time constraints. Thus, there is an obvious advantage to compressing these architectures without significantly decreasing their accuracy. This paper provides a survey of state-of-the-art pruning techniques that are suitable for compressing deep Siamese networks applied to person re-identification. These techniques are analysed according to their pruning criteria and strategy, and according to different design scenarios for exploiting pruning methods to fine-tuning networks for target applications. Experimental results obtained using Siamese networks with ResNet feature extractors, and multiple benchmarks re-identification datasets, indicate that pruning can considerably reduce network complexity while maintaining a high level of accuracy. In scenarios where pruning is performed with large pre-training or fine-tuning datasets, the number of FLOPS required by the ResNet feature extractor is reduced by half, while maintaining a comparable rank-1 accuracy (within 1% of the original model). Pruning while training a larger CNNs can also provide a significantly better performance than fine-tuning smaller ones.
Tasks	Person Re-Identification
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02547v1
PDF	https://arxiv.org/pdf/1907.02547v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-of-pruning-methods-for-efficient
Repo
Framework

Metric Learning from Imbalanced Data


Title	Metric Learning from Imbalanced Data
Authors	Léo Gautheron, Emilie Morvant, Amaury Habrard, Marc Sebban
Abstract	A key element of any machine learning algorithm is the use of a function that measures the dis/similarity between data points. Given a task, such a function can be optimized with a metric learning algorithm. Although this research field has received a lot of attention during the past decade, very few approaches have focused on learning a metric in an imbalanced scenario where the number of positive examples is much smaller than the negatives. Here, we address this challenging task by designing a new Mahalanobis metric learning algorithm (IML) which deals with class imbalance. The empirical study performed shows the efficiency of IML.
Tasks	Metric Learning
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01651v1
PDF	https://arxiv.org/pdf/1909.01651v1.pdf
PWC	https://paperswithcode.com/paper/metric-learning-from-imbalanced-data
Repo
Framework

Query Generation for Patent Retrieval with Keyword Extraction based on Syntactic Features


Title	Query Generation for Patent Retrieval with Keyword Extraction based on Syntactic Features
Authors	Julien Rossi, Matthias Wirth, Evangelos Kanoulas
Abstract	This paper describes a new method to extract relevant keywords from patent claims, as part of the task of retrieving other patents with similar claims (search for prior art). The method combines a qualitative analysis of the writing style of the claims with NLP methods to parse text, in order to represent a legal text as a specialization arborescence of terms. In this setting, the set of extracted keywords are yielding better search results than keywords extracted with traditional methods such as tf-idf. The performance is measured on the search results of a query consisting of the extracted keywords.
Tasks	Keyword Extraction
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07591v1
PDF	https://arxiv.org/pdf/1906.07591v1.pdf
PWC	https://paperswithcode.com/paper/query-generation-for-patent-retrieval-with
Repo
Framework

A Three-dimensional Convolutional-Recurrent Network for Convective Storm Nowcasting


Title	A Three-dimensional Convolutional-Recurrent Network for Convective Storm Nowcasting
Authors	Wei Zhang, Wei Li, Lei Han
Abstract	Very short-term convective storm forecasting, termed nowcasting, has long been an important issue and has attracted substantial interest. Existing nowcasting methods rely principally on radar images and are limited in terms of nowcasting storm initiation and growth. Real-time re-analysis of meteorological data supplied by numerical models provides valuable information about three-dimensional (3D), atmospheric, boundary layer thermal dynamics, such as temperature and wind. To mine such data, we here develop a convolution-recurrent, hybrid deep-learning method with the following characteristics: (1) the use of cell-based oversampling to increase the number of training samples; this mitigates the class imbalance issue; (2) the use of both raw 3D radar data and 3D meteorological data re-analyzed via multi-source 3D convolution without any need for handcraft feature engineering; and (3) the stacking of convolutional neural networks on a long short-term memory encoder/decoder that learns the spatiotemporal patterns of convective processes. Experimental results demonstrated that our method performs better than other extrapolation methods. Qualitative analysis yielded encouraging nowcasting results.
Tasks	Feature Engineering
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00527v2
PDF	https://arxiv.org/pdf/1910.00527v2.pdf
PWC	https://paperswithcode.com/paper/a-three-dimensional-convolutional-recurrent
Repo
Framework

Better Future through AI: Avoiding Pitfalls and Guiding AI Towards its Full Potential


Title	Better Future through AI: Avoiding Pitfalls and Guiding AI Towards its Full Potential
Authors	Risto Miikkulainen, Bret Greenstein, Babak Hodjat, Jerry Smith
Abstract	Artificial Intelligence (AI) technology is rapidly changing many areas of society. While there is tremendous potential in this transition, there are several pitfalls as well. Using the history of computing and the world-wide web as a guide, in this article we identify those pitfalls and actions that lead AI development to its full potential. If done right, AI will be instrumental in achieving the goals we set for economy, society, and the world in general.
Tasks
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13178v1
PDF	https://arxiv.org/pdf/1905.13178v1.pdf
PWC	https://paperswithcode.com/paper/better-future-through-ai-avoiding-pitfalls
Repo
Framework

Complexity, Statistical Risk, and Metric Entropy of Deep Nets Using Total Path Variation


Title	Complexity, Statistical Risk, and Metric Entropy of Deep Nets Using Total Path Variation
Authors	Andrew R. Barron, Jason M. Klusowski
Abstract	For any ReLU network there is a representation in which the sum of the absolute values of the weights into each node is exactly $1$, and the input layer variables are multiplied by a value $V$ coinciding with the total variation of the path weights. Implications are given for Gaussian complexity, Rademacher complexity, statistical risk, and metric entropy, all of which are shown to be proportional to $V$. There is no dependence on the number of nodes per layer, except for the number of inputs $d$. For estimation with sub-Gaussian noise, the mean square generalization error bounds that can be obtained are of order $V \sqrt{L + \log d}/\sqrt{n}$, where $L$ is the number of layers and $n$ is the sample size.
Tasks
Published	2019-02-02
URL	http://arxiv.org/abs/1902.00800v2
PDF	http://arxiv.org/pdf/1902.00800v2.pdf
PWC	https://paperswithcode.com/paper/complexity-statistical-risk-and-metric
Repo
Framework

Noise Contrastive Variational Autoencoders


Title	Noise Contrastive Variational Autoencoders
Authors	Octavian-Eugen Ganea, Yashas Annadani, Gary Bécigneul
Abstract	We take steps towards understanding the “posterior collapse (PC)” difficulty in variational autoencoders (VAEs),~i.e. a degenerate optimum in which the latent codes become independent of their corresponding inputs. We rely on calculus of variations and theoretically explore a few popular VAE models, showing that PC always occurs for non-parametric encoders and decoders. Inspired by the popular noise contrastive estimation algorithm, we propose NC-VAE where the encoder discriminates between the latent codes of real data and of some artificially generated noise, in addition to encouraging good data reconstruction abilities. Theoretically, we prove that our model cannot reach PC and provide novel lower bounds. Our method is straightforward to implement and has the same run-time as vanilla VAE. Empirically, we showcase its benefits on popular image and text datasets.
Tasks
Published	2019-07-23
URL	https://arxiv.org/abs/1907.10430v2
PDF	https://arxiv.org/pdf/1907.10430v2.pdf
PWC	https://paperswithcode.com/paper/noise-contrastive-variational-autoencoders
Repo
Framework