October 18, 2019

3234 words 16 mins read

Paper Group ANR 413

Generalizing semi-supervised generative adversarial networks to regression using feature contrasting. Adaptive Cost-sensitive Online Classification. A Stochastic Decoder for Neural Machine Translation. Negative Momentum for Improved Game Dynamics. Knowledge Compilation in Multi-Agent Epistemic Logics. Person re-identification with fusion of hand-cr …

Generalizing semi-supervised generative adversarial networks to regression using feature contrasting


Title	Generalizing semi-supervised generative adversarial networks to regression using feature contrasting
Authors	Greg Olmschenk, Zhigang Zhu, Hao Tang
Abstract	In this work, we generalize semi-supervised generative adversarial networks (GANs) from classification problems to regression problems. In the last few years, the importance of improving the training of neural networks using semi-supervised training has been demonstrated for classification problems. We present a novel loss function, called feature contrasting, resulting in a discriminator which can distinguish between fake and real data based on feature statistics. This method avoids potential biases and limitations of alternative approaches. The generalization of semi-supervised GANs to the regime of regression problems of opens their use to countless applications as well as providing an avenue for a deeper understanding of how GANs function. We first demonstrate the capabilities of semi-supervised regression GANs on a toy dataset which allows for a detailed understanding of how they operate in various circumstances. This toy dataset is used to provide a theoretical basis of the semi-supervised regression GAN. We then apply the semi-supervised regression GANs to a number of real-world computer vision applications: age estimation, driving steering angle prediction, and crowd counting from single images. We perform extensive tests of what accuracy can be achieved with significantly reduced annotated data. Through the combination of the theoretical example and real-world scenarios, we demonstrate how semi-supervised GANs can be generalized to regression problems.
Tasks	Age Estimation, Crowd Counting
Published	2018-11-27
URL	https://arxiv.org/abs/1811.11269v3
PDF	https://arxiv.org/pdf/1811.11269v3.pdf
PWC	https://paperswithcode.com/paper/generalizing-semi-supervised-generative
Repo
Framework

Adaptive Cost-sensitive Online Classification


Title	Adaptive Cost-sensitive Online Classification
Authors	Peilin Zhao, Yifan Zhang, Min Wu, Steven C. H. Hoi, Mingkui Tan, Junzhou Huang
Abstract	Cost-Sensitive Online Classification has drawn extensive attention in recent years, where the main approach is to directly online optimize two well-known cost-sensitive metrics: (i) weighted sum of sensitivity and specificity; (ii) weighted misclassification cost. However, previous existing methods only considered first-order information of data stream. It is insufficient in practice, since many recent studies have proved that incorporating second-order information enhances the prediction performance of classification models. Thus, we propose a family of cost-sensitive online classification algorithms with adaptive regularization in this paper. We theoretically analyze the proposed algorithms and empirically validate their effectiveness and properties in extensive experiments. Then, for better trade off between the performance and efficiency, we further introduce the sketching technique into our algorithms, which significantly accelerates the computational speed with quite slight performance loss. Finally, we apply our algorithms to tackle several online anomaly detection tasks from real world. Promising results prove that the proposed algorithms are effective and efficient in solving cost-sensitive online classification problems in various real-world domains.
Tasks	Anomaly Detection
Published	2018-04-06
URL	http://arxiv.org/abs/1804.02246v1
PDF	http://arxiv.org/pdf/1804.02246v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-cost-sensitive-online-classification
Repo
Framework

A Stochastic Decoder for Neural Machine Translation


Title	A Stochastic Decoder for Neural Machine Translation
Authors	Philip Schulz, Wilker Aziz, Trevor Cohn
Abstract	The process of translation is ambiguous, in that there are typically many valid trans- lations for a given sentence. This gives rise to significant variation in parallel cor- pora, however, most current models of machine translation do not account for this variation, instead treating the prob- lem as a deterministic process. To this end, we present a deep generative model of machine translation which incorporates a chain of latent variables, in order to ac- count for local lexical and syntactic varia- tion in parallel corpora. We provide an in- depth analysis of the pitfalls encountered in variational inference for training deep generative models. Experiments on sev- eral different language pairs demonstrate that the model consistently improves over strong baselines.
Tasks	Machine Translation
Published	2018-05-28
URL	http://arxiv.org/abs/1805.10844v1
PDF	http://arxiv.org/pdf/1805.10844v1.pdf
PWC	https://paperswithcode.com/paper/a-stochastic-decoder-for-neural-machine
Repo
Framework

Negative Momentum for Improved Game Dynamics


Title	Negative Momentum for Improved Game Dynamics
Authors	Gauthier Gidel, Reyhane Askari Hemmat, Mohammad Pezeshki, Remi Lepriol, Gabriel Huang, Simon Lacoste-Julien, Ioannis Mitliagkas
Abstract	Games generalize the single-objective optimization paradigm by introducing different objective functions for different players. Differentiable games often proceed by simultaneous or alternating gradient updates. In machine learning, games are gaining new importance through formulations like generative adversarial networks (GANs) and actor-critic systems. However, compared to single-objective optimization, game dynamics are more complex and less understood. In this paper, we analyze gradient-based methods with momentum on simple games. We prove that alternating updates are more stable than simultaneous updates. Next, we show both theoretically and empirically that alternating gradient updates with a negative momentum term achieves convergence in a difficult toy adversarial problem, but also on the notoriously difficult to train saturating GANs.
Tasks
Published	2018-07-12
URL	https://arxiv.org/abs/1807.04740v4
PDF	https://arxiv.org/pdf/1807.04740v4.pdf
PWC	https://paperswithcode.com/paper/negative-momentum-for-improved-game-dynamics
Repo
Framework

Knowledge Compilation in Multi-Agent Epistemic Logics


Title	Knowledge Compilation in Multi-Agent Epistemic Logics
Authors	Liangda Fang, Kewen Wang, Zhe Wang, Ximing Wen
Abstract	Epistemic logics are a primary formalism for multi-agent systems but major reasoning tasks in such epistemic logics are intractable, which impedes applications of multi-agent epistemic logics in automatic planning. Knowledge compilation provides a promising way of resolving the intractability by identifying expressive fragments of epistemic logics that are tractable for important reasoning tasks such as satisfiability and forgetting. The property of logical separability allows to decompose a formula into some of its subformulas and thus modular algorithms for various reasoning tasks can be developed. In this paper, by employing logical separability, we propose an approach to knowledge compilation for the logic Kn by defining a normal form SDNF. Among several novel results, we show that every epistemic formula can be equivalently compiled into a formula in SDNF, major reasoning tasks in SDNF are tractable, and formulas in SDNF enjoy the logical separability. Our results shed some lights on modular approaches to knowledge compilation. Furthermore, we apply our results in the multi-agent epistemic planning. Finally, we extend the above result to the logic K45n that is Kn extended by introspection axioms 4 and 5.
Tasks
Published	2018-06-27
URL	http://arxiv.org/abs/1806.10561v2
PDF	http://arxiv.org/pdf/1806.10561v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-compilation-in-multi-agent
Repo
Framework

Person re-identification with fusion of hand-crafted and deep pose-based body region features


Title	Person re-identification with fusion of hand-crafted and deep pose-based body region features
Authors	Jubin Johnson, Shunsuke Yasugi, Yoichi Sugino, Sugiri Pranata, Shengmei Shen
Abstract	Person re-identification (re-ID) aims to accurately re- trieve a person from a large-scale database of images cap- tured across multiple cameras. Existing works learn deep representations using a large training subset of unique per- sons. However, identifying unseen persons is critical for a good re-ID algorithm. Moreover, the misalignment be- tween person crops to detection errors or pose variations leads to poor feature matching. In this work, we present a fusion of handcrafted features and deep feature representa- tion learned using multiple body parts to complement the global body features that achieves high performance on un- seen test images. Pose information is used to detect body regions that are passed through Convolutional Neural Net- works (CNN) to guide feature learning. Finally, a metric learning step enables robust distance matching on a dis- criminative subspace. Experimental results on 4 popular re-ID benchmark datasets namely VIPer, DukeMTMC-reID, Market-1501 and CUHK03 show that the proposed method achieves state-of-the-art performance in image-based per- son re-identification.
Tasks	Metric Learning, Person Re-Identification
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10630v1
PDF	http://arxiv.org/pdf/1803.10630v1.pdf
PWC	https://paperswithcode.com/paper/person-re-identification-with-fusion-of-hand
Repo
Framework

Efficient and Deep Person Re-Identification using Multi-Level Similarity


Title	Efficient and Deep Person Re-Identification using Multi-Level Similarity
Authors	Yiluan Guo, Ngai-Man Cheung
Abstract	Person Re-Identification (ReID) requires comparing two images of person captured under different conditions. Existing work based on neural networks often computes the similarity of feature maps from one single convolutional layer. In this work, we propose an efficient, end-to-end fully convolutional Siamese network that computes the similarities at multiple levels. We demonstrate that multi-level similarity can improve the accuracy considerably using low-complexity network structures in ReID problem. Specifically, first, we use several convolutional layers to extract the features of two input images. Then, we propose Convolution Similarity Network to compute the similarity score maps for the inputs. We use spatial transformer networks (STNs) to determine spatial attention. We propose to apply efficient depth-wise convolution to compute the similarity. The proposed Convolution Similarity Networks can be inserted into different convolutional layers to extract visual similarities at different levels. Furthermore, we use an improved ranking loss to further improve the performance. Our work is the first to propose to compute visual similarities at low, middle and high levels for ReID. With extensive experiments and analysis, we demonstrate that our system, compact yet effective, can achieve competitive results with much smaller model size and computational complexity.
Tasks	Person Re-Identification
Published	2018-03-30
URL	http://arxiv.org/abs/1803.11353v2
PDF	http://arxiv.org/pdf/1803.11353v2.pdf
PWC	https://paperswithcode.com/paper/efficient-and-deep-person-re-identification
Repo
Framework

Integrating Recurrence Dynamics for Speech Emotion Recognition


Title	Integrating Recurrence Dynamics for Speech Emotion Recognition
Authors	Efthymios Tzinis, Georgios Paraskevopoulos, Christos Baziotis, Alexandros Potamianos
Abstract	We investigate the performance of features that can capture nonlinear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER). Reconstruction of the phase space of each speech frame and the computation of its respective Recurrence Plot (RP) reveals complex structures which can be measured by performing Recurrence Quantification Analysis (RQA). These measures are aggregated by using statistical functionals over segment and utterance periods. We report SER results for the proposed feature set on three databases using different classification methods. When fusing the proposed features with traditional feature sets, we show an improvement in unweighted accuracy of up to 5.7% and 10.7% on Speaker-Dependent (SD) and Speaker-Independent (SI) SER tasks, respectively, over the baseline. Following a segment-based approach we demonstrate state-of-the-art performance on IEMOCAP using a Bidirectional Recurrent Neural Network.
Tasks	Emotion Recognition, Speech Emotion Recognition
Published	2018-11-09
URL	http://arxiv.org/abs/1811.04133v1
PDF	http://arxiv.org/pdf/1811.04133v1.pdf
PWC	https://paperswithcode.com/paper/integrating-recurrence-dynamics-for-speech
Repo
Framework

Mammography Dual View Mass Correspondence


Title	Mammography Dual View Mass Correspondence
Authors	Shaked Perek, Alon Hazan, Ella Barkan, Ayelet Akselrod-Ballin
Abstract	Standard breast cancer screening involves the acquisition of two mammography X-ray projections for each breast. Typically, a comparison of both views supports the challenging task of tumor detection and localization. We introduce a deep learning, patch-based Siamese network for lesion matching in dual-view mammography. Our locally-fitted approach generates a joint patch pair representation and comparison with a shared configuration between the two views. We performed a comprehensive set of experiments with the network on standard datasets, among them the large Digital Database for Screening Mammography (DDSM). We analyzed the effect of transfer learning with the network between different types of datasets and compared the network-based matching to using Euclidean distance by template matching. Finally, we evaluated the contribution of the matching network in a full detection pipeline. Experimental results demonstrate the promise of improved detection accuracy using our approach.
Tasks	Transfer Learning
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00637v1
PDF	http://arxiv.org/pdf/1807.00637v1.pdf
PWC	https://paperswithcode.com/paper/mammography-dual-view-mass-correspondence
Repo
Framework

Successive Convex Approximation Algorithms for Sparse Signal Estimation with Nonconvex Regularizations


Title	Successive Convex Approximation Algorithms for Sparse Signal Estimation with Nonconvex Regularizations
Authors	Yang Yang, Marius Pesavento, Symeon Chatzinotas, Björn Ottersten
Abstract	In this paper, we propose a successive convex approximation framework for sparse optimization where the nonsmooth regularization function in the objective function is nonconvex and it can be written as the difference of two convex functions. The proposed framework is based on a nontrivial combination of the majorization-minimization framework and the successive convex approximation framework proposed in literature for a convex regularization function. The proposed framework has several attractive features, namely, i) flexibility, as different choices of the approximate function lead to different type of algorithms; ii) fast convergence, as the problem structure can be better exploited by a proper choice of the approximate function and the stepsize is calculated by the line search; iii) low complexity, as the approximate function is convex and the line search scheme is carried out over a differentiable function; iv) guaranteed convergence to a stationary point. We demonstrate these features by two example applications in subspace learning, namely, the network anomaly detection problem and the sparse subspace clustering problem. Customizing the proposed framework by adopting the best-response type approximation, we obtain soft-thresholding with exact line search algorithms for which all elements of the unknown parameter are updated in parallel according to closed-form expressions. The attractive features of the proposed algorithms are illustrated numerically.
Tasks	Anomaly Detection
Published	2018-06-28
URL	http://arxiv.org/abs/1806.10773v1
PDF	http://arxiv.org/pdf/1806.10773v1.pdf
PWC	https://paperswithcode.com/paper/successive-convex-approximation-algorithms
Repo
Framework

The Structure of Optimal Private Tests for Simple Hypotheses


Title	The Structure of Optimal Private Tests for Simple Hypotheses
Authors	Clément L. Canonne, Gautam Kamath, Audra McMillan, Adam Smith, Jonathan Ullman
Abstract	Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. This work answers a basic question about privately testing simple hypotheses: given two distributions $P$ and $Q$, and a privacy level $\varepsilon$, how many i.i.d. samples are needed to distinguish $P$ from $Q$ subject to $\varepsilon$-differential privacy, and what sort of tests have optimal sample complexity? Specifically, we characterize this sample complexity up to constant factors in terms of the structure of $P$ and $Q$ and the privacy level $\varepsilon$, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test. Our result is an analogue of the classical Neyman-Pearson lemma in the setting of private hypothesis testing. We also give an application of our result to the private change-point detection. Our characterization applies more generally to hypothesis tests satisfying essentially any notion of algorithmic stability, which is known to imply strong generalization bounds in adaptive data analysis, and thus our results have applications even when privacy is not a primary concern.
Tasks	Change Point Detection
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11148v2
PDF	http://arxiv.org/pdf/1811.11148v2.pdf
PWC	https://paperswithcode.com/paper/the-structure-of-optimal-private-tests-for
Repo
Framework

Segmentation of Bleeding Regions in Wireless Capsule Endoscopy Images an Approach for inside Capsule Video Summarization


Title	Segmentation of Bleeding Regions in Wireless Capsule Endoscopy Images an Approach for inside Capsule Video Summarization
Authors	Mohsen Hajabdollahi, Reza Esfandiarpoor, S. M. Reza Soroushmehr, Nader Karimi, Shadrokh Samavi, Kayvan Najarian
Abstract	Wireless capsule endoscopy (WCE) is an effective means of diagnosis of gastrointestinal disorders. Detection of informative scenes by WCE could reduce the length of transmitted videos and can help with the diagnosis. In this paper we propose a simple and efficient method for segmentation of the bleeding regions in WCE captured images. Suitable color channels are selected and classified by a multi-layer perceptron (MLP) structure. The MLP structure is quantized such that the implementation does not require multiplications. The proposed method is tested by simulation on WCE bleeding image dataset. The proposed structure is designed considering hardware resource constrains that exist in WCE systems.
Tasks	Video Summarization
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07788v1
PDF	http://arxiv.org/pdf/1802.07788v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-of-bleeding-regions-in-wireless-1
Repo
Framework

Quantification of Local Metabolic Tumor Volume Changes by Registering Blended PET-CT Images for Prediction of Pathologic Tumor Response


Title	Quantification of Local Metabolic Tumor Volume Changes by Registering Blended PET-CT Images for Prediction of Pathologic Tumor Response
Authors	Sadegh Riyahi, Wookjin Choi, Chia-Ju Liu, Saad Nadeem, Shan Tan, Hualiang Zhong, Wengen Chen, Abraham J. Wu, James G. Mechalakos, Joseph O. Deasy, Wei Lu
Abstract	Quantification of local metabolic tumor volume (MTV) chan-ges after Chemo-radiotherapy would allow accurate tumor response evaluation. Currently, local MTV changes in esophageal (soft-tissue) cancer are measured by registering follow-up PET to baseline PET using the same transformation obtained by deformable registration of follow-up CT to baseline CT. Such approach is suboptimal because PET and CT capture fundamentally different properties (metabolic vs. anatomy) of a tumor. In this work we combined PET and CT images into a single blended PET-CT image and registered follow-up blended PET-CT image to baseline blended PET-CT image. B-spline regularized diffeomorphic registration was used to characterize the large MTV shrinkage. Jacobian of the resulting transformation was computed to measure the local MTV changes. Radiomic features (intensity and texture) were then extracted from the Jacobian map to predict pathologic tumor response. Local MTV changes calculated using blended PET-CT registration achieved the highest correlation with ground truth segmentation (R=0.88) compared to PET-PET (R=0.80) and CT-CT (R=0.67) registrations. Moreover, using blended PET-CT registration, the multivariate prediction model achieved the highest accuracy with only one Jacobian co-occurrence texture feature (accuracy=82.3%). This novel framework can replace the conventional approach that applies CT-CT transformation to the PET data for longitudinal evaluation of tumor response.
Tasks
Published	2018-08-24
URL	http://arxiv.org/abs/1808.08312v1
PDF	http://arxiv.org/pdf/1808.08312v1.pdf
PWC	https://paperswithcode.com/paper/quantification-of-local-metabolic-tumor
Repo
Framework

Face Aging with Contextual Generative Adversarial Nets


Title	Face Aging with Contextual Generative Adversarial Nets
Authors	Si Liu, Yao Sun, Defa Zhu, Renda Bao, Wei Wang, Xiangbo Shu, Shuicheng Yan
Abstract	Face aging, which renders aging faces for an input face, has attracted extensive attention in the multimedia research. Recently, several conditional Generative Adversarial Nets (GANs) based methods have achieved great success. They can generate images fitting the real face distributions conditioned on each individual age group. However, these methods fail to capture the transition patterns, e.g., the gradual shape and texture changes between adjacent age groups. In this paper, we propose a novel Contextual Generative Adversarial Nets (C-GANs) to specifically take it into consideration. The C-GANs consists of a conditional transformation network and two discriminative networks. The conditional transformation network imitates the aging procedure with several specially designed residual blocks. The age discriminative network guides the synthesized face to fit the real conditional distribution. The transition pattern discriminative network is novel, aiming to distinguish the real transition patterns with the fake ones. It serves as an extra regularization term for the conditional transformation network, ensuring the generated image pairs to fit the corresponding real transition pattern distribution. Experimental results demonstrate the proposed framework produces appealing results by comparing with the state-of-the-art and ground truth. We also observe performance gain for cross-age face verification.
Tasks	Face Verification
Published	2018-02-01
URL	http://arxiv.org/abs/1802.00237v1
PDF	http://arxiv.org/pdf/1802.00237v1.pdf
PWC	https://paperswithcode.com/paper/face-aging-with-contextual-generative
Repo
Framework

PIMKL: Pathway Induced Multiple Kernel Learning


Title	PIMKL: Pathway Induced Multiple Kernel Learning
Authors	Matteo Manica, Joris Cadow, Roland Mathis, María Rodríguez Martínez
Abstract	Reliable identification of molecular biomarkers is essential for accurate patient stratification. While state-of-the-art machine learning approaches for sample classification continue to push boundaries in terms of performance, most of these methods are not able to integrate different data types and lack generalization power, limiting their application in a clinical setting. Furthermore, many methods behave as black boxes, and we have very little understanding about the mechanisms that lead to the prediction. While opaqueness concerning machine behaviour might not be a problem in deterministic domains, in health care, providing explanations about the molecular factors and phenotypes that are driving the classification is crucial to build trust in the performance of the predictive system. We propose Pathway Induced Multiple Kernel Learning (PIMKL), a novel methodology to reliably classify samples that can also help gain insights into the molecular mechanisms that underlie the classification. PIMKL exploits prior knowledge in the form of a molecular interaction network and annotated gene sets, by optimizing a mixture of pathway-induced kernels using a Multiple Kernel Learning (MKL) algorithm, an approach that has demonstrated excellent performance in different machine learning applications. After optimizing the combination of kernels for prediction of a specific phenotype, the model provides a stable molecular signature that can be interpreted in the light of the ingested prior knowledge and that can be used in transfer learning tasks.
Tasks	Transfer Learning
Published	2018-03-29
URL	http://arxiv.org/abs/1803.11274v3
PDF	http://arxiv.org/pdf/1803.11274v3.pdf
PWC	https://paperswithcode.com/paper/pimkl-pathway-induced-multiple-kernel
Repo
Framework