July 28, 2019

3007 words 15 mins read

Paper Group ANR 404

A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications. Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts. Bridging between Computer and Robot Vision through Data Augmentation: a Case Study on Object Recognition. A Classifying Variational Autoencoder with Application to Polypho …

A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications


Title	A Review on Bilevel Optimization: From Classical to Evolutionary Approaches and Applications
Authors	Ankur Sinha, Pekka Malo, Kalyanmoy Deb
Abstract	Bilevel optimization is defined as a mathematical program, where an optimization problem contains another optimization problem as a constraint. These problems have received significant attention from the mathematical programming community. Only limited work exists on bilevel problems using evolutionary computation techniques; however, recently there has been an increasing interest due to the proliferation of practical applications and the potential of evolutionary algorithms in tackling these problems. This paper provides a comprehensive review on bilevel optimization from the basic principles to solution strategies; both classical and evolutionary. A number of potential application problems are also discussed. To offer the readers insights on the prominent developments in the field of bilevel optimization, we have performed an automated text-analysis of an extended list of papers published on bilevel optimization to date. This paper should motivate evolutionary computation researchers to pay more attention to this practical yet challenging area.
Tasks	bilevel optimization
Published	2017-05-17
URL	http://arxiv.org/abs/1705.06270v1
PDF	http://arxiv.org/pdf/1705.06270v1.pdf
PWC	https://paperswithcode.com/paper/a-review-on-bilevel-optimization-from
Repo
Framework

Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts


Title	Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts
Authors	Chenhao Tan, Dallas Card, Noah A. Smith
Abstract	Understanding how ideas relate to each other is a fundamental question in many domains, ranging from intellectual history to public communication. Because ideas are naturally embedded in texts, we propose the first framework to systematically characterize the relations between ideas based on their occurrence in a corpus of documents, independent of how these ideas are represented. Combining two statistics — cooccurrence within documents and prevalence correlation over time — our approach reveals a number of different ways in which ideas can cooperate and compete. For instance, two ideas can closely track each other’s prevalence over time, and yet rarely cooccur, almost like a “cold war” scenario. We observe that pairwise cooccurrence and prevalence correlation exhibit different distributions. We further demonstrate that our approach is able to uncover intriguing relations between ideas through in-depth case studies on news articles and research papers.
Tasks
Published	2017-04-25
URL	http://arxiv.org/abs/1704.07828v2
PDF	http://arxiv.org/pdf/1704.07828v2.pdf
PWC	https://paperswithcode.com/paper/friendships-rivalries-and-trysts
Repo
Framework

Bridging between Computer and Robot Vision through Data Augmentation: a Case Study on Object Recognition


Title	Bridging between Computer and Robot Vision through Data Augmentation: a Case Study on Object Recognition
Authors	Antonio D’Innocente, Fabio Maria Carlucci, Mirco Colosi, Barbara Caputo
Abstract	Despite the impressive progress brought by deep network in visual object recognition, robot vision is still far from being a solved problem. The most successful convolutional architectures are developed starting from ImageNet, a large scale collection of images of object categories downloaded from the Web. This kind of images is very different from the situated and embodied visual experience of robots deployed in unconstrained settings. To reduce the gap between these two visual experiences, this paper proposes a simple yet effective data augmentation layer that zooms on the object of interest and simulates the object detection outcome of a robot vision system. The layer, that can be used with any convolutional deep architecture, brings to an increase in object recognition performance of up to 7%, in experiments performed over three different benchmark databases. Upon acceptance of the paper, our robot data augmentation layer will be made publicly available.
Tasks	Data Augmentation, Object Detection, Object Recognition
Published	2017-05-05
URL	http://arxiv.org/abs/1705.02139v1
PDF	http://arxiv.org/pdf/1705.02139v1.pdf
PWC	https://paperswithcode.com/paper/bridging-between-computer-and-robot-vision
Repo
Framework

A Classifying Variational Autoencoder with Application to Polyphonic Music Generation


Title	A Classifying Variational Autoencoder with Application to Polyphonic Music Generation
Authors	Jay A. Hennig, Akash Umakantha, Ryan C. Williamson
Abstract	The variational autoencoder (VAE) is a popular probabilistic generative model. However, one shortcoming of VAEs is that the latent variables cannot be discrete, which makes it difficult to generate data from different modes of a distribution. Here, we propose an extension of the VAE framework that incorporates a classifier to infer the discrete class of the modeled data. To model sequential data, we can combine our Classifying VAE with a recurrent neural network such as an LSTM. We apply this model to algorithmic music generation, where our model learns to generate musical sequences in different keys. Most previous work in this area avoids modeling key by transposing data into only one or two keys, as opposed to the 10+ different keys in the original music. We show that our Classifying VAE and Classifying VAE+LSTM models outperform the corresponding non-classifying models in generating musical samples that stay in key. This benefit is especially apparent when trained on untransposed music data in the original keys.
Tasks	Music Generation
Published	2017-11-19
URL	http://arxiv.org/abs/1711.07050v1
PDF	http://arxiv.org/pdf/1711.07050v1.pdf
PWC	https://paperswithcode.com/paper/a-classifying-variational-autoencoder-with
Repo
Framework

Speaker Diarization using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings


Title	Speaker Diarization using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings
Authors	Pawel Cyrta, Tomasz Trzciński, Wojciech Stokowiec
Abstract	In this paper we propose a new method of speaker diarization that employs a deep learning architecture to learn speaker embeddings. In contrast to the traditional approaches that build their speaker embeddings using manually hand-crafted spectral features, we propose to train for this purpose a recurrent convolutional neural network applied directly on magnitude spectrograms. To compare our approach with the state of the art, we collect and release for the public an additional dataset of over 6 hours of fully annotated broadcast material. The results of our evaluation on the new dataset and three other benchmark datasets show that our proposed method significantly outperforms the competitors and reduces diarization error rate by a large margin of over 30% with respect to the baseline.
Tasks	Speaker Diarization
Published	2017-08-09
URL	http://arxiv.org/abs/1708.02840v2
PDF	http://arxiv.org/pdf/1708.02840v2.pdf
PWC	https://paperswithcode.com/paper/speaker-diarization-using-deep-recurrent
Repo
Framework

Towards balanced clustering - part 1 (preliminaries)


Title	Towards balanced clustering - part 1 (preliminaries)
Authors	Mark Sh. Levin
Abstract	The article contains a preliminary glance at balanced clustering problems. Basic balanced structures and combinatorial balanced problems are briefly described. A special attention is targeted to various balance/unbalance indices (including some new versions of the indices): by cluster cardinality, by cluster weights, by inter-cluster edge/arc weights, by cluster element structure (for element multi-type clustering). Further, versions of optimization clustering problems are suggested (including multicriteria problem formulations). Illustrative numerical examples describe calculation of balance indices and element multi-type balance clustering problems (including example for design of student teams).
Tasks
Published	2017-06-09
URL	http://arxiv.org/abs/1706.03065v1
PDF	http://arxiv.org/pdf/1706.03065v1.pdf
PWC	https://paperswithcode.com/paper/towards-balanced-clustering-part-1
Repo
Framework

Augmented Robust PCA For Foreground-Background Separation on Noisy, Moving Camera Video


Title	Augmented Robust PCA For Foreground-Background Separation on Noisy, Moving Camera Video
Authors	Chen Gao, Brian E. Moore, Raj Rao Nadakuditi
Abstract	This work presents a novel approach for robust PCA with total variation regularization for foreground-background separation and denoising on noisy, moving camera video. Our proposed algorithm registers the raw (possibly corrupted) frames of a video and then jointly processes the registered frames to produce a decomposition of the scene into a low-rank background component that captures the static components of the scene, a smooth foreground component that captures the dynamic components of the scene, and a sparse component that can isolate corruptions and other non-idealities. Unlike existing methods, our proposed algorithm produces a panoramic low-rank component that spans the entire field of view, automatically stitching together corrupted data from partially overlapping scenes. The low-rank portion of our robust PCA model is based on a recently discovered optimal low-rank matrix estimator (OptShrink) that requires no parameter tuning. We demonstrate the performance of our algorithm on both static and moving camera videos corrupted by noise and outliers.
Tasks	Denoising
Published	2017-09-27
URL	http://arxiv.org/abs/1709.09328v1
PDF	http://arxiv.org/pdf/1709.09328v1.pdf
PWC	https://paperswithcode.com/paper/augmented-robust-pca-for-foreground
Repo
Framework

A Simple Language Model based on PMI Matrix Approximations


Title	A Simple Language Model based on PMI Matrix Approximations
Authors	Oren Melamud, Ido Dagan, Jacob Goldberger
Abstract	In this study, we introduce a new approach for learning language models by training them to estimate word-context pointwise mutual information (PMI), and then deriving the desired conditional probabilities from PMI at test time. Specifically, we show that with minor modifications to word2vec’s algorithm, we get principled language models that are closely related to the well-established Noise Contrastive Estimation (NCE) based language models. A compelling aspect of our approach is that our models are trained with the same simple negative sampling objective function that is commonly used in word2vec to learn word embeddings.
Tasks	Language Modelling, Word Embeddings
Published	2017-07-17
URL	http://arxiv.org/abs/1707.05266v1
PDF	http://arxiv.org/pdf/1707.05266v1.pdf
PWC	https://paperswithcode.com/paper/a-simple-language-model-based-on-pmi-matrix
Repo
Framework

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration


Title	Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration
Authors	Jeng-Hau Lin, Tianwei Xing, Ritchie Zhao, Zhiru Zhang, Mani Srivastava, Zhuowen Tu, Rajesh K. Gupta
Abstract	State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution. Such networks strain the computational capabilities and energy available to embedded and mobile processing platforms, restricting their use in many important applications. In this paper, we push the boundaries of hardware-effective CNN design by proposing BCNN with Separable Filters (BCNNw/SF), which applies Singular Value Decomposition (SVD) on BCNN kernels to further reduce computational and storage complexity. To enable its implementation, we provide a closed form of the gradient over SVD to calculate the exact gradient with respect to every binarized weight in backward propagation. We verify BCNNw/SF on the MNIST, CIFAR-10, and SVHN datasets, and implement an accelerator for CIFAR-10 on FPGA hardware. Our BCNNw/SF accelerator realizes memory savings of 17% and execution time reduction of 31.3% compared to BCNN with only minor accuracy sacrifices.
Tasks
Published	2017-07-15
URL	http://arxiv.org/abs/1707.04693v1
PDF	http://arxiv.org/pdf/1707.04693v1.pdf
PWC	https://paperswithcode.com/paper/binarized-convolutional-neural-networks-with
Repo
Framework

Predicting the Popularity of Online Videos via Deep Neural Networks


Title	Predicting the Popularity of Online Videos via Deep Neural Networks
Authors	Yue Mao, Yi Shen, Gang Qin, Longjun Cai
Abstract	Predicting the popularity of online videos is important for video streaming content providers. This is a challenging problem because of the following two reasons. First, the problem is both “wide” and “deep”. That is, it not only depends on a wide range of features, but also be highly non-linear and complex. Second, multiple competitors may be involved. In this paper, we propose a general prediction model using the multi-task learning (MTL) module and the relation network (RN) module, where MTL can reduce over-fitting and RN can model the relations of multiple competitors. Experimental results show that our proposed approach significantly increases the accuracy on predicting the total view counts of TV series with RN and MTL modules.
Tasks	Multi-Task Learning
Published	2017-11-29
URL	http://arxiv.org/abs/1711.10718v2
PDF	http://arxiv.org/pdf/1711.10718v2.pdf
PWC	https://paperswithcode.com/paper/predicting-the-popularity-of-online-videos
Repo
Framework

Skin Lesion Classification Using Hybrid Deep Neural Networks


Title	Skin Lesion Classification Using Hybrid Deep Neural Networks
Authors	Amirreza Mahbod, Gerald Schaefer, Chunliang Wang, Rupert Ecker, Isabella Ellinger
Abstract	Skin cancer is one of the major types of cancers with an increasing incidence over the past decades. Accurately diagnosing skin lesions to discriminate between benign and malignant skin lesions is crucial to ensure appropriate patient treatment. While there are many computerised methods for skin lesion classification, convolutional neural networks (CNNs) have been shown to be superior over classical methods. In this work, we propose a fully automatic computerised method for skin lesion classification which employs optimised deep features from a number of well-established CNNs and from different abstraction levels. We use three pre-trained deep models, namely AlexNet, VGG16 and ResNet-18, as deep feature generators. The extracted features then are used to train support vector machine classifiers. In the final stage, the classifier outputs are fused to obtain a classification. Evaluated on the 150 validation images from the ISIC 2017 classification challenge, the proposed method is shown to achieve very good classification performance, yielding an area under receiver operating characteristic curve of 83.83% for melanoma classification and of 97.55% for seborrheic keratosis classification.
Tasks	Object Detection, Skin Lesion Classification
Published	2017-02-27
URL	http://arxiv.org/abs/1702.08434v2
PDF	http://arxiv.org/pdf/1702.08434v2.pdf
PWC	https://paperswithcode.com/paper/skin-lesion-classification-using-hybrid-deep
Repo
Framework

Improved Stability of Whole Brain Surface Parcellation with Multi-Atlas Segmentation


Title	Improved Stability of Whole Brain Surface Parcellation with Multi-Atlas Segmentation
Authors	Yuankai Huo, Shunxing Bao, Prasanna Parvathaneni, Bennett A. Landman
Abstract	Whole brain segmentation and cortical surface parcellation are essential in understanding the anatomical-functional relationships of the brain. Multi-atlas segmentation has been regarded as one of the leading segmentation methods for the whole brain segmentation. In our recent work, the multi-atlas technique has been adapted to surface reconstruction using a method called Multi-atlas CRUISE (MaCRUISE). The MaCRUISE method not only performed consistent volume-surface analyses but also showed advantages on robustness compared with the FreeSurfer method. However, a detailed surface parcellation was not provided by MaCRUISE, which hindered the region of interest (ROI) based analyses on surfaces. Herein, the MaCRUISE surface parcellation (MaCRUISEsp) method is proposed to perform the surface parcellation upon the inner, central and outer surfaces that are reconstructed from MaCRUISE. MaCRUISEsp parcellates inner, central and outer surfaces with 98 cortical labels respectively using a volume segmentation based surface parcellation (VSBSP), following a topological correction step. To validate the performance of MaCRUISEsp, 21 scan-rescan magnetic resonance imaging (MRI) T1 volume pairs from the Kirby21 dataset were used to perform a reproducibility analyses. MaCRUISEsp achieved 0.948 on median Dice Similarity Coefficient (DSC) for central surfaces. Meanwhile, FreeSurfer achieved 0.905 DSC for inner surfaces and 0.881 DSC for outer surfaces, while the proposed method achieved 0.929 DSC for inner surfaces and 0.835 DSC for outer surfaces. Qualitatively, the results are encouraging, but are not directly comparable as the two approaches use different definitions of cortical labels.
Tasks	Brain Segmentation
Published	2017-12-02
URL	http://arxiv.org/abs/1712.00543v1
PDF	http://arxiv.org/pdf/1712.00543v1.pdf
PWC	https://paperswithcode.com/paper/improved-stability-of-whole-brain-surface
Repo
Framework

A Compressive Sensing Approach to Community Detection with Applications


Title	A Compressive Sensing Approach to Community Detection with Applications
Authors	Ming-Jun Lai, Daniel Mckenzie
Abstract	The community detection problem for graphs asks one to partition the n vertices V of a graph G into k communities, or clusters, such that there are many intracluster edges and few intercluster edges. Of course this is equivalent to finding a permutation matrix P such that, if A denotes the adjacency matrix of G, then PAP^T is approximately block diagonal. As there are k^n possible partitions of n vertices into k subsets, directly determining the optimal clustering is clearly infeasible. Instead one seeks to solve a more tractable approximation to the clustering problem. In this paper we reformulate the community detection problem via sparse solution of a linear system associated with the Laplacian of a graph G and then develop a two-stage approach based on a thresholding technique and a compressive sensing algorithm to find a sparse solution which corresponds to the community containing a vertex of interest in G. Crucially, our approach results in an algorithm which is able to find a single cluster of size n_0 in O(nlog(n)n_0) operations and all k clusters in fewer than O(n^2ln(n)) operations. This is a marked improvement over the classic spectral clustering algorithm, which is unable to find a single cluster at a time and takes approximately O(n^3) operations to find all k clusters. Moreover, we are able to provide robust guarantees of success for the case where G is drawn at random from the Stochastic Block Model, a popular model for graphs with clusters. Extensive numerical results are also provided, showing the efficacy of our algorithm on both synthetic and real-world data sets.
Tasks	Community Detection, Compressive Sensing
Published	2017-08-30
URL	http://arxiv.org/abs/1708.09477v3
PDF	http://arxiv.org/pdf/1708.09477v3.pdf
PWC	https://paperswithcode.com/paper/a-compressive-sensing-approach-to-community
Repo
Framework

Information-Propogation-Enhanced Neural Machine Translation by Relation Model


Title	Information-Propogation-Enhanced Neural Machine Translation by Relation Model
Authors	Wen Zhang, Jiawei Hu, Yang Feng, Qun Liu
Abstract	Even though sequence-to-sequence neural machine translation (NMT) model have achieved state-of-art performance in the recent fewer years, but it is widely concerned that the recurrent neural network (RNN) units are very hard to capture the long-distance state information, which means RNN can hardly find the feature with long term dependency as the sequence becomes longer. Similarly, convolutional neural network (CNN) is introduced into NMT for speeding recently, however, CNN focus on capturing the local feature of the sequence; To relieve this issue, we incorporate a relation network into the standard encoder-decoder framework to enhance information-propogation in neural network, ensuring that the information of the source sentence can flow into the decoder adequately. Experiments show that proposed framework outperforms the statistical MT model and the state-of-art NMT model significantly on two data sets with different scales.
Tasks	Machine Translation
Published	2017-09-06
URL	http://arxiv.org/abs/1709.01766v3
PDF	http://arxiv.org/pdf/1709.01766v3.pdf
PWC	https://paperswithcode.com/paper/information-propogation-enhanced-neural
Repo
Framework

4-DoF Tracking for Robot Fine Manipulation Tasks


Title	4-DoF Tracking for Robot Fine Manipulation Tasks
Authors	Mennatullah Siam, Abhineet Singh, Camilo Perez, Martin Jagersand
Abstract	This paper presents two visual trackers from the different paradigms of learning and registration based tracking and evaluates their application in image based visual servoing. They can track object motion with four degrees of freedom (DoF) which, as we will show here, is sufficient for many fine manipulation tasks. One of these trackers is a newly developed learning based tracker that relies on learning discriminative correlation filters while the other is a refinement of a recent 8 DoF RANSAC based tracker adapted with a new appearance model for tracking 4 DoF motion. Both trackers are shown to provide superior performance to several state of the art trackers on an existing dataset for manipulation tasks. Further, a new dataset with challenging sequences for fine manipulation tasks captured from robot mounted eye-in-hand (EIH) cameras is also presented. These sequences have a variety of challenges encountered during real tasks including jittery camera movement, motion blur, drastic scale changes and partial occlusions. Quantitative and qualitative results on these sequences are used to show that these two trackers are robust to failures while providing high precision that makes them suitable for such fine manipulation tasks.
Tasks
Published	2017-03-06
URL	http://arxiv.org/abs/1703.01698v2
PDF	http://arxiv.org/pdf/1703.01698v2.pdf
PWC	https://paperswithcode.com/paper/4-dof-tracking-for-robot-fine-manipulation
Repo
Framework