October 18, 2019

2972 words 14 mins read

Paper Group ANR 516

Utilizing Imbalanced Data and Classification Cost Matrix to Predict Movie Preferences. Hierarchical community detection by recursive partitioning. Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis. Neural Classification of Malicious Scripts: A study with JavaScript and VBScript. DYAN: A Dynamical Atoms-Based Net …

Utilizing Imbalanced Data and Classification Cost Matrix to Predict Movie Preferences


Title	Utilizing Imbalanced Data and Classification Cost Matrix to Predict Movie Preferences
Authors	Haifeng Wang
Abstract	In this paper, we propose a movie genre recommendation system based on imbalanced survey data and unequal classification costs for small and medium-sized enterprises (SMEs) who need a data-based and analytical approach to stock favored movies and target marketing to young people. The dataset maintains a detailed personal profile as predictors including demographic, behavioral and preferences information for each user as well as imbalanced genre preferences. These predictors do not include the information such as actors or directors. The paper applies Gentle boost, Adaboost and Bagged tree ensembles as well as SVM machine learning algorithms to learn classification from one thousand observations and predict movie genre preferences with adjusted classification costs. The proposed recommendation system also selects important predictors to avoid overfitting and to shorten training time. This paper compares the test error among the above-mentioned algorithms that are used to recommend different movie genres. The prediction power is also indicated in a comparison of precision and recall with other state-of-the-art recommendation systems. The proposed movie genre recommendation system solves problems such as small dataset, imbalanced response, and unequal classification costs.
Tasks	Movie Genre Recommendation System, Recommendation Systems
Published	2018-12-04
URL	http://arxiv.org/abs/1812.02529v1
PDF	http://arxiv.org/pdf/1812.02529v1.pdf
PWC	https://paperswithcode.com/paper/utilizing-imbalanced-data-and-classification
Repo
Framework

Hierarchical community detection by recursive partitioning


Title	Hierarchical community detection by recursive partitioning
Authors	Tianxi Li, Lihua Lei, Sharmodeep Bhattacharyya, Purnamrita Sarkar, Peter J. Bickel, Elizaveta Levina
Abstract	The problem of community detection in networks is usually formulated as finding a single partition of the network into some “correct” number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until a stopping rule suggests there are no further communities. This class of algorithms is model-free, computationally efficient, and requires no tuning other than selecting a stopping rule. We show that there are regimes where this approach outperforms K-way spectral clustering, and propose a natural framework for analyzing the algorithm’s theoretical performance, the binary tree stochastic block model. Under this model, we prove that the algorithm correctly recovers the entire community tree under relatively mild assumptions. We also apply the algorithm to a dataset of statistics papers to construct a hierarchical tree of statistical research communities.
Tasks	Community Detection
Published	2018-10-02
URL	https://arxiv.org/abs/1810.01509v5
PDF	https://arxiv.org/pdf/1810.01509v5.pdf
PWC	https://paperswithcode.com/paper/hierarchical-community-detection-by-recursive
Repo
Framework

Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis


Title	Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis
Authors	Gustav Eje Henter, Jaime Lorenzo-Trueba, Xin Wang, Junichi Yamagishi
Abstract	Generating versatile and appropriate synthetic speech requires control over the output expression separate from the spoken text. Important non-textual speech variation is seldom annotated, in which case output control must be learned in an unsupervised fashion. In this paper, we perform an in-depth study of methods for unsupervised learning of control in statistical speech synthesis. For example, we show that popular unsupervised training heuristics can be interpreted as variational inference in certain autoencoder models. We additionally connect these models to VQ-VAEs, another, recently-proposed class of deep variational autoencoders, which we show can be derived from a very similar mathematical argument. The implications of these new probabilistic interpretations are discussed. We illustrate the utility of the various approaches with an application to acoustic modelling for emotional speech synthesis, where the unsupervised methods for learning expression control (without access to emotional labels) are found to give results that in many aspects match or surpass the previous best supervised approach.
Tasks	Acoustic Modelling, Speech Synthesis
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11470v3
PDF	http://arxiv.org/pdf/1807.11470v3.pdf
PWC	https://paperswithcode.com/paper/deep-encoder-decoder-models-for-unsupervised
Repo
Framework

Neural Classification of Malicious Scripts: A study with JavaScript and VBScript


Title	Neural Classification of Malicious Scripts: A study with JavaScript and VBScript
Authors	Jack W. Stokes, Rakshit Agrawal, Geoff McDonald
Abstract	Malicious scripts are an important computer infection threat vector. Our analysis reveals that the two most prevalent types of malicious scripts include JavaScript and VBScript. The percentage of detected JavaScript attacks are on the rise. To address these threats, we investigate two deep recurrent models, LaMP (LSTM and Max Pooling) and CPoLS (Convoluted Partitioning of Long Sequences), which process JavaScript and VBScript as byte sequences. Lower layers capture the sequential nature of these byte sequences while higher layers classify the resulting embedding as malicious or benign. Unlike previously proposed solutions, our models are trained in an end-to-end fashion allowing discriminative training even for the sequential processing layers. Evaluating these models on a large corpus of 296,274 JavaScript files indicates that the best performing LaMP model has a 65.9% true positive rate (TPR) at a false positive rate (FPR) of 1.0%. Similarly, the best CPoLS model has a TPR of 45.3% at an FPR of 1.0%. LaMP and CPoLS yield a TPR of 69.3% and 67.9%, respectively, at an FPR of 1.0% on a collection of 240,504 VBScript files.
Tasks
Published	2018-05-15
URL	http://arxiv.org/abs/1805.05603v1
PDF	http://arxiv.org/pdf/1805.05603v1.pdf
PWC	https://paperswithcode.com/paper/neural-classification-of-malicious-scripts-a
Repo
Framework

DYAN: A Dynamical Atoms-Based Network for Video Prediction


Title	DYAN: A Dynamical Atoms-Based Network for Video Prediction
Authors	Wenqian Liu, Abhishek Sharma, Octavia Camps, Mario Sznaier
Abstract	The ability to anticipate the future is essential when making real time critical decisions, provides valuable information to understand dynamic natural scenes, and can help unsupervised video representation learning. State-of-art video prediction is based on LSTM recursive networks and/or generative adversarial network learning. These are complex architectures that need to learn large numbers of parameters, are potentially hard to train, slow to run, and may produce blurry predictions. In this paper, we introduce DYAN, a novel network with very few parameters and easy to train, which produces accurate, high quality frame predictions, significantly faster than previous approaches. DYAN owes its good qualities to its encoder and decoder, which are designed following concepts from systems identification theory and exploit the dynamics-based invariants of the data. Extensive experiments using several standard video datasets show that DYAN is superior generating frames and that it generalizes well across domains.
Tasks	Representation Learning, Video Prediction
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07201v2
PDF	http://arxiv.org/pdf/1803.07201v2.pdf
PWC	https://paperswithcode.com/paper/dyan-a-dynamical-atoms-based-network-for
Repo
Framework

Person Search in Videos with One Portrait Through Visual and Temporal Links


Title	Person Search in Videos with One Portrait Through Visual and Temporal Links
Authors	Qingqiu Huang, Wentao Liu, Dahua Lin
Abstract	In real-world applications, e.g. law enforcement and video retrieval, one often needs to search a certain person in long videos with just one portrait. This is much more challenging than the conventional settings for person re-identification, as the search may need to be carried out in the environments different from where the portrait was taken. In this paper, we aim to tackle this challenge and propose a novel framework, which takes into account the identity invariance along a tracklet, thus allowing person identities to be propagated via both the visual and the temporal links. We also develop a novel scheme called Progressive Propagation via Competitive Consensus, which significantly improves the reliability of the propagation process. To promote the study of person search, we construct a large-scale benchmark, which contains 127K manually annotated tracklets from 192 movies. Experiments show that our approach remarkably outperforms mainstream person re-id methods, raising the mAP from 42.16% to 62.27%.
Tasks	Person Re-Identification, Person Search, Video Retrieval
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10510v1
PDF	http://arxiv.org/pdf/1807.10510v1.pdf
PWC	https://paperswithcode.com/paper/person-search-in-videos-with-one-portrait
Repo
Framework

On the Rates of Convergence from Surrogate Risk Minimizers to the Bayes Optimal Classifier


Title	On the Rates of Convergence from Surrogate Risk Minimizers to the Bayes Optimal Classifier
Authors	Jingwei Zhang, Tongliang Liu, Dacheng Tao
Abstract	We study the rates of convergence from empirical surrogate risk minimizers to the Bayes optimal classifier. Specifically, we introduce the notion of \emph{consistency intensity} to characterize a surrogate loss function and exploit this notion to obtain the rate of convergence from an empirical surrogate risk minimizer to the Bayes optimal classifier, enabling fair comparisons of the excess risks of different surrogate risk minimizers. The main result of the paper has practical implications including (1) showing that hinge loss is superior to logistic and exponential loss in the sense that its empirical minimizer converges faster to the Bayes optimal classifier and (2) guiding to modify surrogate loss functions to accelerate the convergence to the Bayes optimal classifier.
Tasks
Published	2018-02-11
URL	http://arxiv.org/abs/1802.03688v1
PDF	http://arxiv.org/pdf/1802.03688v1.pdf
PWC	https://paperswithcode.com/paper/on-the-rates-of-convergence-from-surrogate
Repo
Framework

Improving Predictive Uncertainty Estimation using Dropout – Hamiltonian Monte Carlo


Title	Improving Predictive Uncertainty Estimation using Dropout – Hamiltonian Monte Carlo
Authors	Diego Vergara, Sergio Hernández, Matias Valdenegro-Toro, Felipe Jorquera
Abstract	Estimating predictive uncertainty is crucial for many computer vision tasks, from image classification to autonomous driving systems. Hamiltonian Monte Carlo (HMC) is an sampling method for performing Bayesian inference. On the other hand, Dropout regularization has been proposed as an approximate model averaging technique that tends to improve generalization in large scale models such as deep neural networks. Although, HMC provides convergence guarantees for most standard Bayesian models, it does not handle discrete parameters arising from Dropout regularization. In this paper, we present a robust methodology for improving predictive uncertainty in classification problems, based on Dropout and Hamiltonian Monte Carlo. Even though Dropout induces a non-smooth energy function with no such convergence guarantees, the resulting discretization of the Hamiltonian proves empirical success. The proposed method allows to effectively estimate the predictive accuracy and to provide better generalization for difficult test examples.
Tasks	Autonomous Driving, Bayesian Inference, Image Classification
Published	2018-05-12
URL	https://arxiv.org/abs/1805.04756v3
PDF	https://arxiv.org/pdf/1805.04756v3.pdf
PWC	https://paperswithcode.com/paper/predictive-uncertainty-in-large-scale
Repo
Framework

Multi-View Stereo with Asymmetric Checkerboard Propagation and Multi-Hypothesis Joint View Selection


Title	Multi-View Stereo with Asymmetric Checkerboard Propagation and Multi-Hypothesis Joint View Selection
Authors	Qingshan Xu, Wenbing Tao
Abstract	In computer vision domain, how to fast and accurately perform multiview stereo (MVS) is still a challenging problem. In this paper we present a fast yet accurate method for 3D dense reconstruction, called AMHMVS, built on the PatchMatch based stereo algorithm. Different from the regular symmetric propagation scheme, our approach adopts an asymmetric checkerboard propagation strategy, which can adaptively make effective hypotheses expand further according to the confidence of current neighbor hypotheses. In order to aggregate visual information from multiple images better, we propose the multi-hypothesis joint view selection for each pixel, which leverages a cost matrix based on the multiple propagated hypotheses to robustly infer an appropriate aggregation subset parallel. Combined with the above two steps, our approach not only has the capacity of massively parallel computation, but also obtains high accuracy and completeness. Experiments on extensive datasets show that our method achieves more accurate and robust results, and runs faster than the competing methods.
Tasks
Published	2018-05-21
URL	http://arxiv.org/abs/1805.07920v1
PDF	http://arxiv.org/pdf/1805.07920v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-stereo-with-asymmetric
Repo
Framework


Title	Multi-modal Non-line-of-sight Passive Imaging
Authors	Andre Beckus, Alexandru Tamasan, George K. Atia
Abstract	We consider the non-line-of-sight (NLOS) imaging of an object using the light reflected off a diffusive wall. The wall scatters incident light such that a lens is no longer useful to form an image. Instead, we exploit the 4D spatial coherence function to reconstruct a 2D projection of the obscured object. The approach is completely passive in the sense that no control over the light illuminating the object is assumed and is compatible with the partially coherent fields ubiquitous in both the indoor and outdoor environments. We formulate a multi-criteria convex optimization problem for reconstruction, which fuses the reflected field’s intensity and spatial coherence information at different scales. Our formulation leverages established optics models of light propagation and scattering and exploits the sparsity common to many images in different bases. We also develop an algorithm based on the alternating direction method of multipliers to efficiently solve the convex program proposed. A means for analyzing the null space of the measurement matrices is provided as well as a means for weighting the contribution of individual measurements to the reconstruction. This paper holds promise to advance passive imaging in the challenging NLOS regimes in which the intensity does not necessarily retain distinguishable features and provides a framework for multi-modal information fusion for efficient scene reconstruction.
Tasks
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02444v2
PDF	http://arxiv.org/pdf/1807.02444v2.pdf
PWC	https://paperswithcode.com/paper/multi-modal-non-line-of-sight-passive-imaging
Repo
Framework

Tuplemax Loss for Language Identification


Title	Tuplemax Loss for Language Identification
Authors	Li Wan, Prashant Sridhar, Yang Yu, Quan Wang, Ignacio Lopez Moreno
Abstract	In many scenarios of a language identification task, the user will specify a small set of languages which he/she can speak instead of a large set of all possible languages. We want to model such prior knowledge into the way we train our neural networks, by replacing the commonly used softmax loss function with a novel loss function named tuplemax loss. As a matter of fact, a typical language identification system launched in North America has about 95% users who could speak no more than two languages. Using the tuplemax loss, our system achieved a 2.33% error rate, which is a relative 39.4% improvement over the 3.85% error rate of standard softmax loss method.
Tasks	Language Identification
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12290v2
PDF	http://arxiv.org/pdf/1811.12290v2.pdf
PWC	https://paperswithcode.com/paper/tuplemax-loss-for-language-identification
Repo
Framework

Learning Neural Models for End-to-End Clustering


Title	Learning Neural Models for End-to-End Clustering
Authors	Benjamin Bruno Meier, Ismail Elezi, Mohammadreza Amirian, Oliver Durr, Thilo Stadelmann
Abstract	We propose a novel end-to-end neural network architecture that, once trained, directly outputs a probabilistic clustering of a batch of input examples in one pass. It estimates a distribution over the number of clusters $k$, and for each $1 \leq k \leq k_\mathrm{max}$, a distribution over the individual cluster assignment for each data point. The network is trained in advance in a supervised fashion on separate data to learn grouping by any perceptual similarity criterion based on pairwise labels (same/different group). It can then be applied to different data containing different groups. We demonstrate promising performance on high-dimensional data like images (COIL-100) and speech (TIMIT). We call this ``learning to cluster’’ and show its conceptual difference to deep metric learning, semi-supervise clustering and other related approaches while having the advantage of performing learnable clustering fully end-to-end. \|
Tasks	Metric Learning
Published	2018-07-11
URL	http://arxiv.org/abs/1807.04001v1
PDF	http://arxiv.org/pdf/1807.04001v1.pdf
PWC	https://paperswithcode.com/paper/learning-neural-models-for-end-to-end
Repo
Framework

ATPboost: Learning Premise Selection in Binary Setting with ATP Feedback


Title	ATPboost: Learning Premise Selection in Binary Setting with ATP Feedback
Authors	Bartosz Piotrowski, Josef Urban
Abstract	ATPboost is a system for solving sets of large-theory problems by interleaving ATP runs with state-of-the-art machine learning of premise selection from the proofs. Unlike many previous approaches that use multi-label setting, the learning is implemented as binary classification that estimates the pairwise-relevance of (theorem, premise) pairs. ATPboost uses for this the XGBoost gradient boosting algorithm, which is fast and has state-of-the-art performance on many tasks. Learning in the binary setting however requires negative examples, which is nontrivial due to many alternative proofs. We discuss and implement several solutions in the context of the ATP/ML feedback loop, and show that ATPboost with such methods significantly outperforms the k-nearest neighbors multilabel classifier.
Tasks
Published	2018-02-09
URL	http://arxiv.org/abs/1802.03375v1
PDF	http://arxiv.org/pdf/1802.03375v1.pdf
PWC	https://paperswithcode.com/paper/atpboost-learning-premise-selection-in-binary
Repo
Framework

Augment and Reduce: Stochastic Inference for Large Categorical Distributions


Title	Augment and Reduce: Stochastic Inference for Large Categorical Distributions
Authors	Francisco J. R. Ruiz, Michalis K. Titsias, Adji B. Dieng, David M. Blei
Abstract	Categorical distributions are ubiquitous in machine learning, e.g., in classification, language models, and recommendation systems. However, when the number of possible outcomes is very large, using categorical distributions becomes computationally expensive, as the complexity scales linearly with the number of outcomes. To address this problem, we propose augment and reduce (A&R), a method to alleviate the computational complexity. A&R uses two ideas: latent variable augmentation and stochastic variational inference. It maximizes a lower bound on the marginal likelihood of the data. Unlike existing methods which are specific to softmax, A&R is more general and is amenable to other categorical models, such as multinomial probit. On several large-scale classification problems, we show that A&R provides a tighter bound on the marginal likelihood and has better predictive performance than existing approaches.
Tasks	Recommendation Systems
Published	2018-02-12
URL	http://arxiv.org/abs/1802.04220v3
PDF	http://arxiv.org/pdf/1802.04220v3.pdf
PWC	https://paperswithcode.com/paper/augment-and-reduce-stochastic-inference-for-1
Repo
Framework

Efficient Exploration of Gradient Space for Online Learning to Rank


Title	Efficient Exploration of Gradient Space for Online Learning to Rank
Authors	Huazheng Wang, Ramsey Langley, Sonwoo Kim, Eric McCord-Snook, Hongning Wang
Abstract	Online learning to rank (OL2R) optimizes the utility of returned search results based on implicit feedback gathered directly from users. To improve the estimates, OL2R algorithms examine one or more exploratory gradient directions and update the current ranker if a proposed one is preferred by users via an interleaved test. In this paper, we accelerate the online learning process by efficient exploration in the gradient space. Our algorithm, named as Null Space Gradient Descent, reduces the exploration space to only the \emph{null space} of recent poorly performing gradients. This prevents the algorithm from repeatedly exploring directions that have been discouraged by the most recent interactions with users. To improve sensitivity of the resulting interleaved test, we selectively construct candidate rankers to maximize the chance that they can be differentiated by candidate ranking documents in the current query; and we use historically difficult queries to identify the best ranker when tie occurs in comparing the rankers. Extensive experimental comparisons with the state-of-the-art OL2R algorithms on several public benchmarks confirmed the effectiveness of our proposal algorithm, especially in its fast learning convergence and promising ranking quality at an early stage.
Tasks	Efficient Exploration, Learning-To-Rank
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07317v1
PDF	http://arxiv.org/pdf/1805.07317v1.pdf
PWC	https://paperswithcode.com/paper/efficient-exploration-of-gradient-space-for
Repo
Framework