April 2, 2020

3352 words 16 mins read

Paper Group ANR 271

Progressive Multi-Stage Learning for Discriminative Tracking. Adversarial Transfer Learning for Punctuation Restoration. Hybrid Models for Open Set Recognition. Real-World Airline Crew Pairing Optimization: Customized Genetic Algorithm versus Column Generation Method. Dropout Prediction over Weeks in MOOCs by Learning Representations of Clicks and …

Progressive Multi-Stage Learning for Discriminative Tracking


Title	Progressive Multi-Stage Learning for Discriminative Tracking
Authors	Weichao Li, Xi Li, Omar Elfarouk Bourahla, Fuxian Huang, Fei Wu, Wei Liu, Zhiheng Wang, Hongmin Liu
Abstract	Visual tracking is typically solved as a discriminative learning problem that usually requires high-quality samples for online model adaptation. It is a critical and challenging problem to evaluate the training samples collected from previous predictions and employ sample selection by their quality to train the model. To tackle the above problem, we propose a joint discriminative learning scheme with the progressive multi-stage optimization policy of sample selection for robust visual tracking. The proposed scheme presents a novel time-weighted and detection-guided self-paced learning strategy for easy-to-hard sample selection, which is capable of tolerating relatively large intra-class variations while maintaining inter-class separability. Such a self-paced learning strategy is jointly optimized in conjunction with the discriminative tracking process, resulting in robust tracking results. Experiments on the benchmark datasets demonstrate the effectiveness of the proposed learning framework.
Tasks	Visual Tracking
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00255v1
PDF	https://arxiv.org/pdf/2004.00255v1.pdf
PWC	https://paperswithcode.com/paper/progressive-multi-stage-learning-for
Repo
Framework

Adversarial Transfer Learning for Punctuation Restoration


Title	Adversarial Transfer Learning for Punctuation Restoration
Authors	Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian, Cunhang Fan
Abstract	Previous studies demonstrate that word embeddings and part-of-speech (POS) tags are helpful for punctuation restoration tasks. However, two drawbacks still exist. One is that word embeddings are pre-trained by unidirectional language modeling objectives. Thus the word embeddings only contain left-to-right context information. The other is that POS tags are provided by an external POS tagger. So computation cost will be increased and incorrect predicted tags may affect the performance of restoring punctuation marks during decoding. This paper proposes adversarial transfer learning to address these problems. A pre-trained bidirectional encoder representations from transformers (BERT) model is used to initialize a punctuation model. Thus the transferred model parameters carry both left-to-right and right-to-left representations. Furthermore, adversarial multi-task learning is introduced to learn task invariant knowledge for punctuation prediction. We use an extra POS tagging task to help the training of the punctuation predicting task. Adversarial training is utilized to prevent the shared parameters from containing task specific information. We only use the punctuation predicting task to restore marks during decoding stage. Therefore, it will not need extra computation and not introduce incorrect tags from the POS tagger. Experiments are conducted on IWSLT2011 datasets. The results demonstrate that the punctuation predicting models obtain further performance improvement with task invariant knowledge from the POS tagging task. Our best model outperforms the previous state-of-the-art model trained only with lexical features by up to 9.2% absolute overall F_1-score on test set.
Tasks	Language Modelling, Multi-Task Learning, Transfer Learning, Word Embeddings
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00248v1
PDF	https://arxiv.org/pdf/2004.00248v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-transfer-learning-for-punctuation
Repo
Framework

Hybrid Models for Open Set Recognition


Title	Hybrid Models for Open Set Recognition
Authors	Hongjie Zhang, Ang Li, Jie Guo, Yanwen Guo
Abstract	Open set recognition requires a classifier to detect samples not belonging to any of the classes in its training set. Existing methods fit a probability distribution to the training samples on their embedding space and detect outliers according to this distribution. The embedding space is often obtained from a discriminative classifier. However, such discriminative representation focuses only on known classes, which may not be critical for distinguishing the unknown classes. We argue that the representation space should be jointly learned from the inlier classifier and the density estimator (served as an outlier detector). We propose the OpenHybrid framework, which is composed of an encoder to encode the input data into a joint embedding space, a classifier to classify samples to inlier classes, and a flow-based density estimator to detect whether a sample belongs to the unknown category. A typical problem of existing flow-based models is that they may assign a higher likelihood to outliers. However, we empirically observe that such an issue does not occur in our experiments when learning a joint representation for discriminative and generative components. Experiments on standard open set benchmarks also reveal that an end-to-end trained OpenHybrid model significantly outperforms state-of-the-art methods and flow-based baselines.
Tasks	Open Set Learning
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12506v1
PDF	https://arxiv.org/pdf/2003.12506v1.pdf
PWC	https://paperswithcode.com/paper/hybrid-models-for-open-set-recognition
Repo
Framework

Real-World Airline Crew Pairing Optimization: Customized Genetic Algorithm versus Column Generation Method


Title	Real-World Airline Crew Pairing Optimization: Customized Genetic Algorithm versus Column Generation Method
Authors	Divyam Aggarwal, Dhish Kumar Saxena, Thomas Back, Michael Emmerich
Abstract	Airline crew cost is the second-largest operating cost component and its marginal improvement may translate to millions of dollars annually. Further, it’s highly constrained-combinatorial nature brings-in high impact research and commercial value. The airline crew pairing optimization problem (CPOP) is aimed at generating a set of crew pairings, covering all flights from its timetable, with minimum cost, while satisfying multiple legality constraints laid by federations, etc. Depending upon CPOP’s scale, several Genetic Algorithm and Column Generation based approaches have been proposed in the literature. However, these approaches have been validated either on small-scale flight datasets (a handful of pairings) or for smaller airlines (operating-in low-demand regions) such as Turkish Airlines, etc. Their search-efficiency gets impaired drastically when scaled to the networks of bigger airlines. The contributions of this paper relate to the proposition of a customized genetic algorithm, with improved initialization and genetic operators, developed by exploiting the domain-knowledge; and its comparison with a column generation based large-scale optimizer (developed by authors). To demonstrate the utility of the above-cited contributions, a real-world test-case (839 flights), provided by GE Aviation, is used which has been extracted from the networks of larger airlines (operating up to 33000 monthly flights in the US).
Tasks
Published	2020-03-08
URL	https://arxiv.org/abs/2003.03792v1
PDF	https://arxiv.org/pdf/2003.03792v1.pdf
PWC	https://paperswithcode.com/paper/real-world-airline-crew-pairing-optimization
Repo
Framework

Dropout Prediction over Weeks in MOOCs by Learning Representations of Clicks and Videos


Title	Dropout Prediction over Weeks in MOOCs by Learning Representations of Clicks and Videos
Authors	Byungsoo Jeon, Namyong Park
Abstract	This paper addresses a key challenge in MOOC dropout prediction, namely to build meaningful representations from clickstream data. While a variety of feature extraction techniques have been explored extensively for such purposes, to our knowledge, no prior works have explored modeling of educational content (e.g. video) and their correlation with the learner’s behavior (e.g. clickstream) in this context. We bridge this gap by devising a method to learn representation for videos and the correlation between videos and clicks. The results indicate that modeling videos and their correlation with clicks bring statistically significant improvements in predicting dropout.
Tasks
Published	2020-02-05
URL	https://arxiv.org/abs/2002.01955v1
PDF	https://arxiv.org/pdf/2002.01955v1.pdf
PWC	https://paperswithcode.com/paper/dropout-prediction-over-weeks-in-moocs-by
Repo
Framework

Hierarchical Models: Intrinsic Separability in High Dimensions


Title	Hierarchical Models: Intrinsic Separability in High Dimensions
Authors	Wen-Yan Lin
Abstract	It has long been noticed that high dimension data exhibits strange patterns. This has been variously interpreted as either a “blessing” or a “curse”, causing uncomfortable inconsistencies in the literature. We propose that these patterns arise from an intrinsically hierarchical generative process. Modeling the process creates a web of constraints that reconcile many different theories and results. The model also implies high dimensional data posses an innate separability that can be exploited for machine learning. We demonstrate how this permits the open-set learning problem to be defined mathematically, leading to qualitative and quantitative improvements in performance.
Tasks	Open Set Learning
Published	2020-03-15
URL	https://arxiv.org/abs/2003.07770v1
PDF	https://arxiv.org/pdf/2003.07770v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-models-intrinsic-separability-in
Repo
Framework

Depth Selection for Deep ReLU Nets in Feature Extraction and Generalization


Title	Depth Selection for Deep ReLU Nets in Feature Extraction and Generalization
Authors	Zhi Han, Siquan Yu, Shao-Bo Lin, Ding-Xuan Zhou
Abstract	Deep learning is recognized to be capable of discovering deep features for representation learning and pattern recognition without requiring elegant feature engineering techniques by taking advantage of human ingenuity and prior knowledge. Thus it has triggered enormous research activities in machine learning and pattern recognition. One of the most important challenge of deep learning is to figure out relations between a feature and the depth of deep neural networks (deep nets for short) to reflect the necessity of depth. Our purpose is to quantify this feature-depth correspondence in feature extraction and generalization. We present the adaptivity of features to depths and vice-verse via showing a depth-parameter trade-off in extracting both single feature and composite features. Based on these results, we prove that implementing the classical empirical risk minimization on deep nets can achieve the optimal generalization performance for numerous learning tasks. Our theoretical results are verified by a series of numerical experiments including toy simulations and a real application of earthquake seismic intensity prediction.
Tasks	Feature Engineering, Representation Learning
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00245v1
PDF	https://arxiv.org/pdf/2004.00245v1.pdf
PWC	https://paperswithcode.com/paper/depth-selection-for-deep-relu-nets-in-feature
Repo
Framework

You can do RLAs for IRV


Title	You can do RLAs for IRV
Authors	Michelle Blom, Andrew Conway, Dan King, Laurent Sandrolini, Philip B. Stark, Peter J. Stuckey, Vanessa Teague
Abstract	The City and County of San Francisco, CA, has used Instant Runoff Voting (IRV) for some elections since 2004. This report describes the first ever process pilot of Risk Limiting Audits for IRV, for the San Francisco District Attorney’s race in November, 2019. We found that the vote-by-mail outcome could be efficiently audited to well under the 0.05 risk limit given a sample of only 200 ballots. All the software we developed for the pilot is open source.
Tasks
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00235v1
PDF	https://arxiv.org/pdf/2004.00235v1.pdf
PWC	https://paperswithcode.com/paper/you-can-do-rlas-for-irv
Repo
Framework

Botnet Detection Using Recurrent Variational Autoencoder


Title	Botnet Detection Using Recurrent Variational Autoencoder
Authors	Jeeyung Kim, Alex Sim, Jinoh Kim, Kesheng Wu
Abstract	Botnets are increasingly used by malicious actors, creating increasing threat to a large number of internet users. To address this growing danger, we propose to study methods to detect botnets, especially those that are hard to capture with the commonly used methods, such as the signature based ones and the existing anomaly-based ones. More specifically, we propose a novel machine learning based method, named Recurrent Variational Autoencoder (RVAE), for detecting botnets through sequential characteristics of network traffic flow data including attacks by botnets. We validate robustness of our method with the CTU-13 dataset, where we have chosen the testing dataset to have different types of botnets than those of training dataset. Tests show that RVAE is able to detect botnets with the same accuracy as the best known results published in literature. In addition, we propose an approach to assign anomaly score based on probability distributions, which allows us to detect botnets in streaming mode as the new networking statistics becomes available. This on-line detection capability would enable real-time detection of unknown botnets.
Tasks
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00234v1
PDF	https://arxiv.org/pdf/2004.00234v1.pdf
PWC	https://paperswithcode.com/paper/botnet-detection-using-recurrent-variational
Repo
Framework


Title	Constrained Nonnegative Matrix Factorization for Blind Hyperspectral Unmixing incorporating Endmember Independence
Authors	E. M. M. B. Ekanayake, Bhathiya Rathnayake, G. M. R. I. Godaliyadda, H. M. V. R. Herath, M. P. B. Ekanayake
Abstract	Hyperspectral image (HSI) analysis has become a key area in the field of remote sensing as a result of its ability to exploit richer information in the form of multiple spectral bands. The study of hyperspectral unmixing (HU) is important in HSI analysis due to the insufficient spatial resolution of customary imaging spectrometers. The endmembers of an HSI are more likely to be generated by independent sources and be mixed in a macroscopic degree before arriving at the sensor element of the imaging spectrometer as mixed spectra. Over the past few decades, many attempts have focused on imposing auxiliary constraints on the conventional nonnegative matrix factorization (NMF) framework in order to effectively unmix these mixed spectra. Signifying a step forward toward finding an optimum constraint to extract endmembers, this paper presents a novel blind HU algorithm, referred to as Kurtosis-based Smooth Nonnegative Matrix Factorization (KbSNMF) which incorporates a novel constraint based on the statistical independence of the probability density functions of endmember spectra. Imposing this constraint on the conventional NMF framework promotes the extraction of independent endmembers while further enhancing the parts-based representation of data. The proposed algorithm manages to outperform several state of the art NMF-based algorithms in terms of extracting endmember spectra from hyperspectral remote sensing data; therefore, it could uplift the performance of recent deep learning HU methods which utilizes the endmember spectra as supervisory input data for abundance extraction. We release all code utilized to implement KbSNMF.
Tasks	Hyperspectral Unmixing
Published	2020-03-02
URL	https://arxiv.org/abs/2003.01041v2
PDF	https://arxiv.org/pdf/2003.01041v2.pdf
PWC	https://paperswithcode.com/paper/constrained-nonnegative-matrix-factorization
Repo
Framework

On the Role of Receptive Field in Unsupervised Sim-to-Real Image Translation


Title	On the Role of Receptive Field in Unsupervised Sim-to-Real Image Translation
Authors	Nikita Jaipuria, Shubh Gupta, Praveen Narayanan, Vidya N. Murali
Abstract	Generative Adversarial Networks (GANs) are now widely used for photo-realistic image synthesis. In applications where a simulated image needs to be translated into a realistic image (sim-to-real), GANs trained on unpaired data from the two domains are susceptible to failure in semantic content retention as the image is translated from one domain to the other. This failure mode is more pronounced in cases where the real data lacks content diversity, resulting in a content \emph{mismatch} between the two domains - a situation often encountered in real-world deployment. In this paper, we investigate the role of the discriminator’s receptive field in GANs for unsupervised image-to-image translation with mismatched data, and study its effect on semantic content retention. Experiments with the discriminator architecture of a state-of-the-art coupled Variational Auto-Encoder (VAE) - GAN model on diverse, mismatched datasets show that the discriminator receptive field is directly correlated with semantic content discrepancy of the generated image.
Tasks	Image Generation, Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published	2020-01-25
URL	https://arxiv.org/abs/2001.09257v1
PDF	https://arxiv.org/pdf/2001.09257v1.pdf
PWC	https://paperswithcode.com/paper/on-the-role-of-receptive-field-in
Repo
Framework

Communication-Channel Optimized Partition


Title	Communication-Channel Optimized Partition
Authors	Thuan Nguyen, Thinh Nguyen
Abstract	Given an original discrete source X with the distribution p_X that is corrupted by noise to produce the noisy data Y with the given joint distribution p(X, Y). A quantizer/classifier Q : Y -> Z is then used to classify/quantize the data Y to the discrete partitioned output Z with probability distribution p_Z. Next, Z is transmitted over a deterministic channel with a given channel matrix A that produces the final discrete output T. One wants to design the optimal quantizer/classifier Q^* such that the cost function F(X; T) between the input X and the final output T is minimized while the probability of the partitioned output Z satisfies a concave constraint G(p_Z) < C. Our results generalized some famous previous results. First, an iteration linear time complexity algorithm is proposed to find the local optimal quantizer. Second, we show that the optimal partition should produce a hard partition that is equivalent to the cuts by hyper-planes in the probability space of the posterior probability p(XY). This result finally provides a polynomial-time algorithm to find the globally optimal quantizer.
Tasks
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01708v1
PDF	https://arxiv.org/pdf/2001.01708v1.pdf
PWC	https://paperswithcode.com/paper/communication-channel-optimized-partition
Repo
Framework


Title	Meta-modal Information Flow: A Method for Capturing Multimodal Modular Disconnectivity in Schizophrenia
Authors	Haleh Falakshahi, Victor M. Vergara, Jingyu Liu, Daniel H. Mathalon, Judith M. Ford, James Voyvodic, Bryon A. Mueller, Aysenil Belger, Sarah McEwen, Steven G. Potkin, Adrian Preda, Hooman Rokham, Jing Sui, Jessica A. Turner, Sergey Plis, Vince D. Calhoun
Abstract	Objective: Multimodal measurements of the same phenomena provide complementary information and highlight different perspectives, albeit each with their own limitations. A focus on a single modality may lead to incorrect inferences, which is especially important when a studied phenomenon is a disease. In this paper, we introduce a method that takes advantage of multimodal data in addressing the hypotheses of disconnectivity and dysfunction within schizophrenia (SZ). Methods: We start with estimating and visualizing links within and among extracted multimodal data features using a Gaussian graphical model (GGM). We then propose a modularity-based method that can be applied to the GGM to identify links that are associated with mental illness across a multimodal data set. Through simulation and real data, we show our approach reveals important information about disease-related network disruptions that are missed with a focus on a single modality. We use functional MRI (fMRI), diffusion MRI (dMRI), and structural MRI (sMRI) to compute the fractional amplitude of low frequency fluctuations (fALFF), fractional anisotropy (FA), and gray matter (GM) concentration maps. These three modalities are analyzed using our modularity method. Results: Our results show missing links that are only captured by the cross-modal information that may play an important role in disconnectivity between the components. Conclusion: We identified multimodal (fALFF, FA and GM) disconnectivity in the default mode network area in patients with SZ, which would not have been detectable in a single modality. Significance: The proposed approach provides an important new tool for capturing information that is distributed among multiple imaging modalities.
Tasks
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01707v1
PDF	https://arxiv.org/pdf/2001.01707v1.pdf
PWC	https://paperswithcode.com/paper/meta-modal-information-flow-a-method-for
Repo
Framework

Uncertainty quantification in imaging and automatic horizon tracking: a Bayesian deep-prior based approach


Title	Uncertainty quantification in imaging and automatic horizon tracking: a Bayesian deep-prior based approach
Authors	Ali Siahkoohi, Gabrio Rizzuti, Felix J. Herrmann
Abstract	In inverse problems, uncertainty quantification (UQ) deals with a probabilistic description of the solution nonuniqueness and data noise sensitivity. Setting seismic imaging into a Bayesian framework allows for a principled way of studying uncertainty by solving for the model posterior distribution. Imaging, however, typically constitutes only the first stage of a sequential workflow, and UQ becomes even more relevant when applied to subsequent tasks that are highly sensitive to the inversion outcome. In this paper, we focus on how UQ trickles down to horizon tracking for the determination of stratigraphic models and investigate its sensitivity with respect to the imaging result. As such, the main contribution of this work consists in a data-guided approach to horizon tracking uncertainty analysis. This work is fundamentally based on a special reparameterization of reflectivity, known as “deep prior”. Feasible models are restricted to the output of a convolutional neural network with a fixed input, while weights and biases are Gaussian random variables. Given a deep prior model, the network parameters are sampled from the posterior distribution via a Markov chain Monte Carlo method, from which the conditional mean and point-wise standard deviation of the inferred reflectivities are approximated. For each sample of the posterior distribution, a reflectivity is generated, and the horizons are tracked automatically. In this way, uncertainty on model parameters naturally translates to horizon tracking. As part of the validation for the proposed approach, we verified that the estimated confidence intervals for the horizon tracking coincide with geologically complex regions, such as faults.
Tasks
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00227v1
PDF	https://arxiv.org/pdf/2004.00227v1.pdf
PWC	https://paperswithcode.com/paper/uncertainty-quantification-in-imaging-and
Repo
Framework

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans


Title	Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Authors	Nachiket Deo, Mohan M. Trivedi
Abstract	In this paper, we address the problem of forecasting agent trajectories in unknown environments, conditioned on their past motion and scene structure. Trajectory forecasting is a challenging problem due to the large variation in scene structure, and the multi-modal nature of the distribution of future trajectories. Unlike prior approaches that directly learn one-to-many mappings from observed context, to multiple future trajectories, we propose to condition trajectory forecasts on \textit{plans} sampled from a grid based policy learned using maximum entropy inverse reinforcement learning policy (MaxEnt IRL). We reformulate MaxEnt IRL to allow the policy to jointly infer plausible agent goals and paths to those goals on a coarse 2-D grid defined over an unknown scene. We propose an attention based trajectory generator that generates continuous valued future trajectories conditioned on state sequences sampled from the MaxEnt policy. Quantitative and qualitative evaluation on the publicly available Stanford drone dataset (SDD) shows that our model generates trajectories that are (1) diverse, representing the multi-modal predictive distribution, and (2) precise, conforming to the underlying scene structure over long prediction horizons, achieving state of the art results on the TrajNet benchmark split of SDD.
Tasks
Published	2020-01-03
URL	https://arxiv.org/abs/2001.00735v1
PDF	https://arxiv.org/pdf/2001.00735v1.pdf
PWC	https://paperswithcode.com/paper/trajectory-forecasts-in-unknown-environments
Repo
Framework