January 27, 2020

3159 words 15 mins read

Paper Group ANR 1171

CUDA: Contradistinguisher for Unsupervised Domain Adaptation. Importance-Aware Learning for Neural Headline Editing. Would a File by Any Other Name Seem as Malicious?. COPYCAT: Practical Adversarial Attacks on Visualization-Based Malware Detection. Multi-scale Deep Neural Networks for Solving High Dimensional PDEs. How Does Batch Normalization Help …

CUDA: Contradistinguisher for Unsupervised Domain Adaptation


Title	CUDA: Contradistinguisher for Unsupervised Domain Adaptation
Authors	Sourabh Balgi, Ambedkar Dukkipati
Abstract	In this paper, we propose a simple model referred as Contradistinguisher (CTDR) for unsupervised domain adaptation whose objective is to jointly learn to contradistinguish on unlabeled target domain in a fully unsupervised manner along with prior knowledge acquired by supervised learning on an entirely different domain. Most recent works in domain adaptation rely on an indirect way of first aligning the source and target domain distributions and then learn a classifier on a labeled source domain to classify target domain. This approach of an indirect way of addressing the real task of unlabeled target domain classification has three main drawbacks. (i) The sub-task of obtaining a perfect alignment of the domain in itself might be impossible due to large domain shift (e.g., language domains). (ii) The use of multiple classifiers to align the distributions unnecessarily increases the complexity of the neural networks leading to over-fitting in many cases. (iii) Due to distribution alignment, the domain-specific information is lost as the domains get morphed. In this work, we propose a simple and direct approach that does not require domain alignment. We jointly learn CTDR on both source and target distribution for unsupervised domain adaptation task using contradistinguish loss for the unlabeled target domain in conjunction with a supervised loss for labeled source domain. Our experiments show that avoiding domain alignment by directly addressing the task of unlabeled target domain classification using CTDR achieves state-of-the-art results on eight visual and four language benchmark domain adaptation datasets.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03442v1
PDF	https://arxiv.org/pdf/1909.03442v1.pdf
PWC	https://paperswithcode.com/paper/cuda-contradistinguisher-for-unsupervised
Repo
Framework

Importance-Aware Learning for Neural Headline Editing


Title	Importance-Aware Learning for Neural Headline Editing
Authors	Qingyang Wu, Lei Li, Hao Zhou, Ying Zeng, Zhou Yu
Abstract	Many social media news writers are not professionally trained. Therefore, social media platforms have to hire professional editors to adjust amateur headlines to attract more readers. We propose to automate this headline editing process through neural network models to provide more immediate writing support for these social media news writers. To train such a neural headline editing model, we collected a dataset which contains articles with original headlines and professionally edited headlines. However, it is expensive to collect a large number of professionally edited headlines. To solve this low-resource problem, we design an encoder-decoder model which leverages large scale pre-trained language models. We further improve the pre-trained model’s quality by introducing a headline generation task as an intermediate task before the headline editing task. Also, we propose Self Importance-Aware (SIA) loss to address the different levels of editing in the dataset by down-weighting the importance of easily classified tokens and sentences. With the help of Pre-training, Adaptation, and SIA, the model learns to generate headlines in the professional editor’s style. Experimental results show that our method significantly improves the quality of headline editing comparing against previous methods.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/1912.01114v1
PDF	https://arxiv.org/pdf/1912.01114v1.pdf
PWC	https://paperswithcode.com/paper/importance-aware-learning-for-neural-headline
Repo
Framework

Would a File by Any Other Name Seem as Malicious?


Title	Would a File by Any Other Name Seem as Malicious?
Authors	Andre T. Nguyen, Edward Raff, Aaron Sant-Miller
Abstract	Successful malware attacks on information technology systems can cause millions of dollars in damage, the exposure of sensitive and private information, and the irreversible destruction of data. Anti-virus systems that analyze a file’s contents use a combination of static and dynamic analysis to detect and remove/remediate such malware. However, examining a file’s entire contents is not always possible in practice, as the volume and velocity of incoming data may be too high, or access to the underlying file contents may be restricted or unavailable. If it were possible to obtain estimates of a file’s relative likelihood of being malicious without looking at the file contents, we could better prioritize file processing order and aid analysts in situations where a file is unavailable. In this work, we demonstrate that file names can contain information predictive of the presence of malware in a file. In particular, we show the effectiveness of a character-level convolutional neural network at predicting malware status using file names on Endgame’s EMBER malware detection benchmark dataset.
Tasks	Malware Detection
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04753v1
PDF	https://arxiv.org/pdf/1910.04753v1.pdf
PWC	https://paperswithcode.com/paper/would-a-file-by-any-other-name-seem-as
Repo
Framework

COPYCAT: Practical Adversarial Attacks on Visualization-Based Malware Detection


Title	COPYCAT: Practical Adversarial Attacks on Visualization-Based Malware Detection
Authors	Aminollah Khormali, Ahmed Abusnaina, Songqing Chen, DaeHun Nyang, Aziz Mohaisen
Abstract	Despite many attempts, the state-of-the-art of adversarial machine learning on malware detection systems generally yield unexecutable samples. In this work, we set out to examine the robustness of visualization-based malware detection system against adversarial examples (AEs) that not only are able to fool the model, but also maintain the executability of the original input. As such, we first investigate the application of existing off-the-shelf adversarial attack approaches on malware detection systems through which we found that those approaches do not necessarily maintain the functionality of the original inputs. Therefore, we proposed an approach to generate adversarial examples, COPYCAT, which is specifically designed for malware detection systems considering two main goals; achieving a high misclassification rate and maintaining the executability and functionality of the original input. We designed two main configurations for COPYCAT, namely AE padding and sample injection. While the first configuration results in untargeted misclassification attacks, the sample injection configuration is able to force the model to generate a targeted output, which is highly desirable in the malware attribution setting. We evaluate the performance of COPYCAT through an extensive set of experiments on two malware datasets, and report that we were able to generate adversarial samples that are misclassified at a rate of 98.9% and 96.5% with Windows and IoT binary datasets, respectively, outperforming the misclassification rates in the literature. Most importantly, we report that those AEs were executable unlike AEs generated by off-the-shelf approaches. Our transferability study demonstrates that the generated AEs through our proposed method can be generalized to other models.
Tasks	Adversarial Attack, Malware Detection
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09735v1
PDF	https://arxiv.org/pdf/1909.09735v1.pdf
PWC	https://paperswithcode.com/paper/190909735
Repo
Framework

Multi-scale Deep Neural Networks for Solving High Dimensional PDEs


Title	Multi-scale Deep Neural Networks for Solving High Dimensional PDEs
Authors	Wei Cai, Zhi-Qin John Xu
Abstract	In this paper, we propose the idea of radial scaling in frequency domain and activation functions with compact support to produce a multi-scale DNN (MscaleDNN), which will have the multi-scale capability in approximating high frequency and high dimensional functions and speeding up the solution of high dimensional PDEs. Numerical results on high dimensional function fitting and solutions of high dimensional PDEs, using loss functions with either Ritz energy or least squared PDE residuals, have validated the increased power of multi-scale resolution and high frequency capturing of the proposed MscaleDNN.
Tasks
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11710v1
PDF	https://arxiv.org/pdf/1910.11710v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-deep-neural-networks-for-solving
Repo
Framework

How Does Batch Normalization Help Binary Training?


Title	How Does Batch Normalization Help Binary Training?
Authors	Eyyüb Sari, Mouloud Belbahri, Vahid Partovi Nia
Abstract	Binary Neural Networks (BNNs) are difficult to train, and suffer from drop of accuracy. It appears in practice that BNNs fail to train in the absence of Batch Normalization (BatchNorm) layer. We find the main role of BatchNorm is to avoid exploding gradients in the case of BNNs. This finding suggests that the common initialization methods developed for full-precision networks are irrelevant to BNNs. We build a theoretical study on the role of BatchNorm in binary training, backed up by numerical experiments.
Tasks	Quantization
Published	2019-09-18
URL	https://arxiv.org/abs/1909.09139v2
PDF	https://arxiv.org/pdf/1909.09139v2.pdf
PWC	https://paperswithcode.com/paper/a-study-on-binary-neural-networks
Repo
Framework

Stochastic Proximal Algorithms with SON Regularization: Towards Efficient Optimal Transport for Domain Adaptation


Title	Stochastic Proximal Algorithms with SON Regularization: Towards Efficient Optimal Transport for Domain Adaptation
Authors	Ashkan Panahi, Arman Rahbar, Morteza Haghir Chehreghani, Devdatt Dubhashi
Abstract	We propose a new regularizer for optimal transport (OT) which is tailored to better preserve the class structure of the subjected process. Accordingly, we provide the first theoretical guarantees for an OT scheme that respects class structure. We derive an accelerated proximal algorithm with a closed form projection and proximal operator scheme thereby affording a highly scalable algorithm for computing optimal transport plans. We provide a novel argument for the uniqueness of the optimum even in the absence of strong convexity.Our experiments show that the new regularizer does not only result in a better preservation of the class structure but also in additional robustness relative to previous regularizers.
Tasks	Domain Adaptation
Published	2019-03-09
URL	https://arxiv.org/abs/1903.03850v2
PDF	https://arxiv.org/pdf/1903.03850v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-incremental-algorithms-for-optimal
Repo
Framework

Improving Minimal Gated Unit for Sequential Data


Title	Improving Minimal Gated Unit for Sequential Data
Authors	Kazuki Takamura, Satoshi Yamane
Abstract	In order to obtain a model which can process sequential data related to machine translation and speech recognition faster and more accurately, we propose adopting Chrono Initializer as the initialization method of Minimal Gated Unit. We evaluated the method with two tasks: adding task and copy task. As a result of the experiment, the effectiveness of the proposed method was confirmed.
Tasks	Machine Translation, Speech Recognition
Published	2019-05-21
URL	https://arxiv.org/abs/1906.00748v1
PDF	https://arxiv.org/pdf/1906.00748v1.pdf
PWC	https://paperswithcode.com/paper/190600748
Repo
Framework

Efficient Ridge Solution for the Incremental Broad Learning System on Added Nodes by Inverse Cholesky Factorization of a Partitioned Matrix


Title	Efficient Ridge Solution for the Incremental Broad Learning System on Added Nodes by Inverse Cholesky Factorization of a Partitioned Matrix
Authors	Hufei Zhu, Chenghao Wei
Abstract	To accelerate the existing Broad Learning System (BLS) for new added nodes in [7], we extend the inverse Cholesky factorization in [10] to deduce an efficient inverse Cholesky factorization for a Hermitian matrix partitioned into 2 * 2 blocks, which is utilized to develop the proposed BLS algorithm 1. The proposed BLS algorithm 1 compute the ridge solution (i.e, the output weights) from the inverse Cholesky factor of the Hermitian matrix in the ridge inverse, and update the inverse Cholesky factor efficiently. From the proposed BLS algorithm 1, we deduce the proposed ridge inverse, which can be obtained from the generalized inverse in [7] by just change one matrix in the equation to compute the newly added sub-matrix. We also modify the proposed algorithm 1 into the proposed algorithm 2, which is equivalent to the existing BLS algorithm [7] in terms of numerical computations. The proposed algorithms 1 and 2 can reduce the computational complexity, since usually the Hermitian matrix in the ridge inverse is smaller than the ridge inverse. With respect to the existing BLS algorithm, the proposed algorithms 1 and 2 usually require about 13 and 2 3 of complexities, respectively, while in numerical experiments they achieve the speedups (in each additional training time) of 2.40 - 2.91 and 1.36 - 1.60, respectively. Numerical experiments also show that the proposed algorithm 1 and the standard ridge solution always bear the same testing accuracy, and usually so do the proposed algorithm 2 and the existing BLS algorithm. The existing BLS assumes the ridge parameter lamda->0, since it is based on the generalized inverse with the ridge regression approximation. When the assumption of lamda-> 0 is not satisfied, the standard ridge solution obviously achieves a better testing accuracy than the existing BLS algorithm in numerical experiments.
Tasks
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04872v1
PDF	https://arxiv.org/pdf/1911.04872v1.pdf
PWC	https://paperswithcode.com/paper/efficient-ridge-solution-for-the-incremental
Repo
Framework

Shapley Homology: Topological Analysis of Sample Influence for Neural Networks


Title	Shapley Homology: Topological Analysis of Sample Influence for Neural Networks
Authors	Kaixuan Zhang, Qinglong Wang, Xue Liu, C. Lee Giles
Abstract	Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (iid). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley Homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. By interpreting the influence as a probability measure, we further define an entropy which reflects the complexity of the data manifold. Our empirical studies show that when using the 0-dimensional homology, on neighboring graphs, samples with higher influence scores have more impact on the accuracy of neural networks for determining the graph connectivity and on several regular grammars whose higher entropy values imply more difficulty in being learned.
Tasks	data poisoning
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06509v1
PDF	https://arxiv.org/pdf/1910.06509v1.pdf
PWC	https://paperswithcode.com/paper/shapley-homology-topological-analysis-of
Repo
Framework

Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification


Title	Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification
Authors	T M Feroz Ali, Subhasis Chaudhuri
Abstract	Person re-identification is the task of matching pedestrian images across non-overlapping cameras. In this paper, we propose a non-linear cross-view similarity metric learning for handling small size training data in practical re-ID systems. The method employs non-linear mappings combined with cross-view discriminative subspace learning and cross-view distance metric learning based on pairwise similarity constraints. It is a natural extension of XQDA from linear to non-linear mappings using kernels, and learns non-linear transformations for efficiently handling complex non-linearity of person appearance across camera views. Importantly, the proposed method is very computationally efficient. Extensive experiments on four challenging datasets shows that our method attains competitive performance against state-of-the-art methods.
Tasks	Metric Learning, Person Re-Identification
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11316v1
PDF	https://arxiv.org/pdf/1909.11316v1.pdf
PWC	https://paperswithcode.com/paper/cross-view-kernel-similarity-metric-learning
Repo
Framework


Title	Multi-modal Deep Analysis for Multimedia
Authors	Wenwu Zhu, Xin Wang, Hongzhi Li
Abstract	With the rapid development of Internet and multimedia services in the past decade, a huge amount of user-generated and service provider-generated multimedia data become available. These data are heterogeneous and multi-modal in nature, imposing great challenges for processing and analyzing them. Multi-modal data consist of a mixture of various types of data from different modalities such as texts, images, videos, audios etc. In this article, we present a deep and comprehensive overview for multi-modal analysis in multimedia. We introduce two scientific research problems, data-driven correlational representation and knowledge-guided fusion for multimedia analysis. To address the two scientific problems, we investigate them from the following aspects: 1) multi-modal correlational representation: multi-modal fusion of data across different modalities, and 2) multi-modal data and knowledge fusion: multi-modal fusion of data with domain knowledge. More specifically, on data-driven correlational representation, we highlight three important categories of methods, such as multi-modal deep representation, multi-modal transfer learning, and multi-modal hashing. On knowledge-guided fusion, we discuss the approaches for fusing knowledge with data and four exemplar applications that require various kinds of domain knowledge, including multi-modal visual question answering, multi-modal video summarization, multi-modal visual pattern mining and multi-modal recommendation. Finally, we bring forward our insights and future research directions.
Tasks	Question Answering, Transfer Learning, Video Summarization, Visual Question Answering
Published	2019-10-11
URL	https://arxiv.org/abs/1910.04964v2
PDF	https://arxiv.org/pdf/1910.04964v2.pdf
PWC	https://paperswithcode.com/paper/multi-modal-deep-analysis-for-multimedia
Repo
Framework


Title	Re-ID Driven Localization Refinement for Person Search
Authors	Chuchu Han, Jiacheng Ye, Yunshan Zhong, Xin Tan, Chi Zhang, Changxin Gao, Nong Sang
Abstract	Person search aims at localizing and identifying a query person from a gallery of uncropped scene images. Different from person re-identification (re-ID), its performance also depends on the localization accuracy of a pedestrian detector. The state-of-the-art methods train the detector individually, and the detected bounding boxes may be sub-optimal for the following re-ID task. To alleviate this issue, we propose a re-ID driven localization refinement framework for providing the refined detection boxes for person search. Specifically, we develop a differentiable ROI transform layer to effectively transform the bounding boxes from the original images. Thus, the box coordinates can be supervised by the re-ID training other than the original detection task. With this supervision, the detector can generate more reliable bounding boxes, and the downstream re-ID model can produce more discriminative embeddings based on the refined person localizations. Extensive experimental results on the widely used benchmarks demonstrate that our proposed method performs favorably against the state-of-the-art person search methods.
Tasks	Person Re-Identification, Person Search
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08580v1
PDF	https://arxiv.org/pdf/1909.08580v1.pdf
PWC	https://paperswithcode.com/paper/re-id-driven-localization-refinement-for
Repo
Framework

Synthesis of Provably Correct Autonomy Protocols for Shared Control


Title	Synthesis of Provably Correct Autonomy Protocols for Shared Control
Authors	Murat Cubuktepe, Nils Jansen, Mohammed Alsiekh, Ufuk Topcu
Abstract	We synthesize shared control protocols subject to probabilistic temporal logic specifications. More specifically, we develop a framework in which a human and an autonomy protocol can issue commands to carry out a certain task. We blend these commands into a joint input to a robot. We model the interaction between the human and the robot as a Markov decision process (MDP) that represents the shared control scenario. Using inverse reinforcement learning, we obtain an abstraction of the human’s behavior and decisions. We use randomized strategies to account for randomness in human’s decisions, caused by factors such as complexity of the task specifications or imperfect interfaces. We design the autonomy protocol to ensure that the resulting robot behavior satisfies given safety and performance specifications in probabilistic temporal logic. Additionally, the resulting strategies generate behavior as similar to the behavior induced by the human’s commands as possible. We solve the underlying problem efficiently using quasiconvex programming. Case studies involving autonomous wheelchair navigation and unmanned aerial vehicle mission planning showcase the applicability of our approach.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06471v1
PDF	https://arxiv.org/pdf/1905.06471v1.pdf
PWC	https://paperswithcode.com/paper/synthesis-of-provably-correct-autonomy
Repo
Framework

Robust and Computationally-Efficient Anomaly Detection using Powers-of-Two Networks


Title	Robust and Computationally-Efficient Anomaly Detection using Powers-of-Two Networks
Authors	Usama Muneeb, Erdem Koyuncu, Yasaman Keshtkarjahromi, Hulya Seferoglu, Mehmet Fatih Erden, Ahmet Enis Cetin
Abstract	Robust and computationally efficient anomaly detection in videos is a problem in video surveillance systems. We propose a technique to increase robustness and reduce computational complexity in a Convolutional Neural Network (CNN) based anomaly detector that utilizes the optical flow information of video data. We reduce the complexity of the network by denoising the intermediate layer outputs of the CNN and by using powers-of-two weights, which replaces the computationally expensive multiplication operations with bit-shift operations. Denoising operation during inference forces small valued intermediate layer outputs to zero. The number of zeros in the network significantly increases as a result of denoising, we can implement the CNN about 10% faster than a comparable network while detecting all the anomalies in the testing set. It turns out that denoising operation also provides robustness because the contribution of small intermediate values to the final result is negligible. During training we also generate motion vector images by a Generative Adversarial Network (GAN) to improve the robustness of the overall system. We experimentally observe that the resulting system is robust to background motion.
Tasks	Anomaly Detection, Denoising, Optical Flow Estimation
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14096v1
PDF	https://arxiv.org/pdf/1910.14096v1.pdf
PWC	https://paperswithcode.com/paper/robust-and-computationally-efficient-anomaly
Repo
Framework