Paper Group ANR 1171
CUDA: Contradistinguisher for Unsupervised Domain Adaptation. Importance-Aware Learning for Neural Headline Editing. Would a File by Any Other Name Seem as Malicious?. COPYCAT: Practical Adversarial Attacks on Visualization-Based Malware Detection. Multi-scale Deep Neural Networks for Solving High Dimensional PDEs. How Does Batch Normalization Help …
CUDA: Contradistinguisher for Unsupervised Domain Adaptation
Title | CUDA: Contradistinguisher for Unsupervised Domain Adaptation |
Authors | Sourabh Balgi, Ambedkar Dukkipati |
Abstract | In this paper, we propose a simple model referred as Contradistinguisher (CTDR) for unsupervised domain adaptation whose objective is to jointly learn to contradistinguish on unlabeled target domain in a fully unsupervised manner along with prior knowledge acquired by supervised learning on an entirely different domain. Most recent works in domain adaptation rely on an indirect way of first aligning the source and target domain distributions and then learn a classifier on a labeled source domain to classify target domain. This approach of an indirect way of addressing the real task of unlabeled target domain classification has three main drawbacks. (i) The sub-task of obtaining a perfect alignment of the domain in itself might be impossible due to large domain shift (e.g., language domains). (ii) The use of multiple classifiers to align the distributions unnecessarily increases the complexity of the neural networks leading to over-fitting in many cases. (iii) Due to distribution alignment, the domain-specific information is lost as the domains get morphed. In this work, we propose a simple and direct approach that does not require domain alignment. We jointly learn CTDR on both source and target distribution for unsupervised domain adaptation task using contradistinguish loss for the unlabeled target domain in conjunction with a supervised loss for labeled source domain. Our experiments show that avoiding domain alignment by directly addressing the task of unlabeled target domain classification using CTDR achieves state-of-the-art results on eight visual and four language benchmark domain adaptation datasets. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03442v1 |
https://arxiv.org/pdf/1909.03442v1.pdf | |
PWC | https://paperswithcode.com/paper/cuda-contradistinguisher-for-unsupervised |
Repo | |
Framework | |
Importance-Aware Learning for Neural Headline Editing
Title | Importance-Aware Learning for Neural Headline Editing |
Authors | Qingyang Wu, Lei Li, Hao Zhou, Ying Zeng, Zhou Yu |
Abstract | Many social media news writers are not professionally trained. Therefore, social media platforms have to hire professional editors to adjust amateur headlines to attract more readers. We propose to automate this headline editing process through neural network models to provide more immediate writing support for these social media news writers. To train such a neural headline editing model, we collected a dataset which contains articles with original headlines and professionally edited headlines. However, it is expensive to collect a large number of professionally edited headlines. To solve this low-resource problem, we design an encoder-decoder model which leverages large scale pre-trained language models. We further improve the pre-trained model’s quality by introducing a headline generation task as an intermediate task before the headline editing task. Also, we propose Self Importance-Aware (SIA) loss to address the different levels of editing in the dataset by down-weighting the importance of easily classified tokens and sentences. With the help of Pre-training, Adaptation, and SIA, the model learns to generate headlines in the professional editor’s style. Experimental results show that our method significantly improves the quality of headline editing comparing against previous methods. |
Tasks | |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1912.01114v1 |
https://arxiv.org/pdf/1912.01114v1.pdf | |
PWC | https://paperswithcode.com/paper/importance-aware-learning-for-neural-headline |
Repo | |
Framework | |
Would a File by Any Other Name Seem as Malicious?
Title | Would a File by Any Other Name Seem as Malicious? |
Authors | Andre T. Nguyen, Edward Raff, Aaron Sant-Miller |
Abstract | Successful malware attacks on information technology systems can cause millions of dollars in damage, the exposure of sensitive and private information, and the irreversible destruction of data. Anti-virus systems that analyze a file’s contents use a combination of static and dynamic analysis to detect and remove/remediate such malware. However, examining a file’s entire contents is not always possible in practice, as the volume and velocity of incoming data may be too high, or access to the underlying file contents may be restricted or unavailable. If it were possible to obtain estimates of a file’s relative likelihood of being malicious without looking at the file contents, we could better prioritize file processing order and aid analysts in situations where a file is unavailable. In this work, we demonstrate that file names can contain information predictive of the presence of malware in a file. In particular, we show the effectiveness of a character-level convolutional neural network at predicting malware status using file names on Endgame’s EMBER malware detection benchmark dataset. |
Tasks | Malware Detection |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04753v1 |
https://arxiv.org/pdf/1910.04753v1.pdf | |
PWC | https://paperswithcode.com/paper/would-a-file-by-any-other-name-seem-as |
Repo | |
Framework | |
COPYCAT: Practical Adversarial Attacks on Visualization-Based Malware Detection
Title | COPYCAT: Practical Adversarial Attacks on Visualization-Based Malware Detection |
Authors | Aminollah Khormali, Ahmed Abusnaina, Songqing Chen, DaeHun Nyang, Aziz Mohaisen |
Abstract | Despite many attempts, the state-of-the-art of adversarial machine learning on malware detection systems generally yield unexecutable samples. In this work, we set out to examine the robustness of visualization-based malware detection system against adversarial examples (AEs) that not only are able to fool the model, but also maintain the executability of the original input. As such, we first investigate the application of existing off-the-shelf adversarial attack approaches on malware detection systems through which we found that those approaches do not necessarily maintain the functionality of the original inputs. Therefore, we proposed an approach to generate adversarial examples, COPYCAT, which is specifically designed for malware detection systems considering two main goals; achieving a high misclassification rate and maintaining the executability and functionality of the original input. We designed two main configurations for COPYCAT, namely AE padding and sample injection. While the first configuration results in untargeted misclassification attacks, the sample injection configuration is able to force the model to generate a targeted output, which is highly desirable in the malware attribution setting. We evaluate the performance of COPYCAT through an extensive set of experiments on two malware datasets, and report that we were able to generate adversarial samples that are misclassified at a rate of 98.9% and 96.5% with Windows and IoT binary datasets, respectively, outperforming the misclassification rates in the literature. Most importantly, we report that those AEs were executable unlike AEs generated by off-the-shelf approaches. Our transferability study demonstrates that the generated AEs through our proposed method can be generalized to other models. |
Tasks | Adversarial Attack, Malware Detection |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09735v1 |
https://arxiv.org/pdf/1909.09735v1.pdf | |
PWC | https://paperswithcode.com/paper/190909735 |
Repo | |
Framework | |
Multi-scale Deep Neural Networks for Solving High Dimensional PDEs
Title | Multi-scale Deep Neural Networks for Solving High Dimensional PDEs |
Authors | Wei Cai, Zhi-Qin John Xu |
Abstract | In this paper, we propose the idea of radial scaling in frequency domain and activation functions with compact support to produce a multi-scale DNN (MscaleDNN), which will have the multi-scale capability in approximating high frequency and high dimensional functions and speeding up the solution of high dimensional PDEs. Numerical results on high dimensional function fitting and solutions of high dimensional PDEs, using loss functions with either Ritz energy or least squared PDE residuals, have validated the increased power of multi-scale resolution and high frequency capturing of the proposed MscaleDNN. |
Tasks | |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11710v1 |
https://arxiv.org/pdf/1910.11710v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-scale-deep-neural-networks-for-solving |
Repo | |
Framework | |
How Does Batch Normalization Help Binary Training?
Title | How Does Batch Normalization Help Binary Training? |
Authors | Eyyüb Sari, Mouloud Belbahri, Vahid Partovi Nia |
Abstract | Binary Neural Networks (BNNs) are difficult to train, and suffer from drop of accuracy. It appears in practice that BNNs fail to train in the absence of Batch Normalization (BatchNorm) layer. We find the main role of BatchNorm is to avoid exploding gradients in the case of BNNs. This finding suggests that the common initialization methods developed for full-precision networks are irrelevant to BNNs. We build a theoretical study on the role of BatchNorm in binary training, backed up by numerical experiments. |
Tasks | Quantization |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.09139v2 |
https://arxiv.org/pdf/1909.09139v2.pdf | |
PWC | https://paperswithcode.com/paper/a-study-on-binary-neural-networks |
Repo | |
Framework | |
Stochastic Proximal Algorithms with SON Regularization: Towards Efficient Optimal Transport for Domain Adaptation
Title | Stochastic Proximal Algorithms with SON Regularization: Towards Efficient Optimal Transport for Domain Adaptation |
Authors | Ashkan Panahi, Arman Rahbar, Morteza Haghir Chehreghani, Devdatt Dubhashi |
Abstract | We propose a new regularizer for optimal transport (OT) which is tailored to better preserve the class structure of the subjected process. Accordingly, we provide the first theoretical guarantees for an OT scheme that respects class structure. We derive an accelerated proximal algorithm with a closed form projection and proximal operator scheme thereby affording a highly scalable algorithm for computing optimal transport plans. We provide a novel argument for the uniqueness of the optimum even in the absence of strong convexity.Our experiments show that the new regularizer does not only result in a better preservation of the class structure but also in additional robustness relative to previous regularizers. |
Tasks | Domain Adaptation |
Published | 2019-03-09 |
URL | https://arxiv.org/abs/1903.03850v2 |
https://arxiv.org/pdf/1903.03850v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-incremental-algorithms-for-optimal |
Repo | |
Framework | |
Improving Minimal Gated Unit for Sequential Data
Title | Improving Minimal Gated Unit for Sequential Data |
Authors | Kazuki Takamura, Satoshi Yamane |
Abstract | In order to obtain a model which can process sequential data related to machine translation and speech recognition faster and more accurately, we propose adopting Chrono Initializer as the initialization method of Minimal Gated Unit. We evaluated the method with two tasks: adding task and copy task. As a result of the experiment, the effectiveness of the proposed method was confirmed. |
Tasks | Machine Translation, Speech Recognition |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1906.00748v1 |
https://arxiv.org/pdf/1906.00748v1.pdf | |
PWC | https://paperswithcode.com/paper/190600748 |
Repo | |
Framework | |
Efficient Ridge Solution for the Incremental Broad Learning System on Added Nodes by Inverse Cholesky Factorization of a Partitioned Matrix
Title | Efficient Ridge Solution for the Incremental Broad Learning System on Added Nodes by Inverse Cholesky Factorization of a Partitioned Matrix |
Authors | Hufei Zhu, Chenghao Wei |
Abstract | To accelerate the existing Broad Learning System (BLS) for new added nodes in [7], we extend the inverse Cholesky factorization in [10] to deduce an efficient inverse Cholesky factorization for a Hermitian matrix partitioned into 2 * 2 blocks, which is utilized to develop the proposed BLS algorithm 1. The proposed BLS algorithm 1 compute the ridge solution (i.e, the output weights) from the inverse Cholesky factor of the Hermitian matrix in the ridge inverse, and update the inverse Cholesky factor efficiently. From the proposed BLS algorithm 1, we deduce the proposed ridge inverse, which can be obtained from the generalized inverse in [7] by just change one matrix in the equation to compute the newly added sub-matrix. We also modify the proposed algorithm 1 into the proposed algorithm 2, which is equivalent to the existing BLS algorithm [7] in terms of numerical computations. The proposed algorithms 1 and 2 can reduce the computational complexity, since usually the Hermitian matrix in the ridge inverse is smaller than the ridge inverse. With respect to the existing BLS algorithm, the proposed algorithms 1 and 2 usually require about 13 and 2 3 of complexities, respectively, while in numerical experiments they achieve the speedups (in each additional training time) of 2.40 - 2.91 and 1.36 - 1.60, respectively. Numerical experiments also show that the proposed algorithm 1 and the standard ridge solution always bear the same testing accuracy, and usually so do the proposed algorithm 2 and the existing BLS algorithm. The existing BLS assumes the ridge parameter lamda->0, since it is based on the generalized inverse with the ridge regression approximation. When the assumption of lamda-> 0 is not satisfied, the standard ridge solution obviously achieves a better testing accuracy than the existing BLS algorithm in numerical experiments. |
Tasks | |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04872v1 |
https://arxiv.org/pdf/1911.04872v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-ridge-solution-for-the-incremental |
Repo | |
Framework | |
Shapley Homology: Topological Analysis of Sample Influence for Neural Networks
Title | Shapley Homology: Topological Analysis of Sample Influence for Neural Networks |
Authors | Kaixuan Zhang, Qinglong Wang, Xue Liu, C. Lee Giles |
Abstract | Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (iid). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley Homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. By interpreting the influence as a probability measure, we further define an entropy which reflects the complexity of the data manifold. Our empirical studies show that when using the 0-dimensional homology, on neighboring graphs, samples with higher influence scores have more impact on the accuracy of neural networks for determining the graph connectivity and on several regular grammars whose higher entropy values imply more difficulty in being learned. |
Tasks | data poisoning |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06509v1 |
https://arxiv.org/pdf/1910.06509v1.pdf | |
PWC | https://paperswithcode.com/paper/shapley-homology-topological-analysis-of |
Repo | |
Framework | |
Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification
Title | Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification |
Authors | T M Feroz Ali, Subhasis Chaudhuri |
Abstract | Person re-identification is the task of matching pedestrian images across non-overlapping cameras. In this paper, we propose a non-linear cross-view similarity metric learning for handling small size training data in practical re-ID systems. The method employs non-linear mappings combined with cross-view discriminative subspace learning and cross-view distance metric learning based on pairwise similarity constraints. It is a natural extension of XQDA from linear to non-linear mappings using kernels, and learns non-linear transformations for efficiently handling complex non-linearity of person appearance across camera views. Importantly, the proposed method is very computationally efficient. Extensive experiments on four challenging datasets shows that our method attains competitive performance against state-of-the-art methods. |
Tasks | Metric Learning, Person Re-Identification |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11316v1 |
https://arxiv.org/pdf/1909.11316v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-view-kernel-similarity-metric-learning |
Repo | |
Framework | |
Multi-modal Deep Analysis for Multimedia
Title | Multi-modal Deep Analysis for Multimedia |
Authors | Wenwu Zhu, Xin Wang, Hongzhi Li |
Abstract | With the rapid development of Internet and multimedia services in the past decade, a huge amount of user-generated and service provider-generated multimedia data become available. These data are heterogeneous and multi-modal in nature, imposing great challenges for processing and analyzing them. Multi-modal data consist of a mixture of various types of data from different modalities such as texts, images, videos, audios etc. In this article, we present a deep and comprehensive overview for multi-modal analysis in multimedia. We introduce two scientific research problems, data-driven correlational representation and knowledge-guided fusion for multimedia analysis. To address the two scientific problems, we investigate them from the following aspects: 1) multi-modal correlational representation: multi-modal fusion of data across different modalities, and 2) multi-modal data and knowledge fusion: multi-modal fusion of data with domain knowledge. More specifically, on data-driven correlational representation, we highlight three important categories of methods, such as multi-modal deep representation, multi-modal transfer learning, and multi-modal hashing. On knowledge-guided fusion, we discuss the approaches for fusing knowledge with data and four exemplar applications that require various kinds of domain knowledge, including multi-modal visual question answering, multi-modal video summarization, multi-modal visual pattern mining and multi-modal recommendation. Finally, we bring forward our insights and future research directions. |
Tasks | Question Answering, Transfer Learning, Video Summarization, Visual Question Answering |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.04964v2 |
https://arxiv.org/pdf/1910.04964v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-modal-deep-analysis-for-multimedia |
Repo | |
Framework | |
Re-ID Driven Localization Refinement for Person Search
Title | Re-ID Driven Localization Refinement for Person Search |
Authors | Chuchu Han, Jiacheng Ye, Yunshan Zhong, Xin Tan, Chi Zhang, Changxin Gao, Nong Sang |
Abstract | Person search aims at localizing and identifying a query person from a gallery of uncropped scene images. Different from person re-identification (re-ID), its performance also depends on the localization accuracy of a pedestrian detector. The state-of-the-art methods train the detector individually, and the detected bounding boxes may be sub-optimal for the following re-ID task. To alleviate this issue, we propose a re-ID driven localization refinement framework for providing the refined detection boxes for person search. Specifically, we develop a differentiable ROI transform layer to effectively transform the bounding boxes from the original images. Thus, the box coordinates can be supervised by the re-ID training other than the original detection task. With this supervision, the detector can generate more reliable bounding boxes, and the downstream re-ID model can produce more discriminative embeddings based on the refined person localizations. Extensive experimental results on the widely used benchmarks demonstrate that our proposed method performs favorably against the state-of-the-art person search methods. |
Tasks | Person Re-Identification, Person Search |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08580v1 |
https://arxiv.org/pdf/1909.08580v1.pdf | |
PWC | https://paperswithcode.com/paper/re-id-driven-localization-refinement-for |
Repo | |
Framework | |
Synthesis of Provably Correct Autonomy Protocols for Shared Control
Title | Synthesis of Provably Correct Autonomy Protocols for Shared Control |
Authors | Murat Cubuktepe, Nils Jansen, Mohammed Alsiekh, Ufuk Topcu |
Abstract | We synthesize shared control protocols subject to probabilistic temporal logic specifications. More specifically, we develop a framework in which a human and an autonomy protocol can issue commands to carry out a certain task. We blend these commands into a joint input to a robot. We model the interaction between the human and the robot as a Markov decision process (MDP) that represents the shared control scenario. Using inverse reinforcement learning, we obtain an abstraction of the human’s behavior and decisions. We use randomized strategies to account for randomness in human’s decisions, caused by factors such as complexity of the task specifications or imperfect interfaces. We design the autonomy protocol to ensure that the resulting robot behavior satisfies given safety and performance specifications in probabilistic temporal logic. Additionally, the resulting strategies generate behavior as similar to the behavior induced by the human’s commands as possible. We solve the underlying problem efficiently using quasiconvex programming. Case studies involving autonomous wheelchair navigation and unmanned aerial vehicle mission planning showcase the applicability of our approach. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06471v1 |
https://arxiv.org/pdf/1905.06471v1.pdf | |
PWC | https://paperswithcode.com/paper/synthesis-of-provably-correct-autonomy |
Repo | |
Framework | |
Robust and Computationally-Efficient Anomaly Detection using Powers-of-Two Networks
Title | Robust and Computationally-Efficient Anomaly Detection using Powers-of-Two Networks |
Authors | Usama Muneeb, Erdem Koyuncu, Yasaman Keshtkarjahromi, Hulya Seferoglu, Mehmet Fatih Erden, Ahmet Enis Cetin |
Abstract | Robust and computationally efficient anomaly detection in videos is a problem in video surveillance systems. We propose a technique to increase robustness and reduce computational complexity in a Convolutional Neural Network (CNN) based anomaly detector that utilizes the optical flow information of video data. We reduce the complexity of the network by denoising the intermediate layer outputs of the CNN and by using powers-of-two weights, which replaces the computationally expensive multiplication operations with bit-shift operations. Denoising operation during inference forces small valued intermediate layer outputs to zero. The number of zeros in the network significantly increases as a result of denoising, we can implement the CNN about 10% faster than a comparable network while detecting all the anomalies in the testing set. It turns out that denoising operation also provides robustness because the contribution of small intermediate values to the final result is negligible. During training we also generate motion vector images by a Generative Adversarial Network (GAN) to improve the robustness of the overall system. We experimentally observe that the resulting system is robust to background motion. |
Tasks | Anomaly Detection, Denoising, Optical Flow Estimation |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.14096v1 |
https://arxiv.org/pdf/1910.14096v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-and-computationally-efficient-anomaly |
Repo | |
Framework | |