April 3, 2020

2974 words 14 mins read

Paper Group ANR 72

BUT Opensat 2019 Speech Recognition System. Cost-Sensitive BERT for Generalisable Sentence Classification with Imbalanced Data. Equivariant flow-based sampling for lattice gauge theory. Neural Networks are Surprisingly Modular. A Job-Assignment Heuristic for Lifelong Multi-Agent Path Finding Problem with Multiple Delivery Locations. Bipartite Link …

BUT Opensat 2019 Speech Recognition System


Title	BUT Opensat 2019 Speech Recognition System
Authors	Martin Karafiát, Murali Karthick Baskar, Igor Szöke, Hari Krishna Vydana, Karel Veselý, Jan “Honza’’ Černocký
Abstract	The paper describes the BUT Automatic Speech Recognition (ASR) systems submitted for OpenSAT evaluations under two domain categories such as low resourced languages and public safety communications. The first was challenging due to lack of training data, therefore various architectures and multilingual approaches were employed. The combination led to superior performance. The second domain was challenging due to recording in extreme conditions such as specific channel, speaker under stress and high levels of noise. Data augmentation process was inevitable to get reasonably good performance.
Tasks	Data Augmentation, Speech Recognition
Published	2020-01-30
URL	https://arxiv.org/abs/2001.11360v1
PDF	https://arxiv.org/pdf/2001.11360v1.pdf
PWC	https://paperswithcode.com/paper/but-opensat-2019-speech-recognition-system
Repo
Framework

Cost-Sensitive BERT for Generalisable Sentence Classification with Imbalanced Data


Title	Cost-Sensitive BERT for Generalisable Sentence Classification with Imbalanced Data
Authors	Harish Tayyar Madabushi, Elena Kochkina, Michael Castelle
Abstract	The automatic identification of propaganda has gained significance in recent years due to technological and social changes in the way news is generated and consumed. That this task can be addressed effectively using BERT, a powerful new architecture which can be fine-tuned for text classification tasks, is not surprising. However, propaganda detection, like other tasks that deal with news documents and other forms of decontextualized social communication (e.g. sentiment analysis), inherently deals with data whose categories are simultaneously imbalanced and dissimilar. We show that BERT, while capable of handling imbalanced classes with no additional data augmentation, does not generalise well when the training and test data are sufficiently dissimilar (as is often the case with news sources, whose topics evolve over time). We show how to address this problem by providing a statistical measure of similarity between datasets and a method of incorporating cost-weighting into BERT when the training and test sets are dissimilar. We test these methods on the Propaganda Techniques Corpus (PTC) and achieve the second-highest score on sentence-level propaganda classification.
Tasks	Data Augmentation, Sentence Classification, Sentiment Analysis, Text Classification
Published	2020-03-16
URL	https://arxiv.org/abs/2003.11563v1
PDF	https://arxiv.org/pdf/2003.11563v1.pdf
PWC	https://paperswithcode.com/paper/cost-sensitive-bert-for-generalisable-1
Repo
Framework

Equivariant flow-based sampling for lattice gauge theory


Title	Equivariant flow-based sampling for lattice gauge theory
Authors	Gurtej Kanwar, Michael S. Albergo, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Sébastien Racanière, Danilo Jimenez Rezende, Phiala E. Shanahan
Abstract	We define a class of machine-learned flow-based sampling algorithms for lattice gauge theories that are gauge-invariant by construction. We demonstrate the application of this framework to U(1) gauge theory in two spacetime dimensions, and find that near critical points in parameter space the approach is orders of magnitude more efficient at sampling topological quantities than more traditional sampling procedures such as Hybrid Monte Carlo and Heat Bath.
Tasks
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06413v1
PDF	https://arxiv.org/pdf/2003.06413v1.pdf
PWC	https://paperswithcode.com/paper/equivariant-flow-based-sampling-for-lattice
Repo
Framework

Neural Networks are Surprisingly Modular


Title	Neural Networks are Surprisingly Modular
Authors	Daniel Filan, Shlomi Hod, Cody Wild, Andrew Critch, Stuart Russell
Abstract	The learned weights of a neural network are often considered devoid of scrutable internal structure. In order to attempt to discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate the modular structure of MLPs trained on datasets of small images. Our notion of modularity comes from the graph clustering literature: a “module” is a set of neurons with strong internal connectivity but weak external connectivity. We find that MLPs that undergo training and weight pruning are often significantly more modular than random networks with the same distribution of weights. Interestingly, they are much more modular when trained with dropout. Further analysis shows that this modularity seems to arise mostly for networks trained on learnable datasets. We also present exploratory analyses of the importance of different modules for performance and how modules depend on each other. Understanding the modular structure of neural networks, when such structure exists, will hopefully render their inner workings more interpretable to engineers.
Tasks	Graph Clustering
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04881v2
PDF	https://arxiv.org/pdf/2003.04881v2.pdf
PWC	https://paperswithcode.com/paper/neural-networks-are-surprisingly-modular
Repo
Framework

A Job-Assignment Heuristic for Lifelong Multi-Agent Path Finding Problem with Multiple Delivery Locations


Title	A Job-Assignment Heuristic for Lifelong Multi-Agent Path Finding Problem with Multiple Delivery Locations
Authors	Fatih Semiz, Faruk Polat
Abstract	Multi-agent path finding (MAPF) algorithms are offline methods intended to find conflict-free paths for more than one agent. However, for many real-life applications, this problem description is inadequate for representing the needs of the domain. To address this issue we worked on a lifelong variation in which agents can have more than one ordered destination. New destinations can be inserted into the system anytime after the initial job-assignment has been made, and these new destinations must also be assigned to agents, and the time of visiting the new destination must also be determined. We called this Lifelong Multi-Agent Path Finding with Multiple Delivery Locations (MAPF-MD). To solve this problem we introduced the Multiple Delivery Conflict-Based Search algorithm (MD-DCBS). We used D-lite in the low-level search of CBS to benefit from the D-lite’s incremental nature in achieving a performance increase in the CBS search. To handle multiple delivery locations we define multiple D-lite instances for each agent. The aggregations of all of the paths produced by the D-lite instances constitute the path of that agent. After that we run CBS on aggregated paths. In this problem we introduced the Multiple Delivery Conflict-Based Search algorithm (MD-DCBS). We used D-lite in the low-level search of CBS to benefit from the D-lite’s incremental nature in achieving a performance increase in the CBS search. To handle multiple delivery locations we define multiple D-lite instances for each agent. The aggregations of all of the paths produced by the D-lite instances constitute the path of that agent. After that we run CBS on aggregated paths. We have shown that this version solves MAPF-MD instances correctly. We also proposed multiple job-assignment heuristics to generate low-total-cost solutions and determined the best performing method amongst them.
Tasks	Multi-Agent Path Finding
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07108v1
PDF	https://arxiv.org/pdf/2003.07108v1.pdf
PWC	https://paperswithcode.com/paper/a-job-assignment-heuristic-for-lifelong-multi
Repo
Framework

Bipartite Link Prediction based on Topological Features via 2-hop Path


Title	Bipartite Link Prediction based on Topological Features via 2-hop Path
Authors	Jungwoon Shin
Abstract	A variety of real-world systems can be modeled as bipartite networks. One of the most powerful and simple link prediction methods is Linear-Graph Autoencoder(LGAE) which has promising performance on challenging tasks such as link prediction and node clustering. LGAE relies on simple linear model w.r.t. the adjacency matrix of the graph to learn vector space representations of nodes. In this paper, we consider the case of bipartite link predictions where node attributes are unavailable. When using LGAE, we propose to multiply the reconstructed adjacency matrix with a symmetrically normalized training adjacency matrix. As a result, 2-hop paths are formed which we use as the predicted adjacency matrix to evaluate the performance of our model. Experimental results on both synthetic and real-world dataset show our approach consistently outperforms Graph Autoencoder and Linear Graph Autoencoder model in 10 out of 12 bipartite dataset and reaches competitive performances in 2 other bipartite dataset.
Tasks	Link Prediction
Published	2020-03-19
URL	https://arxiv.org/abs/2003.08572v1
PDF	https://arxiv.org/pdf/2003.08572v1.pdf
PWC	https://paperswithcode.com/paper/bipartite-link-prediction-based-on
Repo
Framework

Approximability of Monotone Submodular Function Maximization under Cardinality and Matroid Constraints in the Streaming Model


Title	Approximability of Monotone Submodular Function Maximization under Cardinality and Matroid Constraints in the Streaming Model
Authors	Chien-Chung Huang, Naonori Kakimura, Simon Mauras, Yuichi Yoshida
Abstract	Maximizing a monotone submodular function under various constraints is a classical and intensively studied problem. However, in the single-pass streaming model, where the elements arrive one by one and an algorithm can store only a small fraction of input elements, there is much gap in our knowledge, even though several approximation algorithms have been proposed in the literature. In this work, we present the first lower bound on the approximation ratios for cardinality and matroid constraints that beat $1-\frac{1}{e}$ in the single-pass streaming model. Let $n$ be the number of elements in the stream. Then, we prove that any (randomized) streaming algorithm for a cardinality constraint with approximation ratio $\frac{2}{2+\sqrt{2}}+\varepsilon$ requires $\Omega\left(\frac{n}{K^2}\right)$ space for any $\varepsilon>0$, where $K$ is the size limit of the output set. We also prove that any (randomized) streaming algorithm for a (partition) matroid constraint with approximation ratio $\frac{K}{2K-1}+\varepsilon$ requires $\Omega\left(\frac{n}{K}\right)$ space for any $\varepsilon>0$, where $K$ is the rank of the given matroid. In addition, we give streaming algorithms when we only have a weak oracle with which we can only evaluate function values on feasible sets. Specifically, we show weak-oracle streaming algorithms for cardinality and matroid constraints with approximation ratios $\frac{K}{2K-1}$ and $\frac{1}{2}$, respectively, whose space complexity is exponential in $K$ but is independent of $n$. The former one exactly matches the known inapproximability result for a cardinality constraint in the weak oracle model. The latter one almost matches our lower bound of $\frac{K}{2K-1}$ for a matroid constraint, which almost settles the approximation ratio for a matroid constraint that can be obtained by a streaming algorithm whose space complexity is independent of $n$.
Tasks
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05477v1
PDF	https://arxiv.org/pdf/2002.05477v1.pdf
PWC	https://paperswithcode.com/paper/approximability-of-monotone-submodular
Repo
Framework

A Simple Fix for Convolutional Neural Network via Coordinate Embedding


Title	A Simple Fix for Convolutional Neural Network via Coordinate Embedding
Authors	Liliang Ren, Zhuonan Hao
Abstract	Convolutional Neural Networks (CNN) has been widely applied in the realm of computer vision. However, given the fact that CNN models are translation invariant, they are not aware of the coordinate information of each pixel. Thus the generalization ability of CNN will be limited since the coordinate information is crucial for a model to learn affine transformations which directly operate on the coordinate of each pixel. In this project, we proposed a simple approach to incorporate the coordinate information to the CNN model through coordinate embedding. Our approach does not change the downstream model architecture and can be easily applied to the pre-trained models for the task like object detection. Our experiments on the German Traffic Sign Detection Benchmark show that our approach not only significantly improve the model performance but also have better robustness with respect to the affine transformation.
Tasks	Object Detection
Published	2020-03-24
URL	https://arxiv.org/abs/2003.10589v1
PDF	https://arxiv.org/pdf/2003.10589v1.pdf
PWC	https://paperswithcode.com/paper/a-simple-fix-for-convolutional-neural-network
Repo
Framework

Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation


Title	Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation
Authors	Yangtao Zheng, Di Huang, Songtao Liu, Yunhong Wang
Abstract	Recent years have witnessed great progress in deep learning based object detection. However, due to the domain shift problem, applying off-the-shelf detectors to an unseen domain leads to significant performance drop. To address such an issue, this paper proposes a novel coarse-to-fine feature adaptation approach to cross-domain object detection. At the coarse-grained stage, different from the rough image-level or instance-level feature alignment used in the literature, foreground regions are extracted by adopting the attention mechanism, and aligned according to their marginal distributions via multi-layer adversarial learning in the common feature space. At the fine-grained stage, we conduct conditional distribution alignment of foregrounds by minimizing the distance of global prototypes with the same category but from different domains. Thanks to this coarse-to-fine feature adaptation, domain knowledge in foreground regions can be effectively transferred. Extensive experiments are carried out in various cross-domain detection scenarios. The results are state-of-the-art, which demonstrate the broad applicability and effectiveness of the proposed approach.
Tasks	Object Detection
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10275v1
PDF	https://arxiv.org/pdf/2003.10275v1.pdf
PWC	https://paperswithcode.com/paper/cross-domain-object-detection-through-coarse
Repo
Framework

Detecting Face2Face Facial Reenactment in Videos


Title	Detecting Face2Face Facial Reenactment in Videos
Authors	Prabhat Kumar, Mayank Vatsa, Richa Singh
Abstract	Visual content has become the primary source of information, as evident in the billions of images and videos, shared and uploaded on the Internet every single day. This has led to an increase in alterations in images and videos to make them more informative and eye-catching for the viewers worldwide. Some of these alterations are simple, like copy-move, and are easily detectable, while other sophisticated alterations like reenactment based DeepFakes are hard to detect. Reenactment alterations allow the source to change the target expressions and create photo-realistic images and videos. While technology can be potentially used for several applications, the malicious usage of automatic reenactment has a very large social implication. It is therefore important to develop detection techniques to distinguish real images and videos with the altered ones. This research proposes a learning-based algorithm for detecting reenactment based alterations. The proposed algorithm uses a multi-stream network that learns regional artifacts and provides a robust performance at various compression levels. We also propose a loss function for the balanced learning of the streams for the proposed network. The performance is evaluated on the publicly available FaceForensics dataset. The results show state-of-the-art classification accuracy of 99.96%, 99.10%, and 91.20% for no, easy, and hard compression factors, respectively.
Tasks
Published	2020-01-21
URL	https://arxiv.org/abs/2001.07444v1
PDF	https://arxiv.org/pdf/2001.07444v1.pdf
PWC	https://paperswithcode.com/paper/detecting-face2face-facial-reenactment-in
Repo
Framework

Robust Unsupervised Neural Machine Translation with Adversarial Training


Title	Robust Unsupervised Neural Machine Translation with Adversarial Training
Authors	Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
Abstract	Unsupervised neural machine translation (UNMT) has recently attracted great interest in the machine translation community, achieving only slightly worse results than supervised neural machine translation. However, in real-world scenarios, there usually exists minor noise in the input sentence and the neural translation system is sensitive to the small perturbations in the input, leading to poor performance. In this paper, we first define two types of noises and empirically show the effect of these noisy data on UNMT performance. Moreover, we propose adversarial training methods to improve the robustness of UNMT in the noisy scenario. To the best of our knowledge, this paper is the first work to explore the robustness of UNMT. Experimental results on several language pairs show that our proposed methods substantially outperform conventional UNMT systems in the noisy scenario.
Tasks	Machine Translation
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12549v1
PDF	https://arxiv.org/pdf/2002.12549v1.pdf
PWC	https://paperswithcode.com/paper/robust-unsupervised-neural-machine
Repo
Framework

InfDetect: a Large Scale Graph-based Fraud Detection System for E-Commerce Insurance


Title	InfDetect: a Large Scale Graph-based Fraud Detection System for E-Commerce Insurance
Authors	Cen Chen, Chen Liang, Jianbin Lin, Li Wang, Ziqi Liu, Xinxing Yang, Xiukun Wang, Jun Zhou, Yang Shuang, Yuan Qi
Abstract	The insurance industry has been creating innovative products around the emerging online shopping activities. Such e-commerce insurance is designed to protect buyers from potential risks such as impulse purchases and counterfeits. Fraudulent claims towards online insurance typically involve multiple parties such as buyers, sellers, and express companies, and they could lead to heavy financial losses. In order to uncover the relations behind organized fraudsters and detect fraudulent claims, we developed a large-scale insurance fraud detection system, i.e., InfDetect, which provides interfaces for commonly used graphs, standard data processing procedures, and a uniform graph learning platform. InfDetect is able to process big graphs containing up to 100 millions of nodes and billions of edges. In this paper, we investigate different graphs to facilitate fraudster mining, such as a device-sharing graph, a transaction graph, a friendship graph, and a buyer-seller graph. These graphs are fed to a uniform graph learning platform containing supervised and unsupervised graph learning algorithms. Cases on widely applied e-commerce insurance are described to demonstrate the usage and capability of our system. InfDetect has successfully detected thousands of fraudulent claims and saved over tens of thousands of dollars daily.
Tasks	Fraud Detection
Published	2020-03-05
URL	https://arxiv.org/abs/2003.02833v3
PDF	https://arxiv.org/pdf/2003.02833v3.pdf
PWC	https://paperswithcode.com/paper/infdetect-a-large-scale-graph-based-fraud
Repo
Framework

Towards A Controllable Disentanglement Network


Title	Towards A Controllable Disentanglement Network
Authors	Zengjie Song, Oluwasanmi Koyejo, Jiangshe Zhang
Abstract	This paper addresses two crucial problems of learning disentangled image representations, namely controlling the degree of disentanglement during image editing, and balancing the disentanglement strength and the reconstruction quality. To encourage disentanglement, we devise a distance covariance based decorrelation regularization. Further, for the reconstruction step, our model leverages a soft target representation combined with the latent image code. By exploring the real-valued space of the soft target representation, we are able to synthesize novel images with the designated properties. To improve the perceptual quality of images generated by autoencoder (AE)-based models, we extend the encoder-decoder architecture with the generative adversarial network (GAN) by collapsing the AE decoder and the GAN generator into one. We also design a classification based protocol to quantitatively evaluate the disentanglement strength of our model. Experimental results showcase the benefits of the proposed model.
Tasks
Published	2020-01-22
URL	https://arxiv.org/abs/2001.08572v1
PDF	https://arxiv.org/pdf/2001.08572v1.pdf
PWC	https://paperswithcode.com/paper/towards-a-controllable-disentanglement
Repo
Framework

Hyperbolic Minesweeper is in P


Title	Hyperbolic Minesweeper is in P
Authors	Eryk Kopczyński
Abstract	We show that, while Minesweeper is NP-complete, its hyperbolic variant is in P. Our proof does not rely on the rules of Minesweeper, but is valid for any puzzle based on satisfying local constraints on a graph embedded in the hyperbolic plane.
Tasks
Published	2020-02-21
URL	https://arxiv.org/abs/2002.09534v1
PDF	https://arxiv.org/pdf/2002.09534v1.pdf
PWC	https://paperswithcode.com/paper/hyperbolic-minesweeper-is-in-p
Repo
Framework

Privacy Preserving PCA for Multiparty Modeling


Title	Privacy Preserving PCA for Multiparty Modeling
Authors	Yingting Liu, Chaochao Chen, Longfei Zheng, Li Wang, Jun Zhou, Guiquan Liu, Shuang Yang
Abstract	In this paper, we present a general multiparty modeling paradigm with Privacy Preserving Principal Component Analysis (PPPCA) for horizontally partitioned data. PPPCA can accomplish multiparty cooperative execution of PCA under the premise of keeping plaintext data locally. We also propose implementations using two techniques, i.e., homomorphic encryption and secret sharing. The output of PPPCA can be sent directly to data consumer to build any machine learning models. We conduct experiments on three UCI benchmark datasets and a real-world fraud detection dataset. Results show that the accuracy of the model built upon PPPCA is the same as the model with PCA that is built based on centralized plaintext data.
Tasks	Fraud Detection
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02091v3
PDF	https://arxiv.org/pdf/2002.02091v3.pdf
PWC	https://paperswithcode.com/paper/privacy-preserving-pca-for-multiparty
Repo
Framework