February 1, 2020

3419 words 17 mins read

Paper Group AWR 170

End to end learning and optimization on graphs. Gravity-Inspired Graph Autoencoders for Directed Link Prediction. Context-Aware Cross-Lingual Mapping. Robust Aggregation for Federated Learning. ThunderNet: Towards Real-time Generic Object Detection. Comparison of Neuronal Attention Models. Large-Scale Characterization and Segmentation of Internet P …

End to end learning and optimization on graphs


Title	End to end learning and optimization on graphs
Authors	Bryan Wilder, Eric Ewing, Bistra Dilkina, Milind Tambe
Abstract	Real-world applications often combine learning and optimization problems on graphs. For instance, our objective may be to cluster the graph in order to detect meaningful communities (or solve other common graph optimization problems such as facility location, maxcut, and so on). However, graphs or related attributes are often only partially observed, introducing learning problems such as link prediction which must be solved prior to optimization. Standard approaches treat learning and optimization entirely separately, while recent machine learning work aims to predict the optimal solution directly from the inputs. Here, we propose an alternative decision-focused learning approach that integrates a differentiable proxy for common graph optimization problems as a layer in learned systems. The main idea is to learn a representation that maps the original optimization problem onto a simpler proxy problem that can be efficiently differentiated through. Experimental results show that our ClusterNet system outperforms both pure end-to-end approaches (that directly predict the optimal solution) and standard approaches that entirely separate learning and optimization. Code for our system is available at https://github.com/bwilder0/clusternet.
Tasks	Link Prediction
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13732v3
PDF	https://arxiv.org/pdf/1905.13732v3.pdf
PWC	https://paperswithcode.com/paper/end-to-end-learning-and-optimization-on
Repo	https://github.com/bwilder0/clusternet
Framework	pytorch

Gravity-Inspired Graph Autoencoders for Directed Link Prediction


Title	Gravity-Inspired Graph Autoencoders for Directed Link Prediction
Authors	Guillaume Salha, Stratis Limnios, Romain Hennequin, Viet Anh Tran, Michalis Vazirgiannis
Abstract	Graph autoencoders (AE) and variational autoencoders (VAE) recently emerged as powerful node embedding methods. In particular, graph AE and VAE were successfully leveraged to tackle the challenging link prediction problem, aiming at figuring out whether some pairs of nodes from a graph are connected by unobserved edges. However, these models focus on undirected graphs and therefore ignore the potential direction of the link, which is limiting for numerous real-life applications. In this paper, we extend the graph AE and VAE frameworks to address link prediction in directed graphs. We present a new gravity-inspired decoder scheme that can effectively reconstruct directed graphs from a node embedding. We empirically evaluate our method on three different directed link prediction tasks, for which standard graph AE and VAE perform poorly. We achieve competitive results on three real-world graphs, outperforming several popular baselines.
Tasks	Link Prediction
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09570v4
PDF	https://arxiv.org/pdf/1905.09570v4.pdf
PWC	https://paperswithcode.com/paper/gravity-inspired-graph-autoencoders-for
Repo	https://github.com/deezer/gravity_graph_autoencoders
Framework	tf

Context-Aware Cross-Lingual Mapping


Title	Context-Aware Cross-Lingual Mapping
Authors	Hanan Aldarmaki, Mona Diab
Abstract	Cross-lingual word vectors are typically obtained by fitting an orthogonal matrix that maps the entries of a bilingual dictionary from a source to a target vector space. Word vectors, however, are most commonly used for sentence or document-level representations that are calculated as the weighted average of word embeddings. In this paper, we propose an alternative to word-level mapping that better reflects sentence-level cross-lingual similarity. We incorporate context in the transformation matrix by directly mapping the averaged embeddings of aligned sentences in a parallel corpus. We also implement cross-lingual mapping of deep contextualized word embeddings using parallel sentences with word alignments. In our experiments, both approaches resulted in cross-lingual sentence embeddings that outperformed context-independent word mapping in sentence translation retrieval. Furthermore, the sentence-level transformation could be used for word-level mapping without loss in word translation quality.
Tasks	Sentence Embeddings, Word Embeddings
Published	2019-03-08
URL	http://arxiv.org/abs/1903.03243v2
PDF	http://arxiv.org/pdf/1903.03243v2.pdf
PWC	https://paperswithcode.com/paper/context-aware-crosslingual-mapping
Repo	https://github.com/h-aldarmaki/sent_translation_retrieval
Framework	tf

Robust Aggregation for Federated Learning


Title	Robust Aggregation for Federated Learning
Authors	Krishna Pillutla, Sham M. Kakade, Zaid Harchaoui
Abstract	We present a robust aggregation approach to make federated learning robust to settings when a fraction of the devices may be sending corrupted updates to the server. The proposed approach relies on a robust secure aggregation oracle based on the geometric median, which returns a robust aggregate using a constant number of calls to a regular non-robust secure average oracle. The robust aggregation oracle is privacy-preserving, similar to the secure average oracle it builds upon. We provide experimental results of the proposed approach with linear models and deep networks for two tasks in computer vision and natural language processing. The robust aggregation approach is agnostic to the level of corruption; it outperforms the classical aggregation approach in terms of robustness when the level of corruption is high, while being competitive in the regime of low corruption.
Tasks
Published	2019-12-31
URL	https://arxiv.org/abs/1912.13445v1
PDF	https://arxiv.org/pdf/1912.13445v1.pdf
PWC	https://paperswithcode.com/paper/robust-aggregation-for-federated-learning
Repo	https://github.com/krishnap25/RFA
Framework	tf

ThunderNet: Towards Real-time Generic Object Detection


Title	ThunderNet: Towards Real-time Generic Object Detection
Authors	Zheng Qin, Zeming Li, Zhaoning Zhang, Yiping Bao, Gang Yu, Yuxing Peng, Jian Sun
Abstract	Real-time generic object detection on mobile platforms is a crucial but challenging computer vision task. However, previous CNN-based detectors suffer from enormous computational cost, which hinders them from real-time inference in computation-constrained scenarios. In this paper, we investigate the effectiveness of two-stage detectors in real-time generic detection and propose a lightweight two-stage detector named ThunderNet. In the backbone part, we analyze the drawbacks in previous lightweight backbones and present a lightweight backbone designed for object detection. In the detection part, we exploit an extremely efficient RPN and detection head design. To generate more discriminative feature representation, we design two efficient architecture blocks, Context Enhancement Module and Spatial Attention Module. At last, we investigate the balance between the input resolution, the backbone, and the detection head. Compared with lightweight one-stage detectors, ThunderNet achieves superior performance with only 40% of the computational cost on PASCAL VOC and COCO benchmarks. Without bells and whistles, our model runs at 24.1 fps on an ARM-based device. To the best of our knowledge, this is the first real-time detector reported on ARM platforms. Code will be released for paper reproduction.
Tasks	Object Detection
Published	2019-03-28
URL	https://arxiv.org/abs/1903.11752v2
PDF	https://arxiv.org/pdf/1903.11752v2.pdf
PWC	https://paperswithcode.com/paper/thundernet-towards-real-time-generic-object
Repo	https://github.com/zhousy1993/paper
Framework	none

Comparison of Neuronal Attention Models


Title	Comparison of Neuronal Attention Models
Authors	Mohamed Karim Belaid
Abstract	Recent models for image processing are using the Convolutional neural network (CNN) which requires a pixel per pixel analysis of the input image. This method works well. However, it is time-consuming if we have large images. To increase the performance, by improving the training time or the accuracy, we need a size-independent method. As a solution, we can add a Neuronal Attention model (NAM). The power of this new approach is that it can efficiently choose several small regions from the initial image to focus on. The purpose of this paper is to explain and also test each of the NAM’s parameters.
Tasks
Published	2019-12-07
URL	https://arxiv.org/abs/1912.03467v1
PDF	https://arxiv.org/pdf/1912.03467v1.pdf
PWC	https://paperswithcode.com/paper/comparison-of-neuronal-attention-models
Repo	https://github.com/Karim-53/Comparison-of-Neuronal-Attention-Models
Framework	none

Large-Scale Characterization and Segmentation of Internet Path Delays with Infinite HMMs


Title	Large-Scale Characterization and Segmentation of Internet Path Delays with Infinite HMMs
Authors	Maxime Mouchet, Sandrine Vaton, Thierry Chonavel, Emile Aben, Jasper den Hertog
Abstract	Round-Trip Times are one of the most commonly collected performance metrics in computer networks. Measurement platforms such as RIPE Atlas provide researchers and network operators with an unprecedented amount of historical Internet delay measurements. It would be very useful to automate the processing of these measurements (statistical characterization of paths performance, change detection, recognition of recurring patterns, etc.). Humans are pretty good at finding patterns in network measurements but it can be difficult to automate this to enable many time series being processed at the same time. In this article we introduce a new model, the HDP-HMM or infinite hidden Markov model, whose performance in trace segmentation is very close to human cognition. This is obtained at the cost of a greater complexity and the ambition of this article is to make the theory accessible to network monitoring and management researchers. We demonstrate that this model provides very accurate results on a labeled dataset and on RIPE Atlas and CAIDA MANIC data. This method has been implemented in Atlas and we introduce the publicly accessible Web API.
Tasks	Time Series
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12714v1
PDF	https://arxiv.org/pdf/1910.12714v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-characterization-and-segmentation
Repo	https://github.com/maxmouchet/atlas-trends-demo
Framework	none

TensorMask: A Foundation for Dense Object Segmentation


Title	TensorMask: A Foundation for Dense Object Segmentation
Authors	Xinlei Chen, Ross Girshick, Kaiming He, Piotr Dollár
Abstract	Sliding-window object detectors that generate bounding-box object predictions over a dense, regular grid have advanced rapidly and proven popular. In contrast, modern instance segmentation approaches are dominated by methods that first detect object bounding boxes, and then crop and segment these regions, as popularized by Mask R-CNN. In this work, we investigate the paradigm of dense sliding-window instance segmentation, which is surprisingly under-explored. Our core observation is that this task is fundamentally different than other dense prediction tasks such as semantic segmentation or bounding-box object detection, as the output at every spatial location is itself a geometric structure with its own spatial dimensions. To formalize this, we treat dense instance segmentation as a prediction task over 4D tensors and present a general framework called TensorMask that explicitly captures this geometry and enables novel operators on 4D tensors. We demonstrate that the tensor view leads to large gains over baselines that ignore this structure, and leads to results comparable to Mask R-CNN. These promising results suggest that TensorMask can serve as a foundation for novel advances in dense mask prediction and a more complete understanding of the task. Code will be made available.
Tasks	Instance Segmentation, Object Detection, Semantic Segmentation
Published	2019-03-28
URL	https://arxiv.org/abs/1903.12174v2
PDF	https://arxiv.org/pdf/1903.12174v2.pdf
PWC	https://paperswithcode.com/paper/tensormask-a-foundation-for-dense-object
Repo	https://github.com/youngwanLEE/detectron2
Framework	pytorch

Input-Cell Attention Reduces Vanishing Saliency of Recurrent Neural Networks


Title	Input-Cell Attention Reduces Vanishing Saliency of Recurrent Neural Networks
Authors	Aya Abdelsalam Ismail, Mohamed Gunady, Luiz Pessoa, Héctor Corrada Bravo abd Soheil Feizi
Abstract	Recent efforts to improve the interpretability of deep neural networks use saliency to characterize the importance of input features to predictions made by models. Work on interpretability using saliency-based methods on Recurrent Neural Networks (RNNs) has mostly targeted language tasks, and their applicability to time series data is less understood. In this work we analyze saliency-based methods for RNNs, both classical and gated cell architectures. We show that RNN saliency vanishes over time, biasing detection of salient features only to later time steps and are, therefore, incapable of reliably detecting important features at arbitrary time intervals. To address this vanishing saliency problem, we propose a novel RNN cell structure (input-cell attention), which can extend any RNN cell architecture. At each time step, instead of only looking at the current input vector, input-cell attention uses a fixed-size matrix embedding, each row of the matrix attending to different inputs from current or previous time steps. Using synthetic data, we show that the saliency map produced by the input-cell attention RNN is able to faithfully detect important features regardless of their occurrence in time. We also apply the input-cell attention RNN on a neuroscience task analyzing functional Magnetic Resonance Imaging (fMRI) data for human subjects performing a variety of tasks. In this case, we use saliency to characterize brain regions (input features) for which activity is important to distinguish between tasks. We show that standard RNN architectures are only capable of detecting important brain regions in the last few time steps of the fMRI data, while the input-cell attention model is able to detect important brain region activity across time without latter time step biases.
Tasks	Time Series
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12370v1
PDF	https://arxiv.org/pdf/1910.12370v1.pdf
PWC	https://paperswithcode.com/paper/input-cell-attention-reduces-vanishing
Repo	https://github.com/ayaabdelsalam91/Input-Cell-Attention
Framework	pytorch

Statistical Significance Testing in Information Retrieval: An Empirical Analysis of Type I, Type II and Type III Errors


Title	Statistical Significance Testing in Information Retrieval: An Empirical Analysis of Type I, Type II and Type III Errors
Authors	Julián Urbano, Harlley Lima, Alan Hanjalic
Abstract	Statistical significance testing is widely accepted as a means to assess how well a difference in effectiveness reflects an actual difference between systems, as opposed to random noise because of the selection of topics. According to recent surveys on SIGIR, CIKM, ECIR and TOIS papers, the t-test is the most popular choice among IR researchers. However, previous work has suggested computer intensive tests like the bootstrap or the permutation test, based mainly on theoretical arguments. On empirical grounds, others have suggested non-parametric alternatives such as the Wilcoxon test. Indeed, the question of which tests we should use has accompanied IR and related fields for decades now. Previous theoretical studies on this matter were limited in that we know that test assumptions are not met in IR experiments, and empirical studies were limited in that we do not have the necessary control over the null hypotheses to compute actual Type I and Type II error rates under realistic conditions. Therefore, not only is it unclear which test to use, but also how much trust we should put in them. In contrast to past studies, in this paper we employ a recent simulation methodology from TREC data to go around these limitations. Our study comprises over 500 million p-values computed for a range of tests, systems, effectiveness measures, topic set sizes and effect sizes, and for both the 2-tail and 1-tail cases. Having such a large supply of IR evaluation data with full knowledge of the null hypotheses, we are finally in a position to evaluate how well statistical significance tests really behave with IR data, and make sound recommendations for practitioners.
Tasks	Information Retrieval
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11096v2
PDF	https://arxiv.org/pdf/1905.11096v2.pdf
PWC	https://paperswithcode.com/paper/statistical-significance-testing-in
Repo	https://github.com/julian-urbano/sigir2019-statistical
Framework	none

Bottom-up Broadcast Neural Network For Music Genre Classification


Title	Bottom-up Broadcast Neural Network For Music Genre Classification
Authors	Caifeng Liu, Lin Feng, Guochao Liu, Huibing Wang, Shenglan Liu
Abstract	Music genre recognition based on visual representation has been successfully explored over the last years. Recently, there has been increasing interest in attempting convolutional neural networks (CNNs) to achieve the task. However, most of existing methods employ the mature CNN structures proposed in image recognition without any modification, which results in the learning features that are not adequate for music genre classification. Faced with the challenge of this issue, we fully exploit the low-level information from spectrograms of audios and develop a novel CNN architecture in this paper. The proposed CNN architecture takes the long contextual information into considerations, which transfers more suitable information for the decision-making layer. Various experiments on several benchmark datasets, including GTZAN, Ballroom, and Extended Ballroom, have verified the excellent performances of the proposed neural network. Codes and model will be available at “ttps://github.com/CaifengLiu/music-genre-classification”.
Tasks	Decision Making, Music Genre Recognition
Published	2019-01-24
URL	http://arxiv.org/abs/1901.08928v1
PDF	http://arxiv.org/pdf/1901.08928v1.pdf
PWC	https://paperswithcode.com/paper/bottom-up-broadcast-neural-network-for-music
Repo	https://github.com/CaifengLiu/music-genre-classification
Framework	none

A Generalized and Robust Method Towards Practical Gaze Estimation on Smart Phone


Title	A Generalized and Robust Method Towards Practical Gaze Estimation on Smart Phone
Authors	Tianchu Guo, Yongchao Liu, Hui Zhang, Xiabing Liu, Youngjun Kwak, Byung In Yoo, Jae-Joon Han, Changkyu Choi
Abstract	Gaze estimation for ordinary smart phone, e.g. estimating where the user is looking at on the phone screen, can be applied in various applications. However, the widely used appearance-based CNN methods still have two issues for practical adoption. First, due to the limited dataset, gaze estimation is very likely to suffer from over-fitting, leading to poor accuracy at run time. Second, the current methods are usually not robust, i.e. their prediction results having notable jitters even when the user is performing gaze fixation, which degrades user experience greatly. For the first issue, we propose a new tolerant and talented (TAT) training scheme, which is an iterative random knowledge distillation framework enhanced with cosine similarity pruning and aligned orthogonal initialization. The knowledge distillation is a tolerant teaching process providing diverse and informative supervision. The enhanced pruning and initialization is a talented learning process prompting the network to escape from the local minima and re-born from a better start. For the second issue, we define a new metric to measure the robustness of gaze estimator, and propose an adversarial training based Disturbance with Ordinal loss (DwO) method to improve it. The experimental results show that our TAT method achieves state-of-the-art performance on GazeCapture dataset, and that our DwO method improves the robustness while keeping comparable accuracy.
Tasks	Gaze Estimation
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07331v1
PDF	https://arxiv.org/pdf/1910.07331v1.pdf
PWC	https://paperswithcode.com/paper/a-generalized-and-robust-method-towards
Repo	https://github.com/antarestcguo/GazeEstimation-TAT-DwO
Framework	none

Introducing a Generative Adversarial Network Model for Lagrangian Trajectory Simulation


Title	Introducing a Generative Adversarial Network Model for Lagrangian Trajectory Simulation
Authors	Jingwei Gan, Pai Liu, Rajan K. Chakrabarty
Abstract	We introduce a generative adversarial network (GAN) model to simulate the 3-dimensional Lagrangian motion of particles trapped in the recirculation zone of a buoyancy-opposed flame. The GAN model comprises a stochastic recurrent neural network, serving as a generator, and a convoluted neural network, serving as a discriminator. Adversarial training was performed to the point where the best-trained discriminator failed to distinguish the ground truth from the trajectory produced by the best-trained generator. The model performance was then benchmarked against a statistical analysis performed on both the simulated trajectories and the ground truth, with regard to the accuracy and generalization criteria.
Tasks
Published	2019-01-13
URL	http://arxiv.org/abs/1901.03960v1
PDF	http://arxiv.org/pdf/1901.03960v1.pdf
PWC	https://paperswithcode.com/paper/introducing-a-generative-adversarial-network
Repo	https://github.com/deadzombie2333/Lagrangian_simulation_GAN
Framework	tf

Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples


Title	Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples
Authors	Derui Wang, Chaoran Li, Sheng Wen, Xiaojun Chang, Surya Nepal, Yang Xiang
Abstract	We demonstrate that Non-Maximum Suppression (NMS), which is commonly used in Object Detection (OD) tasks to filter redundant detection results, is no longer secure. Considering that NMS has been an integral part of OD systems, thwarting the functionality of NMS can result in unexpected or even lethal consequences for such systems. In this paper, we propose an adversarial example attack which triggers malfunctioning of NMS in end-to-end OD models. Our attack, namely Daedalus, compresses the dimensions of detection boxes to evade NMS. As a result, the final detection output contains extremely dense false positives. This can be fatal for many OD applications such as autonomous vehicle and surveillance system. Our attack can be generalised to different end-to-end OD models, such that the attack cripples various OD applications. Furthermore, we propose a way to craft robust adversarial examples by using an ensemble of popular detection models as the substitutes. Considering the pervasive nature of model reusing in real-world OD scenarios, Daedalus examples crafted based on an ensemble of substitutes can launch attacks without knowing the parameters of the victim models. Our experiments demonstrate that the attack effectively stops NMS from filtering redundant bounding boxes. As the evaluation results suggest, Daedalus increases the false positive rate in detection results to 99.9% and reduces the mean average precision scores to 0, while maintaining a low cost of distortion on the original inputs. With the widespread applications of OD, our work shows that there are serious vulnerabilities in the fundamental components of such systems and further investigation on them is required in this area.
Tasks	Object Detection
Published	2019-02-06
URL	https://arxiv.org/abs/1902.02067v2
PDF	https://arxiv.org/pdf/1902.02067v2.pdf
PWC	https://paperswithcode.com/paper/daedalus-breaking-non-maximum-suppression-in
Repo	https://github.com/NeuralSec/Daedalus-attack
Framework	tf

Generating Classification Weights with GNN Denoising Autoencoders for Few-Shot Learning


Title	Generating Classification Weights with GNN Denoising Autoencoders for Few-Shot Learning
Authors	Spyros Gidaris, Nikos Komodakis
Abstract	Given an initial recognition model already trained on a set of base classes, the goal of this work is to develop a meta-model for few-shot learning. The meta-model, given as input some novel classes with few training examples per class, must properly adapt the existing recognition model into a new model that can correctly classify in a unified way both the novel and the base classes. To accomplish this goal it must learn to output the appropriate classification weight vectors for those two types of classes. To build our meta-model we make use of two main innovations: we propose the use of a Denoising Autoencoder network (DAE) that (during training) takes as input a set of classification weights corrupted with Gaussian noise and learns to reconstruct the target-discriminative classification weights. In this case, the injected noise on the classification weights serves the role of regularizing the weight generating meta-model. Furthermore, in order to capture the co-dependencies between different classes in a given task instance of our meta-model, we propose to implement the DAE model as a Graph Neural Network (GNN). In order to verify the efficacy of our approach, we extensively evaluate it on ImageNet based few-shot benchmarks and we report strong results that surpass prior approaches. The code and models of our paper will be published on: https://github.com/gidariss/wDAE_GNN_FewShot
Tasks	Denoising, Few-Shot Learning
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01102v1
PDF	https://arxiv.org/pdf/1905.01102v1.pdf
PWC	https://paperswithcode.com/paper/generating-classification-weights-with-gnn
Repo	https://github.com/gidariss/wDAE_GNN_FewShot
Framework	pytorch