April 1, 2020

3105 words 15 mins read

Paper Group ANR 408

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices. Backdooring and Poisoning Neural Networks with Image-Scaling Attacks. Cluster-Based Social Reinforcement Learning. Detection of Pitt-Hopkins Syndrome based on morphological facial features. Towards Evaluating Plan Generation Approaches with Instructional Texts. Fol …

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices


Title	An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices
Authors	Xiaolong Ma, Wei Niu, Tianyun Zhang, Sijia Liu, Sheng Lin, Hongjia Li, Xiang Chen, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang
Abstract	Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms. However, most of the pruning techniques are essentially trade-offs between model accuracy and regularity which lead to impaired inference accuracy and limited on-device acceleration performance. To solve the problem, we introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. With carefully designed patterns, the proposed pruning unprecedentedly and consistently achieves accuracy enhancement and better feature extraction ability on different DNN structures and datasets, and our pattern-aware pruning framework also achieves pattern library extraction, pattern selection, pattern and connectivity pruning and weight training simultaneously. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms. To the best of our knowledge, it is the first time that mobile devices achieve real-time inference for the large-scale DNN models thanks to the unique spatial property of pattern-based sparsity and the help of the code generation capability of compilers.
Tasks	Code Generation
Published	2020-01-20
URL	https://arxiv.org/abs/2001.07710v2
PDF	https://arxiv.org/pdf/2001.07710v2.pdf
PWC	https://paperswithcode.com/paper/an-image-enhancing-pattern-based-sparsity-for
Repo
Framework

Backdooring and Poisoning Neural Networks with Image-Scaling Attacks


Title	Backdooring and Poisoning Neural Networks with Image-Scaling Attacks
Authors	Erwin Quiring, Konrad Rieck
Abstract	Backdoors and poisoning attacks are a major threat to the security of machine-learning and vision systems. Often, however, these attacks leave visible artifacts in the images that can be visually detected and weaken the efficacy of the attacks. In this paper, we propose a novel strategy for hiding backdoor and poisoning attacks. Our approach builds on a recent class of attacks against image scaling. These attacks enable manipulating images such that they change their content when scaled to a specific resolution. By combining poisoning and image-scaling attacks, we can conceal the trigger of backdoors as well as hide the overlays of clean-label poisoning. Furthermore, we consider the detection of image-scaling attacks and derive an adaptive attack. In an empirical evaluation, we demonstrate the effectiveness of our strategy. First, we show that backdoors and poisoning work equally well when combined with image-scaling attacks. Second, we demonstrate that current detection defenses against image-scaling attacks are insufficient to uncover our manipulations. Overall, our work provides a novel means for hiding traces of manipulations, being applicable to different poisoning approaches.
Tasks
Published	2020-03-19
URL	https://arxiv.org/abs/2003.08633v1
PDF	https://arxiv.org/pdf/2003.08633v1.pdf
PWC	https://paperswithcode.com/paper/backdooring-and-poisoning-neural-networks
Repo
Framework


Title	Cluster-Based Social Reinforcement Learning
Authors	Mahak Goindani, Jennifer Neville
Abstract	Social Reinforcement Learning methods, which model agents in large networks, are useful for fake news mitigation, personalized teaching/healthcare, and viral marketing, but it is challenging to incorporate inter-agent dependencies into the models effectively due to network size and sparse interaction data. Previous social RL approaches either ignore agents dependencies or model them in a computationally intensive manner. In this work, we incorporate agent dependencies efficiently in a compact model by clustering users (based on their payoff and contribution to the goal) and combine this with a method to easily derive personalized agent-level policies from cluster-level policies. We also propose a dynamic clustering approach that captures changing user behavior. Experiments on real-world datasets illustrate that our proposed approach learns more accurate policy estimates and converges more quickly, compared to several baselines that do not use agent correlations or only use static clusters.
Tasks
Published	2020-03-02
URL	https://arxiv.org/abs/2003.00627v2
PDF	https://arxiv.org/pdf/2003.00627v2.pdf
PWC	https://paperswithcode.com/paper/cluster-based-social-reinforcement-learning
Repo
Framework

Detection of Pitt-Hopkins Syndrome based on morphological facial features


Title	Detection of Pitt-Hopkins Syndrome based on morphological facial features
Authors	Elena D’Amato, Constantino Carlos Reyes-Aldasoro, Maria Felicia Faienza, Marcella Zollino
Abstract	This work describes an automatic methodology to discriminate between individuals with the genetic disorder Pitt-Hopkins syndrome (PTHS), and healthy individuals. As input data, the methodology accepts unconstrained frontal facial photographs, from which faces are located with Histograms of Oriented Gradients features descriptors. Pre-processing steps of the methodology consist of colour normalisation, scaling down, rotation, and cropping in order to produce a series of images of faces with consistent dimensions. Sixty eight facial landmarks are automatically located on each face through a cascade of regression functions learnt via gradient boosting to estimate the shape from an initial approximation. The intensities of a sparse set of pixels indexed relative to this initial estimate are used to determine the landmarks. A set of carefully selected geometric features, for example, relative width of the mouth, or angle of the nose, are extracted from the landmarks. The features are used to investigate the statistical differences between the two populations of PTHS and healthy controls. The methodology was tested on 71 individuals with PTHS and 55 healthy controls. Two geometric features related to the nose and mouth showed statistical difference between the two populations.
Tasks
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08229v2
PDF	https://arxiv.org/pdf/2003.08229v2.pdf
PWC	https://paperswithcode.com/paper/detection-of-pitt-hopkins-syndrome-based-on
Repo
Framework

Towards Evaluating Plan Generation Approaches with Instructional Texts


Title	Towards Evaluating Plan Generation Approaches with Instructional Texts
Authors	Debajyoti Paul Chowdhury, Arghya Biswas, Tomasz Sosnowski, Kristina Yordanova
Abstract	Recent research in behaviour understanding through language grounding has shown it is possible to automatically generate behaviour models from textual instructions. These models usually have goal-oriented structure and are modelled with different formalisms from the planning domain such as the Planning Domain Definition Language. One major problem that still remains is that there are no benchmark datasets for comparing the different model generation approaches, as each approach is usually evaluated on domain-specific application. To allow the objective comparison of different methods for model generation from textual instructions, in this report we introduce a dataset consisting of 83 textual instructions in English language, their refinement in a more structured form as well as manually developed plans for each of the instructions. The dataset is publicly available to the community.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04186v1
PDF	https://arxiv.org/pdf/2001.04186v1.pdf
PWC	https://paperswithcode.com/paper/towards-evaluating-plan-generation-approaches
Repo
Framework

Foldover Features for Dynamic Object Behavior Description in Microscopic Videos


Title	Foldover Features for Dynamic Object Behavior Description in Microscopic Videos
Authors	Xialin Li, Chen Li, Wenwei Zhao
Abstract	Behavior description is conducive to the analysis of tiny objects, similar objects, objects with weak visual information and objects with similar visual information, playing a fundamental role in the identification and classification of dynamic objects in microscopic videos. To this end, we propose foldover features to describe the behavior of dynamic objects. First, we generate foldover for each object in microscopic videos in X, Y and Z directions, respectively. Then, we extract foldover features from the X, Y and Z directions with statistical methods, respectively. Finally, we use four different classifiers to test the effectiveness of the proposed foldover features. In the experiment, we use a sperm microscopic video dataset to evaluate the proposed foldover features, including three types of 1374 sperms, and obtain the highest classification accuracy of 96.5%.
Tasks
Published	2020-03-19
URL	https://arxiv.org/abs/2003.08628v2
PDF	https://arxiv.org/pdf/2003.08628v2.pdf
PWC	https://paperswithcode.com/paper/foldover-features-for-dynamic-object-behavior
Repo
Framework

MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment


Title	MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment
Authors	Florian Bernard, Zeeshan Khan Suri, Christian Theobalt
Abstract	We present a convex mixed-integer programming formulation for non-rigid shape matching. To this end, we propose a novel shape deformation model based on an efficient low-dimensional discrete model, so that finding a globally optimal solution is tractable in (most) practical cases. Our approach combines several favourable properties: it is independent of the initialisation, it is much more efficient to solve to global optimality compared to analogous quadratic assignment problem formulations, and it is highly flexible in terms of the variants of matching problems it can handle. Experimentally we demonstrate that our approach outperforms existing methods for sparse shape matching, that it can be used for initialising dense shape matching methods, and we showcase its flexibility on several examples.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12623v1
PDF	https://arxiv.org/pdf/2002.12623v1.pdf
PWC	https://paperswithcode.com/paper/mina-convex-mixed-integer-programming-for-non
Repo
Framework

Omni-sourced Webly-supervised Learning for Video Recognition


Title	Omni-sourced Webly-supervised Learning for Video Recognition
Authors	Haodong Duan, Yue Zhao, Yuanjun Xiong, Wentao Liu, Dahua Lin
Abstract	We introduce OmniSource, a novel framework for leveraging web data to train video recognition models. OmniSource overcomes the barriers between data formats, such as images, short videos, and long untrimmed videos for webly-supervised learning. First, data samples with multiple formats, curated by task-specific data collection and automatically filtered by a teacher model, are transformed into a unified form. Then a joint-training strategy is proposed to deal with the domain gaps between multiple data sources and formats in webly-supervised learning. Several good practices, including data balancing, resampling, and cross-dataset mixup are adopted in joint training. Experiments show that by utilizing data from multiple sources and formats, OmniSource is more data-efficient in training. With only 3.5M images and 800K minutes videos crawled from the internet without human labeling (less than 2% of prior works), our models learned with OmniSource improve Top-1 accuracy of 2D- and 3D-ConvNet baseline models by 3.0% and 3.9%, respectively, on the Kinetics-400 benchmark. With OmniSource, we establish new records with different pretraining strategies for video recognition. Our best models achieve 80.4%, 80.5%, and 83.6 Top-1 accuracies on the Kinetics-400 benchmark respectively for training-from-scratch, ImageNet pre-training and IG-65M pre-training.
Tasks	Action Classification, Video Recognition
Published	2020-03-29
URL	https://arxiv.org/abs/2003.13042v1
PDF	https://arxiv.org/pdf/2003.13042v1.pdf
PWC	https://paperswithcode.com/paper/omni-sourced-webly-supervised-learning-for
Repo
Framework

Do We Need Depth in State-Of-The-Art Face Authentication?


Title	Do We Need Depth in State-Of-The-Art Face Authentication?
Authors	Amir Livne, Alex Bronstein, Ron Kimmel, Ziv Aviv, Shahaf Grofit
Abstract	Some face recognition methods are designed to utilize geometric features extracted from depth sensors to handle the challenges of single-image based recognition technologies. However, calculating the geometrical data is an expensive and challenging process. Here, we introduce a novel method that learns distinctive geometric features from stereo camera systems without the need to explicitly compute the facial surface or depth map. The raw face stereo images along with coordinate maps allow a CNN to learn geometric features. This way, we keep the simplicity and cost efficiency of recognition from a single image, while enjoying the benefits of geometric data without explicitly reconstructing it. We demonstrate that the suggested method outperforms both existing single-image and explicit depth based methods on large-scale benchmarks. We also provide an ablation study to show that the suggested method uses the coordinate maps to encode more informative features.
Tasks	Face Recognition
Published	2020-03-24
URL	https://arxiv.org/abs/2003.10895v1
PDF	https://arxiv.org/pdf/2003.10895v1.pdf
PWC	https://paperswithcode.com/paper/do-we-need-depth-in-state-of-the-art-face
Repo
Framework

On the Convergence of Stochastic Gradient Descent with Low-Rank Projections for Convex Low-Rank Matrix Problems


Title	On the Convergence of Stochastic Gradient Descent with Low-Rank Projections for Convex Low-Rank Matrix Problems
Authors	Dan Garber
Abstract	We revisit the use of Stochastic Gradient Descent (SGD) for solving convex optimization problems that serve as highly popular convex relaxations for many important low-rank matrix recovery problems such as \textit{matrix completion}, \textit{phase retrieval}, and more. The computational limitation of applying SGD to solving these relaxations in large-scale is the need to compute a potentially high-rank singular value decomposition (SVD) on each iteration in order to enforce the low-rank-promoting constraint. We begin by considering a simple and natural sufficient condition so that these relaxations indeed admit low-rank solutions. This condition is also necessary for a certain notion of low-rank-robustness to hold. Our main result shows that under this condition which involves the eigenvalues of the gradient vector at optimal points, SGD with mini-batches, when initialized with a “warm-start” point, produces iterates that are low-rank with high probability, and hence only a low-rank SVD computation is required on each iteration. This suggests that SGD may indeed be practically applicable to solving large-scale convex relaxations of low-rank matrix recovery problems. Our theoretical results are accompanied with supporting preliminary empirical evidence. As a side benefit, our analysis is quite simple and short.
Tasks	Matrix Completion
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11668v1
PDF	https://arxiv.org/pdf/2001.11668v1.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-of-stochastic-gradient-3
Repo
Framework

A Generative Learning Approach for Spatio-temporal Modeling in Connected Vehicular Network


Title	A Generative Learning Approach for Spatio-temporal Modeling in Connected Vehicular Network
Authors	Rong Xia, Yong Xiao, Yingyu Li, Marwan Krunz, Dusit Niyato
Abstract	Spatio-temporal modeling of wireless access latency is of great importance for connected-vehicular systems. The quality of the molded results rely heavily on the number and quality of samples which can vary significantly due to the sensor deployment density as well as traffic volume and density. This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive spatio-temporal of wireless access latency of a connected vehicles across a wide geographical area. LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure. In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Variational Autoencoder (VAE), a deep generative model, to create latency samples with similar probability distribution with the original samples. Finally, LaMI establishes the empirical PDF of latency performance and maps the PDFs into the confidence levels of different vehicular service requirements. Extensive performance evaluation has been conducted using the real traces collected in a commercial LTE network in a university campus. Simulation results show that our proposed model can significantly improve the accuracy of latency modeling especially compared to existing popular solutions such as interpolation and nearest neighbor-based methods.
Tasks	Image Inpainting
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07004v1
PDF	https://arxiv.org/pdf/2003.07004v1.pdf
PWC	https://paperswithcode.com/paper/a-generative-learning-approach-for-spatio
Repo
Framework

Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States


Title	Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States
Authors	Yunan Ye, Hengzhi Pei, Boxin Wang, Pin-Yu Chen, Yada Zhu, Jun Xiao, Bo Li
Abstract	Portfolio management (PM) is a fundamental financial planning task that aims to achieve investment goals such as maximal profits or minimal risks. Its decision process involves continuous derivation of valuable information from various data sources and sequential decision optimization, which is a prospective research direction for reinforcement learning (RL). In this paper, we propose SARL, a novel State-Augmented RL framework for PM. Our framework aims to address two unique challenges in financial PM: (1) data heterogeneity – the collected information for each asset is usually diverse, noisy and imbalanced (e.g., news articles); and (2) environment uncertainty – the financial market is versatile and non-stationary. To incorporate heterogeneous data and enhance robustness against environment uncertainty, our SARL augments the asset information with their price movement prediction as additional states, where the prediction can be solely based on financial data (e.g., asset prices) or derived from alternative sources such as news. Experiments on two real-world datasets, (i) Bitcoin market and (ii) HighTech stock market with 7-year Reuters news articles, validate the effectiveness of SARL over existing PM approaches, both in terms of accumulated profits and risk-adjusted profits. Moreover, extensive simulations are conducted to demonstrate the importance of our proposed state augmentation, providing new insights and boosting performance significantly over standard RL-based PM method and other baselines.
Tasks
Published	2020-02-09
URL	https://arxiv.org/abs/2002.05780v1
PDF	https://arxiv.org/pdf/2002.05780v1.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-based-portfolio
Repo
Framework

Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation


Title	Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation
Authors	Mikel Artetxe, Gorka Labaka, Noe Casas, Eneko Agirre
Abstract	Back-translation provides a simple yet effective approach to exploit monolingual corpora in Neural Machine Translation (NMT). Its iterative variant, where two opposite NMT models are jointly trained by alternately using a synthetic parallel corpus generated by the reverse model, plays a central role in unsupervised machine translation. In order to start producing sound translations and provide a meaningful training signal to each other, existing approaches rely on either a separate machine translation system to warm up the iterative procedure, or some form of pre-training to initialize the weights of the model. In this paper, we analyze the role that such initialization plays in iterative back-translation. Is the behavior of the final system heavily dependent on it? Or does iterative back-translation converge to a similar solution given any reasonable initialization? Through a series of empirical experiments over a diverse set of warmup systems, we show that, although the quality of the initial system does affect final performance, its effect is relatively small, as iterative back-translation has a strong tendency to convergence to a similar solution. As such, the margin of improvement left for the initialization method is narrow, suggesting that future research should focus more on improving the iterative mechanism itself.
Tasks	Machine Translation, Unsupervised Machine Translation
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12867v1
PDF	https://arxiv.org/pdf/2002.12867v1.pdf
PWC	https://paperswithcode.com/paper/do-all-roads-lead-to-rome-understanding-the
Repo
Framework

Learning Constraints from Locally-Optimal Demonstrations under Cost Function Uncertainty


Title	Learning Constraints from Locally-Optimal Demonstrations under Cost Function Uncertainty
Authors	Glen Chou, Necmiye Ozay, Dmitry Berenson
Abstract	We present an algorithm for learning parametric constraints from locally-optimal demonstrations, where the cost function being optimized is uncertain to the learner. Our method uses the Karush-Kuhn-Tucker (KKT) optimality conditions of the demonstrations within a mixed integer linear program (MILP) to learn constraints which are consistent with the local optimality of the demonstrations, by either using a known constraint parameterization or by incrementally growing a parameterization that is consistent with the demonstrations. We provide theoretical guarantees on the conservativeness of the recovered safe/unsafe sets and analyze the limits of constraint learnability when using locally-optimal demonstrations. We evaluate our method on high-dimensional constraints and systems by learning constraints for 7-DOF arm and quadrotor examples, show that it outperforms competing constraint-learning approaches, and can be effectively used to plan new constraint-satisfying trajectories in the environment.
Tasks
Published	2020-01-25
URL	https://arxiv.org/abs/2001.09336v1
PDF	https://arxiv.org/pdf/2001.09336v1.pdf
PWC	https://paperswithcode.com/paper/learning-constraints-from-locally-optimal
Repo
Framework

Echo State Neural Machine Translation


Title	Echo State Neural Machine Translation
Authors	Ankush Garg, Yuan Cao, Qi Ge
Abstract	We present neural machine translation (NMT) models inspired by echo state network (ESN), named Echo State NMT (ESNMT), in which the encoder and decoder layer weights are randomly generated then fixed throughout training. We show that even with this extremely simple model construction and training procedure, ESNMT can already reach 70-80% quality of fully trainable baselines. We examine how spectral radius of the reservoir, a key quantity that characterizes the model, determines the model behavior. Our findings indicate that randomized networks can work well even for complicated sequence-to-sequence prediction NLP tasks.
Tasks	Machine Translation
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11847v1
PDF	https://arxiv.org/pdf/2002.11847v1.pdf
PWC	https://paperswithcode.com/paper/echo-state-neural-machine-translation
Repo
Framework