Paper Group ANR 408
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices. Backdooring and Poisoning Neural Networks with Image-Scaling Attacks. Cluster-Based Social Reinforcement Learning. Detection of Pitt-Hopkins Syndrome based on morphological facial features. Towards Evaluating Plan Generation Approaches with Instructional Texts. Fol …
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices
Title | An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices |
Authors | Xiaolong Ma, Wei Niu, Tianyun Zhang, Sijia Liu, Sheng Lin, Hongjia Li, Xiang Chen, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang |
Abstract | Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms. However, most of the pruning techniques are essentially trade-offs between model accuracy and regularity which lead to impaired inference accuracy and limited on-device acceleration performance. To solve the problem, we introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. With carefully designed patterns, the proposed pruning unprecedentedly and consistently achieves accuracy enhancement and better feature extraction ability on different DNN structures and datasets, and our pattern-aware pruning framework also achieves pattern library extraction, pattern selection, pattern and connectivity pruning and weight training simultaneously. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms. To the best of our knowledge, it is the first time that mobile devices achieve real-time inference for the large-scale DNN models thanks to the unique spatial property of pattern-based sparsity and the help of the code generation capability of compilers. |
Tasks | Code Generation |
Published | 2020-01-20 |
URL | https://arxiv.org/abs/2001.07710v2 |
https://arxiv.org/pdf/2001.07710v2.pdf | |
PWC | https://paperswithcode.com/paper/an-image-enhancing-pattern-based-sparsity-for |
Repo | |
Framework | |
Backdooring and Poisoning Neural Networks with Image-Scaling Attacks
Title | Backdooring and Poisoning Neural Networks with Image-Scaling Attacks |
Authors | Erwin Quiring, Konrad Rieck |
Abstract | Backdoors and poisoning attacks are a major threat to the security of machine-learning and vision systems. Often, however, these attacks leave visible artifacts in the images that can be visually detected and weaken the efficacy of the attacks. In this paper, we propose a novel strategy for hiding backdoor and poisoning attacks. Our approach builds on a recent class of attacks against image scaling. These attacks enable manipulating images such that they change their content when scaled to a specific resolution. By combining poisoning and image-scaling attacks, we can conceal the trigger of backdoors as well as hide the overlays of clean-label poisoning. Furthermore, we consider the detection of image-scaling attacks and derive an adaptive attack. In an empirical evaluation, we demonstrate the effectiveness of our strategy. First, we show that backdoors and poisoning work equally well when combined with image-scaling attacks. Second, we demonstrate that current detection defenses against image-scaling attacks are insufficient to uncover our manipulations. Overall, our work provides a novel means for hiding traces of manipulations, being applicable to different poisoning approaches. |
Tasks | |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.08633v1 |
https://arxiv.org/pdf/2003.08633v1.pdf | |
PWC | https://paperswithcode.com/paper/backdooring-and-poisoning-neural-networks |
Repo | |
Framework | |
Cluster-Based Social Reinforcement Learning
Title | Cluster-Based Social Reinforcement Learning |
Authors | Mahak Goindani, Jennifer Neville |
Abstract | Social Reinforcement Learning methods, which model agents in large networks, are useful for fake news mitigation, personalized teaching/healthcare, and viral marketing, but it is challenging to incorporate inter-agent dependencies into the models effectively due to network size and sparse interaction data. Previous social RL approaches either ignore agents dependencies or model them in a computationally intensive manner. In this work, we incorporate agent dependencies efficiently in a compact model by clustering users (based on their payoff and contribution to the goal) and combine this with a method to easily derive personalized agent-level policies from cluster-level policies. We also propose a dynamic clustering approach that captures changing user behavior. Experiments on real-world datasets illustrate that our proposed approach learns more accurate policy estimates and converges more quickly, compared to several baselines that do not use agent correlations or only use static clusters. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00627v2 |
https://arxiv.org/pdf/2003.00627v2.pdf | |
PWC | https://paperswithcode.com/paper/cluster-based-social-reinforcement-learning |
Repo | |
Framework | |
Detection of Pitt-Hopkins Syndrome based on morphological facial features
Title | Detection of Pitt-Hopkins Syndrome based on morphological facial features |
Authors | Elena D’Amato, Constantino Carlos Reyes-Aldasoro, Maria Felicia Faienza, Marcella Zollino |
Abstract | This work describes an automatic methodology to discriminate between individuals with the genetic disorder Pitt-Hopkins syndrome (PTHS), and healthy individuals. As input data, the methodology accepts unconstrained frontal facial photographs, from which faces are located with Histograms of Oriented Gradients features descriptors. Pre-processing steps of the methodology consist of colour normalisation, scaling down, rotation, and cropping in order to produce a series of images of faces with consistent dimensions. Sixty eight facial landmarks are automatically located on each face through a cascade of regression functions learnt via gradient boosting to estimate the shape from an initial approximation. The intensities of a sparse set of pixels indexed relative to this initial estimate are used to determine the landmarks. A set of carefully selected geometric features, for example, relative width of the mouth, or angle of the nose, are extracted from the landmarks. The features are used to investigate the statistical differences between the two populations of PTHS and healthy controls. The methodology was tested on 71 individuals with PTHS and 55 healthy controls. Two geometric features related to the nose and mouth showed statistical difference between the two populations. |
Tasks | |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08229v2 |
https://arxiv.org/pdf/2003.08229v2.pdf | |
PWC | https://paperswithcode.com/paper/detection-of-pitt-hopkins-syndrome-based-on |
Repo | |
Framework | |
Towards Evaluating Plan Generation Approaches with Instructional Texts
Title | Towards Evaluating Plan Generation Approaches with Instructional Texts |
Authors | Debajyoti Paul Chowdhury, Arghya Biswas, Tomasz Sosnowski, Kristina Yordanova |
Abstract | Recent research in behaviour understanding through language grounding has shown it is possible to automatically generate behaviour models from textual instructions. These models usually have goal-oriented structure and are modelled with different formalisms from the planning domain such as the Planning Domain Definition Language. One major problem that still remains is that there are no benchmark datasets for comparing the different model generation approaches, as each approach is usually evaluated on domain-specific application. To allow the objective comparison of different methods for model generation from textual instructions, in this report we introduce a dataset consisting of 83 textual instructions in English language, their refinement in a more structured form as well as manually developed plans for each of the instructions. The dataset is publicly available to the community. |
Tasks | |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.04186v1 |
https://arxiv.org/pdf/2001.04186v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-evaluating-plan-generation-approaches |
Repo | |
Framework | |
Foldover Features for Dynamic Object Behavior Description in Microscopic Videos
Title | Foldover Features for Dynamic Object Behavior Description in Microscopic Videos |
Authors | Xialin Li, Chen Li, Wenwei Zhao |
Abstract | Behavior description is conducive to the analysis of tiny objects, similar objects, objects with weak visual information and objects with similar visual information, playing a fundamental role in the identification and classification of dynamic objects in microscopic videos. To this end, we propose foldover features to describe the behavior of dynamic objects. First, we generate foldover for each object in microscopic videos in X, Y and Z directions, respectively. Then, we extract foldover features from the X, Y and Z directions with statistical methods, respectively. Finally, we use four different classifiers to test the effectiveness of the proposed foldover features. In the experiment, we use a sperm microscopic video dataset to evaluate the proposed foldover features, including three types of 1374 sperms, and obtain the highest classification accuracy of 96.5%. |
Tasks | |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.08628v2 |
https://arxiv.org/pdf/2003.08628v2.pdf | |
PWC | https://paperswithcode.com/paper/foldover-features-for-dynamic-object-behavior |
Repo | |
Framework | |
MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment
Title | MINA: Convex Mixed-Integer Programming for Non-Rigid Shape Alignment |
Authors | Florian Bernard, Zeeshan Khan Suri, Christian Theobalt |
Abstract | We present a convex mixed-integer programming formulation for non-rigid shape matching. To this end, we propose a novel shape deformation model based on an efficient low-dimensional discrete model, so that finding a globally optimal solution is tractable in (most) practical cases. Our approach combines several favourable properties: it is independent of the initialisation, it is much more efficient to solve to global optimality compared to analogous quadratic assignment problem formulations, and it is highly flexible in terms of the variants of matching problems it can handle. Experimentally we demonstrate that our approach outperforms existing methods for sparse shape matching, that it can be used for initialising dense shape matching methods, and we showcase its flexibility on several examples. |
Tasks | |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12623v1 |
https://arxiv.org/pdf/2002.12623v1.pdf | |
PWC | https://paperswithcode.com/paper/mina-convex-mixed-integer-programming-for-non |
Repo | |
Framework | |
Omni-sourced Webly-supervised Learning for Video Recognition
Title | Omni-sourced Webly-supervised Learning for Video Recognition |
Authors | Haodong Duan, Yue Zhao, Yuanjun Xiong, Wentao Liu, Dahua Lin |
Abstract | We introduce OmniSource, a novel framework for leveraging web data to train video recognition models. OmniSource overcomes the barriers between data formats, such as images, short videos, and long untrimmed videos for webly-supervised learning. First, data samples with multiple formats, curated by task-specific data collection and automatically filtered by a teacher model, are transformed into a unified form. Then a joint-training strategy is proposed to deal with the domain gaps between multiple data sources and formats in webly-supervised learning. Several good practices, including data balancing, resampling, and cross-dataset mixup are adopted in joint training. Experiments show that by utilizing data from multiple sources and formats, OmniSource is more data-efficient in training. With only 3.5M images and 800K minutes videos crawled from the internet without human labeling (less than 2% of prior works), our models learned with OmniSource improve Top-1 accuracy of 2D- and 3D-ConvNet baseline models by 3.0% and 3.9%, respectively, on the Kinetics-400 benchmark. With OmniSource, we establish new records with different pretraining strategies for video recognition. Our best models achieve 80.4%, 80.5%, and 83.6 Top-1 accuracies on the Kinetics-400 benchmark respectively for training-from-scratch, ImageNet pre-training and IG-65M pre-training. |
Tasks | Action Classification, Video Recognition |
Published | 2020-03-29 |
URL | https://arxiv.org/abs/2003.13042v1 |
https://arxiv.org/pdf/2003.13042v1.pdf | |
PWC | https://paperswithcode.com/paper/omni-sourced-webly-supervised-learning-for |
Repo | |
Framework | |
Do We Need Depth in State-Of-The-Art Face Authentication?
Title | Do We Need Depth in State-Of-The-Art Face Authentication? |
Authors | Amir Livne, Alex Bronstein, Ron Kimmel, Ziv Aviv, Shahaf Grofit |
Abstract | Some face recognition methods are designed to utilize geometric features extracted from depth sensors to handle the challenges of single-image based recognition technologies. However, calculating the geometrical data is an expensive and challenging process. Here, we introduce a novel method that learns distinctive geometric features from stereo camera systems without the need to explicitly compute the facial surface or depth map. The raw face stereo images along with coordinate maps allow a CNN to learn geometric features. This way, we keep the simplicity and cost efficiency of recognition from a single image, while enjoying the benefits of geometric data without explicitly reconstructing it. We demonstrate that the suggested method outperforms both existing single-image and explicit depth based methods on large-scale benchmarks. We also provide an ablation study to show that the suggested method uses the coordinate maps to encode more informative features. |
Tasks | Face Recognition |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.10895v1 |
https://arxiv.org/pdf/2003.10895v1.pdf | |
PWC | https://paperswithcode.com/paper/do-we-need-depth-in-state-of-the-art-face |
Repo | |
Framework | |
On the Convergence of Stochastic Gradient Descent with Low-Rank Projections for Convex Low-Rank Matrix Problems
Title | On the Convergence of Stochastic Gradient Descent with Low-Rank Projections for Convex Low-Rank Matrix Problems |
Authors | Dan Garber |
Abstract | We revisit the use of Stochastic Gradient Descent (SGD) for solving convex optimization problems that serve as highly popular convex relaxations for many important low-rank matrix recovery problems such as \textit{matrix completion}, \textit{phase retrieval}, and more. The computational limitation of applying SGD to solving these relaxations in large-scale is the need to compute a potentially high-rank singular value decomposition (SVD) on each iteration in order to enforce the low-rank-promoting constraint. We begin by considering a simple and natural sufficient condition so that these relaxations indeed admit low-rank solutions. This condition is also necessary for a certain notion of low-rank-robustness to hold. Our main result shows that under this condition which involves the eigenvalues of the gradient vector at optimal points, SGD with mini-batches, when initialized with a “warm-start” point, produces iterates that are low-rank with high probability, and hence only a low-rank SVD computation is required on each iteration. This suggests that SGD may indeed be practically applicable to solving large-scale convex relaxations of low-rank matrix recovery problems. Our theoretical results are accompanied with supporting preliminary empirical evidence. As a side benefit, our analysis is quite simple and short. |
Tasks | Matrix Completion |
Published | 2020-01-31 |
URL | https://arxiv.org/abs/2001.11668v1 |
https://arxiv.org/pdf/2001.11668v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-convergence-of-stochastic-gradient-3 |
Repo | |
Framework | |
A Generative Learning Approach for Spatio-temporal Modeling in Connected Vehicular Network
Title | A Generative Learning Approach for Spatio-temporal Modeling in Connected Vehicular Network |
Authors | Rong Xia, Yong Xiao, Yingyu Li, Marwan Krunz, Dusit Niyato |
Abstract | Spatio-temporal modeling of wireless access latency is of great importance for connected-vehicular systems. The quality of the molded results rely heavily on the number and quality of samples which can vary significantly due to the sensor deployment density as well as traffic volume and density. This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive spatio-temporal of wireless access latency of a connected vehicles across a wide geographical area. LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure. In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Variational Autoencoder (VAE), a deep generative model, to create latency samples with similar probability distribution with the original samples. Finally, LaMI establishes the empirical PDF of latency performance and maps the PDFs into the confidence levels of different vehicular service requirements. Extensive performance evaluation has been conducted using the real traces collected in a commercial LTE network in a university campus. Simulation results show that our proposed model can significantly improve the accuracy of latency modeling especially compared to existing popular solutions such as interpolation and nearest neighbor-based methods. |
Tasks | Image Inpainting |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07004v1 |
https://arxiv.org/pdf/2003.07004v1.pdf | |
PWC | https://paperswithcode.com/paper/a-generative-learning-approach-for-spatio |
Repo | |
Framework | |
Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States
Title | Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States |
Authors | Yunan Ye, Hengzhi Pei, Boxin Wang, Pin-Yu Chen, Yada Zhu, Jun Xiao, Bo Li |
Abstract | Portfolio management (PM) is a fundamental financial planning task that aims to achieve investment goals such as maximal profits or minimal risks. Its decision process involves continuous derivation of valuable information from various data sources and sequential decision optimization, which is a prospective research direction for reinforcement learning (RL). In this paper, we propose SARL, a novel State-Augmented RL framework for PM. Our framework aims to address two unique challenges in financial PM: (1) data heterogeneity – the collected information for each asset is usually diverse, noisy and imbalanced (e.g., news articles); and (2) environment uncertainty – the financial market is versatile and non-stationary. To incorporate heterogeneous data and enhance robustness against environment uncertainty, our SARL augments the asset information with their price movement prediction as additional states, where the prediction can be solely based on financial data (e.g., asset prices) or derived from alternative sources such as news. Experiments on two real-world datasets, (i) Bitcoin market and (ii) HighTech stock market with 7-year Reuters news articles, validate the effectiveness of SARL over existing PM approaches, both in terms of accumulated profits and risk-adjusted profits. Moreover, extensive simulations are conducted to demonstrate the importance of our proposed state augmentation, providing new insights and boosting performance significantly over standard RL-based PM method and other baselines. |
Tasks | |
Published | 2020-02-09 |
URL | https://arxiv.org/abs/2002.05780v1 |
https://arxiv.org/pdf/2002.05780v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-based-portfolio |
Repo | |
Framework | |
Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation
Title | Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation |
Authors | Mikel Artetxe, Gorka Labaka, Noe Casas, Eneko Agirre |
Abstract | Back-translation provides a simple yet effective approach to exploit monolingual corpora in Neural Machine Translation (NMT). Its iterative variant, where two opposite NMT models are jointly trained by alternately using a synthetic parallel corpus generated by the reverse model, plays a central role in unsupervised machine translation. In order to start producing sound translations and provide a meaningful training signal to each other, existing approaches rely on either a separate machine translation system to warm up the iterative procedure, or some form of pre-training to initialize the weights of the model. In this paper, we analyze the role that such initialization plays in iterative back-translation. Is the behavior of the final system heavily dependent on it? Or does iterative back-translation converge to a similar solution given any reasonable initialization? Through a series of empirical experiments over a diverse set of warmup systems, we show that, although the quality of the initial system does affect final performance, its effect is relatively small, as iterative back-translation has a strong tendency to convergence to a similar solution. As such, the margin of improvement left for the initialization method is narrow, suggesting that future research should focus more on improving the iterative mechanism itself. |
Tasks | Machine Translation, Unsupervised Machine Translation |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12867v1 |
https://arxiv.org/pdf/2002.12867v1.pdf | |
PWC | https://paperswithcode.com/paper/do-all-roads-lead-to-rome-understanding-the |
Repo | |
Framework | |
Learning Constraints from Locally-Optimal Demonstrations under Cost Function Uncertainty
Title | Learning Constraints from Locally-Optimal Demonstrations under Cost Function Uncertainty |
Authors | Glen Chou, Necmiye Ozay, Dmitry Berenson |
Abstract | We present an algorithm for learning parametric constraints from locally-optimal demonstrations, where the cost function being optimized is uncertain to the learner. Our method uses the Karush-Kuhn-Tucker (KKT) optimality conditions of the demonstrations within a mixed integer linear program (MILP) to learn constraints which are consistent with the local optimality of the demonstrations, by either using a known constraint parameterization or by incrementally growing a parameterization that is consistent with the demonstrations. We provide theoretical guarantees on the conservativeness of the recovered safe/unsafe sets and analyze the limits of constraint learnability when using locally-optimal demonstrations. We evaluate our method on high-dimensional constraints and systems by learning constraints for 7-DOF arm and quadrotor examples, show that it outperforms competing constraint-learning approaches, and can be effectively used to plan new constraint-satisfying trajectories in the environment. |
Tasks | |
Published | 2020-01-25 |
URL | https://arxiv.org/abs/2001.09336v1 |
https://arxiv.org/pdf/2001.09336v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-constraints-from-locally-optimal |
Repo | |
Framework | |
Echo State Neural Machine Translation
Title | Echo State Neural Machine Translation |
Authors | Ankush Garg, Yuan Cao, Qi Ge |
Abstract | We present neural machine translation (NMT) models inspired by echo state network (ESN), named Echo State NMT (ESNMT), in which the encoder and decoder layer weights are randomly generated then fixed throughout training. We show that even with this extremely simple model construction and training procedure, ESNMT can already reach 70-80% quality of fully trainable baselines. We examine how spectral radius of the reservoir, a key quantity that characterizes the model, determines the model behavior. Our findings indicate that randomized networks can work well even for complicated sequence-to-sequence prediction NLP tasks. |
Tasks | Machine Translation |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.11847v1 |
https://arxiv.org/pdf/2002.11847v1.pdf | |
PWC | https://paperswithcode.com/paper/echo-state-neural-machine-translation |
Repo | |
Framework | |