January 28, 2020

3148 words 15 mins read

Paper Group ANR 953

Boosting Generative Models by Leveraging Cascaded Meta-Models. Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition. A Lightweight Deep Learning Model for Human Activity Recognition on Edge Devices. Machine Translation with Cross-lingual Word Embeddings. TentacleNet: A Pseudo-Ensemble Template for Accurate Binar …

Boosting Generative Models by Leveraging Cascaded Meta-Models


Title	Boosting Generative Models by Leveraging Cascaded Meta-Models
Authors	Fan Bao, Hang Su, Jun Zhu
Abstract	Deep generative models are effective methods of modeling data. However, it is not easy for a single generative model to faithfully capture the distributions of complex data such as images. In this paper, we propose an approach for boosting generative models, which cascades meta-models together to produce a stronger model. Any hidden variable meta-model (e.g., RBM and VAE) which supports likelihood evaluation can be leveraged. We derive a decomposable variational lower bound of the boosted model, which allows each meta-model to be trained separately and greedily. Besides, our framework can be extended to semi-supervised boosting, where the boosted model learns a joint distribution of data and labels. Finally, we combine our boosting framework with the multiplicative boosting framework, which further improves the learning power of generative models.
Tasks
Published	2019-05-11
URL	https://arxiv.org/abs/1905.04534v1
PDF	https://arxiv.org/pdf/1905.04534v1.pdf
PWC	https://paperswithcode.com/paper/boosting-generative-models-by-leveraging
Repo
Framework

Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition


Title	Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition
Authors	Yao Qin, Nicholas Carlini, Ian Goodfellow, Garrison Cottrell, Colin Raffel
Abstract	Adversarial examples are inputs to machine learning models designed by an adversary to cause an incorrect output. So far, adversarial examples have been studied most extensively in the image domain. In this domain, adversarial examples can be constructed by imperceptibly modifying images to cause misclassification, and are practical in the physical world. In contrast, current targeted adversarial examples applied to speech recognition systems have neither of these properties: humans can easily identify the adversarial perturbations, and they are not effective when played over-the-air. This paper makes advances on both of these fronts. First, we develop effectively imperceptible audio adversarial examples (verified through a human study) by leveraging the psychoacoustic principle of auditory masking, while retaining 100% targeted success rate on arbitrary full-sentence targets. Next, we make progress towards physical-world over-the-air audio adversarial examples by constructing perturbations which remain effective even after applying realistic simulated environmental distortions.
Tasks	Speech Recognition
Published	2019-03-22
URL	https://arxiv.org/abs/1903.10346v2
PDF	https://arxiv.org/pdf/1903.10346v2.pdf
PWC	https://paperswithcode.com/paper/imperceptible-robust-and-targeted-adversarial
Repo
Framework

A Lightweight Deep Learning Model for Human Activity Recognition on Edge Devices


Title	A Lightweight Deep Learning Model for Human Activity Recognition on Edge Devices
Authors	Preeti Agarwal, Mansaf Alam
Abstract	Human Activity Recognition (HAR) using wearable and mobile sensors has gained momentum in last few years, in various fields, such as, healthcare, surveillance, education, entertainment. Nowadays, Edge Computing has emerged to reduce communication latency and network traffic.Edge devices are resource constrained devices and cannot support high computation. In literature, various models have been developed for HAR. In recent years, deep learning algorithms have shown high performance in HAR, but these algorithms require lot of computation making them inefficient to be deployed on edge devices. This paper, proposes a Lightweight Deep Learning Model for HAR requiring less computational power, making it suitable to be deployed on edge devices. The performance of proposed model is tested on the participants six daily activities data. Results show that the proposed model outperforms many of the existing machine learning and deep learning techniques.
Tasks	Activity Recognition, Human Activity Recognition
Published	2019-09-20
URL	https://arxiv.org/abs/1909.12917v1
PDF	https://arxiv.org/pdf/1909.12917v1.pdf
PWC	https://paperswithcode.com/paper/a-lightweight-deep-learning-model-for-human
Repo
Framework

Machine Translation with Cross-lingual Word Embeddings


Title	Machine Translation with Cross-lingual Word Embeddings
Authors	Marco Berlot, Evan Kaplan
Abstract	Learning word embeddings using distributional information is a task that has been studied by many researchers, and a lot of studies are reported in the literature. On the contrary, less studies were done for the case of multiple languages. The idea is to focus on a single representation for a pair of languages such that semantically similar words are closer to one another in the induced representation irrespective of the language. In this way, when data are missing for a particular language, classifiers from another language can be used.
Tasks	Learning Word Embeddings, Machine Translation, Word Embeddings
Published	2019-12-10
URL	https://arxiv.org/abs/1912.10167v1
PDF	https://arxiv.org/pdf/1912.10167v1.pdf
PWC	https://paperswithcode.com/paper/machine-translation-with-cross-lingual-word
Repo
Framework

TentacleNet: A Pseudo-Ensemble Template for Accurate Binary Convolutional Neural Networks


Title	TentacleNet: A Pseudo-Ensemble Template for Accurate Binary Convolutional Neural Networks
Authors	Luca Mocerino, Andrea Calimera
Abstract	Binarization is an attractive strategy for implementing lightweight Deep Convolutional Neural Networks (CNNs). Despite the unquestionable savings offered, memory footprint above all, it may induce an excessive accuracy loss that prevents a widespread use. This work elaborates on this aspect introducing TentacleNet, a new template designed to improve the predictive performance of binarized CNNs via parallelization. Inspired by the ensemble learning theory, it consists of a compact topology that is end-to-end trainable and organized to minimize memory utilization. Experimental results collected over three realistic benchmarks show TentacleNet fills the gap left by classical binary models, ensuring substantial memory savings w.r.t. state-of-the-art binary ensemble methods.
Tasks
Published	2019-12-20
URL	https://arxiv.org/abs/1912.10103v2
PDF	https://arxiv.org/pdf/1912.10103v2.pdf
PWC	https://paperswithcode.com/paper/tentaclenet-a-pseudo-ensemble-template-for
Repo
Framework

An IoT Based Framework For Activity Recognition Using Deep Learning Technique


Title	An IoT Based Framework For Activity Recognition Using Deep Learning Technique
Authors	Ashwin Geet D’Sa, B. G. Prasad
Abstract	Activity recognition is the ability to identify and recognize the action or goals of the agent. The agent can be any object or entity that performs action that has end goals. The agents can be a single agent performing the action or group of agents performing the actions or having some interaction. Human activity recognition has gained popularity due to its demands in many practical applications such as entertainment, healthcare, simulations and surveillance systems. Vision based activity recognition is gaining advantage as it does not require any human intervention or physical contact with humans. Moreover, there are set of cameras that are networked with the intention to track and recognize the activities of the agent. Traditional applications that were required to track or recognize human activities made use of wearable devices. However, such applications require physical contact of the person. To overcome such challenges, vision based activity recognition system can be used, which uses a camera to record the video and a processor that performs the task of recognition. The work is implemented in two stages. In the first stage, an approach for the Implementation of Activity recognition is proposed using background subtraction of images, followed by 3D- Convolutional Neural Networks. The impact of using Background subtraction prior to 3D-Convolutional Neural Networks has been reported. In the second stage, the work is further extended and implemented on Raspberry Pi, that can be used to record a stream of video, followed by recognizing the activity that was involved in the video. Thus, a proof-of-concept for activity recognition using small, IoT based device, is provided, which can enhance the system and extend its applications in various forms like, increase in portability, networking, and other capabilities of the device.
Tasks	Activity Recognition, Human Activity Recognition
Published	2019-06-17
URL	https://arxiv.org/abs/1906.07247v1
PDF	https://arxiv.org/pdf/1906.07247v1.pdf
PWC	https://paperswithcode.com/paper/an-iot-based-framework-for-activity
Repo
Framework

Investigating Decision Boundaries of Trained Neural Networks


Title	Investigating Decision Boundaries of Trained Neural Networks
Authors	Roozbeh Yousefzadeh, Dianne P O’Leary
Abstract	Deep learning models have been the subject of study from various perspectives, for example, their training process, interpretation, generalization error, robustness to adversarial attacks, etc. A trained model is defined by its decision boundaries, and therefore, many of the studies about deep learning models speculate about the decision boundaries, and sometimes make simplifying assumptions about them. So far, finding exact points on the decision boundaries of trained deep models has been considered an intractable problem. Here, we compute exact points on the decision boundaries of these models and provide mathematical tools to investigate the surfaces that define the decision boundaries. Through numerical results, we confirm that some of the speculations about the decision boundaries are accurate, some of the computational methods can be improved, and some of the simplifying assumptions may be unreliable, for models with nonlinear activation functions. We advocate for verification of simplifying assumptions and approximation methods, wherever they are used. Finally, we demonstrate that the computational practices used for finding adversarial examples can be improved and computing the closest point on the decision boundary reveals the weakest vulnerability of a model against adversarial attack.
Tasks	Adversarial Attack
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02802v1
PDF	https://arxiv.org/pdf/1908.02802v1.pdf
PWC	https://paperswithcode.com/paper/investigating-decision-boundaries-of-trained
Repo
Framework

From feature selection to continuous optimization


Title	From feature selection to continuous optimization
Authors	Hojjat Rakhshani, Lhassane Idoumghar, Julien Lepagnot, Mathieu Brevilliers
Abstract	Metaheuristic algorithms (MAs) have seen unprecedented growth thanks to their successful applications in fields including engineering and health sciences. In this work, we investigate the use of a deep learning (DL) model as an alternative tool to do so. The proposed method, called MaNet, is motivated by the fact that most of the DL models often need to solve massive nasty optimization problems consisting of millions of parameters. Feature selection is the main adopted concepts in MaNet that helps the algorithm to skip irrelevant or partially relevant evolutionary information and uses those which contribute most to the overall performance. The introduced model is applied on several unimodal and multimodal continuous problems. The experiments indicate that MaNet is able to yield competitive results compared to one of the best hand-designed algorithms for the aforementioned problems, in terms of the solution accuracy and scalability.
Tasks	Feature Selection
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09444v2
PDF	https://arxiv.org/pdf/1909.09444v2.pdf
PWC	https://paperswithcode.com/paper/from-feature-selection-to-continues
Repo
Framework

Deep SR-ITM: Joint Learning of Super-Resolution and Inverse Tone-Mapping for 4K UHD HDR Applications


Title	Deep SR-ITM: Joint Learning of Super-Resolution and Inverse Tone-Mapping for 4K UHD HDR Applications
Authors	Soo Ye Kim, Jihyong Oh, Munchurl Kim
Abstract	Recent modern displays are now able to render high dynamic range (HDR), high resolution (HR) videos of up to 8K UHD (Ultra High Definition). Consequently, UHD HDR broadcasting and streaming have emerged as high quality premium services. However, due to the lack of original UHD HDR video content, appropriate conversion technologies are urgently needed to transform the legacy low resolution (LR) standard dynamic range (SDR) videos into UHD HDR versions. In this paper, we propose a joint super-resolution (SR) and inverse tone-mapping (ITM) framework, called Deep SR-ITM, which learns the direct mapping from LR SDR video to their HR HDR version. Joint SR and ITM is an intricate task, where high frequency details must be restored for SR, jointly with the local contrast, for ITM. Our network is able to restore fine details by decomposing the input image and focusing on the separate base (low frequency) and detail (high frequency) layers. Moreover, the proposed modulation blocks apply location-variant operations to enhance local contrast. The Deep SR-ITM shows good subjective quality with increased contrast and details, outperforming the previous joint SR-ITM method.
Tasks	Super-Resolution
Published	2019-04-25
URL	https://arxiv.org/abs/1904.11176v3
PDF	https://arxiv.org/pdf/1904.11176v3.pdf
PWC	https://paperswithcode.com/paper/deep-sr-itm-joint-learning-of-super
Repo
Framework

Automated Detection of Pre-Disaster Building Images from Google Street View


Title	Automated Detection of Pre-Disaster Building Images from Google Street View
Authors	Chul Min Yeum, Ali Lenjani, Shirley J. Dyke, Ilias Bilionis
Abstract	After a disaster, teams of structural engineers collect vast amounts of images from damaged buildings to obtain lessons and gain knowledge from the event. Images of damaged buildings and components provide valuable evidence to understand the consequences on our structures. However, in many cases, images of damaged buildings are often captured without sufficient spatial context. Also, they may be hard to recognize in cases with severe damage. Incorporating past images showing a pre-disaster condition of such buildings is helpful to accurately evaluate possible circumstances related to a building’s failure. One of the best resources to observe the pre-disaster condition of the buildings is Google Street View. A sequence of 360 panorama images which are captured along streets enables all-around views at each location on the street. Once a user knows the GPS information near the building, all external views of the building can be made available. In this study, we develop an automated technique to extract past building images from 360 panorama images serviced by Google Street View. Users only need to provide a geo-tagged image, collected near the target building, and the rest of the process is fully automated. High-quality and undistorted building images are extracted from past panoramas. Since the panoramas are collected from various locations near the building along the street, the user can identify its pre-disaster conditions from the full set of external views.
Tasks
Published	2019-02-13
URL	http://arxiv.org/abs/1902.10816v1
PDF	http://arxiv.org/pdf/1902.10816v1.pdf
PWC	https://paperswithcode.com/paper/automated-detection-of-pre-disaster-building
Repo
Framework

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity


Title	Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity
Authors	Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang
Abstract	Zeroth-order (gradient-free) method is a class of powerful optimization tool for many machine learning problems because it only needs function values (not gradient) in the optimization. In particular, zeroth-order method is very suitable for many complex problems such as black-box attacks and bandit feedback, whose explicit gradients are difficult or infeasible to obtain. Recently, although many zeroth-order methods have been developed, these approaches still exist two main drawbacks: 1) high function query complexity; 2) not being well suitable for solving the problems with complex penalties and constraints. To address these challenging drawbacks, in this paper, we propose a novel fast zeroth-order stochastic alternating direction method of multipliers (ADMM) method (\emph{i.e.}, ZO-SPIDER-ADMM) with lower function query complexity for solving nonconvex problems with multiple nonsmooth penalties. Moreover, we prove that our ZO-SPIDER-ADMM has the optimal function query complexity of $O(dn + dn^{\frac{1}{2}}\epsilon^{-1})$ for finding an $\epsilon$-approximate local solution, where $n$ and $d$ denote the sample size and dimension of data, respectively. In particular, the ZO-SPIDER-ADMM improves the existing best nonconvex zeroth-order ADMM methods by a factor of $O(d^{\frac{1}{3}}n^{\frac{1}{6}})$. Moreover, we propose a fast online ZO-SPIDER-ADMM (\emph{i.e.,} ZOO-SPIDER-ADMM). Our theoretical analysis shows that the ZOO-SPIDER-ADMM has the function query complexity of $O(d\epsilon^{-\frac{3}{2}})$, which improves the existing best result by a factor of $O(\epsilon^{-\frac{1}{2}})$. Finally, we utilize a task of structured adversarial attack on black-box deep neural networks to demonstrate the efficiency of our algorithms.
Tasks	Adversarial Attack
Published	2019-07-30
URL	https://arxiv.org/abs/1907.13463v1
PDF	https://arxiv.org/pdf/1907.13463v1.pdf
PWC	https://paperswithcode.com/paper/nonconvex-zeroth-order-stochastic-admm
Repo
Framework

Neural eliminators and classifiers


Title	Neural eliminators and classifiers
Authors	Włodzisław Duch, Rafał Adamczak, Yoichi Hayashi
Abstract	Classification may not be reliable for several reasons: noise in the data, insufficient input information, overlapping distributions and sharp definition of classes. Faced with several possibilities neural network may in such cases still be useful if instead of a classification elimination of improbable classes is done. Eliminators may be constructed using classifiers assigning new cases to a pool of several classes instead of just one winning class. Elimination may be done with the help of several classifiers using modified error functions. A real life medical application of neural network is presented illustrating the usefulness of elimination.
Tasks
Published	2019-01-28
URL	http://arxiv.org/abs/1901.09632v1
PDF	http://arxiv.org/pdf/1901.09632v1.pdf
PWC	https://paperswithcode.com/paper/neural-eliminators-and-classifiers
Repo
Framework

Optimal, Truthful, and Private Securities Lending


Title	Optimal, Truthful, and Private Securities Lending
Authors	Emily Diana, Michael Kearns, Seth Neel, Aaron Roth
Abstract	We consider a fundamental dynamic allocation problem motivated by the problem of $\textit{securities lending}$ in financial markets, the mechanism underlying the short selling of stocks. A lender would like to distribute a finite number of identical copies of some scarce resource to $n$ clients, each of whom has a private demand that is unknown to the lender. The lender would like to maximize the usage of the resource $\mbox{—}$ avoiding allocating more to a client than her true demand $\mbox{—}$ but is constrained to sell the resource at a pre-specified price per unit, and thus cannot use prices to incentivize truthful reporting. We first show that the Bayesian optimal algorithm for the one-shot problem $\mbox{—}$ which maximizes the resource’s expected usage according to the posterior expectation of demand, given reports $\mbox{—}$ actually incentivizes truthful reporting as a dominant strategy. Because true demands in the securities lending problem are often sensitive information that the client would like to hide from competitors, we then consider the problem under the additional desideratum of (joint) differential privacy. We give an algorithm, based on simple dynamics for computing market equilibria, that is simultaneously private, approximately optimal, and approximately dominant-strategy truthful. Finally, we leverage this private algorithm to construct an approximately truthful, optimal mechanism for the extensive form multi-round auction where the lender does not have access to the true joint distributions between clients’ requests and demands.
Tasks
Published	2019-12-12
URL	https://arxiv.org/abs/1912.06202v1
PDF	https://arxiv.org/pdf/1912.06202v1.pdf
PWC	https://paperswithcode.com/paper/optimal-truthful-and-private-securities
Repo
Framework

Color Constancy Convolutional Autoencoder


Title	Color Constancy Convolutional Autoencoder
Authors	Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis, Jarno Nikkanen, Moncef Gabbouj
Abstract	In this paper, we study the importance of pre-training for the generalization capability in the color constancy problem. We propose two novel approaches based on convolutional autoencoders: an unsupervised pre-training algorithm using a fine-tuned encoder and a semi-supervised pre-training algorithm using a novel composite-loss function. This enables us to solve the data scarcity problem and achieve competitive, to the state-of-the-art, results while requiring much fewer parameters on ColorChecker RECommended dataset. We further study the over-fitting phenomenon on the recently introduced version of INTEL-TUT Dataset for Camera Invariant Color Constancy Research, which has both field and non-field scenes acquired by three different camera models.
Tasks	Color Constancy
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01340v1
PDF	https://arxiv.org/pdf/1906.01340v1.pdf
PWC	https://paperswithcode.com/paper/color-constancy-convolutional-autoencoder
Repo
Framework

On the Design of Black-box Adversarial Examples by Leveraging Gradient-free Optimization and Operator Splitting Method


Title	On the Design of Black-box Adversarial Examples by Leveraging Gradient-free Optimization and Operator Splitting Method
Authors	Pu Zhao, Sijia Liu, Pin-Yu Chen, Nghia Hoang, Kaidi Xu, Bhavya Kailkhura, Xue Lin
Abstract	Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model’s feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.
Tasks	Adversarial Attack, Image Classification
Published	2019-07-26
URL	https://arxiv.org/abs/1907.11684v4
PDF	https://arxiv.org/pdf/1907.11684v4.pdf
PWC	https://paperswithcode.com/paper/on-the-design-of-black-box-adversarial
Repo
Framework