July 29, 2019

3001 words 15 mins read

Paper Group AWR 115

High-Resolution Deep Convolutional Generative Adversarial Networks. Probabilistic programs for inferring the goals of autonomous agents. Receptive Field Block Net for Accurate and Fast Object Detection. Cost-complexity pruning of random forests. Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-tuning. Advers …

High-Resolution Deep Convolutional Generative Adversarial Networks


Title	High-Resolution Deep Convolutional Generative Adversarial Networks
Authors	Joachim D. Curtó, Irene C. Zarza, Fernando De La Torre, Irwin King, Michael R. Lyu
Abstract	Generative Adversarial Networks (GANs) convergence in a high-resolution setting with a computational constrain of GPU memory capacity (from 12GB to 24 GB) has been beset with difficulty due to the known lack of convergence rate stability. In order to boost network convergence of DCGAN (Deep Convolutional Generative Adversarial Networks) and achieve good-looking high-resolution results we propose a new layered network structure, HDCGAN, that incorporates current state-of-the-art techniques for this effect. A novel dataset, Curt'o & Zarza, containing human faces from different ethnical groups in a wide variety of illumination conditions and image resolutions is introduced. Curt'o is enhanced with HDCGAN synthetic images, thus being the first GAN augmented face dataset. We conduct extensive experiments on CelebA (MS-SSIM 0.1978 and Distance of Fr'echet 8.77) and Curt'o.
Tasks
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06491v12
PDF	http://arxiv.org/pdf/1711.06491v12.pdf
PWC	https://paperswithcode.com/paper/high-resolution-deep-convolutional-generative
Repo	https://github.com/curto2/cz
Framework	none

Probabilistic programs for inferring the goals of autonomous agents


Title	Probabilistic programs for inferring the goals of autonomous agents
Authors	Marco F. Cusumano-Towner, Alexey Radul, David Wingate, Vikash K. Mansinghka
Abstract	Intelligent systems sometimes need to infer the probable goals of people, cars, and robots, based on partial observations of their motion. This paper introduces a class of probabilistic programs for formulating and solving these problems. The formulation uses randomized path planning algorithms as the basis for probabilistic models of the process by which autonomous agents plan to achieve their goals. Because these path planning algorithms do not have tractable likelihood functions, new inference algorithms are needed. This paper proposes two Monte Carlo techniques for these “likelihood-free” models, one of which can use likelihood estimates from neural networks to accelerate inference. The paper demonstrates efficacy on three simple examples, each using under 50 lines of probabilistic code.
Tasks
Published	2017-04-17
URL	http://arxiv.org/abs/1704.04977v2
PDF	http://arxiv.org/pdf/1704.04977v2.pdf
PWC	https://paperswithcode.com/paper/probabilistic-programs-for-inferring-the
Repo	https://github.com/cimat-ris/GoalInferenceGAN
Framework	none

Receptive Field Block Net for Accurate and Fast Object Detection


Title	Receptive Field Block Net for Accurate and Fast Object Detection
Authors	Songtao Liu, Di Huang, Yunhong Wang
Abstract	Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and Inception, benefiting from their powerful feature representations but suffering from high computational costs. Conversely, some lightweight model based detectors fulfil real time processing, while their accuracies are often criticized. In this paper, we explore an alternative to build a fast and accurate detector by strengthening lightweight features using a hand-crafted mechanism. Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the feature discriminability and robustness. We further assemble RFB to the top of SSD, constructing the RFB Net detector. To evaluate its effectiveness, experiments are conducted on two major benchmarks and the results show that RFB Net is able to reach the performance of advanced very deep detectors while keeping the real-time speed. Code is available at https://github.com/ruinmessi/RFBNet.
Tasks	Object Detection, Real-Time Object Detection
Published	2017-11-21
URL	http://arxiv.org/abs/1711.07767v3
PDF	http://arxiv.org/pdf/1711.07767v3.pdf
PWC	https://paperswithcode.com/paper/receptive-field-block-net-for-accurate-and
Repo	https://github.com/ZTao-z/multiflow-resnet-ssd
Framework	pytorch

Cost-complexity pruning of random forests


Title	Cost-complexity pruning of random forests
Authors	Kiran Bangalore Ravi, Jean Serra
Abstract	Random forests perform bootstrap-aggregation by sampling the training samples with replacement. This enables the evaluation of out-of-bag error which serves as a internal cross-validation mechanism. Our motivation lies in using the unsampled training samples to improve each decision tree in the ensemble. We study the effect of using the out-of-bag samples to improve the generalization error first of the decision trees and second the random forest by post-pruning. A preliminary empirical study on four UCI repository datasets show consistent decrease in the size of the forests without considerable loss in accuracy.
Tasks
Published	2017-03-15
URL	http://arxiv.org/abs/1703.05430v2
PDF	http://arxiv.org/pdf/1703.05430v2.pdf
PWC	https://paperswithcode.com/paper/cost-complexity-pruning-of-random-forests
Repo	https://github.com/beedotkiran/randomforestpruning-ismm-2017
Framework	none

Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-tuning


Title	Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-tuning
Authors	Weifeng Ge, Yizhou Yu
Abstract	Deep neural networks require a large amount of labeled training data during supervised learning. However, collecting and labeling so much data might be infeasible in many cases. In this paper, we introduce a source-target selective joint fine-tuning scheme for improving the performance of deep learning tasks with insufficient training data. In this scheme, a target learning task with insufficient training data is carried out simultaneously with another source learning task with abundant training data. However, the source learning task does not use all existing training data. Our core idea is to identify and use a subset of training images from the original source learning task whose low-level characteristics are similar to those from the target learning task, and jointly fine-tune shared convolutional layers for both tasks. Specifically, we compute descriptors from linear or nonlinear filter bank responses on training images from both tasks, and use such descriptors to search for a desired subset of training samples for the source learning task. Experiments demonstrate that our selective joint fine-tuning scheme achieves state-of-the-art performance on multiple visual classification tasks with insufficient training data for deep learning. Such tasks include Caltech 256, MIT Indoor 67, Oxford Flowers 102 and Stanford Dogs 120. In comparison to fine-tuning without a source domain, the proposed method can improve the classification accuracy by 2% - 10% using a single model.
Tasks	Transfer Learning
Published	2017-02-28
URL	http://arxiv.org/abs/1702.08690v2
PDF	http://arxiv.org/pdf/1702.08690v2.pdf
PWC	https://paperswithcode.com/paper/borrowing-treasures-from-the-wealthy-deep
Repo	https://github.com/ZYYSzj/Selective-Joint-Fine-tuning
Framework	none

Adversary Detection in Neural Networks via Persistent Homology


Title	Adversary Detection in Neural Networks via Persistent Homology
Authors	Thomas Gebhart, Paul Schrater
Abstract	We outline a detection method for adversarial inputs to deep neural networks. By viewing neural network computations as graphs upon which information flows from input space to out- put distribution, we compare the differences in graphs induced by different inputs. Specifically, by applying persistent homology to these induced graphs, we observe that the structure of the most persistent subgraphs which generate the first homology group differ between adversarial and unperturbed inputs. Based on this observation, we build a detection algorithm that depends only on the topological information extracted during training. We test our algorithm on MNIST and achieve 98% detection adversary accuracy with F1-score 0.98.
Tasks
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10056v1
PDF	http://arxiv.org/pdf/1711.10056v1.pdf
PWC	https://paperswithcode.com/paper/adversary-detection-in-neural-networks-via
Repo	https://github.com/tgebhart/tf_activation
Framework	tf

Foolbox: A Python toolbox to benchmark the robustness of machine learning models


Title	Foolbox: A Python toolbox to benchmark the robustness of machine learning models
Authors	Jonas Rauber, Wieland Brendel, Matthias Bethge
Abstract	Even todays most advanced machine learning models are easily fooled by almost imperceptible perturbations of their inputs. Foolbox is a new Python package to generate such adversarial perturbations and to quantify and compare the robustness of machine learning models. It is build around the idea that the most comparable robustness measure is the minimum perturbation needed to craft an adversarial example. To this end, Foolbox provides reference implementations of most published adversarial attack methods alongside some new ones, all of which perform internal hyperparameter tuning to find the minimum adversarial perturbation. Additionally, Foolbox interfaces with most popular deep learning frameworks such as PyTorch, Keras, TensorFlow, Theano and MXNet and allows different adversarial criteria such as targeted misclassification and top-k misclassification as well as different distance measures. The code is licensed under the MIT license and is openly available at https://github.com/bethgelab/foolbox . The most up-to-date documentation can be found at http://foolbox.readthedocs.io .
Tasks	Adversarial Attack
Published	2017-07-13
URL	http://arxiv.org/abs/1707.04131v3
PDF	http://arxiv.org/pdf/1707.04131v3.pdf
PWC	https://paperswithcode.com/paper/foolbox-a-python-toolbox-to-benchmark-the
Repo	https://github.com/bethgelab/foolbox
Framework	tf

Translation-based Recommendation


Title	Translation-based Recommendation
Authors	Ruining He, Wang-Cheng Kang, Julian McAuley
Abstract	Modeling the complex interactions between users and items as well as amongst items themselves is at the core of designing successful recommender systems. One classical setting is predicting users’ personalized sequential behavior (or `next-item' recommendation), where the challenges mainly lie in modeling` third-order’ interactions between a user, her previously visited item(s), and the next item to consume. Existing methods typically decompose these higher-order interactions into a combination of pairwise relationships, by way of which user preferences (user-item interactions) and sequential patterns (item-item interactions) are captured by separate components. In this paper, we propose a unified method, TransRec, to model such third-order relationships for large-scale sequential prediction. Methodologically, we embed items into a `transition space’ where users are modeled as translation vectors operating on item sequences. Empirically, this approach outperforms the state-of-the-art on a wide spectrum of real-world datasets. Data and code are available at https://sites.google.com/a/eng.ucsd.edu/ruining-he/. \|
Tasks	Recommendation Systems
Published	2017-07-08
URL	https://arxiv.org/abs/1707.02410v1
PDF	https://arxiv.org/pdf/1707.02410v1.pdf
PWC	https://paperswithcode.com/paper/translation-based-recommendation
Repo	https://github.com/YifanZhou95/Translation-based-Recommendation
Framework	none

Adversarial Examples: Attacks and Defenses for Deep Learning


Title	Adversarial Examples: Attacks and Defenses for Deep Learning
Authors	Xiaoyong Yuan, Pan He, Qile Zhu, Xiaolin Li
Abstract	With rapid progress and significant successes in a wide spectrum of applications, deep learning is being applied in many safety-critical environments. However, deep neural networks have been recently found vulnerable to well-designed input samples, called adversarial examples. Adversarial examples are imperceptible to human but can easily fool deep neural networks in the testing/deploying stage. The vulnerability to adversarial examples becomes one of the major risks for applying deep neural networks in safety-critical environments. Therefore, attacks and defenses on adversarial examples draw great attention. In this paper, we review recent findings on adversarial examples for deep neural networks, summarize the methods for generating adversarial examples, and propose a taxonomy of these methods. Under the taxonomy, applications for adversarial examples are investigated. We further elaborate on countermeasures for adversarial examples and explore the challenges and the potential solutions.
Tasks
Published	2017-12-19
URL	http://arxiv.org/abs/1712.07107v3
PDF	http://arxiv.org/pdf/1712.07107v3.pdf
PWC	https://paperswithcode.com/paper/adversarial-examples-attacks-and-defenses-for
Repo	https://github.com/revbucket/mister_ed
Framework	pytorch

Analyzing Cloud Optical Properties Using Sky Cameras


Title	Analyzing Cloud Optical Properties Using Sky Cameras
Authors	Shilpa Manandhar, Soumyabrata Dev, Yee Hui Lee, Yu Song Meng
Abstract	Clouds play a significant role in the fluctuation of solar radiation received by the earth’s surface. It is important to study the various cloud properties, as it impacts the total solar irradiance falling on the earth’s surface. One of such important optical properties of the cloud is the Cloud Optical Thickness (COT). It is defined with the amount of light that can pass through the clouds. The COT values are generally obtained from satellite images. However, satellite images have a low temporal- and spatial- resolutions; and are not suitable for study in applications as solar energy generation and forecasting. Therefore, ground-based sky cameras are now getting popular in such fields. In this paper, we analyze the cloud optical thickness value, from the ground-based sky cameras, and provide future research directions.
Tasks
Published	2017-08-24
URL	http://arxiv.org/abs/1708.08995v1
PDF	http://arxiv.org/pdf/1708.08995v1.pdf
PWC	https://paperswithcode.com/paper/analyzing-cloud-optical-properties-using-sky
Repo	https://github.com/Soumyabrata/cloud-optical-thickness
Framework	none

Monaural Audio Speaker Separation with Source Contrastive Estimation


Title	Monaural Audio Speaker Separation with Source Contrastive Estimation
Authors	Cory Stephenson, Patrick Callier, Abhinav Ganesh, Karl Ni
Abstract	We propose an algorithm to separate simultaneously speaking persons from each other, the “cocktail party problem”, using a single microphone. Our approach involves a deep recurrent neural networks regression to a vector space that is descriptive of independent speakers. Such a vector space can embed empirically determined speaker characteristics and is optimized by distinguishing between speaker masks. We call this technique source-contrastive estimation. The methodology is inspired by negative sampling, which has seen success in natural language processing, where an embedding is learned by correlating and de-correlating a given input vector with output weights. Although the matrix determined by the output weights is dependent on a set of known speakers, we only use the input vectors during inference. Doing so will ensure that source separation is explicitly speaker-independent. Our approach is similar to recent deep neural network clustering and permutation-invariant training research; we use weighted spectral features and masks to augment individual speaker frequencies while filtering out other speakers. We avoid, however, the severe computational burden of other approaches with our technique. Furthermore, by training a vector space rather than combinations of different speakers or differences thereof, we avoid the so-called permutation problem during training. Our algorithm offers an intuitive, computationally efficient response to the cocktail party problem, and most importantly boasts better empirical performance than other current techniques.
Tasks	Speaker Separation
Published	2017-05-12
URL	http://arxiv.org/abs/1705.04662v1
PDF	http://arxiv.org/pdf/1705.04662v1.pdf
PWC	https://paperswithcode.com/paper/monaural-audio-speaker-separation-with-source
Repo	https://github.com/lab41/magnolia
Framework	none

Neural Vector Spaces for Unsupervised Information Retrieval


Title	Neural Vector Spaces for Unsupervised Information Retrieval
Authors	Christophe Van Gysel, Maarten de Rijke, Evangelos Kanoulas
Abstract	We propose the Neural Vector Space Model (NVSM), a method that learns representations of documents in an unsupervised manner for news article retrieval. In the NVSM paradigm, we learn low-dimensional representations of words and documents from scratch using gradient descent and rank documents according to their similarity with query representations that are composed from word representations. We show that NVSM performs better at document ranking than existing latent semantic vector space methods. The addition of NVSM to a mixture of lexical language models and a state-of-the-art baseline vector space model yields a statistically significant increase in retrieval effectiveness. Consequently, NVSM adds a complementary relevance signal. Next to semantic matching, we find that NVSM performs well in cases where lexical matching is needed. NVSM learns a notion of term specificity directly from the document collection without feature engineering. We also show that NVSM learns regularities related to Luhn significance. Finally, we give advice on how to deploy NVSM in situations where model selection (e.g., cross-validation) is infeasible. We find that an unsupervised ensemble of multiple models trained with different hyperparameter values performs better than a single cross-validated model. Therefore, NVSM can safely be used for ranking documents without supervised relevance judgments.
Tasks	Document Ranking, Feature Engineering, Information Retrieval, Model Selection
Published	2017-08-09
URL	http://arxiv.org/abs/1708.02702v4
PDF	http://arxiv.org/pdf/1708.02702v4.pdf
PWC	https://paperswithcode.com/paper/neural-vector-spaces-for-unsupervised
Repo	https://github.com/rodgzilla/NVSM_pytorch
Framework	pytorch

RNN-based counterfactual prediction


Title	RNN-based counterfactual prediction
Authors	Jason Poulos
Abstract	This paper proposes an alternative to the synthetic control method (SCM) for estimating the effect of a policy intervention on an outcome over time. Recurrent neural networks (RNNs) are used to predict the counterfactual outcomes of treated units using only the outcomes of control units as predictors. This approach is less susceptible to $p$-hacking because it does not require the researcher to choose predictors or pre-intervention covariates to construct the synthetic control. RNNs do not assume a functional form, can learn nonconvex combinations of control units, and are specifically structured to exploit temporal dependencies in sequential data. I apply the approach to the problem of estimating the long-run impacts of U.S. homestead policy on public school spending.
Tasks	Causal Inference, Counterfactual Inference, Time Series, Time Series Prediction
Published	2017-12-10
URL	http://arxiv.org/abs/1712.03553v5
PDF	http://arxiv.org/pdf/1712.03553v5.pdf
PWC	https://paperswithcode.com/paper/rnn-based-counterfactual-time-series
Repo	https://github.com/jvpoulos/rnns-causal
Framework	tf

SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks


Title	SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks
Authors	Ravi Kiran Sarvadevabhatla, Isht Dwivedi, Abhijat Biswas, Sahil Manocha, R. Venkatesh Babu
Abstract	The ability to semantically interpret hand-drawn line sketches, although very challenging, can pave way for novel applications in multimedia. We propose SketchParse, the first deep-network architecture for fully automatic parsing of freehand object sketches. SketchParse is configured as a two-level fully convolutional network. The first level contains shared layers common to all object categories. The second level contains a number of expert sub-networks. Each expert specializes in parsing sketches from object categories which contain structurally similar parts. Effectively, the two-level configuration enables our architecture to scale up efficiently as additional categories are added. We introduce a router layer which (i) relays sketch features from shared layers to the correct expert (ii) eliminates the need to manually specify object category during inference. To bypass laborious part-level annotation, we sketchify photos from semantic object-part image datasets and use them for training. Our architecture also incorporates object pose prediction as a novel auxiliary task which boosts overall performance while providing supplementary information regarding the sketch. We demonstrate SketchParse’s abilities (i) on two challenging large-scale sketch datasets (ii) in parsing unseen, semantically related object categories (iii) in improving fine-grained sketch-based image retrieval. As a novel application, we also outline how SketchParse’s output can be used to generate caption-style descriptions for hand-drawn sketches.
Tasks	Image Retrieval, Pose Prediction, Sketch-Based Image Retrieval
Published	2017-09-05
URL	http://arxiv.org/abs/1709.01295v1
PDF	http://arxiv.org/pdf/1709.01295v1.pdf
PWC	https://paperswithcode.com/paper/sketchparse-towards-rich-descriptions-for
Repo	https://github.com/val-iisc/sketch-parse
Framework	pytorch

Semi-Supervised Active Clustering with Weak Oracles


Title	Semi-Supervised Active Clustering with Weak Oracles
Authors	Taewan Kim, Joydeep Ghosh
Abstract	Semi-supervised active clustering (SSAC) utilizes the knowledge of a domain expert to cluster data points by interactively making pairwise “same-cluster” queries. However, it is impractical to ask human oracles to answer every pairwise query. In this paper, we study the influence of allowing “not-sure” answers from a weak oracle and propose algorithms to efficiently handle uncertainties. Different types of model assumptions are analyzed to cover realistic scenarios of oracle abstraction. In the first model, random-weak oracle, an oracle randomly abstains with a certain probability. We also proposed two distance-weak oracle models which simulate the case of getting confused based on the distance between two points in a pairwise query. For each weak oracle model, we show that a small query complexity is adequate for the effective $k$ means clustering with high probability. Sufficient conditions for the guarantee include a $\gamma$-margin property of the data, and an existence of a point close to each cluster center. Furthermore, we provide a sample complexity with a reduced effect of the cluster’s margin and only a logarithmic dependency on the data dimension. Our results allow significantly less number of same-cluster queries if the margin of the clusters is tight, i.e. $\gamma \approx 1$. Experimental results on synthetic data show the effective performance of our approach in overcoming uncertainties.
Tasks
Published	2017-09-11
URL	http://arxiv.org/abs/1709.03202v1
PDF	http://arxiv.org/pdf/1709.03202v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-active-clustering-with-weak
Repo	https://github.com/twankim/weaksemi
Framework	none