Paper Group AWR 115
High-Resolution Deep Convolutional Generative Adversarial Networks. Probabilistic programs for inferring the goals of autonomous agents. Receptive Field Block Net for Accurate and Fast Object Detection. Cost-complexity pruning of random forests. Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-tuning. Advers …
High-Resolution Deep Convolutional Generative Adversarial Networks
Title | High-Resolution Deep Convolutional Generative Adversarial Networks |
Authors | Joachim D. Curtó, Irene C. Zarza, Fernando De La Torre, Irwin King, Michael R. Lyu |
Abstract | Generative Adversarial Networks (GANs) convergence in a high-resolution setting with a computational constrain of GPU memory capacity (from 12GB to 24 GB) has been beset with difficulty due to the known lack of convergence rate stability. In order to boost network convergence of DCGAN (Deep Convolutional Generative Adversarial Networks) and achieve good-looking high-resolution results we propose a new layered network structure, HDCGAN, that incorporates current state-of-the-art techniques for this effect. A novel dataset, Curt'o & Zarza, containing human faces from different ethnical groups in a wide variety of illumination conditions and image resolutions is introduced. Curt'o is enhanced with HDCGAN synthetic images, thus being the first GAN augmented face dataset. We conduct extensive experiments on CelebA (MS-SSIM 0.1978 and Distance of Fr'echet 8.77) and Curt'o. |
Tasks | |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06491v12 |
http://arxiv.org/pdf/1711.06491v12.pdf | |
PWC | https://paperswithcode.com/paper/high-resolution-deep-convolutional-generative |
Repo | https://github.com/curto2/cz |
Framework | none |
Probabilistic programs for inferring the goals of autonomous agents
Title | Probabilistic programs for inferring the goals of autonomous agents |
Authors | Marco F. Cusumano-Towner, Alexey Radul, David Wingate, Vikash K. Mansinghka |
Abstract | Intelligent systems sometimes need to infer the probable goals of people, cars, and robots, based on partial observations of their motion. This paper introduces a class of probabilistic programs for formulating and solving these problems. The formulation uses randomized path planning algorithms as the basis for probabilistic models of the process by which autonomous agents plan to achieve their goals. Because these path planning algorithms do not have tractable likelihood functions, new inference algorithms are needed. This paper proposes two Monte Carlo techniques for these “likelihood-free” models, one of which can use likelihood estimates from neural networks to accelerate inference. The paper demonstrates efficacy on three simple examples, each using under 50 lines of probabilistic code. |
Tasks | |
Published | 2017-04-17 |
URL | http://arxiv.org/abs/1704.04977v2 |
http://arxiv.org/pdf/1704.04977v2.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-programs-for-inferring-the |
Repo | https://github.com/cimat-ris/GoalInferenceGAN |
Framework | none |
Receptive Field Block Net for Accurate and Fast Object Detection
Title | Receptive Field Block Net for Accurate and Fast Object Detection |
Authors | Songtao Liu, Di Huang, Yunhong Wang |
Abstract | Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and Inception, benefiting from their powerful feature representations but suffering from high computational costs. Conversely, some lightweight model based detectors fulfil real time processing, while their accuracies are often criticized. In this paper, we explore an alternative to build a fast and accurate detector by strengthening lightweight features using a hand-crafted mechanism. Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the feature discriminability and robustness. We further assemble RFB to the top of SSD, constructing the RFB Net detector. To evaluate its effectiveness, experiments are conducted on two major benchmarks and the results show that RFB Net is able to reach the performance of advanced very deep detectors while keeping the real-time speed. Code is available at https://github.com/ruinmessi/RFBNet. |
Tasks | Object Detection, Real-Time Object Detection |
Published | 2017-11-21 |
URL | http://arxiv.org/abs/1711.07767v3 |
http://arxiv.org/pdf/1711.07767v3.pdf | |
PWC | https://paperswithcode.com/paper/receptive-field-block-net-for-accurate-and |
Repo | https://github.com/ZTao-z/multiflow-resnet-ssd |
Framework | pytorch |
Cost-complexity pruning of random forests
Title | Cost-complexity pruning of random forests |
Authors | Kiran Bangalore Ravi, Jean Serra |
Abstract | Random forests perform bootstrap-aggregation by sampling the training samples with replacement. This enables the evaluation of out-of-bag error which serves as a internal cross-validation mechanism. Our motivation lies in using the unsampled training samples to improve each decision tree in the ensemble. We study the effect of using the out-of-bag samples to improve the generalization error first of the decision trees and second the random forest by post-pruning. A preliminary empirical study on four UCI repository datasets show consistent decrease in the size of the forests without considerable loss in accuracy. |
Tasks | |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.05430v2 |
http://arxiv.org/pdf/1703.05430v2.pdf | |
PWC | https://paperswithcode.com/paper/cost-complexity-pruning-of-random-forests |
Repo | https://github.com/beedotkiran/randomforestpruning-ismm-2017 |
Framework | none |
Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-tuning
Title | Borrowing Treasures from the Wealthy: Deep Transfer Learning through Selective Joint Fine-tuning |
Authors | Weifeng Ge, Yizhou Yu |
Abstract | Deep neural networks require a large amount of labeled training data during supervised learning. However, collecting and labeling so much data might be infeasible in many cases. In this paper, we introduce a source-target selective joint fine-tuning scheme for improving the performance of deep learning tasks with insufficient training data. In this scheme, a target learning task with insufficient training data is carried out simultaneously with another source learning task with abundant training data. However, the source learning task does not use all existing training data. Our core idea is to identify and use a subset of training images from the original source learning task whose low-level characteristics are similar to those from the target learning task, and jointly fine-tune shared convolutional layers for both tasks. Specifically, we compute descriptors from linear or nonlinear filter bank responses on training images from both tasks, and use such descriptors to search for a desired subset of training samples for the source learning task. Experiments demonstrate that our selective joint fine-tuning scheme achieves state-of-the-art performance on multiple visual classification tasks with insufficient training data for deep learning. Such tasks include Caltech 256, MIT Indoor 67, Oxford Flowers 102 and Stanford Dogs 120. In comparison to fine-tuning without a source domain, the proposed method can improve the classification accuracy by 2% - 10% using a single model. |
Tasks | Transfer Learning |
Published | 2017-02-28 |
URL | http://arxiv.org/abs/1702.08690v2 |
http://arxiv.org/pdf/1702.08690v2.pdf | |
PWC | https://paperswithcode.com/paper/borrowing-treasures-from-the-wealthy-deep |
Repo | https://github.com/ZYYSzj/Selective-Joint-Fine-tuning |
Framework | none |
Adversary Detection in Neural Networks via Persistent Homology
Title | Adversary Detection in Neural Networks via Persistent Homology |
Authors | Thomas Gebhart, Paul Schrater |
Abstract | We outline a detection method for adversarial inputs to deep neural networks. By viewing neural network computations as graphs upon which information flows from input space to out- put distribution, we compare the differences in graphs induced by different inputs. Specifically, by applying persistent homology to these induced graphs, we observe that the structure of the most persistent subgraphs which generate the first homology group differ between adversarial and unperturbed inputs. Based on this observation, we build a detection algorithm that depends only on the topological information extracted during training. We test our algorithm on MNIST and achieve 98% detection adversary accuracy with F1-score 0.98. |
Tasks | |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10056v1 |
http://arxiv.org/pdf/1711.10056v1.pdf | |
PWC | https://paperswithcode.com/paper/adversary-detection-in-neural-networks-via |
Repo | https://github.com/tgebhart/tf_activation |
Framework | tf |
Foolbox: A Python toolbox to benchmark the robustness of machine learning models
Title | Foolbox: A Python toolbox to benchmark the robustness of machine learning models |
Authors | Jonas Rauber, Wieland Brendel, Matthias Bethge |
Abstract | Even todays most advanced machine learning models are easily fooled by almost imperceptible perturbations of their inputs. Foolbox is a new Python package to generate such adversarial perturbations and to quantify and compare the robustness of machine learning models. It is build around the idea that the most comparable robustness measure is the minimum perturbation needed to craft an adversarial example. To this end, Foolbox provides reference implementations of most published adversarial attack methods alongside some new ones, all of which perform internal hyperparameter tuning to find the minimum adversarial perturbation. Additionally, Foolbox interfaces with most popular deep learning frameworks such as PyTorch, Keras, TensorFlow, Theano and MXNet and allows different adversarial criteria such as targeted misclassification and top-k misclassification as well as different distance measures. The code is licensed under the MIT license and is openly available at https://github.com/bethgelab/foolbox . The most up-to-date documentation can be found at http://foolbox.readthedocs.io . |
Tasks | Adversarial Attack |
Published | 2017-07-13 |
URL | http://arxiv.org/abs/1707.04131v3 |
http://arxiv.org/pdf/1707.04131v3.pdf | |
PWC | https://paperswithcode.com/paper/foolbox-a-python-toolbox-to-benchmark-the |
Repo | https://github.com/bethgelab/foolbox |
Framework | tf |
Translation-based Recommendation
Title | Translation-based Recommendation |
Authors | Ruining He, Wang-Cheng Kang, Julian McAuley |
Abstract | Modeling the complex interactions between users and items as well as amongst items themselves is at the core of designing successful recommender systems. One classical setting is predicting users’ personalized sequential behavior (or next-item' recommendation), where the challenges mainly lie in modeling third-order’ interactions between a user, her previously visited item(s), and the next item to consume. Existing methods typically decompose these higher-order interactions into a combination of pairwise relationships, by way of which user preferences (user-item interactions) and sequential patterns (item-item interactions) are captured by separate components. In this paper, we propose a unified method, TransRec, to model such third-order relationships for large-scale sequential prediction. Methodologically, we embed items into a `transition space’ where users are modeled as translation vectors operating on item sequences. Empirically, this approach outperforms the state-of-the-art on a wide spectrum of real-world datasets. Data and code are available at https://sites.google.com/a/eng.ucsd.edu/ruining-he/. | |
Tasks | Recommendation Systems |
Published | 2017-07-08 |
URL | https://arxiv.org/abs/1707.02410v1 |
https://arxiv.org/pdf/1707.02410v1.pdf | |
PWC | https://paperswithcode.com/paper/translation-based-recommendation |
Repo | https://github.com/YifanZhou95/Translation-based-Recommendation |
Framework | none |
Adversarial Examples: Attacks and Defenses for Deep Learning
Title | Adversarial Examples: Attacks and Defenses for Deep Learning |
Authors | Xiaoyong Yuan, Pan He, Qile Zhu, Xiaolin Li |
Abstract | With rapid progress and significant successes in a wide spectrum of applications, deep learning is being applied in many safety-critical environments. However, deep neural networks have been recently found vulnerable to well-designed input samples, called adversarial examples. Adversarial examples are imperceptible to human but can easily fool deep neural networks in the testing/deploying stage. The vulnerability to adversarial examples becomes one of the major risks for applying deep neural networks in safety-critical environments. Therefore, attacks and defenses on adversarial examples draw great attention. In this paper, we review recent findings on adversarial examples for deep neural networks, summarize the methods for generating adversarial examples, and propose a taxonomy of these methods. Under the taxonomy, applications for adversarial examples are investigated. We further elaborate on countermeasures for adversarial examples and explore the challenges and the potential solutions. |
Tasks | |
Published | 2017-12-19 |
URL | http://arxiv.org/abs/1712.07107v3 |
http://arxiv.org/pdf/1712.07107v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-examples-attacks-and-defenses-for |
Repo | https://github.com/revbucket/mister_ed |
Framework | pytorch |
Analyzing Cloud Optical Properties Using Sky Cameras
Title | Analyzing Cloud Optical Properties Using Sky Cameras |
Authors | Shilpa Manandhar, Soumyabrata Dev, Yee Hui Lee, Yu Song Meng |
Abstract | Clouds play a significant role in the fluctuation of solar radiation received by the earth’s surface. It is important to study the various cloud properties, as it impacts the total solar irradiance falling on the earth’s surface. One of such important optical properties of the cloud is the Cloud Optical Thickness (COT). It is defined with the amount of light that can pass through the clouds. The COT values are generally obtained from satellite images. However, satellite images have a low temporal- and spatial- resolutions; and are not suitable for study in applications as solar energy generation and forecasting. Therefore, ground-based sky cameras are now getting popular in such fields. In this paper, we analyze the cloud optical thickness value, from the ground-based sky cameras, and provide future research directions. |
Tasks | |
Published | 2017-08-24 |
URL | http://arxiv.org/abs/1708.08995v1 |
http://arxiv.org/pdf/1708.08995v1.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-cloud-optical-properties-using-sky |
Repo | https://github.com/Soumyabrata/cloud-optical-thickness |
Framework | none |
Monaural Audio Speaker Separation with Source Contrastive Estimation
Title | Monaural Audio Speaker Separation with Source Contrastive Estimation |
Authors | Cory Stephenson, Patrick Callier, Abhinav Ganesh, Karl Ni |
Abstract | We propose an algorithm to separate simultaneously speaking persons from each other, the “cocktail party problem”, using a single microphone. Our approach involves a deep recurrent neural networks regression to a vector space that is descriptive of independent speakers. Such a vector space can embed empirically determined speaker characteristics and is optimized by distinguishing between speaker masks. We call this technique source-contrastive estimation. The methodology is inspired by negative sampling, which has seen success in natural language processing, where an embedding is learned by correlating and de-correlating a given input vector with output weights. Although the matrix determined by the output weights is dependent on a set of known speakers, we only use the input vectors during inference. Doing so will ensure that source separation is explicitly speaker-independent. Our approach is similar to recent deep neural network clustering and permutation-invariant training research; we use weighted spectral features and masks to augment individual speaker frequencies while filtering out other speakers. We avoid, however, the severe computational burden of other approaches with our technique. Furthermore, by training a vector space rather than combinations of different speakers or differences thereof, we avoid the so-called permutation problem during training. Our algorithm offers an intuitive, computationally efficient response to the cocktail party problem, and most importantly boasts better empirical performance than other current techniques. |
Tasks | Speaker Separation |
Published | 2017-05-12 |
URL | http://arxiv.org/abs/1705.04662v1 |
http://arxiv.org/pdf/1705.04662v1.pdf | |
PWC | https://paperswithcode.com/paper/monaural-audio-speaker-separation-with-source |
Repo | https://github.com/lab41/magnolia |
Framework | none |
Neural Vector Spaces for Unsupervised Information Retrieval
Title | Neural Vector Spaces for Unsupervised Information Retrieval |
Authors | Christophe Van Gysel, Maarten de Rijke, Evangelos Kanoulas |
Abstract | We propose the Neural Vector Space Model (NVSM), a method that learns representations of documents in an unsupervised manner for news article retrieval. In the NVSM paradigm, we learn low-dimensional representations of words and documents from scratch using gradient descent and rank documents according to their similarity with query representations that are composed from word representations. We show that NVSM performs better at document ranking than existing latent semantic vector space methods. The addition of NVSM to a mixture of lexical language models and a state-of-the-art baseline vector space model yields a statistically significant increase in retrieval effectiveness. Consequently, NVSM adds a complementary relevance signal. Next to semantic matching, we find that NVSM performs well in cases where lexical matching is needed. NVSM learns a notion of term specificity directly from the document collection without feature engineering. We also show that NVSM learns regularities related to Luhn significance. Finally, we give advice on how to deploy NVSM in situations where model selection (e.g., cross-validation) is infeasible. We find that an unsupervised ensemble of multiple models trained with different hyperparameter values performs better than a single cross-validated model. Therefore, NVSM can safely be used for ranking documents without supervised relevance judgments. |
Tasks | Document Ranking, Feature Engineering, Information Retrieval, Model Selection |
Published | 2017-08-09 |
URL | http://arxiv.org/abs/1708.02702v4 |
http://arxiv.org/pdf/1708.02702v4.pdf | |
PWC | https://paperswithcode.com/paper/neural-vector-spaces-for-unsupervised |
Repo | https://github.com/rodgzilla/NVSM_pytorch |
Framework | pytorch |
RNN-based counterfactual prediction
Title | RNN-based counterfactual prediction |
Authors | Jason Poulos |
Abstract | This paper proposes an alternative to the synthetic control method (SCM) for estimating the effect of a policy intervention on an outcome over time. Recurrent neural networks (RNNs) are used to predict the counterfactual outcomes of treated units using only the outcomes of control units as predictors. This approach is less susceptible to $p$-hacking because it does not require the researcher to choose predictors or pre-intervention covariates to construct the synthetic control. RNNs do not assume a functional form, can learn nonconvex combinations of control units, and are specifically structured to exploit temporal dependencies in sequential data. I apply the approach to the problem of estimating the long-run impacts of U.S. homestead policy on public school spending. |
Tasks | Causal Inference, Counterfactual Inference, Time Series, Time Series Prediction |
Published | 2017-12-10 |
URL | http://arxiv.org/abs/1712.03553v5 |
http://arxiv.org/pdf/1712.03553v5.pdf | |
PWC | https://paperswithcode.com/paper/rnn-based-counterfactual-time-series |
Repo | https://github.com/jvpoulos/rnns-causal |
Framework | tf |
SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks
Title | SketchParse : Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks |
Authors | Ravi Kiran Sarvadevabhatla, Isht Dwivedi, Abhijat Biswas, Sahil Manocha, R. Venkatesh Babu |
Abstract | The ability to semantically interpret hand-drawn line sketches, although very challenging, can pave way for novel applications in multimedia. We propose SketchParse, the first deep-network architecture for fully automatic parsing of freehand object sketches. SketchParse is configured as a two-level fully convolutional network. The first level contains shared layers common to all object categories. The second level contains a number of expert sub-networks. Each expert specializes in parsing sketches from object categories which contain structurally similar parts. Effectively, the two-level configuration enables our architecture to scale up efficiently as additional categories are added. We introduce a router layer which (i) relays sketch features from shared layers to the correct expert (ii) eliminates the need to manually specify object category during inference. To bypass laborious part-level annotation, we sketchify photos from semantic object-part image datasets and use them for training. Our architecture also incorporates object pose prediction as a novel auxiliary task which boosts overall performance while providing supplementary information regarding the sketch. We demonstrate SketchParse’s abilities (i) on two challenging large-scale sketch datasets (ii) in parsing unseen, semantically related object categories (iii) in improving fine-grained sketch-based image retrieval. As a novel application, we also outline how SketchParse’s output can be used to generate caption-style descriptions for hand-drawn sketches. |
Tasks | Image Retrieval, Pose Prediction, Sketch-Based Image Retrieval |
Published | 2017-09-05 |
URL | http://arxiv.org/abs/1709.01295v1 |
http://arxiv.org/pdf/1709.01295v1.pdf | |
PWC | https://paperswithcode.com/paper/sketchparse-towards-rich-descriptions-for |
Repo | https://github.com/val-iisc/sketch-parse |
Framework | pytorch |
Semi-Supervised Active Clustering with Weak Oracles
Title | Semi-Supervised Active Clustering with Weak Oracles |
Authors | Taewan Kim, Joydeep Ghosh |
Abstract | Semi-supervised active clustering (SSAC) utilizes the knowledge of a domain expert to cluster data points by interactively making pairwise “same-cluster” queries. However, it is impractical to ask human oracles to answer every pairwise query. In this paper, we study the influence of allowing “not-sure” answers from a weak oracle and propose algorithms to efficiently handle uncertainties. Different types of model assumptions are analyzed to cover realistic scenarios of oracle abstraction. In the first model, random-weak oracle, an oracle randomly abstains with a certain probability. We also proposed two distance-weak oracle models which simulate the case of getting confused based on the distance between two points in a pairwise query. For each weak oracle model, we show that a small query complexity is adequate for the effective $k$ means clustering with high probability. Sufficient conditions for the guarantee include a $\gamma$-margin property of the data, and an existence of a point close to each cluster center. Furthermore, we provide a sample complexity with a reduced effect of the cluster’s margin and only a logarithmic dependency on the data dimension. Our results allow significantly less number of same-cluster queries if the margin of the clusters is tight, i.e. $\gamma \approx 1$. Experimental results on synthetic data show the effective performance of our approach in overcoming uncertainties. |
Tasks | |
Published | 2017-09-11 |
URL | http://arxiv.org/abs/1709.03202v1 |
http://arxiv.org/pdf/1709.03202v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-active-clustering-with-weak |
Repo | https://github.com/twankim/weaksemi |
Framework | none |