Paper Group ANR 1197
Detecting Colorized Images via Convolutional Neural Networks: Toward High Accuracy and Good Generalization. Pixelated Semantic Colorization. Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning. Detecting Repeating Objects using Patch Correlation Analysis. Sum-of-squares meets square loss: Fast rat …
Detecting Colorized Images via Convolutional Neural Networks: Toward High Accuracy and Good Generalization
Title | Detecting Colorized Images via Convolutional Neural Networks: Toward High Accuracy and Good Generalization |
Authors | Weize Quan, Dong-Ming Yan, Kai Wang, Xiaopeng Zhang, Denis Pellerin |
Abstract | Image colorization achieves more and more realistic results with the increasing computation power of recent deep learning techniques. It becomes more difficult to identify the fake colorized images by human eyes. In this work, we propose a novel forensic method to distinguish between natural images (NIs) and colorized images (CIs) based on convolutional neural network (CNN). Our method is able to achieve high classification accuracy and cope with the challenging scenario of blind detection, i.e., no training sample is available from “unknown” colorization algorithm that we may encounter during the testing phase. This blind detection performance can be regarded as a generalization performance. First, we design and implement a base network, which can attain better performance in terms of classification accuracy and generalization (in most cases) compared with state-of-the-art methods. Furthermore, we design a new branch, which analyzes smaller regions of extracted features, and insert it into the above base network. Consequently, our network can not only improve the classification accuracy, but also enhance the generalization in the vast majority of cases. To further improve the performance of blind detection, we propose to automatically construct negative samples through linear interpolation of paired natural and colorized images. Then, we progressively insert these negative samples into the original training dataset and continue to train the network. Experimental results demonstrate that our method can achieve stable and high generalization performance when tested against different state-of-the-art colorization algorithms. |
Tasks | Colorization |
Published | 2019-02-17 |
URL | http://arxiv.org/abs/1902.06222v1 |
http://arxiv.org/pdf/1902.06222v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-colorized-images-via-convolutional |
Repo | |
Framework | |
Pixelated Semantic Colorization
Title | Pixelated Semantic Colorization |
Authors | Jiaojiao Zhao, Jungong Han, Ling Shao, Cees G. M. Snoek |
Abstract | While many image colorization algorithms have recently shown the capability of producing plausible color versions from gray-scale photographs, they still suffer from limited semantic understanding. To address this shortcoming, we propose to exploit pixelated object semantics to guide image colorization. The rationale is that human beings perceive and distinguish colors based on the semantic categories of objects. Starting from an autoregressive model, we generate image color distributions, from which diverse colored results are sampled. We propose two ways to incorporate object semantics into the colorization model: through a pixelated semantic embedding and a pixelated semantic generator. Specifically, the proposed convolutional neural network includes two branches. One branch learns what the object is, while the other branch learns the object colors. The network jointly optimizes a color embedding loss, a semantic segmentation loss and a color generation loss, in an end-to-end fashion. Experiments on PASCAL VOC2012 and COCO-stuff reveal that our network, when trained with semantic segmentation labels, produces more realistic and finer results compared to the colorization state-of-the-art. |
Tasks | Colorization, Semantic Segmentation |
Published | 2019-01-27 |
URL | http://arxiv.org/abs/1901.10889v2 |
http://arxiv.org/pdf/1901.10889v2.pdf | |
PWC | https://paperswithcode.com/paper/pixelated-semantic-colorization |
Repo | |
Framework | |
Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning
Title | Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning |
Authors | Jérôme Bolte, Edouard Pauwels |
Abstract | Modern problems in AI or in numerical analysis require nonsmooth approaches with a flexible calculus. We introduce generalized derivatives called conservative fields for which we develop a calculus and provide representation formulas. Functions having a conservative field are called path differentiable: convex, concave, Clarke regular and any semialgebraic Lipschitz continuous functions are path differentiable. Using Whitney stratification techniques for semialgebraic and definable sets, our model provides variational formulas for nonsmooth automatic differentiation oracles, as for instance the famous backpropagation algorithm in deep learning. Our differential model is applied to establish the convergence in values of nonsmooth stochastic gradient methods as they are implemented in practice. |
Tasks | |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10300v3 |
https://arxiv.org/pdf/1909.10300v3.pdf | |
PWC | https://paperswithcode.com/paper/190910300 |
Repo | |
Framework | |
Detecting Repeating Objects using Patch Correlation Analysis
Title | Detecting Repeating Objects using Patch Correlation Analysis |
Authors | Inbar Huberman, Raanan Fattal |
Abstract | In this paper we describe a new method for detecting and counting a repeating object in an image. While the method relies on a fairly sophisticated deformable part model, unlike existing techniques it estimates the model parameters in an unsupervised fashion thus alleviating the need for a user-annotated training data and avoiding the associated specificity. This automatic fitting process is carried out by exploiting the recurrence of small image patches associated with the repeating object and analyzing their spatial correlation. The analysis allows us to reject outlier patches, recover the visual and shape parameters of the part model, and detect the object instances efficiently. In order to achieve a practical system which is able to cope with diverse images, we describe a simple and intuitive active-learning procedure that updates the object classification by querying the user on very few carefully chosen marginal classifications. Evaluation of the new method against the state-of-the-art techniques demonstrates its ability to achieve higher accuracy through a better user experience. |
Tasks | Active Learning, Object Classification |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05629v1 |
http://arxiv.org/pdf/1904.05629v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-repeating-objects-using-patch-1 |
Repo | |
Framework | |
Sum-of-squares meets square loss: Fast rates for agnostic tensor completion
Title | Sum-of-squares meets square loss: Fast rates for agnostic tensor completion |
Authors | Dylan J. Foster, Andrej Risteski |
Abstract | We study tensor completion in the agnostic setting. In the classical tensor completion problem, we receive $n$ entries of an unknown rank-$r$ tensor and wish to exactly complete the remaining entries. In agnostic tensor completion, we make no assumption on the rank of the unknown tensor, but attempt to predict unknown entries as well as the best rank-$r$ tensor. For agnostic learning of third-order tensors with the square loss, we give the first polynomial time algorithm that obtains a “fast” (i.e., $O(1/n)$-type) rate improving over the rate obtained by reduction to matrix completion. Our prediction error rate to compete with the best $d\times{}d\times{}d$ tensor of rank-$r$ is $\tilde{O}(r^{2}d^{3/2}/n)$. We also obtain an exact oracle inequality that trades off estimation and approximation error. Our algorithm is based on the degree-six sum-of-squares relaxation of the tensor nuclear norm. The key feature of our analysis is to show that a certain characterization for the subgradient of the tensor nuclear norm can be encoded in the sum-of-squares proof system. This unlocks the standard toolbox for localization of empirical processes under the square loss, and allows us to establish restricted eigenvalue-type guarantees for various tensor regression models, with tensor completion as a special case. The new analysis of the relaxation complements Barak and Moitra (2016), who gave slow rates for agnostic tensor completion, and Potechin and Steurer (2017), who gave exact recovery guarantees for the noiseless setting. Our techniques are user-friendly, and we anticipate that they will find use elsewhere. |
Tasks | Matrix Completion |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13283v1 |
https://arxiv.org/pdf/1905.13283v1.pdf | |
PWC | https://paperswithcode.com/paper/sum-of-squares-meets-square-loss-fast-rates |
Repo | |
Framework | |
Towards Understanding Generalization in Gradient-Based Meta-Learning
Title | Towards Understanding Generalization in Gradient-Based Meta-Learning |
Authors | Simon Guiroy, Vikas Verma, Christopher Pal |
Abstract | In this work we study generalization of neural networks in gradient-based meta-learning by analyzing various properties of the objective landscapes. We experimentally demonstrate that as meta-training progresses, the meta-test solutions, obtained after adapting the meta-train solution of the model, to new tasks via few steps of gradient-based fine-tuning, become flatter, lower in loss, and further away from the meta-train solution. We also show that those meta-test solutions become flatter even as generalization starts to degrade, thus providing an experimental evidence against the correlation between generalization and flat minima in the paradigm of gradient-based meta-leaning. Furthermore, we provide empirical evidence that generalization to new tasks is correlated with the coherence between their adaptation trajectories in parameter space, measured by the average cosine similarity between task-specific trajectory directions, starting from a same meta-train solution. We also show that coherence of meta-test gradients, measured by the average inner product between the task-specific gradient vectors evaluated at meta-train solution, is also correlated with generalization. Based on these observations, we propose a novel regularizer for MAML and provide experimental evidence for its effectiveness. |
Tasks | Meta-Learning |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.07287v1 |
https://arxiv.org/pdf/1907.07287v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-understanding-generalization-in |
Repo | |
Framework | |
Collaborative Self-Attention for Recommender Systems
Title | Collaborative Self-Attention for Recommender Systems |
Authors | Kai-Lang Yao, Wu-Jun Li |
Abstract | Recommender systems (RS), which have been an essential part in a wide range of applications, can be formulated as a matrix completion (MC) problem. To boost the performance of MC, matrix completion with side information, called inductive matrix completion (IMC), was further proposed. In real applications, the factorized version of IMC is more favored due to its efficiency of optimization and implementation. Regarding the factorized version, traditional IMC method can be interpreted as learning an individual representation for each feature, which is independent from each other. Moreover, representations for the same features are shared across all users/items. However, the independent characteristic for features and shared characteristic for the same features across all users/items may limit the expressiveness of the model. The limitation also exists in variants of IMC, such as deep learning based IMC models. To break the limitation, we generalize recent advances of self-attention mechanism to IMC and propose a context-aware model called collaborative self-attention (CSA), which can jointly learn context-aware representations for features and perform inductive matrix completion process. Extensive experiments on three large-scale datasets from real RS applications demonstrate effectiveness of CSA. |
Tasks | Matrix Completion, Recommendation Systems |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.13133v2 |
https://arxiv.org/pdf/1905.13133v2.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-self-attention-for-recommender |
Repo | |
Framework | |
Automated Polysomnography Analysis for Detection of Non-Apneic and Non-Hypopneic Arousals using Feature Engineering and a Bidirectional LSTM Network
Title | Automated Polysomnography Analysis for Detection of Non-Apneic and Non-Hypopneic Arousals using Feature Engineering and a Bidirectional LSTM Network |
Authors | Ali Bahrami Rad, Morteza Zabihi, Zheng Zhao, Moncef Gabbouj, Aggelos K. Katsaggelos, Simo Särkkä |
Abstract | Objective: The aim of this study is to develop an automated classification algorithm for polysomnography (PSG) recordings to detect non-apneic and non-hypopneic arousals. Our particular focus is on detecting the respiratory effort-related arousals (RERAs) which are very subtle respiratory events that do not meet the criteria for apnea or hypopnea, and are more challenging to detect. Methods: The proposed algorithm is based on a bidirectional long short-term memory (BiLSTM) classifier and 465 multi-domain features, extracted from multimodal clinical time series. The features consist of a set of physiology-inspired features (n = 75), obtained by multiple steps of feature selection and expert analysis, and a set of physiology-agnostic features (n = 390), derived from scattering transform. Results: The proposed algorithm is validated on the 2018 PhysioNet challenge dataset. The overall performance in terms of the area under the precision-recall curve (AUPRC) is 0.50 on the hidden test dataset. This result is tied for the second-best score during the follow-up and official phases of the 2018 PhysioNet challenge. Conclusions: The results demonstrate that it is possible to automatically detect subtle non-apneic/non-hypopneic arousal events from PSG recordings. Significance: Automatic detection of subtle respiratory events such as RERAs together with other non-apneic/non-hypopneic arousals will allow detailed annotations of large PSG databases. This contributes to a better retrospective analysis of sleep data, which may also improve the quality of treatment. |
Tasks | Feature Engineering, Feature Selection, Time Series |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.02971v1 |
https://arxiv.org/pdf/1909.02971v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-polysomnography-analysis-for |
Repo | |
Framework | |
Geometry-Aware Video Object Detection for Static Cameras
Title | Geometry-Aware Video Object Detection for Static Cameras |
Authors | Dan Xu, Weidi Xie, Andrew Zisserman |
Abstract | In this paper we propose a geometry-aware model for video object detection. Specifically, we consider the setting that cameras can be well approximated as static, e.g. in video surveillance scenarios, and scene pseudo depth maps can therefore be inferred easily from the object scale on the image plane. We make the following contributions: First, we extend the recent anchor-free detector (CornerNet [17]) to video object detections. In order to exploit the spatial-temporal information while maintaining high efficiency, the proposed model accepts video clips as input, and only makes predictions for the starting and the ending frames, i.e. heatmaps of object bounding box corners and the corresponding embeddings for grouping. Second, to tackle the challenge from scale variations in object detection, scene geometry information, e.g. derived depth maps, is explicitly incorporated into deep networks for multi-scale feature selection and for the network prediction. Third, we validate the proposed architectures on an autonomous driving dataset generated from the Carla simulator [5], and on a real dataset for human detection (DukeMTMC dataset [28]). When comparing with the existing competitive single-stage or two-stage detectors, the proposed geometry-aware spatio-temporal network achieves significantly better results. |
Tasks | Autonomous Driving, Feature Selection, Human Detection, Object Detection, Video Object Detection |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.03140v1 |
https://arxiv.org/pdf/1909.03140v1.pdf | |
PWC | https://paperswithcode.com/paper/geometry-aware-video-object-detection-for |
Repo | |
Framework | |
Barron Spaces and the Compositional Function Spaces for Neural Network Models
Title | Barron Spaces and the Compositional Function Spaces for Neural Network Models |
Authors | Weinan E, Chao Ma, Lei Wu |
Abstract | One of the key issues in the analysis of machine learning models is to identify the appropriate function space for the model. This is the space of functions that the particular machine learning model can approximate with good accuracy, endowed with a natural norm associated with the approximation process. In this paper, we address this issue for two representative neural network models: the two-layer networks and the residual neural networks. We define Barron space and show that it is the right space for two-layer neural network models in the sense that optimal direct and inverse approximation theorems hold for functions in the Barron space. For residual neural network models, we construct the so-called compositional function space, and prove direct and inverse approximation theorems for this space. In addition, we show that the Rademacher complexity has the optimal upper bounds for these spaces. |
Tasks | |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.08039v1 |
https://arxiv.org/pdf/1906.08039v1.pdf | |
PWC | https://paperswithcode.com/paper/barron-spaces-and-the-compositional-function |
Repo | |
Framework | |
On Low-rank Trace Regression under General Sampling Distribution
Title | On Low-rank Trace Regression under General Sampling Distribution |
Authors | Nima Hamidi, Mohsen Bayati |
Abstract | A growing number of modern statistical learning problems involve estimating a large number of parameters from a (smaller) number of noisy observations. In a subset of these problems (matrix completion, matrix compressed sensing, and multi-task learning) the unknown parameters form a high-dimensional matrix B*, and two popular approaches for the estimation are convex relaxation of rank-penalized regression or non-convex optimization. It is also known that these estimators satisfy near optimal error bounds under assumptions on rank, coherence, or spikiness of the unknown matrix. In this paper, we introduce a unifying technique for analyzing all of these problems via both estimators that leads to short proofs for the existing results as well as new results. Specifically, first we introduce a general notion of spikiness for B* and consider a general family of estimators and prove non-asymptotic error bounds for the their estimation error. Our approach relies on a generic recipe to prove restricted strong convexity for the sampling operator of the trace regression. Second, and most notably, we prove similar error bounds when the regularization parameter is chosen via K-fold cross-validation. This result is significant in that existing theory on cross-validated estimators do not apply to our setting since our estimators are not known to satisfy their required notion of stability. Third, we study applications of our general results to four subproblems of (1) matrix completion, (2) multi-task learning, (3) compressed sensing with Gaussian ensembles, and (4) compressed sensing with factored measurements. For (1), (3), and (4) we recover matching error bounds as those found in the literature, and for (2) we obtain (to the best of our knowledge) the first such error bound. We also demonstrate how our frameworks applies to the exact recovery problem in (3) and (4). |
Tasks | Matrix Completion, Multi-Task Learning |
Published | 2019-04-18 |
URL | https://arxiv.org/abs/1904.08576v2 |
https://arxiv.org/pdf/1904.08576v2.pdf | |
PWC | https://paperswithcode.com/paper/on-low-rank-trace-regression-under-general |
Repo | |
Framework | |
Feature Gradients: Scalable Feature Selection via Discrete Relaxation
Title | Feature Gradients: Scalable Feature Selection via Discrete Relaxation |
Authors | Rishit Sheth, Nicolo Fusi |
Abstract | In this paper we introduce Feature Gradients, a gradient-based search algorithm for feature selection. Our approach extends a recent result on the estimation of learnability in the sublinear data regime by showing that the calculation can be performed iteratively (i.e., in mini-batches) and in linear time and space with respect to both the number of features D and the sample size N . This, along with a discrete-to-continuous relaxation of the search domain, allows for an efficient, gradient-based search algorithm among feature subsets for very large datasets. Crucially, our algorithm is capable of finding higher-order correlations between features and targets for both the N > D and N < D regimes, as opposed to approaches that do not consider such interactions and/or only consider one regime. We provide experimental demonstration of the algorithm in small and large sample-and feature-size settings. |
Tasks | Feature Selection |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10382v1 |
https://arxiv.org/pdf/1908.10382v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-gradients-scalable-feature-selection |
Repo | |
Framework | |
A Posteriori Probabilistic Bounds of Convex Scenario Programs with Validation Tests
Title | A Posteriori Probabilistic Bounds of Convex Scenario Programs with Validation Tests |
Authors | Chao Shang, Fengqi You |
Abstract | Scenario programs have established themselves as efficient tools towards decision-making under uncertainty. To assess the quality of scenario-based solutions a posteriori, validation tests based on Bernoulli trials have been widely adopted in practice. However, to reach a theoretically reliable judgement of risk, one typically needs to collect massive validation samples. In this work, we propose new a posteriori bounds for convex scenario programs with validation tests, which are dependent on both realizations of support constraints and performance on out-of-sample validation data. The proposed bounds enjoy wide generality in that many existing theoretical results can be incorporated as particular cases. To facilitate practical use, a systematic approach for parameterizing a posteriori probability bounds is also developed, which is shown to possess a variety of desirable properties allowing for easy implementations and clear interpretations. By synthesizing comprehensive information about support constraints and validation tests, improved risk evaluation can be achieved for randomized solutions in comparison with existing a posteriori bounds. Case studies on controller design of aircraft lateral motion are presented to validate the effectiveness of the proposed a posteriori bounds. |
Tasks | Decision Making, Decision Making Under Uncertainty |
Published | 2019-03-27 |
URL | https://arxiv.org/abs/1903.11734v2 |
https://arxiv.org/pdf/1903.11734v2.pdf | |
PWC | https://paperswithcode.com/paper/posteriori-probabilistic-bounds-of-convex |
Repo | |
Framework | |
Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation
Title | Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation |
Authors | Zih-Siou Hung, Arun Mallya, Svetlana Lazebnik |
Abstract | Relations amongst entities play a central role in image understanding. Due to the complexity of modeling (subject, predicate, object) relation triplets, it is crucial to develop a method that can not only recognize seen relations, but also generalize to unseen cases. Inspired by a previously proposed visual translation embedding model, or VTransE, we propose a context-augmented translation embedding model that can capture both common and rare relations. The previous VTransE model maps entities and predicates into a low-dimensional embedding vector space where the predicate is interpreted as a translation vector between the embedded features of the bounding box regions of the subject and the object. Our model additionally incorporates the contextual information captured by the bounding box of the union of the subject and the object, and learns the embeddings guided by the constraint predicate $\approx$ union (subject, object) $-$ subject $-$ object. In a comprehensive evaluation on multiple challenging benchmarks, our approach outperforms previous translation-based models and comes close to or exceeds the state of the art across a range of settings, from small-scale to large-scale datasets, from common to previously unseen relations. It also achieves promising results for the recently introduced task of scene graph generation. |
Tasks | Graph Generation, Scene Graph Generation |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11624v3 |
https://arxiv.org/pdf/1905.11624v3.pdf | |
PWC | https://paperswithcode.com/paper/union-visual-translation-embedding-for-visual |
Repo | |
Framework | |
Multi-Label Adversarial Perturbations
Title | Multi-Label Adversarial Perturbations |
Authors | Qingquan Song, Haifeng Jin, Xiao Huang, Xia Hu |
Abstract | Adversarial examples are delicately perturbed inputs, which aim to mislead machine learning models towards incorrect outputs. While most of the existing work focuses on generating adversarial perturbations in multi-class classification problems, many real-world applications fall into the multi-label setting in which one instance could be associated with more than one label. For example, a spammer may generate adversarial spams with malicious advertising while maintaining the other labels such as topic labels unchanged. To analyze the vulnerability and robustness of multi-label learning models, we investigate the generation of multi-label adversarial perturbations. This is a challenging task due to the uncertain number of positive labels associated with one instance, as well as the fact that multiple labels are usually not mutually exclusive with each other. To bridge this gap, in this paper, we propose a general attacking framework targeting on multi-label classification problem and conduct a premier analysis on the perturbations for deep neural networks. Leveraging the ranking relationships among labels, we further design a ranking-based framework to attack multi-label ranking algorithms. We specify the connection between the two proposed frameworks and separately design two specific methods grounded on each of them to generate targeted multi-label perturbations. Experiments on real-world multi-label image classification and ranking problems demonstrate the effectiveness of our proposed frameworks and provide insights of the vulnerability of multi-label deep learning models under diverse targeted attacking strategies. Several interesting findings including an unpolished defensive strategy, which could potentially enhance the interpretability and robustness of multi-label deep learning models, are further presented and discussed at the end. |
Tasks | Image Classification, Multi-Label Classification, Multi-Label Learning |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00546v1 |
http://arxiv.org/pdf/1901.00546v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-label-adversarial-perturbations |
Repo | |
Framework | |