Paper Group ANR 135
On the regularization of Wasserstein GANs. A representer theorem for deep kernel learning. Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers. Robust Decentralized Learning Using ADMM with Unreliable Agents. Learning to Refine Object Contours with a Top-Down Fully Convolutional Encoder-Decoder Network. EVE: Explainable Vecto …
On the regularization of Wasserstein GANs
Title | On the regularization of Wasserstein GANs |
Authors | Henning Petzka, Asja Fischer, Denis Lukovnicov |
Abstract | Since their invention, generative adversarial networks (GANs) have become a popular approach for learning to model a distribution of real (unlabeled) data. Convergence problems during training are overcome by Wasserstein GANs which minimize the distance between the model and the empirical distribution in terms of a different metric, but thereby introduce a Lipschitz constraint into the optimization problem. A simple way to enforce the Lipschitz constraint on the class of functions, which can be modeled by the neural network, is weight clipping. It was proposed that training can be improved by instead augmenting the loss by a regularization term that penalizes the deviation of the gradient of the critic (as a function of the network’s input) from one. We present theoretical arguments why using a weaker regularization term enforcing the Lipschitz constraint is preferable. These arguments are supported by experimental results on toy data sets. |
Tasks | |
Published | 2017-09-26 |
URL | http://arxiv.org/abs/1709.08894v2 |
http://arxiv.org/pdf/1709.08894v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-regularization-of-wasserstein-gans |
Repo | |
Framework | |
A representer theorem for deep kernel learning
Title | A representer theorem for deep kernel learning |
Authors | Bastian Bohn, Michael Griebel, Christian Rieger |
Abstract | In this paper we provide a finite-sample and an infinite-sample representer theorem for the concatenation of (linear combinations of) kernel functions of reproducing kernel Hilbert spaces. These results serve as mathematical foundation for the analysis of machine learning algorithms based on compositions of functions. As a direct consequence in the finite-sample case, the corresponding infinite-dimensional minimization problems can be recast into (nonlinear) finite-dimensional minimization problems, which can be tackled with nonlinear optimization algorithms. Moreover, we show how concatenated machine learning problems can be reformulated as neural networks and how our representer theorem applies to a broad class of state-of-the-art deep learning methods. |
Tasks | |
Published | 2017-09-29 |
URL | http://arxiv.org/abs/1709.10441v3 |
http://arxiv.org/pdf/1709.10441v3.pdf | |
PWC | https://paperswithcode.com/paper/a-representer-theorem-for-deep-kernel |
Repo | |
Framework | |
Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers
Title | Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers |
Authors | Jacob Steinhardt, Moses Charikar, Gregory Valiant |
Abstract | We introduce a criterion, resilience, which allows properties of a dataset (such as its mean or best low rank approximation) to be robustly computed, even in the presence of a large fraction of arbitrary additional data. Resilience is a weaker condition than most other properties considered so far in the literature, and yet enables robust estimation in a broader variety of settings. We provide new information-theoretic results on robust distribution learning, robust estimation of stochastic block models, and robust mean estimation under bounded $k$th moments. We also provide new algorithmic results on robust distribution learning, as well as robust mean estimation in $\ell_p$-norms. Among our proof techniques is a method for pruning a high-dimensional distribution with bounded $1$st moments to a stable “core” with bounded $2$nd moments, which may be of independent interest. |
Tasks | |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.04940v3 |
http://arxiv.org/pdf/1703.04940v3.pdf | |
PWC | https://paperswithcode.com/paper/resilience-a-criterion-for-learning-in-the |
Repo | |
Framework | |
Robust Decentralized Learning Using ADMM with Unreliable Agents
Title | Robust Decentralized Learning Using ADMM with Unreliable Agents |
Authors | Qunwei Li, Bhavya Kailkhura, Ryan Goldhahn, Priyadip Ray, Pramod K. Varshney |
Abstract | Many machine learning problems can be formulated as consensus optimization problems which can be solved efficiently via a cooperative multi-agent system. However, the agents in the system can be unreliable due to a variety of reasons: noise, faults and attacks. Providing erroneous updates leads the optimization process in a wrong direction, and degrades the performance of distributed machine learning algorithms. This paper considers the problem of decentralized learning using ADMM in the presence of unreliable agents. First, we rigorously analyze the effect of erroneous updates (in ADMM learning iterations) on the convergence behavior of multi-agent system. We show that the algorithm linearly converges to a neighborhood of the optimal solution under certain conditions and characterize the neighborhood size analytically. Next, we provide guidelines for network design to achieve a faster convergence. We also provide conditions on the erroneous updates for exact convergence to the optimal solution. Finally, to mitigate the influence of unreliable agents, we propose \textsf{ROAD}, a robust variant of ADMM, and show its resilience to unreliable agents with an exact convergence to the optimum. |
Tasks | |
Published | 2017-10-14 |
URL | http://arxiv.org/abs/1710.05241v3 |
http://arxiv.org/pdf/1710.05241v3.pdf | |
PWC | https://paperswithcode.com/paper/robust-decentralized-learning-using-admm-with |
Repo | |
Framework | |
Learning to Refine Object Contours with a Top-Down Fully Convolutional Encoder-Decoder Network
Title | Learning to Refine Object Contours with a Top-Down Fully Convolutional Encoder-Decoder Network |
Authors | Yahui Liu, Jian Yao, Li Li, Xiaohu Lu, Jing Han |
Abstract | We develop a novel deep contour detection algorithm with a top-down fully convolutional encoder-decoder network. Our proposed method, named TD-CEDN, solves two important issues in this low-level vision problem: (1) learning multi-scale and multi-level features; and (2) applying an effective top-down refined approach in the networks. TD-CEDN performs the pixel-wise prediction by means of leveraging features at all layers of the net. Unlike skip connections and previous encoder-decoder methods, we first learn a coarse feature map after the encoder stage in a feedforward pass, and then refine this feature map in a top-down strategy during the decoder stage utilizing features at successively lower layers. Therefore, the deconvolutional process is conducted stepwise, which is guided by Deeply-Supervision Net providing the integrated direct supervision. The above proposed technologies lead to a more precise and clearer prediction. Our proposed algorithm achieved the state-of-the-art on the BSDS500 dataset (ODS F-score of 0.788), the PASCAL VOC2012 dataset (ODS F-score of 0.588), and and the NYU Depth dataset (ODS F-score of 0.735). |
Tasks | Contour Detection |
Published | 2017-05-12 |
URL | http://arxiv.org/abs/1705.04456v1 |
http://arxiv.org/pdf/1705.04456v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-refine-object-contours-with-a-top |
Repo | |
Framework | |
EVE: Explainable Vector Based Embedding Technique Using Wikipedia
Title | EVE: Explainable Vector Based Embedding Technique Using Wikipedia |
Authors | M. Atif Qureshi, Derek Greene |
Abstract | We present an unsupervised explainable word embedding technique, called EVE, which is built upon the structure of Wikipedia. The proposed model defines the dimensions of a semantic vector representing a word using human-readable labels, thereby it readily interpretable. Specifically, each vector is constructed using the Wikipedia category graph structure together with the Wikipedia article link structure. To test the effectiveness of the proposed word embedding model, we consider its usefulness in three fundamental tasks: 1) intruder detection - to evaluate its ability to identify a non-coherent vector from a list of coherent vectors, 2) ability to cluster - to evaluate its tendency to group related vectors together while keeping unrelated vectors in separate clusters, and 3) sorting relevant items first - to evaluate its ability to rank vectors (items) relevant to the query in the top order of the result. For each task, we also propose a strategy to generate a task-specific human-interpretable explanation from the model. These demonstrate the overall effectiveness of the explainable embeddings generated by EVE. Finally, we compare EVE with the Word2Vec, FastText, and GloVe embedding techniques across the three tasks, and report improvements over the state-of-the-art. |
Tasks | |
Published | 2017-02-22 |
URL | http://arxiv.org/abs/1702.06891v1 |
http://arxiv.org/pdf/1702.06891v1.pdf | |
PWC | https://paperswithcode.com/paper/eve-explainable-vector-based-embedding |
Repo | |
Framework | |
The Matrix Hilbert Space and Its Application to Matrix Learning
Title | The Matrix Hilbert Space and Its Application to Matrix Learning |
Authors | Yunfei Ye |
Abstract | Theoretical studies have proven that the Hilbert space has remarkable performance in many fields of applications. Frames in tensor product of Hilbert spaces were introduced to generalize the inner product to high-order tensors. However, these techniques require tensor decomposition which could lead to the loss of information and it is a NP-hard problem to determine the rank of tensors. Here, we present a new framework, namely matrix Hilbert space to perform a matrix inner product space when data observations are represented as matrices. We preserve the structure of initial data and multi-way correlation among them is captured in the process. In addition, we extend the reproducing kernel Hilbert space (RKHS) to reproducing kernel matrix Hilbert space (RKMHS) and propose an equivalent condition of the space uses of the certain kernel function. A new family of kernels is introduced in our framework to apply the classifier of Support Tensor Machine(STM) and comparative experiments are performed on a number of real-world datasets to support our contributions. |
Tasks | |
Published | 2017-06-25 |
URL | http://arxiv.org/abs/1706.08110v2 |
http://arxiv.org/pdf/1706.08110v2.pdf | |
PWC | https://paperswithcode.com/paper/the-matrix-hilbert-space-and-its-application |
Repo | |
Framework | |
Joint Semi-supervised RSS Dimensionality Reduction and Fingerprint Based Algorithm for Indoor Localization
Title | Joint Semi-supervised RSS Dimensionality Reduction and Fingerprint Based Algorithm for Indoor Localization |
Authors | Caifa Zhou, Lin Ma, Xuezhi Tan |
Abstract | With the recent development in mobile computing devices and as the ubiquitous deployment of access points(APs) of Wireless Local Area Networks(WLANs), WLAN based indoor localization systems(WILSs) are of mounting concentration and are becoming more and more prevalent for they do not require additional infrastructure. As to the localization methods in WILSs, for the approaches used to localization in satellite based global position systems are difficult to achieve in indoor environments, fingerprint based localization algorithms(FLAs) are predominant in the RSS based schemes. However, the performance of FLAs has close relationship with the number of APs and the number of reference points(RPs) in WILSs, especially as the redundant deployment of APs and RPs in the system. There are two fatal problems, curse of dimensionality (CoD) and asymmetric matching(AM), caused by increasing number of APs and breaking down APs during online stage. In this paper, a semi-supervised RSS dimensionality reduction algorithm is proposed to solve these two dilemmas at the same time and there are numerous analyses about the theoretical realization of the proposed method. Another significant innovation of this paper is jointing the fingerprint based algorithm with CM-SDE algorithm to improve the localization accuracy of indoor localization. |
Tasks | Dimensionality Reduction |
Published | 2017-04-12 |
URL | http://arxiv.org/abs/1704.03639v1 |
http://arxiv.org/pdf/1704.03639v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-semi-supervised-rss-dimensionality |
Repo | |
Framework | |
Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation
Title | Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation |
Authors | Emad M. Grais, Hagen Wierstorf, Dominic Ward, Mark D. Plumbley |
Abstract | In deep neural networks with convolutional layers, each layer typically has fixed-size/single-resolution receptive field (RF). Convolutional layers with a large RF capture global information from the input features, while layers with small RF size capture local details with high resolution from the input features. In this work, we introduce novel deep multi-resolution fully convolutional neural networks (MR-FCNN), where each layer has different RF sizes to extract multi-resolution features that capture the global and local details information from its input features. The proposed MR-FCNN is applied to separate a target audio source from a mixture of many audio sources. Experimental results show that using MR-FCNN improves the performance compared to feedforward deep neural networks (DNNs) and single resolution deep fully convolutional neural networks (FCNNs) on the audio source separation problem. |
Tasks | |
Published | 2017-10-28 |
URL | http://arxiv.org/abs/1710.11473v1 |
http://arxiv.org/pdf/1710.11473v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-resolution-fully-convolutional-neural |
Repo | |
Framework | |
Parallel Markov Chain Monte Carlo for the Indian Buffet Process
Title | Parallel Markov Chain Monte Carlo for the Indian Buffet Process |
Authors | Michael M. Zhang, Avinava Dubey, Sinead A. Williamson |
Abstract | Indian Buffet Process based models are an elegant way for discovering underlying features within a data set, but inference in such models can be slow. Inferring underlying features using Markov chain Monte Carlo either relies on an uncollapsed representation, which leads to poor mixing, or on a collapsed representation, which leads to a quadratic increase in computational complexity. Existing attempts at distributing inference have introduced additional approximation within the inference procedure. In this paper we present a novel algorithm to perform asymptotically exact parallel Markov chain Monte Carlo inference for Indian Buffet Process models. We take advantage of the fact that the features are conditionally independent under the beta-Bernoulli process. Because of this conditional independence, we can partition the features into two parts: one part containing only the finitely many instantiated features and the other part containing the infinite tail of uninstantiated features. For the finite partition, parallel inference is simple given the instantiation of features. But for the infinite tail, performing uncollapsed MCMC leads to poor mixing and hence we collapse out the features. The resulting hybrid sampler, while being parallel, produces samples asymptotically from the true posterior. |
Tasks | |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03457v1 |
http://arxiv.org/pdf/1703.03457v1.pdf | |
PWC | https://paperswithcode.com/paper/parallel-markov-chain-monte-carlo-for-the |
Repo | |
Framework | |
PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding
Title | PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding |
Authors | Chunhui Liu, Yueyu Hu, Yanghao Li, Sijie Song, Jiaying Liu |
Abstract | Despite the fact that many 3D human activity benchmarks being proposed, most existing action datasets focus on the action recognition tasks for the segmented videos. There is a lack of standard large-scale benchmarks, especially for current popular data-hungry deep learning based methods. In this paper, we introduce a new large scale benchmark (PKU-MMD) for continuous multi-modality 3D human action understanding and cover a wide range of complex human activities with well annotated information. PKU-MMD contains 1076 long video sequences in 51 action categories, performed by 66 subjects in three camera views. It contains almost 20,000 action instances and 5.4 million frames in total. Our dataset also provides multi-modality data sources, including RGB, depth, Infrared Radiation and Skeleton. With different modalities, we conduct extensive experiments on our dataset in terms of two scenarios and evaluate different methods by various metrics, including a new proposed evaluation protocol 2D-AP. We believe this large-scale dataset will benefit future researches on action detection for the community. |
Tasks | Action Detection, Temporal Action Localization |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07475v2 |
http://arxiv.org/pdf/1703.07475v2.pdf | |
PWC | https://paperswithcode.com/paper/pku-mmd-a-large-scale-benchmark-for |
Repo | |
Framework | |
Co-salient Object Detection Based on Deep Saliency Networks and Seed Propagation over an Integrated Graph
Title | Co-salient Object Detection Based on Deep Saliency Networks and Seed Propagation over an Integrated Graph |
Authors | Dong-ju Jeong, Insung Hwang, Nam Ik Cho |
Abstract | This paper presents a co-salient object detection method to find common salient regions in a set of images. We utilize deep saliency networks to transfer co-saliency prior knowledge and better capture high-level semantic information, and the resulting initial co-saliency maps are enhanced by seed propagation steps over an integrated graph. The deep saliency networks are trained in a supervised manner to avoid online weakly supervised learning and exploit them not only to extract high-level features but also to produce both intra- and inter-image saliency maps. Through a refinement step, the initial co-saliency maps can uniformly highlight co-salient regions and locate accurate object boundaries. To handle input image groups inconsistent in size, we propose to pool multi-regional descriptors including both within-segment and within-group information. In addition, the integrated multilayer graph is constructed to find the regions that the previous steps may not detect by seed propagation with low-level descriptors. In this work, we utilize the useful complementary components of high-, low-level information, and several learning-based steps. Our experiments have demonstrated that the proposed approach outperforms comparable co-saliency detection methods on widely used public databases and can also be directly applied to co-segmentation tasks. |
Tasks | Co-Saliency Detection, Object Detection, Saliency Detection, Salient Object Detection |
Published | 2017-06-29 |
URL | http://arxiv.org/abs/1706.09650v1 |
http://arxiv.org/pdf/1706.09650v1.pdf | |
PWC | https://paperswithcode.com/paper/co-salient-object-detection-based-on-deep |
Repo | |
Framework | |
Naturally Combined Shape-Color Moment Invariants under Affine Transformations
Title | Naturally Combined Shape-Color Moment Invariants under Affine Transformations |
Authors | Ming Gong, You Hao, Hanlin Mo, Hua Li |
Abstract | We proposed a kind of naturally combined shape-color affine moment invariants (SCAMI), which consider both shape and color affine transformations simultaneously in one single system. In the real scene, color and shape deformations always exist in images simultaneously. Simple shape invariants or color invariants can not be qualified for this situation. The conventional method is just to make a simple linear combination of the two factors. Meanwhile, the manual selection of weights is a complex issue. Our construction method is based on the multiple integration framework. The integral kernel is assigned as the continued product of the shape and color invariant cores. It is the first time to directly derive an invariant to dual affine transformations of shape and color. The manual selection of weights is no longer necessary, and both the shape and color transformations are extended to affine transformation group. With the various of invariant cores, a set of lower-order invariants are constructed and the completeness and independence are discussed detailedly. A set of SCAMIs, which called SCAMI24, are recommended, and the effectiveness and robustness have been evaluated on both synthetic and real datasets. |
Tasks | |
Published | 2017-05-31 |
URL | http://arxiv.org/abs/1705.10928v2 |
http://arxiv.org/pdf/1705.10928v2.pdf | |
PWC | https://paperswithcode.com/paper/naturally-combined-shape-color-moment |
Repo | |
Framework | |
Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds
Title | Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds |
Authors | Shusen Wang, Alex Gittens, Michael W. Mahoney |
Abstract | Kernel $k$-means clustering can correctly identify and extract a far more varied collection of cluster structures than the linear $k$-means clustering algorithm. However, kernel $k$-means clustering is computationally expensive when the non-linear feature map is high-dimensional and there are many input points. Kernel approximation, e.g., the Nystr"om method, has been applied in previous works to approximately solve kernel learning problems when both of the above conditions are present. This work analyzes the application of this paradigm to kernel $k$-means clustering, and shows that applying the linear $k$-means clustering algorithm to $\frac{k}{\epsilon} (1 + o(1))$ features constructed using a so-called rank-restricted Nystr"om approximation results in cluster assignments that satisfy a $1 + \epsilon$ approximation ratio in terms of the kernel $k$-means cost function, relative to the guarantee provided by the same algorithm without the use of the Nystr"om method. As part of the analysis, this work establishes a novel $1 + \epsilon$ relative-error trace norm guarantee for low-rank approximation using the rank-restricted Nystr"om approximation. Empirical evaluations on the $8.1$ million instance MNIST8M dataset demonstrate the scalability and usefulness of kernel $k$-means clustering with Nystr"om approximation. This work argues that spectral clustering using Nystr"om approximation—a popular and computationally efficient, but theoretically unsound approach to non-linear clustering—should be replaced with the efficient and theoretically sound combination of kernel $k$-means clustering with Nystr"om approximation. The superior performance of the latter approach is empirically verified. |
Tasks | |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1706.02803v4 |
http://arxiv.org/pdf/1706.02803v4.pdf | |
PWC | https://paperswithcode.com/paper/scalable-kernel-k-means-clustering-with |
Repo | |
Framework | |
High-Performance FPGA Implementation of Equivariant Adaptive Separation via Independence Algorithm for Independent Component Analysis
Title | High-Performance FPGA Implementation of Equivariant Adaptive Separation via Independence Algorithm for Independent Component Analysis |
Authors | Mahdi Nazemi, Shahin Nazarian, Massoud Pedram |
Abstract | Independent Component Analysis (ICA) is a dimensionality reduction technique that can boost efficiency of machine learning models that deal with probability density functions, e.g. Bayesian neural networks. Algorithms that implement adaptive ICA converge slower than their nonadaptive counterparts, however, they are capable of tracking changes in underlying distributions of input features. This intrinsically slow convergence of adaptive methods combined with existing hardware implementations that operate at very low clock frequencies necessitate fundamental improvements in both algorithm and hardware design. This paper presents an algorithm that allows efficient hardware implementation of ICA. Compared to previous work, our FPGA implementation of adaptive ICA improves clock frequency by at least one order of magnitude and throughput by at least two orders of magnitude. Our proposed algorithm is not limited to ICA and can be used in various machine learning problems that use stochastic gradient descent optimization. |
Tasks | Dimensionality Reduction |
Published | 2017-07-06 |
URL | http://arxiv.org/abs/1707.01939v1 |
http://arxiv.org/pdf/1707.01939v1.pdf | |
PWC | https://paperswithcode.com/paper/high-performance-fpga-implementation-of |
Repo | |
Framework | |