Paper Group ANR 1053
Neural tangent kernels, transportation mappings, and universal approximation. An enhanced KNN-based twin support vector machine with stable learning rules. Learning Compact Models for Planning with Exogenous Processes. RGB-D Individual Segmentation. Distance Geometry and Data Science. Robustness Of Saak Transform Against Adversarial Attacks. Ultra- …
Neural tangent kernels, transportation mappings, and universal approximation
Title | Neural tangent kernels, transportation mappings, and universal approximation |
Authors | Ziwei Ji, Matus Telgarsky, Ruicheng Xian |
Abstract | This paper establishes rates of universal approximation for the shallow neural tangent kernel (NTK): network weights are only allowed microscopic changes from random initialization, which entails that activations are mostly unchanged, and the network is nearly equivalent to its linearization. Concretely, the paper has two main contributions: a generic scheme to approximate functions with the NTK by sampling from transport mappings between the initial weights and their desired values, and the construction of transport mappings via Fourier transforms. Regarding the first contribution, the proof scheme provides another perspective on how the NTK regime arises from rescaling: redundancy in the weights due to resampling allows individual weights to be scaled down. Regarding the second contribution, the most notable transport mapping asserts that roughly $1 / \delta^{10d}$ nodes are sufficient to approximate continuous functions, where $\delta$ depends on the continuity properties of the target function. By contrast, nearly the same proof yields a bound of $1 / \delta^{2d}$ for shallow ReLU networks; this gap suggests a tantalizing direction for future work, separating shallow ReLU networks and their linearization. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06956v2 |
https://arxiv.org/pdf/1910.06956v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-tangent-kernels-transportation-1 |
Repo | |
Framework | |
An enhanced KNN-based twin support vector machine with stable learning rules
Title | An enhanced KNN-based twin support vector machine with stable learning rules |
Authors | A. Mir, Jalal A. Nasiri |
Abstract | Among the extensions of twin support vector machine (TSVM), some scholars have utilized K-nearest neighbor (KNN) graph to enhance TSVM’s classification accuracy. However, these KNN-based TSVM classifiers have two major issues such as high computational cost and overfitting. In order to address these issues, this paper presents an enhanced regularized K-nearest neighbor based twin support vector machine (RKNN-TSVM). It has three additional advantages: (1) Weight is given to each sample by considering the distance from its nearest neighbors. This further reduces the effect of noise and outliers on the output model. (2) An extra stabilizer term was added to each objective function. As a result, the learning rules of the proposed method are stable. (3) To reduce the computational cost of finding KNNs for all the samples, location difference of multiple distances based k-nearest neighbors algorithm (LDMDBA) was embedded into the learning process of the proposed method. The extensive experimental results on several synthetic and benchmark datasets show the effectiveness of our proposed RKNN-TSVM in both classification accuracy and computational time. Moreover, the largest speedup in the proposed method reaches to 14 times. |
Tasks | |
Published | 2019-06-22 |
URL | https://arxiv.org/abs/1906.09443v1 |
https://arxiv.org/pdf/1906.09443v1.pdf | |
PWC | https://paperswithcode.com/paper/an-enhanced-knn-based-twin-support-vector |
Repo | |
Framework | |
Learning Compact Models for Planning with Exogenous Processes
Title | Learning Compact Models for Planning with Exogenous Processes |
Authors | Rohan Chitnis, Tomás Lozano-Pérez |
Abstract | We address the problem of approximate model minimization for MDPs in which the state is partitioned into endogenous and (much larger) exogenous components. An exogenous state variable is one whose dynamics are independent of the agent’s actions. We formalize the mask-learning problem, in which the agent must choose a subset of exogenous state variables to reason about when planning; doing planning in such a reduced state space can often be significantly more efficient than planning in the full model. We then explore the various value functions at play within this setting, and describe conditions under which a policy for a reduced model will be optimal for the full MDP. The analysis leads us to a tractable approximate algorithm that draws upon the notion of mutual information among exogenous state variables. We validate our approach in simulated robotic manipulation domains where a robot is placed in a busy environment, in which there are many other agents also interacting with the objects. Visit http://tinyurl.com/chitnis-exogenous for a supplementary video. |
Tasks | |
Published | 2019-09-30 |
URL | https://arxiv.org/abs/1909.13870v1 |
https://arxiv.org/pdf/1909.13870v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-compact-models-for-planning-with |
Repo | |
Framework | |
RGB-D Individual Segmentation
Title | RGB-D Individual Segmentation |
Authors | Wenqiang Xu, Yanjun Fu, Yuchen Luo, Chang Liu, Cewu Lu |
Abstract | Fine-grained recognition task deals with sub-category classification problem, which is important for real-world applications. In this work, we are particularly interested in the segmentation task on the \emph{finest-grained} level, which is specifically named “individual segmentation”. In other words, the individual-level category has no sub-category under it. Segmentation problem in the individual level reveals some new properties, limited training data for single individual object, unknown background, and difficulty for the use of depth. To address these new problems, we propose a “Context Less-Aware” (CoLA) pipeline, which produces RGB-D object-predominated images that have less background context, and enables a scale-aware training and testing with 3D information. Extensive experiments show that the proposed CoLA strategy largely outperforms baseline methods on YCB-Video dataset and our proposed Supermarket-10K dataset. Code, trained model and new dataset will be published with this paper. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07641v2 |
https://arxiv.org/pdf/1910.07641v2.pdf | |
PWC | https://paperswithcode.com/paper/rgb-d-individual-segmentation |
Repo | |
Framework | |
Distance Geometry and Data Science
Title | Distance Geometry and Data Science |
Authors | Leo Liberti |
Abstract | Data are often represented as graphs. Many common tasks in data science are based on distances between entities. While some data science methodologies natively take graphs as their input, there are many more that take their input in vectorial form. In this survey we discuss the fundamental problem of mapping graphs to vectors, and its relation with mathematical programming. We discuss applications, solution methods, dimensional reduction techniques and some of their limits. We then present an application of some of these ideas to neural networks, showing that distance geometry techniques can give competitive performance with respect to more traditional graph-to-vector mappings. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08544v1 |
https://arxiv.org/pdf/1909.08544v1.pdf | |
PWC | https://paperswithcode.com/paper/distance-geometry-and-data-science |
Repo | |
Framework | |
Robustness Of Saak Transform Against Adversarial Attacks
Title | Robustness Of Saak Transform Against Adversarial Attacks |
Authors | Thiyagarajan Ramanathan, Abinaya Manimaran, Suya You, C-C Jay Kuo |
Abstract | Image classification is vulnerable to adversarial attacks. This work investigates the robustness of Saak transform against adversarial attacks towards high performance image classification. We develop a complete image classification system based on multi-stage Saak transform. In the Saak transform domain, clean and adversarial images demonstrate different distributions at different spectral dimensions. Selection of the spectral dimensions at every stage can be viewed as an automatic denoising process. Motivated by this observation, we carefully design strategies of feature extraction, representation and classification that increase adversarial robustness. The performances with well-known datasets and attacks are demonstrated by extensive experimental evaluations. |
Tasks | Denoising, Image Classification |
Published | 2019-02-07 |
URL | http://arxiv.org/abs/1902.02826v1 |
http://arxiv.org/pdf/1902.02826v1.pdf | |
PWC | https://paperswithcode.com/paper/robustness-of-saak-transform-against |
Repo | |
Framework | |
Ultra-Scalable Spectral Clustering and Ensemble Clustering
Title | Ultra-Scalable Spectral Clustering and Ensemble Clustering |
Authors | Dong Huang, Chang-Dong Wang, Jian-Sheng Wu, Jian-Huang Lai, Chee-Keong Kwoh |
Abstract | This paper focuses on scalability and robustness of spectral clustering for extremely large-scale datasets with limited resources. Two novel algorithms are proposed, namely, ultra-scalable spectral clustering (U-SPEC) and ultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representative selection strategy and a fast approximation method for K-nearest representatives are proposed for the construction of a sparse affinity sub-matrix. By interpreting the sparse sub-matrix as a bipartite graph, the transfer cut is then utilized to efficiently partition the graph and obtain the clustering result. In U-SENC, multiple U-SPEC clusterers are further integrated into an ensemble clustering framework to enhance the robustness of U-SPEC while maintaining high efficiency. Based on the ensemble generation via multiple U-SEPC’s, a new bipartite graph is constructed between objects and base clusters and then efficiently partitioned to achieve the consensus clustering result. It is noteworthy that both U-SPEC and U-SENC have nearly linear time and space complexity, and are capable of robustly and efficiently partitioning ten-million-level nonlinearly-separable datasets on a PC with 64GB memory. Experiments on various large-scale datasets have demonstrated the scalability and robustness of our algorithms. The MATLAB code and experimental data are available at https://www.researchgate.net/publication/330760669. |
Tasks | |
Published | 2019-03-04 |
URL | http://arxiv.org/abs/1903.01057v2 |
http://arxiv.org/pdf/1903.01057v2.pdf | |
PWC | https://paperswithcode.com/paper/ultra-scalable-spectral-clustering-and |
Repo | |
Framework | |
Integrating Tensor Similarity to Enhance Clustering Performance
Title | Integrating Tensor Similarity to Enhance Clustering Performance |
Authors | Hong Peng, Jiazhou Chen, Haiyan Wang, Yu Hu, Hongmin Cai |
Abstract | Clustering aims to separate observed data into different categories. The performance of popular clustering models relies on the sample-to-sample similarity. However, the pairwise similarity is prone to be corrupted by noise or outliers and thus deteriorates the subsequent clustering. A high-order relationship among samples-to-samples may elaborate the local manifold of the data and thus provide complementary information to guide the clustering. However, few studies have investigated the connection between high-order similarity and usual pairwise similarity. To fill this gap, we first define a high-order tensor similarity to exploit the samples-to-samples affinity relationship. We then establish the connection between tensor similarity and pairwise similarity, proving that the decomposable tensor similarity is the Kronecker product of the usual pairwise similarity and the non-decomposable tensor similarity is generalized to provide complementary information, which pairwise similarity fails to regard. Finally, the high-order tensor similarity and pairwise similarity (IPS2) were integrated collaboratively to enhance clustering performance by enjoying their merits. The proposed IPS2 is shown to perform superior or competitive to state-of-the-art methods on synthetic and real-world datasets. Extensive experiments demonstrated that tensor similarity is capable to boost the performance of the classical clustering method. |
Tasks | |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.03920v1 |
https://arxiv.org/pdf/1905.03920v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-tensor-similarity-to-enhance |
Repo | |
Framework | |
Open-ended Learning in Symmetric Zero-sum Games
Title | Open-ended Learning in Symmetric Zero-sum Games |
Authors | David Balduzzi, Marta Garnelo, Yoram Bachrach, Wojciech M. Czarnecki, Julien Perolat, Max Jaderberg, Thore Graepel |
Abstract | Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them winner' and loser’. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective – we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives. |
Tasks | |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.08106v2 |
https://arxiv.org/pdf/1901.08106v2.pdf | |
PWC | https://paperswithcode.com/paper/open-ended-learning-in-symmetric-zero-sum |
Repo | |
Framework | |
EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM
Title | EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM |
Authors | Skanda Koppula, Lois Orosa, Abdullah Giray Yağlıkçı, Roknoddin Azizi, Taha Shahroodi, Konstantinos Kanellopoulos, Onur Mutlu |
Abstract | The effectiveness of deep neural networks (DNN) in vision, speech, and language processing has prompted a tremendous demand for energy-efficient high-performance DNN inference systems. Due to the increasing memory intensity of most DNN workloads, main memory can dominate the system’s energy consumption and stall time. One effective way to reduce the energy consumption and increase the performance of DNN inference systems is by using approximate memory, which operates with reduced supply voltage and reduced access latency parameters that violate standard specifications. Using approximate memory reduces reliability, leading to higher bit error rates. Fortunately, neural networks have an intrinsic capacity to tolerate increased bit errors. This can enable energy-efficient and high-performance neural network inference using approximate DRAM devices. Based on this observation, we propose EDEN, a general framework that reduces DNN energy consumption and DNN evaluation latency by using approximate DRAM devices, while strictly meeting a user-specified target DNN accuracy. EDEN relies on two key ideas: 1) retraining the DNN for a target approximate DRAM device to increase the DNN’s error tolerance, and 2) efficient mapping of the error tolerance of each individual DNN data type to a corresponding approximate DRAM partition in a way that meets the user-specified DNN accuracy requirements. We evaluate EDEN on multi-core CPUs, GPUs, and DNN accelerators with error models obtained from real approximate DRAM devices. For a target accuracy within 1% of the original DNN, our results show that EDEN enables 1) an average DRAM energy reduction of 21%, 37%, 31%, and 32% in CPU, GPU, and two DNN accelerator architectures, respectively, across a variety of DNNs, and 2) an average (maximum) speedup of 8% (17%) and 2.7% (5.5%) in CPU and GPU architectures, respectively, when evaluating latency-bound DNNs. |
Tasks | |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05340v1 |
https://arxiv.org/pdf/1910.05340v1.pdf | |
PWC | https://paperswithcode.com/paper/eden-enabling-energy-efficient-high |
Repo | |
Framework | |
How to Write High-quality News on Social Network? Predicting News Quality by Mining Writing Style
Title | How to Write High-quality News on Social Network? Predicting News Quality by Mining Writing Style |
Authors | Yuting Yang, Juan Cao, Mingyan Lu, Jintao Li, Chia-Wen Lin |
Abstract | Rapid development of Internet technologies promotes traditional newspapers to report news on social networks. However, people on social networks may have different needs which naturally arises the question: whether can we analyze the influence of writing style on news quality automatically and assist writers in improving news quality? It’s challenging due to writing style and ‘quality’ are hard to measure. First, we use ‘popularity’ as the measure of ‘quality’. It is natural on social networks but brings new problems: popularity are also influenced by event and publisher. So we design two methods to alleviate their influence. Then, we proposed eight types of linguistic features (53 features in all) according eight writing guidelines and analyze their relationship with news quality. The experimental results show these linguistic features influence greatly on news quality. Based on it, we design a news quality assessment model on social network (SNQAM). SNQAM performs excellently on predicting quality, presenting interpretable quality score and giving accessible suggestions on how to improve it according to writing guidelines we referred to. |
Tasks | |
Published | 2019-02-02 |
URL | http://arxiv.org/abs/1902.00750v1 |
http://arxiv.org/pdf/1902.00750v1.pdf | |
PWC | https://paperswithcode.com/paper/how-to-write-high-quality-news-on-social |
Repo | |
Framework | |
Ranking and Classification driven Feature Learning for Person Re_identification
Title | Ranking and Classification driven Feature Learning for Person Re_identification |
Authors | Zhiguang Zhang |
Abstract | Person re-identification has attracted many researchers’ attention for its wide application, but it is still a very challenging task because only part of the image information can be used for personnel matching. Most of current methods uses CNN to learn to embeddings that can capture semantic similarity information among data points. Many of the state-of-the-arts methods use complex network structures with multiple branches that fuse multiple features while training or testing, using classification loss, Triplet loss or a combination of the two as loss function. However, the method that using Triplet loss as loss function converges slowly, and the method in which pull features of the same class as close as possible in features space leads to poor feature stability. This paper will combine the ranking motivated structured loss, proposed a new metric learning loss function that make the features of the same class are sparsely distributed into the range of small hyperspheres and the features of different classes are uniformly distributed at a clearly angle. And adopted a new single-branch network structure that only using global feature can also get great performance. The validity of our method is verified on the Market1501 and DukeMTMC-ReID person re-identification datasets. Finally acquires 90.9% rank-1 accuracy and 80.8% mAP on DukeMTMC-reID, 95.3% rank-1 accuracy and 88.7% mAP on Market1501. Codes and models are available in Github.https://github.com/Qidian213/Ranked_Person_ReID. |
Tasks | Metric Learning, Person Re-Identification, Semantic Similarity, Semantic Textual Similarity |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/1912.11630v1 |
https://arxiv.org/pdf/1912.11630v1.pdf | |
PWC | https://paperswithcode.com/paper/ranking-and-classification-driven-feature |
Repo | |
Framework | |
Benefits and Pitfalls of the Exponential Mechanism with Applications to Hilbert Spaces and Functional PCA
Title | Benefits and Pitfalls of the Exponential Mechanism with Applications to Hilbert Spaces and Functional PCA |
Authors | Jordan Awan, Ana Kenney, Matthew Reimherr, Aleksandra Slavković |
Abstract | The exponential mechanism is a fundamental tool of Differential Privacy (DP) due to its strong privacy guarantees and flexibility. We study its extension to settings with summaries based on infinite dimensional outputs such as with functional data analysis, shape analysis, and nonparametric statistics. We show that one can design the mechanism with respect to a specific base measure over the output space, such as a Guassian process. We provide a positive result that establishes a Central Limit Theorem for the exponential mechanism quite broadly. We also provide an apparent negative result, showing that the magnitude of the noise introduced for privacy is asymptotically non-negligible relative to the statistical estimation error. We develop an \ep-DP mechanism for functional principal component analysis, applicable in separable Hilbert spaces. We demonstrate its performance via simulations and applications to two datasets. |
Tasks | |
Published | 2019-01-30 |
URL | http://arxiv.org/abs/1901.10864v1 |
http://arxiv.org/pdf/1901.10864v1.pdf | |
PWC | https://paperswithcode.com/paper/benefits-and-pitfalls-of-the-exponential |
Repo | |
Framework | |
A Frobenius norm regularization method for convolutional kernels to avoid unstable gradient problem
Title | A Frobenius norm regularization method for convolutional kernels to avoid unstable gradient problem |
Authors | Pei-Chang Guo |
Abstract | Convolutional neural network is a very important model of deep learning. It can help avoid the exploding/vanishing gradient problem and improve the generalizability of a neural network if the singular values of the Jacobian of a layer are bounded around $1$ in the training process. We propose a new penalty function for a convolutional kernel to let the singular values of the corresponding transformation matrix are bounded around $1$. We show how to carry out the gradient type methods. The penalty is about the structured transformation matrix corresponding to a convolutional kernel. This provides a new regularization method about the weights of convolutional layers. |
Tasks | |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11235v1 |
https://arxiv.org/pdf/1907.11235v1.pdf | |
PWC | https://paperswithcode.com/paper/a-frobenius-norm-regularization-method-for |
Repo | |
Framework | |
Expert-guided Regularization via Distance Metric Learning
Title | Expert-guided Regularization via Distance Metric Learning |
Authors | Shouvik Mani, Mehdi Maasoumy, Sina Pakazad, Henrik Ohlsson |
Abstract | High-dimensional prediction is a challenging problem setting for traditional statistical models. Although regularization improves model performance in high dimensions, it does not sufficiently leverage knowledge on feature importances held by domain experts. As an alternative to standard regularization techniques, we propose Distance Metric Learning Regularization (DMLreg), an approach for eliciting prior knowledge from domain experts and integrating that knowledge into a regularized linear model. First, we learn a Mahalanobis distance metric between observations from pairwise similarity comparisons provided by an expert. Then, we use the learned distance metric to place prior distributions on coefficients in a linear model. Through experimental results on a simulated high-dimensional prediction problem, we show that DMLreg leads to improvements in model performance when the domain expert is knowledgeable. |
Tasks | Metric Learning |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.03984v1 |
https://arxiv.org/pdf/1912.03984v1.pdf | |
PWC | https://paperswithcode.com/paper/expert-guided-regularization-via-distance |
Repo | |
Framework | |