Paper Group ANR 311
TensiStrength: Stress and relaxation magnitude detection for social media texts. The Digital Synaptic Neural Substrate: Size and Quality Matters. Faster Asynchronous SGD. Elastic Net Hypergraph Learning for Image Clustering and Semi-supervised Classification. Deep Blind Compressed Sensing. Reliable Evaluation of Neural Network for Multiclass Classi …
TensiStrength: Stress and relaxation magnitude detection for social media texts
Title | TensiStrength: Stress and relaxation magnitude detection for social media texts |
Authors | Mike Thelwall |
Abstract | Computer systems need to be able to react to stress in order to perform optimally on some tasks. This article describes TensiStrength, a system to detect the strength of stress and relaxation expressed in social media text messages. TensiStrength uses a lexical approach and a set of rules to detect direct and indirect expressions of stress or relaxation, particularly in the context of transportation. It is slightly more effective than a comparable sentiment analysis program, although their similar performances occur despite differences on almost half of the tweets gathered. The effectiveness of TensiStrength depends on the nature of the tweets classified, with tweets that are rich in stress-related terms being particularly problematic. Although generic machine learning methods can give better performance than TensiStrength overall, they exploit topic-related terms in a way that may be undesirable in practical applications and that may not work as well in more focused contexts. In conclusion, TensiStrength and generic machine learning approaches work well enough to be practical choices for intelligent applications that need to take advantage of stress information, and the decision about which to use depends on the nature of the texts analysed and the purpose of the task. |
Tasks | Sentiment Analysis |
Published | 2016-07-01 |
URL | http://arxiv.org/abs/1607.00139v2 |
http://arxiv.org/pdf/1607.00139v2.pdf | |
PWC | https://paperswithcode.com/paper/tensistrength-stress-and-relaxation-magnitude |
Repo | |
Framework | |
The Digital Synaptic Neural Substrate: Size and Quality Matters
Title | The Digital Synaptic Neural Substrate: Size and Quality Matters |
Authors | Azlan Iqbal |
Abstract | We investigate the ‘Digital Synaptic Neural Substrate’ (DSNS) computational creativity approach further with respect to the size and quality of images that can be used to seed the process. In previous work we demonstrated how combining photographs of people and sequences taken from chess games between weak players can be used to generate chess problems or puzzles of higher aesthetic quality, on average, compared to alternative approaches. In this work we show experimentally that using larger images as opposed to smaller ones improves the output quality even further. The same is also true for using clearer or less corrupted images. The reasons why these things influence the DSNS process is presently not well-understood and debatable but the findings are nevertheless immediately applicable for obtaining better results. |
Tasks | |
Published | 2016-09-20 |
URL | http://arxiv.org/abs/1609.06953v1 |
http://arxiv.org/pdf/1609.06953v1.pdf | |
PWC | https://paperswithcode.com/paper/the-digital-synaptic-neural-substrate-size |
Repo | |
Framework | |
Faster Asynchronous SGD
Title | Faster Asynchronous SGD |
Authors | Augustus Odena |
Abstract | Asynchronous distributed stochastic gradient descent methods have trouble converging because of stale gradients. A gradient update sent to a parameter server by a client is stale if the parameters used to calculate that gradient have since been updated on the server. Approaches have been proposed to circumvent this problem that quantify staleness in terms of the number of elapsed updates. In this work, we propose a novel method that quantifies staleness in terms of moving averages of gradient statistics. We show that this method outperforms previous methods with respect to convergence speed and scalability to many clients. We also discuss how an extension to this method can be used to dramatically reduce bandwidth costs in a distributed training context. In particular, our method allows reduction of total bandwidth usage by a factor of 5 with little impact on cost convergence. We also describe (and link to) a software library that we have used to simulate these algorithms deterministically on a single machine. |
Tasks | |
Published | 2016-01-15 |
URL | http://arxiv.org/abs/1601.04033v1 |
http://arxiv.org/pdf/1601.04033v1.pdf | |
PWC | https://paperswithcode.com/paper/faster-asynchronous-sgd |
Repo | |
Framework | |
Elastic Net Hypergraph Learning for Image Clustering and Semi-supervised Classification
Title | Elastic Net Hypergraph Learning for Image Clustering and Semi-supervised Classification |
Authors | Qingshan Liu, Yubao Sun, Cantian Wang, Tongliang Liu, Dacheng Tao |
Abstract | Graph model is emerging as a very effective tool for learning the complex structures and relationships hidden in data. Generally, the critical purpose of graph-oriented learning algorithms is to construct an informative graph for image clustering and classification tasks. In addition to the classical $K$-nearest-neighbor and $r$-neighborhood methods for graph construction, $l_1$-graph and its variants are emerging methods for finding the neighboring samples of a center datum, where the corresponding ingoing edge weights are simultaneously derived by the sparse reconstruction coefficients of the remaining samples. However, the pair-wise links of $l_1$-graph are not capable of capturing the high order relationships between the center datum and its prominent data in sparse reconstruction. Meanwhile, from the perspective of variable selection, the $l_1$ norm sparse constraint, regarded as a LASSO model, tends to select only one datum from a group of data that are highly correlated and ignore the others. To simultaneously cope with these drawbacks, we propose a new elastic net hypergraph learning model, which consists of two steps. In the first step, the Robust Matrix Elastic Net model is constructed to find the canonically related samples in a somewhat greedy way, achieving the grouping effect by adding the $l_2$ penalty to the $l_1$ constraint. In the second step, hypergraph is used to represent the high order relationships between each datum and its prominent samples by regarding them as a hyperedge. Subsequently, hypergraph Laplacian matrix is constructed for further analysis. New hypergraph learning algorithms, including unsupervised clustering and multi-class semi-supervised classification, are then derived. Extensive experiments on face and handwriting databases demonstrate the effectiveness of the proposed method. |
Tasks | graph construction, Image Clustering |
Published | 2016-03-03 |
URL | http://arxiv.org/abs/1603.01096v1 |
http://arxiv.org/pdf/1603.01096v1.pdf | |
PWC | https://paperswithcode.com/paper/elastic-net-hypergraph-learning-for-image |
Repo | |
Framework | |
Deep Blind Compressed Sensing
Title | Deep Blind Compressed Sensing |
Authors | Shikha Singh, Vanika Singhal, Angshul Majumdar |
Abstract | This work addresses the problem of extracting deeply learned features directly from compressive measurements. There has been no work in this area. Existing deep learning tools only give good results when applied on the full signal, that too usually after preprocessing. These techniques require the signal to be reconstructed first. In this work we show that by learning directly from the compressed domain, considerably better results can be obtained. This work extends the recently proposed framework of deep matrix factorization in combination with blind compressed sensing; hence the term deep blind compressed sensing. Simulation experiments have been carried out on imaging via single pixel camera, under-sampled biomedical signals, arising in wireless body area network and compressive hyperspectral imaging. In all cases, the superiority of our proposed deep blind compressed sensing can be envisaged. |
Tasks | |
Published | 2016-12-22 |
URL | http://arxiv.org/abs/1612.07453v1 |
http://arxiv.org/pdf/1612.07453v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-blind-compressed-sensing |
Repo | |
Framework | |
Reliable Evaluation of Neural Network for Multiclass Classification of Real-world Data
Title | Reliable Evaluation of Neural Network for Multiclass Classification of Real-world Data |
Authors | Siddharth Dinesh, Tirtharaj Dash |
Abstract | This paper presents a systematic evaluation of Neural Network (NN) for classification of real-world data. In the field of machine learning, it is often seen that a single parameter that is ‘predictive accuracy’ is being used for evaluating the performance of a classifier model. However, this parameter might not be considered reliable given a dataset with very high level of skewness. To demonstrate such behavior, seven different types of datasets have been used to evaluate a Multilayer Perceptron (MLP) using twelve(12) different parameters which include micro- and macro-level estimation. In the present study, the most common problem of prediction called ‘multiclass’ classification has been considered. The results that are obtained for different parameters for each of the dataset could demonstrate interesting findings to support the usability of these set of performance evaluation parameters. |
Tasks | |
Published | 2016-11-30 |
URL | http://arxiv.org/abs/1612.00671v1 |
http://arxiv.org/pdf/1612.00671v1.pdf | |
PWC | https://paperswithcode.com/paper/reliable-evaluation-of-neural-network-for |
Repo | |
Framework | |
Unsupervised High-level Feature Learning by Ensemble Projection for Semi-supervised Image Classification and Image Clustering
Title | Unsupervised High-level Feature Learning by Ensemble Projection for Semi-supervised Image Classification and Image Clustering |
Authors | Dengxin Dai, Luc Van Gool |
Abstract | This paper investigates the problem of image classification with limited or no annotations, but abundant unlabeled data. The setting exists in many tasks such as semi-supervised image classification, image clustering, and image retrieval. Unlike previous methods, which develop or learn sophisticated regularizers for classifiers, our method learns a new image representation by exploiting the distribution patterns of all available data for the task at hand. Particularly, a rich set of visual prototypes are sampled from all available data, and are taken as surrogate classes to train discriminative classifiers; images are projected via the classifiers; the projected values, similarities to the prototypes, are stacked to build the new feature vector. The training set is noisy. Hence, in the spirit of ensemble learning we create a set of such training sets which are all diverse, leading to diverse classifiers. The method is dubbed Ensemble Projection (EP). EP captures not only the characteristics of individual images, but also the relationships among images. It is conceptually simple and computationally efficient, yet effective and flexible. Experiments on eight standard datasets show that: (1) EP outperforms previous methods for semi-supervised image classification; (2) EP produces promising results for self-taught image classification, where unlabeled samples are a random collection of images rather than being from the same distribution as the labeled ones; and (3) EP improves over the original features for image clustering. The code of the method is available on the project page. |
Tasks | Image Classification, Image Clustering, Image Retrieval, Semi-Supervised Image Classification |
Published | 2016-02-02 |
URL | http://arxiv.org/abs/1602.00955v2 |
http://arxiv.org/pdf/1602.00955v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-high-level-feature-learning-by |
Repo | |
Framework | |
Theoretical Analysis of the $k$-Means Algorithm - A Survey
Title | Theoretical Analysis of the $k$-Means Algorithm - A Survey |
Authors | Johannes Blömer, Christiane Lammersen, Melanie Schmidt, Christian Sohler |
Abstract | The $k$-means algorithm is one of the most widely used clustering heuristics. Despite its simplicity, analyzing its running time and quality of approximation is surprisingly difficult and can lead to deep insights that can be used to improve the algorithm. In this paper we survey the recent results in this direction as well as several extension of the basic $k$-means method. |
Tasks | |
Published | 2016-02-26 |
URL | http://arxiv.org/abs/1602.08254v1 |
http://arxiv.org/pdf/1602.08254v1.pdf | |
PWC | https://paperswithcode.com/paper/theoretical-analysis-of-the-k-means-algorithm |
Repo | |
Framework | |
Regret Bounds for Lifelong Learning
Title | Regret Bounds for Lifelong Learning |
Authors | Pierre Alquier, The Tien Mai, Massimiliano Pontil |
Abstract | We consider the problem of transfer learning in an online setting. Different tasks are presented sequentially and processed by a within-task algorithm. We propose a lifelong learning strategy which refines the underlying data representation used by the within-task algorithm, thereby transferring information from one task to the next. We show that when the within-task algorithm comes with some regret bound, our strategy inherits this good property. Our bounds are in expectation for a general loss function, and uniform for a convex loss. We discuss applications to dictionary learning and finite set of predictors. In the latter case, we improve previous $O(1/\sqrt{m})$ bounds to $O(1/m)$ where $m$ is the per task sample size. |
Tasks | Dictionary Learning, Transfer Learning |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08628v1 |
http://arxiv.org/pdf/1610.08628v1.pdf | |
PWC | https://paperswithcode.com/paper/regret-bounds-for-lifelong-learning |
Repo | |
Framework | |
A Residual Bootstrap for High-Dimensional Regression with Near Low-Rank Designs
Title | A Residual Bootstrap for High-Dimensional Regression with Near Low-Rank Designs |
Authors | Miles E. Lopes |
Abstract | We study the residual bootstrap (RB) method in the context of high-dimensional linear regression. Specifically, we analyze the distributional approximation of linear contrasts $c^{\top} (\hat{\beta}{\rho}-\beta)$, where $\hat{\beta}{\rho}$ is a ridge-regression estimator. When regression coefficients are estimated via least squares, classical results show that RB consistently approximates the laws of contrasts, provided that $p\ll n$, where the design matrix is of size $n\times p$. Up to now, relatively little work has considered how additional structure in the linear model may extend the validity of RB to the setting where $p/n\asymp 1$. In this setting, we propose a version of RB that resamples residuals obtained from ridge regression. Our main structural assumption on the design matrix is that it is nearly low rank — in the sense that its singular values decay according to a power-law profile. Under a few extra technical assumptions, we derive a simple criterion for ensuring that RB consistently approximates the law of a given contrast. We then specialize this result to study confidence intervals for mean response values $X_i^{\top} \beta$, where $X_i^{\top}$ is the $i$th row of the design. More precisely, we show that conditionally on a Gaussian design with near low-rank structure, RB simultaneously approximates all of the laws $X_i^{\top}(\hat{\beta}_{\rho}-\beta)$, $i=1,\dots,n$. This result is also notable as it imposes no sparsity assumptions on $\beta$. Furthermore, since our consistency results are formulated in terms of the Mallows (Kantorovich) metric, the existence of a limiting distribution is not required. |
Tasks | |
Published | 2016-07-04 |
URL | http://arxiv.org/abs/1607.00743v1 |
http://arxiv.org/pdf/1607.00743v1.pdf | |
PWC | https://paperswithcode.com/paper/a-residual-bootstrap-for-high-dimensional |
Repo | |
Framework | |
NeRD: a Neural Response Divergence Approach to Visual Salience Detection
Title | NeRD: a Neural Response Divergence Approach to Visual Salience Detection |
Authors | M. J. Shafiee, P. Siva, C. Scharfenberger, P. Fieguth, A. Wong |
Abstract | In this paper, a novel approach to visual salience detection via Neural Response Divergence (NeRD) is proposed, where synaptic portions of deep neural networks, previously trained for complex object recognition, are leveraged to compute low level cues that can be used to compute image region distinctiveness. Based on this concept , an efficient visual salience detection framework is proposed using deep convolutional StochasticNets. Experimental results using CSSD and MSRA10k natural image datasets show that the proposed NeRD approach can achieve improved performance when compared to state-of-the-art image saliency approaches, while the attaining low computational complexity necessary for near-real-time computer vision applications. |
Tasks | Object Recognition |
Published | 2016-02-04 |
URL | http://arxiv.org/abs/1602.01728v1 |
http://arxiv.org/pdf/1602.01728v1.pdf | |
PWC | https://paperswithcode.com/paper/nerd-a-neural-response-divergence-approach-to |
Repo | |
Framework | |
On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression
Title | On Projected Stochastic Gradient Descent Algorithm with Weighted Averaging for Least Squares Regression |
Authors | Kobi Cohen, Angelia Nedic, R. Srikant |
Abstract | The problem of least squares regression of a $d$-dimensional unknown parameter is considered. A stochastic gradient descent based algorithm with weighted iterate-averaging that uses a single pass over the data is studied and its convergence rate is analyzed. We first consider a bounded constraint set of the unknown parameter. Under some standard regularity assumptions, we provide an explicit $O(1/k)$ upper bound on the convergence rate, depending on the variance (due to the additive noise in the measurements) and the size of the constraint set. We show that the variance term dominates the error and decreases with rate $1/k$, while the term which is related to the size of the constraint set decreases with rate $\log k/k^2$. We then compare the asymptotic ratio $\rho$ between the convergence rate of the proposed scheme and the empirical risk minimizer (ERM) as the number of iterations approaches infinity. We show that $\rho\leq 4$ under some mild conditions for all $d\geq 1$. We further improve the upper bound by showing that $\rho\leq 4/3$ for the case of $d=1$ and unbounded parameter set. Simulation results demonstrate strong performance of the algorithm as compared to existing methods, and coincide with $\rho\leq 4/3$ even for large $d$ in practice. |
Tasks | |
Published | 2016-06-09 |
URL | http://arxiv.org/abs/1606.03000v1 |
http://arxiv.org/pdf/1606.03000v1.pdf | |
PWC | https://paperswithcode.com/paper/on-projected-stochastic-gradient-descent |
Repo | |
Framework | |
An Evolving Neuro-Fuzzy System with Online Learning/Self-learning
Title | An Evolving Neuro-Fuzzy System with Online Learning/Self-learning |
Authors | Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Anastasiia O. Deineko |
Abstract | An architecture of a new neuro-fuzzy system is proposed. The basic idea of this approach is to tune both synaptic weights and membership functions with the help of the supervised learning and self-learning paradigms. The approach to solving the problem has to do with evolving online neuro-fuzzy systems that can process data under uncertainty conditions. The results prove the effectiveness of the developed architecture and the learning procedure. |
Tasks | |
Published | 2016-10-20 |
URL | http://arxiv.org/abs/1610.06488v1 |
http://arxiv.org/pdf/1610.06488v1.pdf | |
PWC | https://paperswithcode.com/paper/an-evolving-neuro-fuzzy-system-with-online |
Repo | |
Framework | |
An extended Perona-Malik model based on probabilistic models
Title | An extended Perona-Malik model based on probabilistic models |
Authors | Lars M. Mescheder, Dirk A. Lorenz |
Abstract | The Perona-Malik model has been very successful at restoring images from noisy input. In this paper, we reinterpret the Perona-Malik model in the language of Gaussian scale mixtures and derive some extensions of the model. Specifically, we show that the expectation-maximization (EM) algorithm applied to Gaussian scale mixtures leads to the lagged-diffusivity algorithm for computing stationary points of the Perona-Malik diffusion equations. Moreover, we show how mean field approximations to these Gaussian scale mixtures lead to a modification of the lagged-diffusivity algorithm that better captures the uncertainties in the restoration. Since this modification can be hard to compute in practice we propose relaxations to the mean field objective to make the algorithm computationally feasible. Our numerical experiments show that this modified lagged-diffusivity algorithm often performs better at restoring textured areas and fuzzy edges than the unmodified algorithm. As a second application of the Gaussian scale mixture framework, we show how an efficient sampling procedure can be obtained for the probabilistic model, making the computation of the conditional mean and other expectations algorithmically feasible. Again, the resulting algorithm has a strong resemblance to the lagged-diffusivity algorithm. Finally, we show that a probabilistic version of the Mumford-Shah segementation model can be obtained in the same framework with a discrete edge-prior. |
Tasks | |
Published | 2016-12-19 |
URL | http://arxiv.org/abs/1612.06176v1 |
http://arxiv.org/pdf/1612.06176v1.pdf | |
PWC | https://paperswithcode.com/paper/an-extended-perona-malik-model-based-on |
Repo | |
Framework | |
Quantization and Training of Low Bit-Width Convolutional Neural Networks for Object Detection
Title | Quantization and Training of Low Bit-Width Convolutional Neural Networks for Object Detection |
Authors | Penghang Yin, Shuai Zhang, Yingyong Qi, Jack Xin |
Abstract | We present LBW-Net, an efficient optimization based method for quantization and training of the low bit-width convolutional neural networks (CNNs). Specifically, we quantize the weights to zero or powers of two by minimizing the Euclidean distance between full-precision weights and quantized weights during backpropagation. We characterize the combinatorial nature of the low bit-width quantization problem. For 2-bit (ternary) CNNs, the quantization of $N$ weights can be done by an exact formula in $O(N\log N)$ complexity. When the bit-width is three and above, we further propose a semi-analytical thresholding scheme with a single free parameter for quantization that is computationally inexpensive. The free parameter is further determined by network retraining and object detection tests. LBW-Net has several desirable advantages over full-precision CNNs, including considerable memory savings, energy efficiency, and faster deployment. Our experiments on PASCAL VOC dataset show that compared with its 32-bit floating-point counterpart, the performance of the 6-bit LBW-Net is nearly lossless in the object detection tasks, and can even do better in some real world visual scenes, while empirically enjoying more than 4$\times$ faster deployment. |
Tasks | Object Detection, Quantization |
Published | 2016-12-19 |
URL | http://arxiv.org/abs/1612.06052v2 |
http://arxiv.org/pdf/1612.06052v2.pdf | |
PWC | https://paperswithcode.com/paper/quantization-and-training-of-low-bit-width |
Repo | |
Framework | |