Paper Group ANR 361
cltorch: a Hardware-Agnostic Backend for the Torch Deep Neural Network Library, Based on OpenCL. N-opcode Analysis for Android Malware Classification and Categorization. Ontology Verbalization using Semantic-Refinement. A statistical framework for fair predictive algorithms. Noisy Parallel Approximate Decoding for Conditional Recurrent Language Mod …
cltorch: a Hardware-Agnostic Backend for the Torch Deep Neural Network Library, Based on OpenCL
Title | cltorch: a Hardware-Agnostic Backend for the Torch Deep Neural Network Library, Based on OpenCL |
Authors | Hugh Perkins |
Abstract | This paper presents cltorch, a hardware-agnostic backend for the Torch neural network framework. cltorch enables training of deep neural networks on GPUs from diverse hardware vendors, including AMD, NVIDIA, and Intel. cltorch contains sufficient implementation to run models such as AlexNet, VGG, Overfeat, and GoogleNet. It is written using the OpenCL language, a portable compute language, governed by the Khronos Group. cltorch is the top-ranked hardware-agnostic machine learning framework on Chintala’s convnet-benchmarks page. This paper presents the technical challenges encountered whilst creating the cltorch backend for Torch, and looks in detail at the challenges related to obtaining a fast hardware-agnostic implementation. The convolutional layers are identified as the key area of focus for accelerating hardware-agnostic frameworks. Possible approaches to accelerating the convolutional implementation are identified including: implementation of the convolutions using the implicitgemm or winograd algorithm, using a GEMM implementation adapted to the geometries associated with the convolutional algorithm, or using a pluggable hardware-specific convolutional implementation. |
Tasks | |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.04884v1 |
http://arxiv.org/pdf/1606.04884v1.pdf | |
PWC | https://paperswithcode.com/paper/cltorch-a-hardware-agnostic-backend-for-the |
Repo | |
Framework | |
N-opcode Analysis for Android Malware Classification and Categorization
Title | N-opcode Analysis for Android Malware Classification and Categorization |
Authors | BooJoong Kang, Suleiman Y. Yerima, Kieran McLaughlin, Sakir Sezer |
Abstract | Malware detection is a growing problem particularly on the Android mobile platform due to its increasing popularity and accessibility to numerous third party app markets. This has also been made worse by the increasingly sophisticated detection avoidance techniques employed by emerging malware families. This calls for more effective techniques for detection and classification of Android malware. Hence, in this paper we present an n-opcode analysis based approach that utilizes machine learning to classify and categorize Android malware. This approach enables automated feature discovery that eliminates the need for applying expert or domain knowledge to define the needed features. Our experiments on 2520 samples that were performed using up to 10-gram opcode features showed that an f-measure of 98% is achievable using this approach. |
Tasks | Malware Classification, Malware Detection |
Published | 2016-07-27 |
URL | http://arxiv.org/abs/1607.08149v1 |
http://arxiv.org/pdf/1607.08149v1.pdf | |
PWC | https://paperswithcode.com/paper/n-opcode-analysis-for-android-malware |
Repo | |
Framework | |
Ontology Verbalization using Semantic-Refinement
Title | Ontology Verbalization using Semantic-Refinement |
Authors | Vinu E. V, P Sreenivasa Kumar |
Abstract | We propose a rule-based technique to generate redundancy-free NL descriptions of OWL entities.The existing approaches which address the problem of verbalizing OWL ontologies generate NL text segments which are close to their counterpart OWL statements.Some of these approaches also perform grouping and aggregating of these NL text segments to generate a more fluent and comprehensive form of the content.Restricting our attention to description of individuals and concepts, we find that the approach currently followed in the available tools is that of determining the set of all logical conditions that are satisfied by the given individual/concept name and translate these conditions verbatim into corresponding NL descriptions.Human-understandability of such descriptions is affected by the presence of repetitions and redundancies, as they have high fidelity to their OWL representation.In the literature, no efforts had been taken to remove redundancies and repetitions at the logical-level before generating the NL descriptions of entities and we find this to be the main reason for lack of readability of the generated text.Herein, we propose a technique called semantic-refinement(SR) to generate meaningful and easily-understandable descriptions of individuals and concepts of a given OWLontology.We identify the combinations of OWL/DL constructs that lead to repetitive/redundant descriptions and propose a series of refinement rules to rewrite the conditions that are satisfied by an individual/concept in a meaning-preserving manner.The reduced set of conditions are then employed for generating NL descriptions.Our experiments show that, SR leads to significantly improved descriptions of ontology entities.We also test the effectiveness and usefulness of the the generated descriptions for the purpose of validating the ontologies and find that the proposed technique is indeed helpful in the context. |
Tasks | |
Published | 2016-10-31 |
URL | http://arxiv.org/abs/1610.09964v1 |
http://arxiv.org/pdf/1610.09964v1.pdf | |
PWC | https://paperswithcode.com/paper/ontology-verbalization-using-semantic |
Repo | |
Framework | |
A statistical framework for fair predictive algorithms
Title | A statistical framework for fair predictive algorithms |
Authors | Kristian Lum, James Johndrow |
Abstract | Predictive modeling is increasingly being employed to assist human decision-makers. One purported advantage of replacing human judgment with computer models in high stakes settings– such as sentencing, hiring, policing, college admissions, and parole decisions– is the perceived “neutrality” of computers. It is argued that because computer models do not hold personal prejudice, the predictions they produce will be equally free from prejudice. There is growing recognition that employing algorithms does not remove the potential for bias, and can even amplify it, since training data were inevitably generated by a process that is itself biased. In this paper, we provide a probabilistic definition of algorithmic bias. We propose a method to remove bias from predictive models by removing all information regarding protected variables from the permitted training data. Unlike previous work in this area, our framework is general enough to accommodate arbitrary data types, e.g. binary, continuous, etc. Motivated by models currently in use in the criminal justice system that inform decisions on pre-trial release and paroling, we apply our proposed method to a dataset on the criminal histories of individuals at the time of sentencing to produce “race-neutral” predictions of re-arrest. In the process, we demonstrate that the most common approach to creating “race-neutral” models– omitting race as a covariate– still results in racially disparate predictions. We then demonstrate that the application of our proposed method to these data removes racial disparities from predictions with minimal impact on predictive accuracy. |
Tasks | |
Published | 2016-10-25 |
URL | http://arxiv.org/abs/1610.08077v1 |
http://arxiv.org/pdf/1610.08077v1.pdf | |
PWC | https://paperswithcode.com/paper/a-statistical-framework-for-fair-predictive |
Repo | |
Framework | |
Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model
Title | Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model |
Authors | Kyunghyun Cho |
Abstract | Recent advances in conditional recurrent language modelling have mainly focused on network architectures (e.g., attention mechanism), learning algorithms (e.g., scheduled sampling and sequence-level training) and novel applications (e.g., image/video description generation, speech recognition, etc.) On the other hand, we notice that decoding algorithms/strategies have not been investigated as much, and it has become standard to use greedy or beam search. In this paper, we propose a novel decoding strategy motivated by an earlier observation that nonlinear hidden layers of a deep neural network stretch the data manifold. The proposed strategy is embarrassingly parallelizable without any communication overhead, while improving an existing decoding algorithm. We extensively evaluate it with attention-based neural machine translation on the task of En->Cz translation. |
Tasks | Language Modelling, Machine Translation, Speech Recognition, Video Description |
Published | 2016-05-12 |
URL | http://arxiv.org/abs/1605.03835v1 |
http://arxiv.org/pdf/1605.03835v1.pdf | |
PWC | https://paperswithcode.com/paper/noisy-parallel-approximate-decoding-for |
Repo | |
Framework | |
Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates
Title | Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates |
Authors | Yuanzhi Li, Yingyu Liang, Andrej Risteski |
Abstract | Non-negative matrix factorization is a popular tool for decomposing data into feature and weight matrices under non-negativity constraints. It enjoys practical success but is poorly understood theoretically. This paper proposes an algorithm that alternates between decoding the weights and updating the features, and shows that assuming a generative model of the data, it provably recovers the ground-truth under fairly mild conditions. In particular, its only essential requirement on features is linear independence. Furthermore, the algorithm uses ReLU to exploit the non-negativity for decoding the weights, and thus can tolerate adversarial noise that can potentially be as large as the signal, and can tolerate unbiased noise much larger than the signal. The analysis relies on a carefully designed coupling between two potential functions, which we believe is of independent interest. |
Tasks | |
Published | 2016-11-11 |
URL | http://arxiv.org/abs/1611.03819v1 |
http://arxiv.org/pdf/1611.03819v1.pdf | |
PWC | https://paperswithcode.com/paper/recovery-guarantee-of-non-negative-matrix |
Repo | |
Framework | |
A Convolutional Autoencoder for Multi-Subject fMRI Data Aggregation
Title | A Convolutional Autoencoder for Multi-Subject fMRI Data Aggregation |
Authors | Po-Hsuan Chen, Xia Zhu, Hejia Zhang, Javier S. Turek, Janice Chen, Theodore L. Willke, Uri Hasson, Peter J. Ramadge |
Abstract | Finding the most effective way to aggregate multi-subject fMRI data is a long-standing and challenging problem. It is of increasing interest in contemporary fMRI studies of human cognition due to the scarcity of data per subject and the variability of brain anatomy and functional response across subjects. Recent work on latent factor models shows promising results in this task but this approach does not preserve spatial locality in the brain. We examine two ways to combine the ideas of a factor model and a searchlight based analysis to aggregate multi-subject fMRI data while preserving spatial locality. We first do this directly by combining a recent factor method known as a shared response model with searchlight analysis. Then we design a multi-view convolutional autoencoder for the same task. Both approaches preserve spatial locality and have competitive or better performance compared with standard searchlight analysis and the shared response model applied across the whole brain. We also report a system design to handle the computational challenge of training the convolutional autoencoder. |
Tasks | |
Published | 2016-08-17 |
URL | http://arxiv.org/abs/1608.04846v1 |
http://arxiv.org/pdf/1608.04846v1.pdf | |
PWC | https://paperswithcode.com/paper/a-convolutional-autoencoder-for-multi-subject |
Repo | |
Framework | |
Comparison of Several Sparse Recovery Methods for Low Rank Matrices with Random Samples
Title | Comparison of Several Sparse Recovery Methods for Low Rank Matrices with Random Samples |
Authors | Ashkan Esmaeili, Farokh Marvasti |
Abstract | In this paper, we will investigate the efficacy of IMAT (Iterative Method of Adaptive Thresholding) in recovering the sparse signal (parameters) for linear models with missing data. Sparse recovery rises in compressed sensing and machine learning problems and has various applications necessitating viable reconstruction methods specifically when we work with big data. This paper will focus on comparing the power of IMAT in reconstruction of the desired sparse signal with LASSO. Additionally, we will assume the model has random missing information. Missing data has been recently of interest in big data and machine learning problems since they appear in many cases including but not limited to medical imaging datasets, hospital datasets, and massive MIMO. The dominance of IMAT over the well-known LASSO will be taken into account in different scenarios. Simulations and numerical results are also provided to verify the arguments. |
Tasks | |
Published | 2016-06-12 |
URL | http://arxiv.org/abs/1606.03672v1 |
http://arxiv.org/pdf/1606.03672v1.pdf | |
PWC | https://paperswithcode.com/paper/comparison-of-several-sparse-recovery-methods |
Repo | |
Framework | |
Contravening Esotery: Cryptanalysis of Knapsack Cipher using Genetic Algorithms
Title | Contravening Esotery: Cryptanalysis of Knapsack Cipher using Genetic Algorithms |
Authors | Harmeet Singh |
Abstract | Cryptanalysis of knapsack cipher is a fascinating problem which has eluded the computing fraternity for decades. However, in most of the cases either the time complexity of the proposed algorithm is colossal or an insufficient number of samples have been taken for verification. The present work proposes a Genetic Algorithm based technique for cryptanalysis of knapsack cipher. The experiments conducted prove the validity of the technique. The results prove that the technique is better than the existing techniques. An extensive review has been carried out in order to find the gaps in the existing techniques. The work paves the way of the application of computational intelligence techniques to the discipline of cryptanalysis. |
Tasks | Cryptanalysis |
Published | 2016-06-20 |
URL | http://arxiv.org/abs/1606.06047v1 |
http://arxiv.org/pdf/1606.06047v1.pdf | |
PWC | https://paperswithcode.com/paper/contravening-esotery-cryptanalysis-of |
Repo | |
Framework | |
Discovering containment: from infants to machines
Title | Discovering containment: from infants to machines |
Authors | Shimon Ullman, Nimrod Dorfman, Daniel Harari |
Abstract | Current artificial learning systems can recognize thousands of visual categories, or play Go at a champion"s level, but cannot explain infants learning, in particular the ability to learn complex concepts without guidance, in a specific order. A notable example is the category of ‘containers’ and the notion of containment, one of the earliest spatial relations to be learned, starting already at 2.5 months, and preceding other common relations (e.g., support). Such spontaneous unsupervised learning stands in contrast with current highly successful computational models, which learn in a supervised manner, that is, by using large data sets of labeled examples. How can meaningful concepts be learned without guidance, and what determines the trajectory of infant learning, making some notions appear consistently earlier than others? |
Tasks | |
Published | 2016-10-30 |
URL | http://arxiv.org/abs/1610.09625v1 |
http://arxiv.org/pdf/1610.09625v1.pdf | |
PWC | https://paperswithcode.com/paper/discovering-containment-from-infants-to |
Repo | |
Framework | |
Exponentially fast convergence to (strict) equilibrium via hedging
Title | Exponentially fast convergence to (strict) equilibrium via hedging |
Authors | Johanne Cohen, Amélie Héliou, Panayotis Mertikopoulos |
Abstract | Motivated by applications to data networks where fast convergence is essential, we analyze the problem of learning in generic N-person games that admit a Nash equilibrium in pure strategies. Specifically, we consider a scenario where players interact repeatedly and try to learn from past experience by small adjustments based on local - and possibly imperfect - payoff information. For concreteness, we focus on the so-called “hedge” variant of the exponential weights algorithm where players select an action with probability proportional to the exponential of the action’s cumulative payoff over time. When players have perfect information on their mixed payoffs, the algorithm converges locally to a strict equilibrium and the rate of convergence is exponentially fast - of the order of $\mathcal{O}(\exp(-a\sum_{j=1}^{t}\gamma_{j}))$ where $a>0$ is a constant and $\gamma_{j}$ is the algorithm’s step-size. In the presence of uncertainty, convergence requires a more conservative step-size policy, but with high probability, the algorithm remains locally convergent and achieves an exponential convergence rate. |
Tasks | |
Published | 2016-07-29 |
URL | http://arxiv.org/abs/1607.08863v1 |
http://arxiv.org/pdf/1607.08863v1.pdf | |
PWC | https://paperswithcode.com/paper/exponentially-fast-convergence-to-strict |
Repo | |
Framework | |
Graph based manifold regularized deep neural networks for automatic speech recognition
Title | Graph based manifold regularized deep neural networks for automatic speech recognition |
Authors | Vikrant Singh Tomar, Richard C. Rose |
Abstract | Deep neural networks (DNNs) have been successfully applied to a wide variety of acoustic modeling tasks in recent years. These include the applications of DNNs either in a discriminative feature extraction or in a hybrid acoustic modeling scenario. Despite the rapid progress in this area, a number of challenges remain in training DNNs. This paper presents an effective way of training DNNs using a manifold learning based regularization framework. In this framework, the parameters of the network are optimized to preserve underlying manifold based relationships between speech feature vectors while minimizing a measure of loss between network outputs and targets. This is achieved by incorporating manifold based locality constraints in the objective criterion of DNNs. Empirical evidence is provided to demonstrate that training a network with manifold constraints preserves structural compactness in the hidden layers of the network. Manifold regularization is applied to train bottleneck DNNs for feature extraction in hidden Markov model (HMM) based speech recognition. The experiments in this work are conducted on the Aurora-2 spoken digits and the Aurora-4 read news large vocabulary continuous speech recognition tasks. The performance is measured in terms of word error rate (WER) on these tasks. It is shown that the manifold regularized DNNs result in up to 37% reduction in WER relative to standard DNNs. |
Tasks | Large Vocabulary Continuous Speech Recognition, Speech Recognition |
Published | 2016-06-19 |
URL | http://arxiv.org/abs/1606.05925v1 |
http://arxiv.org/pdf/1606.05925v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-manifold-regularized-deep-neural |
Repo | |
Framework | |
Multilinear Low-Rank Tensors on Graphs & Applications
Title | Multilinear Low-Rank Tensors on Graphs & Applications |
Authors | Nauman Shahid, Francesco Grassi, Pierre Vandergheynst |
Abstract | We propose a new framework for the analysis of low-rank tensors which lies at the intersection of spectral graph theory and signal processing. As a first step, we present a new graph based low-rank decomposition which approximates the classical low-rank SVD for matrices and multi-linear SVD for tensors. Then, building on this novel decomposition we construct a general class of convex optimization problems for approximately solving low-rank tensor inverse problems, such as tensor Robust PCA. The whole framework is named as ‘Multilinear Low-rank tensors on Graphs (MLRTG)'. Our theoretical analysis shows: 1) MLRTG stands on the notion of approximate stationarity of multi-dimensional signals on graphs and 2) the approximation error depends on the eigen gaps of the graphs. We demonstrate applications for a wide variety of 4 artificial and 12 real tensor datasets, such as EEG, FMRI, BCI, surveillance videos and hyperspectral images. Generalization of the tensor concepts to non-euclidean domain, orders of magnitude speed-up, low-memory requirement and significantly enhanced performance at low SNR are the key aspects of our framework. |
Tasks | EEG |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04835v1 |
http://arxiv.org/pdf/1611.04835v1.pdf | |
PWC | https://paperswithcode.com/paper/multilinear-low-rank-tensors-on-graphs |
Repo | |
Framework | |
Prediction with a Short Memory
Title | Prediction with a Short Memory |
Authors | Vatsal Sharan, Sham Kakade, Percy Liang, Gregory Valiant |
Abstract | We consider the problem of predicting the next observation given a sequence of past observations, and consider the extent to which accurate prediction requires complex algorithms that explicitly leverage long-range dependencies. Perhaps surprisingly, our positive results show that for a broad class of sequences, there is an algorithm that predicts well on average, and bases its predictions only on the most recent few observation together with a set of simple summary statistics of the past observations. Specifically, we show that for any distribution over observations, if the mutual information between past observations and future observations is upper bounded by $I$, then a simple Markov model over the most recent $I/\epsilon$ observations obtains expected KL error $\epsilon$—and hence $\ell_1$ error $\sqrt{\epsilon}$—with respect to the optimal predictor that has access to the entire past and knows the data generating distribution. For a Hidden Markov Model with $n$ hidden states, $I$ is bounded by $\log n$, a quantity that does not depend on the mixing time, and we show that the trivial prediction algorithm based on the empirical frequencies of length $O(\log n/\epsilon)$ windows of observations achieves this error, provided the length of the sequence is $d^{\Omega(\log n/\epsilon)}$, where $d$ is the size of the observation alphabet. We also establish that this result cannot be improved upon, even for the class of HMMs, in the following two senses: First, for HMMs with $n$ hidden states, a window length of $\log n/\epsilon$ is information-theoretically necessary to achieve expected $\ell_1$ error $\sqrt{\epsilon}$. Second, the $d^{\Theta(\log n/\epsilon)}$ samples required to estimate the Markov model for an observation alphabet of size $d$ is necessary for any computationally tractable learning algorithm, assuming the hardness of strongly refuting a certain class of CSPs. |
Tasks | |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02526v5 |
http://arxiv.org/pdf/1612.02526v5.pdf | |
PWC | https://paperswithcode.com/paper/prediction-with-a-short-memory |
Repo | |
Framework | |
Facial Expression Recognition from World Wild Web
Title | Facial Expression Recognition from World Wild Web |
Authors | Ali Mollahosseini, Behzad Hassani, Michelle J. Salvador, Hojjat Abdollahi, David Chan, Mohammad H. Mahoor |
Abstract | Recognizing facial expression in a wild setting has remained a challenging task in computer vision. The World Wide Web is a good source of facial images which most of them are captured in uncontrolled conditions. In fact, the Internet is a Word Wild Web of facial images with expressions. This paper presents the results of a new study on collecting, annotating, and analyzing wild facial expressions from the web. Three search engines were queried using 1250 emotion related keywords in six different languages and the retrieved images were mapped by two annotators to six basic expressions and neutral. Deep neural networks and noise modeling were used in three different training scenarios to find how accurately facial expressions can be recognized when trained on noisy images collected from the web using query terms (e.g. happy face, laughing man, etc)? The results of our experiments show that deep neural networks can recognize wild facial expressions with an accuracy of 82.12%. |
Tasks | Facial Expression Recognition |
Published | 2016-05-11 |
URL | http://arxiv.org/abs/1605.03639v3 |
http://arxiv.org/pdf/1605.03639v3.pdf | |
PWC | https://paperswithcode.com/paper/facial-expression-recognition-from-world-wild |
Repo | |
Framework | |