Paper Group ANR 176
A Cluster Elastic Net for Multivariate Regression. An Asynchronous Parallel Approach to Sparse Recovery. Malware Detection by Eating a Whole EXE. Universal Joint Image Clustering and Registration using Partition Information. Distributed Bayesian Piecewise Sparse Linear Models. Solving Distributed Constraint Optimization Problems Using Logic Program …
A Cluster Elastic Net for Multivariate Regression
Title | A Cluster Elastic Net for Multivariate Regression |
Authors | Bradley S. Price, Ben Sherwood |
Abstract | We propose a method for estimating coefficients in multivariate regression when there is a clustering structure to the response variables. The proposed method includes a fusion penalty, to shrink the difference in fitted values from responses in the same cluster, and an L1 penalty for simultaneous variable selection and estimation. The method can be used when the grouping structure of the response variables is known or unknown. When the clustering structure is unknown the method will simultaneously estimate the clusters of the response and the regression coefficients. Theoretical results are presented for the penalized least squares case, including asymptotic results allowing for p » n. We extend our method to the setting where the responses are binomial variables. We propose a coordinate descent algorithm for both the normal and binomial likelihood, which can easily be extended to other generalized linear model (GLM) settings. Simulations and data examples from business operations and genomics are presented to show the merits of both the least squares and binomial methods. |
Tasks | |
Published | 2017-07-12 |
URL | http://arxiv.org/abs/1707.03530v2 |
http://arxiv.org/pdf/1707.03530v2.pdf | |
PWC | https://paperswithcode.com/paper/a-cluster-elastic-net-for-multivariate |
Repo | |
Framework | |
An Asynchronous Parallel Approach to Sparse Recovery
Title | An Asynchronous Parallel Approach to Sparse Recovery |
Authors | Deanna Needell, Tina Woolf |
Abstract | Asynchronous parallel computing and sparse recovery are two areas that have received recent interest. Asynchronous algorithms are often studied to solve optimization problems where the cost function takes the form $\sum_{i=1}^M f_i(x)$, with a common assumption that each $f_i$ is sparse; that is, each $f_i$ acts only on a small number of components of $x\in\mathbb{R}^n$. Sparse recovery problems, such as compressed sensing, can be formulated as optimization problems, however, the cost functions $f_i$ are dense with respect to the components of $x$, and instead the signal $x$ is assumed to be sparse, meaning that it has only $s$ non-zeros where $s\ll n$. Here we address how one may use an asynchronous parallel architecture when the cost functions $f_i$ are not sparse in $x$, but rather the signal $x$ is sparse. We propose an asynchronous parallel approach to sparse recovery via a stochastic greedy algorithm, where multiple processors asynchronously update a vector in shared memory containing information on the estimated signal support. We include numerical simulations that illustrate the potential benefits of our proposed asynchronous method. |
Tasks | |
Published | 2017-01-12 |
URL | http://arxiv.org/abs/1701.03458v1 |
http://arxiv.org/pdf/1701.03458v1.pdf | |
PWC | https://paperswithcode.com/paper/an-asynchronous-parallel-approach-to-sparse |
Repo | |
Framework | |
Malware Detection by Eating a Whole EXE
Title | Malware Detection by Eating a Whole EXE |
Authors | Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, Charles Nicholas |
Abstract | In this work we introduce malware detection from raw byte sequences as a fruitful research area to the larger machine learning community. Building a neural network for such a problem presents a number of interesting challenges that have not occurred in tasks such as image processing or NLP. In particular, we note that detection from raw bytes presents a sequence problem with over two million time steps and a problem where batch normalization appear to hinder the learning process. We present our initial work in building a solution to tackle this problem, which has linear complexity dependence on the sequence length, and allows for interpretable sub-regions of the binary to be identified. In doing so we will discuss the many challenges in building a neural network to process data at this scale, and the methods we used to work around them. |
Tasks | Malware Detection |
Published | 2017-10-25 |
URL | http://arxiv.org/abs/1710.09435v1 |
http://arxiv.org/pdf/1710.09435v1.pdf | |
PWC | https://paperswithcode.com/paper/malware-detection-by-eating-a-whole-exe |
Repo | |
Framework | |
Universal Joint Image Clustering and Registration using Partition Information
Title | Universal Joint Image Clustering and Registration using Partition Information |
Authors | Ravi Kiran Raman, Lav R. Varshney |
Abstract | We consider the problem of universal joint clustering and registration of images and define algorithms using multivariate information functionals. We first study registering two images using maximum mutual information and prove its asymptotic optimality. We then show the shortcomings of pairwise registration in multi-image registration, and design an asymptotically optimal algorithm based on multiinformation. Further, we define a novel multivariate information functional to perform joint clustering and registration of images, and prove consistency of the algorithm. Finally, we consider registration and clustering of numerous limited-resolution images, defining algorithms that are order-optimal in scaling of number of pixels in each image with the number of images. |
Tasks | Image Clustering, Image Registration |
Published | 2017-01-10 |
URL | http://arxiv.org/abs/1701.02776v2 |
http://arxiv.org/pdf/1701.02776v2.pdf | |
PWC | https://paperswithcode.com/paper/universal-joint-image-clustering-and |
Repo | |
Framework | |
Distributed Bayesian Piecewise Sparse Linear Models
Title | Distributed Bayesian Piecewise Sparse Linear Models |
Authors | Masato Asahara, Ryohei Fujimaki |
Abstract | The importance of interpretability of machine learning models has been increasing due to emerging enterprise predictive analytics, threat of data privacy, accountability of artificial intelligence in society, and so on. Piecewise linear models have been actively studied to achieve both accuracy and interpretability. They often produce competitive accuracy against state-of-the-art non-linear methods. In addition, their representations (i.e., rule-based segmentation plus sparse linear formula) are often preferred by domain experts. A disadvantage of such models, however, is high computational cost for simultaneous determinations of the number of “pieces” and cardinality of each linear predictor, which has restricted their applicability to middle-scale data sets. This paper proposes a distributed factorized asymptotic Bayesian (FAB) inference of learning piece-wise sparse linear models on distributed memory architectures. The distributed FAB inference solves the simultaneous model selection issue without communicating $O(N)$ data where N is the number of training samples and achieves linear scale-out against the number of CPU cores. Experimental results demonstrate that the distributed FAB inference achieves high prediction accuracy and performance scalability with both synthetic and benchmark data. |
Tasks | Model Selection |
Published | 2017-11-07 |
URL | http://arxiv.org/abs/1711.02368v1 |
http://arxiv.org/pdf/1711.02368v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-bayesian-piecewise-sparse-linear |
Repo | |
Framework | |
Solving Distributed Constraint Optimization Problems Using Logic Programming
Title | Solving Distributed Constraint Optimization Problems Using Logic Programming |
Authors | Tiep Le, Tran Cao Son, Enrico Pontelli, William Yeoh |
Abstract | This paper explores the use of Answer Set Programming (ASP) in solving Distributed Constraint Optimization Problems (DCOPs). The paper provides the following novel contributions: (1) It shows how one can formulate DCOPs as logic programs; (2) It introduces ASP-DPOP, the first DCOP algorithm that is based on logic programming; (3) It experimentally shows that ASP-DPOP can be up to two orders of magnitude faster than DPOP (its imperative programming counterpart) as well as solve some problems that DPOP fails to solve, due to memory limitations; and (4) It demonstrates the applicability of ASP in a wide array of multi-agent problems currently modeled as DCOPs. Under consideration in Theory and Practice of Logic Programming (TPLP). |
Tasks | |
Published | 2017-05-10 |
URL | http://arxiv.org/abs/1705.03916v1 |
http://arxiv.org/pdf/1705.03916v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-distributed-constraint-optimization |
Repo | |
Framework | |
Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach
Title | Extracting Hierarchies of Search Tasks & Subtasks via a Bayesian Nonparametric Approach |
Authors | Rishabh Mehrotra, Emine Yilmaz |
Abstract | A significant amount of search queries originate from some real world information need or tasks. In order to improve the search experience of the end users, it is important to have accurate representations of tasks. As a result, significant amount of research has been devoted to extracting proper representations of tasks in order to enable search systems to help users complete their tasks, as well as providing the end user with better query suggestions, for better recommendations, for satisfaction prediction, and for improved personalization in terms of tasks. Most existing task extraction methodologies focus on representing tasks as flat structures. However, tasks often tend to have multiple subtasks associated with them and a more naturalistic representation of tasks would be in terms of a hierarchy, where each task can be composed of multiple (sub)tasks. To this end, we propose an efficient Bayesian nonparametric model for extracting hierarchies of such tasks & subtasks. We evaluate our method based on real world query log data both through quantitative and crowdsourced experiments and highlight the importance of considering task/subtask hierarchies. |
Tasks | |
Published | 2017-06-06 |
URL | http://arxiv.org/abs/1706.01574v2 |
http://arxiv.org/pdf/1706.01574v2.pdf | |
PWC | https://paperswithcode.com/paper/extracting-hierarchies-of-search-tasks |
Repo | |
Framework | |
Visual Speech Enhancement
Title | Visual Speech Enhancement |
Authors | Aviv Gabbay, Asaph Shamir, Shmuel Peleg |
Abstract | When video is shot in noisy environment, the voice of a speaker seen in the video can be enhanced using the visible mouth movements, reducing background noise. While most existing methods use audio-only inputs, improved performance is obtained with our visual speech enhancement, based on an audio-visual neural network. We include in the training data videos to which we added the voice of the target speaker as background noise. Since the audio input is not sufficient to separate the voice of a speaker from his own voice, the trained model better exploits the visual input and generalizes well to different noise types. The proposed model outperforms prior audio visual methods on two public lipreading datasets. It is also the first to be demonstrated on a dataset not designed for lipreading, such as the weekly addresses of Barack Obama. |
Tasks | Lipreading, Speech Enhancement |
Published | 2017-11-23 |
URL | http://arxiv.org/abs/1711.08789v3 |
http://arxiv.org/pdf/1711.08789v3.pdf | |
PWC | https://paperswithcode.com/paper/visual-speech-enhancement |
Repo | |
Framework | |
Sampling High Throughput Data for Anomaly Detection of Data-Base Activity
Title | Sampling High Throughput Data for Anomaly Detection of Data-Base Activity |
Authors | Hagit Grushka-Cohen, Oded Sofer, Ofer Biller, Michael Dymshits, Lior Rokach, Bracha Shapira |
Abstract | Data leakage and theft from databases is a dangerous threat to organizations. Data Security and Data Privacy protection systems (DSDP) monitor data access and usage to identify leakage or suspicious activities that should be investigated. Because of the high velocity nature of database systems, such systems audit only a portion of the vast number of transactions that take place. Anomalies are investigated by a Security Officer (SO) in order to choose the proper response. In this paper we investigate the effect of sampling methods based on the risk the transaction poses and propose a new method for “combined sampling” for capturing a more varied sample. |
Tasks | Anomaly Detection |
Published | 2017-08-14 |
URL | http://arxiv.org/abs/1708.04278v1 |
http://arxiv.org/pdf/1708.04278v1.pdf | |
PWC | https://paperswithcode.com/paper/sampling-high-throughput-data-for-anomaly |
Repo | |
Framework | |
Deep Metric Learning and Image Classification with Nearest Neighbour Gaussian Kernels
Title | Deep Metric Learning and Image Classification with Nearest Neighbour Gaussian Kernels |
Authors | Benjamin J. Meyer, Ben Harwood, Tom Drummond |
Abstract | We present a Gaussian kernel loss function and training algorithm for convolutional neural networks that can be directly applied to both distance metric learning and image classification problems. Our method treats all training features from a deep neural network as Gaussian kernel centres and computes loss by summing the influence of a feature’s nearby centres in the feature embedding space. Our approach is made scalable by treating it as an approximate nearest neighbour search problem. We show how to make end-to-end learning feasible, resulting in a well formed embedding space, in which semantically related instances are likely to be located near one another, regardless of whether or not the network was trained on those classes. Our approach outperforms state-of-the-art deep metric learning approaches on embedding learning challenges, as well as conventional softmax classification on several datasets. |
Tasks | Image Classification, Metric Learning |
Published | 2017-05-27 |
URL | http://arxiv.org/abs/1705.09780v3 |
http://arxiv.org/pdf/1705.09780v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-metric-learning-and-image-classification |
Repo | |
Framework | |
Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks
Title | Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks |
Authors | Hassan Al Hajj, Mathieu Lamard, Pierre-Henri Conze, Béatrice Cochener, Gwenolé Quellec |
Abstract | This paper investigates the automatic monitoring of tool usage during a surgery, with potential applications in report generation, surgical training and real-time decision support. Two surgeries are considered: cataract surgery, the most common surgical procedure, and cholecystectomy, one of the most common digestive surgeries. Tool usage is monitored in videos recorded either through a microscope (cataract surgery) or an endoscope (cholecystectomy). Following state-of-the-art video analysis solutions, each frame of the video is analyzed by convolutional neural networks (CNNs) whose outputs are fed to recurrent neural networks (RNNs) in order to take temporal relationships between events into account. Novelty lies in the way those CNNs and RNNs are trained. Computational complexity prevents the end-to-end training of “CNN+RNN” systems. Therefore, CNNs are usually trained first, independently from the RNNs. This approach is clearly suboptimal for surgical tool analysis: many tools are very similar to one another, but they can generally be differentiated based on past events. CNNs should be trained to extract the most useful visual features in combination with the temporal context. A novel boosting strategy is proposed to achieve this goal: the CNN and RNN parts of the system are simultaneously enriched by progressively adding weak classifiers (either CNNs or RNNs) trained to improve the overall classification accuracy. Experiments were performed in a dataset of 50 cataract surgery videos and a dataset of 80 cholecystectomy videos. Very good classification performance are achieved in both datasets: tool usage could be labeled with an average area under the ROC curve of $A_z = 0.9961$ and $A_z = 0.9939$, respectively, in offline mode (using past, present and future information), and $A_z = 0.9957$ and $A_z = 0.9936$, respectively, in online mode (using past and present information only). |
Tasks | |
Published | 2017-10-04 |
URL | http://arxiv.org/abs/1710.01559v2 |
http://arxiv.org/pdf/1710.01559v2.pdf | |
PWC | https://paperswithcode.com/paper/monitoring-tool-usage-in-surgery-videos-using |
Repo | |
Framework | |
NEXT: A Neural Network Framework for Next POI Recommendation
Title | NEXT: A Neural Network Framework for Next POI Recommendation |
Authors | Zhiqian Zhang, Chenliang Li, Zhiyong Wu, Aixin Sun, Dengpan Ye, Xiangyang Luo |
Abstract | The task of next POI recommendation has been studied extensively in recent years. However, developing an unified recommendation framework to incorporate multiple factors associated with both POIs and users remains challenging, because of the heterogeneity nature of these information. Further, effective mechanisms to handle cold-start and endow the system with interpretability are also difficult topics. Inspired by the recent success of neural networks in many areas, in this paper, we present a simple but effective neural network framework for next POI recommendation, named NEXT. NEXT is an unified framework to learn the hidden intent regarding user’s next move, by incorporating different factors in an unified manner. Specifically, in NEXT, we incorporate meta-data information and two kinds of temporal contexts (i.e., time interval and visit time). To leverage sequential relations and geographical influence, we propose to adopt DeepWalk, a network representation learning technique, to encode such knowledge. We evaluate the effectiveness of NEXT against state-of-the-art alternatives and neural networks based solutions. Experimental results over three publicly available datasets demonstrate that NEXT significantly outperforms baselines in real-time next POI recommendation. Further experiments demonstrate the superiority of NEXT in handling cold-start. More importantly, we show that NEXT provides meaningful explanation of the dimensions in hidden intent space. |
Tasks | Representation Learning |
Published | 2017-04-15 |
URL | http://arxiv.org/abs/1704.04576v1 |
http://arxiv.org/pdf/1704.04576v1.pdf | |
PWC | https://paperswithcode.com/paper/170404576 |
Repo | |
Framework | |
Deep Representation Learning with Part Loss for Person Re-Identification
Title | Deep Representation Learning with Part Loss for Person Re-Identification |
Authors | Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian |
Abstract | Learning discriminative representations for unseen person images is critical for person Re-Identification (ReID). Most of current approaches learn deep representations in classification tasks, which essentially minimize the empirical classification risk on the training set. As shown in our experiments, such representations commonly focus on several body parts discriminative to the training set, rather than the entire human body. Inspired by the structural risk minimization principle in SVM, we revise the traditional deep representation learning procedure to minimize both the empirical classification risk and the representation learning risk. The representation learning risk is evaluated by the proposed part loss, which automatically generates several parts for an image, and computes the person classification loss on each part separately. Compared with traditional global classification loss, simultaneously considering multiple part loss enforces the deep network to focus on the entire human body and learn discriminative representations for different parts. Experimental results on three datasets, i.e., Market1501, CUHK03, VIPeR, show that our representation outperforms the existing deep representations. |
Tasks | Person Re-Identification, Representation Learning |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.00798v2 |
http://arxiv.org/pdf/1707.00798v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-representation-learning-with-part-loss |
Repo | |
Framework | |
Semantic Composition via Probabilistic Model Theory
Title | Semantic Composition via Probabilistic Model Theory |
Authors | Guy Emerson, Ann Copestake |
Abstract | Semantic composition remains an open problem for vector space models of semantics. In this paper, we explain how the probabilistic graphical model used in the framework of Functional Distributional Semantics can be interpreted as a probabilistic version of model theory. Building on this, we explain how various semantic phenomena can be recast in terms of conditional probabilities in the graphical model. This connection between formal semantics and machine learning is helpful in both directions: it gives us an explicit mechanism for modelling context-dependent meanings (a challenge for formal semantics), and also gives us well-motivated techniques for composing distributed representations (a challenge for distributional semantics). We present results on two datasets that go beyond word similarity, showing how these semantically-motivated techniques improve on the performance of vector models. |
Tasks | Semantic Composition |
Published | 2017-09-01 |
URL | http://arxiv.org/abs/1709.00226v1 |
http://arxiv.org/pdf/1709.00226v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-composition-via-probabilistic-model |
Repo | |
Framework | |
Classification-based RNN machine translation using GRUs
Title | Classification-based RNN machine translation using GRUs |
Authors | Ri Wang, Maysum Panju, Mahmood Gohari |
Abstract | We report the results of our classification-based machine translation model, built upon the framework of a recurrent neural network using gated recurrent units. Unlike other RNN models that attempt to maximize the overall conditional log probability of sentences against sentences, our model focuses a classification approach of estimating the conditional probability of the next word given the input sequence. This simpler approach using GRUs was hoped to be comparable with more complicated RNN models, but achievements in this implementation were modest and there remains a lot of room for improving this classification approach. |
Tasks | Machine Translation |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07841v1 |
http://arxiv.org/pdf/1703.07841v1.pdf | |
PWC | https://paperswithcode.com/paper/classification-based-rnn-machine-translation |
Repo | |
Framework | |