Paper Group AWR 77
Bayesian Cluster Enumeration Criterion for Unsupervised Learning. BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet. Developing a comprehensive framework for multimodal feature extraction. A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing. Towards Alzheimer’s Disease Classification throug …
Bayesian Cluster Enumeration Criterion for Unsupervised Learning
Title | Bayesian Cluster Enumeration Criterion for Unsupervised Learning |
Authors | Freweyni K. Teklehaymanot, Michael Muma, Abdelhak M. Zoubir |
Abstract | We derive a new Bayesian Information Criterion (BIC) by formulating the problem of estimating the number of clusters in an observed data set as maximization of the posterior probability of the candidate models. Given that some mild assumptions are satisfied, we provide a general BIC expression for a broad class of data distributions. This serves as a starting point when deriving the BIC for specific distributions. Along this line, we provide a closed-form BIC expression for multivariate Gaussian distributed variables. We show that incorporating the data structure of the clustering problem into the derivation of the BIC results in an expression whose penalty term is different from that of the original BIC. We propose a two-step cluster enumeration algorithm. First, a model-based unsupervised learning algorithm partitions the data according to a given set of candidate models. Subsequently, the number of clusters is determined as the one associated with the model for which the proposed BIC is maximal. The performance of the proposed two-step algorithm is tested using synthetic and real data sets. |
Tasks | |
Published | 2017-10-22 |
URL | http://arxiv.org/abs/1710.07954v3 |
http://arxiv.org/pdf/1710.07954v3.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-cluster-enumeration-criterion-for |
Repo | https://github.com/FreTekle/Bayesian-Cluster-Enumeration |
Framework | none |
BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet
Title | BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet |
Authors | Haojin Yang, Martin Fritzsche, Christian Bartz, Christoph Meinel |
Abstract | Binary Neural Networks (BNNs) can drastically reduce memory size and accesses by applying bit-wise operations instead of standard arithmetic operations. Therefore it could significantly improve the efficiency and lower the energy consumption at runtime, which enables the application of state-of-the-art deep learning models on low power devices. BMXNet is an open-source BNN library based on MXNet, which supports both XNOR-Networks and Quantized Neural Networks. The developed BNN layers can be seamlessly applied with other standard library components and work in both GPU and CPU mode. BMXNet is maintained and developed by the multimedia research group at Hasso Plattner Institute and released under Apache license. Extensive experiments validate the efficiency and effectiveness of our implementation. The BMXNet library, several sample projects, and a collection of pre-trained binary deep models are available for download at https://github.com/hpi-xnor |
Tasks | |
Published | 2017-05-27 |
URL | http://arxiv.org/abs/1705.09864v1 |
http://arxiv.org/pdf/1705.09864v1.pdf | |
PWC | https://paperswithcode.com/paper/bmxnet-an-open-source-binary-neural-network |
Repo | https://github.com/hpi-xnor/BMXNet |
Framework | mxnet |
Developing a comprehensive framework for multimodal feature extraction
Title | Developing a comprehensive framework for multimodal feature extraction |
Authors | Quinten McNamara, Alejandro de la Vega, Tal Yarkoni |
Abstract | Feature extraction is a critical component of many applied data science workflows. In recent years, rapid advances in artificial intelligence and machine learning have led to an explosion of feature extraction tools and services that allow data scientists to cheaply and effectively annotate their data along a vast array of dimensions—ranging from detecting faces in images to analyzing the sentiment expressed in coherent text. Unfortunately, the proliferation of powerful feature extraction services has been mirrored by a corresponding expansion in the number of distinct interfaces to feature extraction services. In a world where nearly every new service has its own API, documentation, and/or client library, data scientists who need to combine diverse features obtained from multiple sources are often forced to write and maintain ever more elaborate feature extraction pipelines. To address this challenge, we introduce a new open-source framework for comprehensive multimodal feature extraction. Pliers is an open-source Python package that supports standardized annotation of diverse data types (video, images, audio, and text), and is expressly with both ease-of-use and extensibility in mind. Users can apply a wide range of pre-existing feature extraction tools to their data in just a few lines of Python code, and can also easily add their own custom extractors by writing modular classes. A graph-based API enables rapid development of complex feature extraction pipelines that output results in a single, standardized format. We describe the package’s architecture, detail its major advantages over previous feature extraction toolboxes, and use a sample application to a large functional MRI dataset to illustrate how pliers can significantly reduce the time and effort required to construct sophisticated feature extraction workflows while increasing code clarity and maintainability. |
Tasks | |
Published | 2017-02-20 |
URL | http://arxiv.org/abs/1702.06151v1 |
http://arxiv.org/pdf/1702.06151v1.pdf | |
PWC | https://paperswithcode.com/paper/developing-a-comprehensive-framework-for |
Repo | https://github.com/tyarkoni/pliers |
Framework | none |
A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing
Title | A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing |
Authors | Dat Quoc Nguyen, Mark Dras, Mark Johnson |
Abstract | We present a novel neural network model that learns POS tagging and graph-based dependency parsing jointly. Our model uses bidirectional LSTMs to learn feature representations shared for both POS tagging and dependency parsing tasks, thus handling the feature-engineering problem. Our extensive experiments, on 19 languages from the Universal Dependencies project, show that our model outperforms the state-of-the-art neural network-based Stack-propagation model for joint POS tagging and transition-based dependency parsing, resulting in a new state of the art. Our code is open-source and available together with pre-trained models at: https://github.com/datquocnguyen/jPTDP |
Tasks | Dependency Parsing, Feature Engineering, Part-Of-Speech Tagging, Transition-Based Dependency Parsing |
Published | 2017-05-16 |
URL | http://arxiv.org/abs/1705.05952v2 |
http://arxiv.org/pdf/1705.05952v2.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-neural-network-model-for-joint-pos |
Repo | https://github.com/datquocnguyen/jPTDP |
Framework | none |
Towards Alzheimer’s Disease Classification through Transfer Learning
Title | Towards Alzheimer’s Disease Classification through Transfer Learning |
Authors | Marcia Hon, Naimul Khan |
Abstract | Detection of Alzheimer’s Disease (AD) from neuroimaging data such as MRI through machine learning have been a subject of intense research in recent years. Recent success of deep learning in computer vision have progressed such research further. However, common limitations with such algorithms are reliance on a large number of training images, and requirement of careful optimization of the architecture of deep networks. In this paper, we attempt solving these issues with transfer learning, where state-of-the-art architectures such as VGG and Inception are initialized with pre-trained weights from large benchmark datasets consisting of natural images, and the fully-connected layer is re-trained with only a small number of MRI images. We employ image entropy to select the most informative slices for training. Through experimentation on the OASIS MRI dataset, we show that with training size almost 10 times smaller than the state-of-the-art, we reach comparable or even better performance than current deep-learning based methods. |
Tasks | Transfer Learning |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.11117v1 |
http://arxiv.org/pdf/1711.11117v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-alzheimers-disease-classification |
Repo | https://github.com/marciahon29/Ryerson_MRP |
Framework | tf |
Detecting Online Hate Speech Using Context Aware Models
Title | Detecting Online Hate Speech Using Context Aware Models |
Authors | Lei Gao, Ruihong Huang |
Abstract | In the wake of a polarizing election, the cyber world is laden with hate speech. Context accompanying a hate speech text is useful for identifying hate speech, which however has been largely overlooked in existing datasets and hate speech detection models. In this paper, we provide an annotated corpus of hate speech with context information well kept. Then we propose two types of hate speech detection models that incorporate context information, a logistic regression model with context features and a neural network model with learning components for context. Our evaluation shows that both models outperform a strong baseline by around 3% to 4% in F1 score and combining these two models further improve the performance by another 7% in F1 score. |
Tasks | Hate Speech Detection |
Published | 2017-10-20 |
URL | http://arxiv.org/abs/1710.07395v2 |
http://arxiv.org/pdf/1710.07395v2.pdf | |
PWC | https://paperswithcode.com/paper/detecting-online-hate-speech-using-context |
Repo | https://github.com/sjtuprog/fox-news-comments |
Framework | none |
Towards the Automatic Anime Characters Creation with Generative Adversarial Networks
Title | Towards the Automatic Anime Characters Creation with Generative Adversarial Networks |
Authors | Yanghua Jin, Jiakai Zhang, Minjun Li, Yingtao Tian, Huachun Zhu, Zhihao Fang |
Abstract | Automatic generation of facial images has been well studied after the Generative Adversarial Network (GAN) came out. There exists some attempts applying the GAN model to the problem of generating facial images of anime characters, but none of the existing work gives a promising result. In this work, we explore the training of GAN models specialized on an anime facial image dataset. We address the issue from both the data and the model aspect, by collecting a more clean, well-suited dataset and leverage proper, empirical application of DRAGAN. With quantitative analysis and case studies we demonstrate that our efforts lead to a stable and high-quality model. Moreover, to assist people with anime character design, we build a website (http://make.girls.moe) with our pre-trained model available online, which makes the model easily accessible to general public. |
Tasks | |
Published | 2017-08-18 |
URL | http://arxiv.org/abs/1708.05509v1 |
http://arxiv.org/pdf/1708.05509v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-the-automatic-anime-characters |
Repo | https://github.com/ctwxdd/Tensorflow-ACGAN-Anime-Generation |
Framework | tf |
Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms
Title | Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms |
Authors | Jeff Heaton |
Abstract | Frequent itemset mining is a popular data mining technique. Apriori, Eclat, and FP-Growth are among the most common algorithms for frequent itemset mining. Considerable research has been performed to compare the relative performance between these three algorithms, by evaluating the scalability of each algorithm as the dataset size increases. While scalability as data size increases is important, previous papers have not examined the performance impact of similarly sized datasets that contain different itemset characteristics. This paper explores the effects that two dataset characteristics can have on the performance of these three frequent itemset algorithms. To perform this empirical analysis, a dataset generator is created to measure the effects of frequent item density and the maximum transaction size on performance. The generated datasets contain the same number of rows. This provides some insight into dataset characteristics that are conducive to each algorithm. The results of this paper’s research demonstrate Eclat and FP-Growth both handle increases in maximum transaction size and frequent itemset density considerably better than the Apriori algorithm. This paper explores the effects that two dataset characteristics can have on the performance of these three frequent itemset algorithms. To perform this empirical analysis, a dataset generator is created to measure the effects of frequent item density and the maximum transaction size on performance. The generated datasets contain the same number of rows. This provides some insight into dataset characteristics that are conducive to each algorithm. The results of this paper’s research demonstrate Eclat and FP-Growth both handle increases in maximum transaction size and frequent itemset density considerably better than the Apriori algorithm. |
Tasks | |
Published | 2017-01-30 |
URL | http://arxiv.org/abs/1701.09042v1 |
http://arxiv.org/pdf/1701.09042v1.pdf | |
PWC | https://paperswithcode.com/paper/comparing-dataset-characteristics-that-favor |
Repo | https://github.com/alextanhongpin/affinity-analysis |
Framework | none |
Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching
Title | Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching |
Authors | Andy Zeng, Shuran Song, Kuan-Ting Yu, Elliott Donlon, Francois R. Hogan, Maria Bauza, Daolin Ma, Orion Taylor, Melody Liu, Eudald Romo, Nima Fazeli, Ferran Alet, Nikhil Chavan Dafle, Rachel Holladay, Isabella Morona, Prem Qu Nair, Druck Green, Ian Taylor, Weber Liu, Thomas Funkhouser, Alberto Rodriguez |
Abstract | This paper presents a robotic pick-and-place system that is capable of grasping and recognizing both known and novel objects in cluttered environments. The key new feature of the system is that it handles a wide range of object categories without needing any task-specific training data for novel objects. To achieve this, it first uses a category-agnostic affordance prediction algorithm to select and execute among four different grasping primitive behaviors. It then recognizes picked objects with a cross-domain image classification framework that matches observed images to product images. Since product images are readily available for a wide range of objects (e.g., from the web), the system works out-of-the-box for novel objects without requiring any additional training data. Exhaustive experimental results demonstrate that our multi-affordance grasping achieves high success rates for a wide variety of objects in clutter, and our recognition algorithm achieves high accuracy for both known and novel grasped objects. The approach was part of the MIT-Princeton Team system that took 1st place in the stowing task at the 2017 Amazon Robotics Challenge. All code, datasets, and pre-trained models are available online at http://arc.cs.princeton.edu |
Tasks | Image Classification, Robotic Grasping |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01330v4 |
http://arxiv.org/pdf/1710.01330v4.pdf | |
PWC | https://paperswithcode.com/paper/robotic-pick-and-place-of-novel-objects-in |
Repo | https://github.com/andyzeng/arc-robot-vision |
Framework | torch |
Semi-Supervised Deep Learning for Fully Convolutional Networks
Title | Semi-Supervised Deep Learning for Fully Convolutional Networks |
Authors | Christoph Baur, Shadi Albarqouni, Nassir Navab |
Abstract | Deep learning usually requires large amounts of labeled training data, but annotating data is costly and tedious. The framework of semi-supervised learning provides the means to use both labeled data and arbitrary amounts of unlabeled data for training. Recently, semi-supervised deep learning has been intensively studied for standard CNN architectures. However, Fully Convolutional Networks (FCNs) set the state-of-the-art for many image segmentation tasks. To the best of our knowledge, there is no existing semi-supervised learning method for such FCNs yet. We lift the concept of auxiliary manifold embedding for semi-supervised learning to FCNs with the help of Random Feature Embedding. In our experiments on the challenging task of MS Lesion Segmentation, we leverage the proposed framework for the purpose of domain adaptation and report substantial improvements over the baseline model. |
Tasks | Domain Adaptation, Lesion Segmentation, Semantic Segmentation |
Published | 2017-03-17 |
URL | http://arxiv.org/abs/1703.06000v2 |
http://arxiv.org/pdf/1703.06000v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-deep-learning-for-fully |
Repo | https://github.com/bumuckl/SemiSupervisedDLForFCNs |
Framework | none |
Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
Title | Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates |
Authors | Leslie N. Smith, Nicholay Topin |
Abstract | In this paper, we describe a phenomenon, which we named “super-convergence”, where neural networks can be trained an order of magnitude faster than with standard training methods. The existence of super-convergence is relevant to understanding why deep networks generalize well. One of the key elements of super-convergence is training with one learning rate cycle and a large maximum learning rate. A primary insight that allows super-convergence training is that large learning rates regularize the training, hence requiring a reduction of all other forms of regularization in order to preserve an optimal regularization balance. We also derive a simplification of the Hessian Free optimization method to compute an estimate of the optimal learning rate. Experiments demonstrate super-convergence for Cifar-10/100, MNIST and Imagenet datasets, and resnet, wide-resnet, densenet, and inception architectures. In addition, we show that super-convergence provides a greater boost in performance relative to standard training when the amount of labeled training data is limited. The architectures and code to replicate the figures in this paper are available at github.com/lnsmith54/super-convergence. See http://www.fast.ai/2018/04/30/dawnbench-fastai/ for an application of super-convergence to win the DAWNBench challenge (see https://dawn.cs.stanford.edu/benchmark/). |
Tasks | |
Published | 2017-08-23 |
URL | http://arxiv.org/abs/1708.07120v3 |
http://arxiv.org/pdf/1708.07120v3.pdf | |
PWC | https://paperswithcode.com/paper/super-convergence-very-fast-training-of-1 |
Repo | https://github.com/coxy1989/superconv |
Framework | tf |
Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach
Title | Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach |
Authors | Yuhang Song, Mai Xu, Jianyi Wang, Minglang Qiao, Liangyu Huo, Zulin Wang |
Abstract | Panoramic video provides immersive and interactive experience by enabling humans to control the field of view (FoV) through head movement (HM). Thus, HM plays a key role in modeling human attention on panoramic video. This paper establishes a database collecting subjects’ HM in panoramic video sequences. From this database, we find that the HM data are highly consistent across subjects. Furthermore, we find that deep reinforcement learning (DRL) can be applied to predict HM positions, via maximizing the reward of imitating human HM scanpaths through the agent’s actions. Based on our findings, we propose a DRL-based HM prediction (DHP) approach with offline and online versions, called offline-DHP and online-DHP. In offline-DHP, multiple DRL workflows are run to determine potential HM positions at each panoramic frame. Then, a heat map of the potential HM positions, named the HM map, is generated as the output of offline-DHP. In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model. Finally, the experiments validate that our approach is effective in both offline and online prediction of HM positions for panoramic video, and that the learned offline-DHP model can improve the performance of online-DHP. |
Tasks | |
Published | 2017-10-30 |
URL | https://arxiv.org/abs/1710.10755v5 |
https://arxiv.org/pdf/1710.10755v5.pdf | |
PWC | https://paperswithcode.com/paper/predicting-head-movement-in-panoramic-video-a |
Repo | https://github.com/YuhangSong/DHP |
Framework | tf |
SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements
Title | SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements |
Authors | Francisco J. R. Ruiz, Susan Athey, David M. Blei |
Abstract | We develop SHOPPER, a sequential probabilistic model of shopping data. SHOPPER uses interpretable components to model the forces that drive how a customer chooses products; in particular, we designed SHOPPER to capture how items interact with other items. We develop an efficient posterior inference algorithm to estimate these forces from large-scale data, and we analyze a large dataset from a major chain grocery store. We are interested in answering counterfactual queries about changes in prices. We found that SHOPPER provides accurate predictions even under price interventions, and that it helps identify complementary and substitutable pairs of products. |
Tasks | |
Published | 2017-11-09 |
URL | https://arxiv.org/abs/1711.03560v3 |
https://arxiv.org/pdf/1711.03560v3.pdf | |
PWC | https://paperswithcode.com/paper/shopper-a-probabilistic-model-of-consumer |
Repo | https://github.com/franrruiz/shopper-src |
Framework | none |
Predicting Deeper into the Future of Semantic Segmentation
Title | Predicting Deeper into the Future of Semantic Segmentation |
Authors | Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann LeCun |
Abstract | The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future. We develop an autoregressive convolutional neural network that learns to iteratively generate multiple frames. Our results on the Cityscapes dataset show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames. Prediction results up to half a second in the future are visually convincing and are much more accurate than those of a baseline based on warping semantic segmentations using optical flow. |
Tasks | Autonomous Driving, Decision Making, Optical Flow Estimation, Scene Understanding, Semantic Segmentation, Video Prediction |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07684v3 |
http://arxiv.org/pdf/1703.07684v3.pdf | |
PWC | https://paperswithcode.com/paper/predicting-deeper-into-the-future-of-semantic |
Repo | https://github.com/m-serra/action-inference-for-video-prediction-benchmarking |
Framework | tf |
Self-Taught Convolutional Neural Networks for Short Text Clustering
Title | Self-Taught Convolutional Neural Networks for Short Text Clustering |
Authors | Jiaming Xu, Bo Xu, Peng Wang, Suncong Zheng, Guanhua Tian, Jun Zhao, Bo Xu |
Abstract | Short text clustering is a challenging problem due to its sparseness of text representation. Here we propose a flexible Self-Taught Convolutional neural network framework for Short Text Clustering (dubbed STC^2), which can flexibly and successfully incorporate more useful semantic features and learn non-biased deep text representation in an unsupervised manner. In our framework, the original raw text features are firstly embedded into compact binary codes by using one existing unsupervised dimensionality reduction methods. Then, word embeddings are explored and fed into convolutional neural networks to learn deep feature representations, meanwhile the output units are used to fit the pre-trained binary codes in the training process. Finally, we get the optimal clusters by employing K-means to cluster the learned representations. Extensive experimental results demonstrate that the proposed framework is effective, flexible and outperform several popular clustering methods when tested on three public short text datasets. |
Tasks | Dimensionality Reduction, Text Clustering, Word Embeddings |
Published | 2017-01-01 |
URL | http://arxiv.org/abs/1701.00185v1 |
http://arxiv.org/pdf/1701.00185v1.pdf | |
PWC | https://paperswithcode.com/paper/self-taught-convolutional-neural-networks-for |
Repo | https://github.com/jacoxu/STC2 |
Framework | tf |