July 29, 2019

3232 words 16 mins read

Paper Group AWR 77

Paper Group AWR 77

Bayesian Cluster Enumeration Criterion for Unsupervised Learning. BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet. Developing a comprehensive framework for multimodal feature extraction. A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing. Towards Alzheimer’s Disease Classification throug …

Bayesian Cluster Enumeration Criterion for Unsupervised Learning

Title Bayesian Cluster Enumeration Criterion for Unsupervised Learning
Authors Freweyni K. Teklehaymanot, Michael Muma, Abdelhak M. Zoubir
Abstract We derive a new Bayesian Information Criterion (BIC) by formulating the problem of estimating the number of clusters in an observed data set as maximization of the posterior probability of the candidate models. Given that some mild assumptions are satisfied, we provide a general BIC expression for a broad class of data distributions. This serves as a starting point when deriving the BIC for specific distributions. Along this line, we provide a closed-form BIC expression for multivariate Gaussian distributed variables. We show that incorporating the data structure of the clustering problem into the derivation of the BIC results in an expression whose penalty term is different from that of the original BIC. We propose a two-step cluster enumeration algorithm. First, a model-based unsupervised learning algorithm partitions the data according to a given set of candidate models. Subsequently, the number of clusters is determined as the one associated with the model for which the proposed BIC is maximal. The performance of the proposed two-step algorithm is tested using synthetic and real data sets.
Tasks
Published 2017-10-22
URL http://arxiv.org/abs/1710.07954v3
PDF http://arxiv.org/pdf/1710.07954v3.pdf
PWC https://paperswithcode.com/paper/bayesian-cluster-enumeration-criterion-for
Repo https://github.com/FreTekle/Bayesian-Cluster-Enumeration
Framework none

BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet

Title BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet
Authors Haojin Yang, Martin Fritzsche, Christian Bartz, Christoph Meinel
Abstract Binary Neural Networks (BNNs) can drastically reduce memory size and accesses by applying bit-wise operations instead of standard arithmetic operations. Therefore it could significantly improve the efficiency and lower the energy consumption at runtime, which enables the application of state-of-the-art deep learning models on low power devices. BMXNet is an open-source BNN library based on MXNet, which supports both XNOR-Networks and Quantized Neural Networks. The developed BNN layers can be seamlessly applied with other standard library components and work in both GPU and CPU mode. BMXNet is maintained and developed by the multimedia research group at Hasso Plattner Institute and released under Apache license. Extensive experiments validate the efficiency and effectiveness of our implementation. The BMXNet library, several sample projects, and a collection of pre-trained binary deep models are available for download at https://github.com/hpi-xnor
Tasks
Published 2017-05-27
URL http://arxiv.org/abs/1705.09864v1
PDF http://arxiv.org/pdf/1705.09864v1.pdf
PWC https://paperswithcode.com/paper/bmxnet-an-open-source-binary-neural-network
Repo https://github.com/hpi-xnor/BMXNet
Framework mxnet

Developing a comprehensive framework for multimodal feature extraction

Title Developing a comprehensive framework for multimodal feature extraction
Authors Quinten McNamara, Alejandro de la Vega, Tal Yarkoni
Abstract Feature extraction is a critical component of many applied data science workflows. In recent years, rapid advances in artificial intelligence and machine learning have led to an explosion of feature extraction tools and services that allow data scientists to cheaply and effectively annotate their data along a vast array of dimensions—ranging from detecting faces in images to analyzing the sentiment expressed in coherent text. Unfortunately, the proliferation of powerful feature extraction services has been mirrored by a corresponding expansion in the number of distinct interfaces to feature extraction services. In a world where nearly every new service has its own API, documentation, and/or client library, data scientists who need to combine diverse features obtained from multiple sources are often forced to write and maintain ever more elaborate feature extraction pipelines. To address this challenge, we introduce a new open-source framework for comprehensive multimodal feature extraction. Pliers is an open-source Python package that supports standardized annotation of diverse data types (video, images, audio, and text), and is expressly with both ease-of-use and extensibility in mind. Users can apply a wide range of pre-existing feature extraction tools to their data in just a few lines of Python code, and can also easily add their own custom extractors by writing modular classes. A graph-based API enables rapid development of complex feature extraction pipelines that output results in a single, standardized format. We describe the package’s architecture, detail its major advantages over previous feature extraction toolboxes, and use a sample application to a large functional MRI dataset to illustrate how pliers can significantly reduce the time and effort required to construct sophisticated feature extraction workflows while increasing code clarity and maintainability.
Tasks
Published 2017-02-20
URL http://arxiv.org/abs/1702.06151v1
PDF http://arxiv.org/pdf/1702.06151v1.pdf
PWC https://paperswithcode.com/paper/developing-a-comprehensive-framework-for
Repo https://github.com/tyarkoni/pliers
Framework none

A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing

Title A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing
Authors Dat Quoc Nguyen, Mark Dras, Mark Johnson
Abstract We present a novel neural network model that learns POS tagging and graph-based dependency parsing jointly. Our model uses bidirectional LSTMs to learn feature representations shared for both POS tagging and dependency parsing tasks, thus handling the feature-engineering problem. Our extensive experiments, on 19 languages from the Universal Dependencies project, show that our model outperforms the state-of-the-art neural network-based Stack-propagation model for joint POS tagging and transition-based dependency parsing, resulting in a new state of the art. Our code is open-source and available together with pre-trained models at: https://github.com/datquocnguyen/jPTDP
Tasks Dependency Parsing, Feature Engineering, Part-Of-Speech Tagging, Transition-Based Dependency Parsing
Published 2017-05-16
URL http://arxiv.org/abs/1705.05952v2
PDF http://arxiv.org/pdf/1705.05952v2.pdf
PWC https://paperswithcode.com/paper/a-novel-neural-network-model-for-joint-pos
Repo https://github.com/datquocnguyen/jPTDP
Framework none

Towards Alzheimer’s Disease Classification through Transfer Learning

Title Towards Alzheimer’s Disease Classification through Transfer Learning
Authors Marcia Hon, Naimul Khan
Abstract Detection of Alzheimer’s Disease (AD) from neuroimaging data such as MRI through machine learning have been a subject of intense research in recent years. Recent success of deep learning in computer vision have progressed such research further. However, common limitations with such algorithms are reliance on a large number of training images, and requirement of careful optimization of the architecture of deep networks. In this paper, we attempt solving these issues with transfer learning, where state-of-the-art architectures such as VGG and Inception are initialized with pre-trained weights from large benchmark datasets consisting of natural images, and the fully-connected layer is re-trained with only a small number of MRI images. We employ image entropy to select the most informative slices for training. Through experimentation on the OASIS MRI dataset, we show that with training size almost 10 times smaller than the state-of-the-art, we reach comparable or even better performance than current deep-learning based methods.
Tasks Transfer Learning
Published 2017-11-29
URL http://arxiv.org/abs/1711.11117v1
PDF http://arxiv.org/pdf/1711.11117v1.pdf
PWC https://paperswithcode.com/paper/towards-alzheimers-disease-classification
Repo https://github.com/marciahon29/Ryerson_MRP
Framework tf

Detecting Online Hate Speech Using Context Aware Models

Title Detecting Online Hate Speech Using Context Aware Models
Authors Lei Gao, Ruihong Huang
Abstract In the wake of a polarizing election, the cyber world is laden with hate speech. Context accompanying a hate speech text is useful for identifying hate speech, which however has been largely overlooked in existing datasets and hate speech detection models. In this paper, we provide an annotated corpus of hate speech with context information well kept. Then we propose two types of hate speech detection models that incorporate context information, a logistic regression model with context features and a neural network model with learning components for context. Our evaluation shows that both models outperform a strong baseline by around 3% to 4% in F1 score and combining these two models further improve the performance by another 7% in F1 score.
Tasks Hate Speech Detection
Published 2017-10-20
URL http://arxiv.org/abs/1710.07395v2
PDF http://arxiv.org/pdf/1710.07395v2.pdf
PWC https://paperswithcode.com/paper/detecting-online-hate-speech-using-context
Repo https://github.com/sjtuprog/fox-news-comments
Framework none

Towards the Automatic Anime Characters Creation with Generative Adversarial Networks

Title Towards the Automatic Anime Characters Creation with Generative Adversarial Networks
Authors Yanghua Jin, Jiakai Zhang, Minjun Li, Yingtao Tian, Huachun Zhu, Zhihao Fang
Abstract Automatic generation of facial images has been well studied after the Generative Adversarial Network (GAN) came out. There exists some attempts applying the GAN model to the problem of generating facial images of anime characters, but none of the existing work gives a promising result. In this work, we explore the training of GAN models specialized on an anime facial image dataset. We address the issue from both the data and the model aspect, by collecting a more clean, well-suited dataset and leverage proper, empirical application of DRAGAN. With quantitative analysis and case studies we demonstrate that our efforts lead to a stable and high-quality model. Moreover, to assist people with anime character design, we build a website (http://make.girls.moe) with our pre-trained model available online, which makes the model easily accessible to general public.
Tasks
Published 2017-08-18
URL http://arxiv.org/abs/1708.05509v1
PDF http://arxiv.org/pdf/1708.05509v1.pdf
PWC https://paperswithcode.com/paper/towards-the-automatic-anime-characters
Repo https://github.com/ctwxdd/Tensorflow-ACGAN-Anime-Generation
Framework tf

Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms

Title Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms
Authors Jeff Heaton
Abstract Frequent itemset mining is a popular data mining technique. Apriori, Eclat, and FP-Growth are among the most common algorithms for frequent itemset mining. Considerable research has been performed to compare the relative performance between these three algorithms, by evaluating the scalability of each algorithm as the dataset size increases. While scalability as data size increases is important, previous papers have not examined the performance impact of similarly sized datasets that contain different itemset characteristics. This paper explores the effects that two dataset characteristics can have on the performance of these three frequent itemset algorithms. To perform this empirical analysis, a dataset generator is created to measure the effects of frequent item density and the maximum transaction size on performance. The generated datasets contain the same number of rows. This provides some insight into dataset characteristics that are conducive to each algorithm. The results of this paper’s research demonstrate Eclat and FP-Growth both handle increases in maximum transaction size and frequent itemset density considerably better than the Apriori algorithm. This paper explores the effects that two dataset characteristics can have on the performance of these three frequent itemset algorithms. To perform this empirical analysis, a dataset generator is created to measure the effects of frequent item density and the maximum transaction size on performance. The generated datasets contain the same number of rows. This provides some insight into dataset characteristics that are conducive to each algorithm. The results of this paper’s research demonstrate Eclat and FP-Growth both handle increases in maximum transaction size and frequent itemset density considerably better than the Apriori algorithm.
Tasks
Published 2017-01-30
URL http://arxiv.org/abs/1701.09042v1
PDF http://arxiv.org/pdf/1701.09042v1.pdf
PWC https://paperswithcode.com/paper/comparing-dataset-characteristics-that-favor
Repo https://github.com/alextanhongpin/affinity-analysis
Framework none

Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching

Title Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching
Authors Andy Zeng, Shuran Song, Kuan-Ting Yu, Elliott Donlon, Francois R. Hogan, Maria Bauza, Daolin Ma, Orion Taylor, Melody Liu, Eudald Romo, Nima Fazeli, Ferran Alet, Nikhil Chavan Dafle, Rachel Holladay, Isabella Morona, Prem Qu Nair, Druck Green, Ian Taylor, Weber Liu, Thomas Funkhouser, Alberto Rodriguez
Abstract This paper presents a robotic pick-and-place system that is capable of grasping and recognizing both known and novel objects in cluttered environments. The key new feature of the system is that it handles a wide range of object categories without needing any task-specific training data for novel objects. To achieve this, it first uses a category-agnostic affordance prediction algorithm to select and execute among four different grasping primitive behaviors. It then recognizes picked objects with a cross-domain image classification framework that matches observed images to product images. Since product images are readily available for a wide range of objects (e.g., from the web), the system works out-of-the-box for novel objects without requiring any additional training data. Exhaustive experimental results demonstrate that our multi-affordance grasping achieves high success rates for a wide variety of objects in clutter, and our recognition algorithm achieves high accuracy for both known and novel grasped objects. The approach was part of the MIT-Princeton Team system that took 1st place in the stowing task at the 2017 Amazon Robotics Challenge. All code, datasets, and pre-trained models are available online at http://arc.cs.princeton.edu
Tasks Image Classification, Robotic Grasping
Published 2017-10-03
URL http://arxiv.org/abs/1710.01330v4
PDF http://arxiv.org/pdf/1710.01330v4.pdf
PWC https://paperswithcode.com/paper/robotic-pick-and-place-of-novel-objects-in
Repo https://github.com/andyzeng/arc-robot-vision
Framework torch

Semi-Supervised Deep Learning for Fully Convolutional Networks

Title Semi-Supervised Deep Learning for Fully Convolutional Networks
Authors Christoph Baur, Shadi Albarqouni, Nassir Navab
Abstract Deep learning usually requires large amounts of labeled training data, but annotating data is costly and tedious. The framework of semi-supervised learning provides the means to use both labeled data and arbitrary amounts of unlabeled data for training. Recently, semi-supervised deep learning has been intensively studied for standard CNN architectures. However, Fully Convolutional Networks (FCNs) set the state-of-the-art for many image segmentation tasks. To the best of our knowledge, there is no existing semi-supervised learning method for such FCNs yet. We lift the concept of auxiliary manifold embedding for semi-supervised learning to FCNs with the help of Random Feature Embedding. In our experiments on the challenging task of MS Lesion Segmentation, we leverage the proposed framework for the purpose of domain adaptation and report substantial improvements over the baseline model.
Tasks Domain Adaptation, Lesion Segmentation, Semantic Segmentation
Published 2017-03-17
URL http://arxiv.org/abs/1703.06000v2
PDF http://arxiv.org/pdf/1703.06000v2.pdf
PWC https://paperswithcode.com/paper/semi-supervised-deep-learning-for-fully
Repo https://github.com/bumuckl/SemiSupervisedDLForFCNs
Framework none

Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates

Title Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
Authors Leslie N. Smith, Nicholay Topin
Abstract In this paper, we describe a phenomenon, which we named “super-convergence”, where neural networks can be trained an order of magnitude faster than with standard training methods. The existence of super-convergence is relevant to understanding why deep networks generalize well. One of the key elements of super-convergence is training with one learning rate cycle and a large maximum learning rate. A primary insight that allows super-convergence training is that large learning rates regularize the training, hence requiring a reduction of all other forms of regularization in order to preserve an optimal regularization balance. We also derive a simplification of the Hessian Free optimization method to compute an estimate of the optimal learning rate. Experiments demonstrate super-convergence for Cifar-10/100, MNIST and Imagenet datasets, and resnet, wide-resnet, densenet, and inception architectures. In addition, we show that super-convergence provides a greater boost in performance relative to standard training when the amount of labeled training data is limited. The architectures and code to replicate the figures in this paper are available at github.com/lnsmith54/super-convergence. See http://www.fast.ai/2018/04/30/dawnbench-fastai/ for an application of super-convergence to win the DAWNBench challenge (see https://dawn.cs.stanford.edu/benchmark/).
Tasks
Published 2017-08-23
URL http://arxiv.org/abs/1708.07120v3
PDF http://arxiv.org/pdf/1708.07120v3.pdf
PWC https://paperswithcode.com/paper/super-convergence-very-fast-training-of-1
Repo https://github.com/coxy1989/superconv
Framework tf

Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach

Title Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach
Authors Yuhang Song, Mai Xu, Jianyi Wang, Minglang Qiao, Liangyu Huo, Zulin Wang
Abstract Panoramic video provides immersive and interactive experience by enabling humans to control the field of view (FoV) through head movement (HM). Thus, HM plays a key role in modeling human attention on panoramic video. This paper establishes a database collecting subjects’ HM in panoramic video sequences. From this database, we find that the HM data are highly consistent across subjects. Furthermore, we find that deep reinforcement learning (DRL) can be applied to predict HM positions, via maximizing the reward of imitating human HM scanpaths through the agent’s actions. Based on our findings, we propose a DRL-based HM prediction (DHP) approach with offline and online versions, called offline-DHP and online-DHP. In offline-DHP, multiple DRL workflows are run to determine potential HM positions at each panoramic frame. Then, a heat map of the potential HM positions, named the HM map, is generated as the output of offline-DHP. In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model. Finally, the experiments validate that our approach is effective in both offline and online prediction of HM positions for panoramic video, and that the learned offline-DHP model can improve the performance of online-DHP.
Tasks
Published 2017-10-30
URL https://arxiv.org/abs/1710.10755v5
PDF https://arxiv.org/pdf/1710.10755v5.pdf
PWC https://paperswithcode.com/paper/predicting-head-movement-in-panoramic-video-a
Repo https://github.com/YuhangSong/DHP
Framework tf

SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements

Title SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements
Authors Francisco J. R. Ruiz, Susan Athey, David M. Blei
Abstract We develop SHOPPER, a sequential probabilistic model of shopping data. SHOPPER uses interpretable components to model the forces that drive how a customer chooses products; in particular, we designed SHOPPER to capture how items interact with other items. We develop an efficient posterior inference algorithm to estimate these forces from large-scale data, and we analyze a large dataset from a major chain grocery store. We are interested in answering counterfactual queries about changes in prices. We found that SHOPPER provides accurate predictions even under price interventions, and that it helps identify complementary and substitutable pairs of products.
Tasks
Published 2017-11-09
URL https://arxiv.org/abs/1711.03560v3
PDF https://arxiv.org/pdf/1711.03560v3.pdf
PWC https://paperswithcode.com/paper/shopper-a-probabilistic-model-of-consumer
Repo https://github.com/franrruiz/shopper-src
Framework none

Predicting Deeper into the Future of Semantic Segmentation

Title Predicting Deeper into the Future of Semantic Segmentation
Authors Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann LeCun
Abstract The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future. We develop an autoregressive convolutional neural network that learns to iteratively generate multiple frames. Our results on the Cityscapes dataset show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames. Prediction results up to half a second in the future are visually convincing and are much more accurate than those of a baseline based on warping semantic segmentations using optical flow.
Tasks Autonomous Driving, Decision Making, Optical Flow Estimation, Scene Understanding, Semantic Segmentation, Video Prediction
Published 2017-03-22
URL http://arxiv.org/abs/1703.07684v3
PDF http://arxiv.org/pdf/1703.07684v3.pdf
PWC https://paperswithcode.com/paper/predicting-deeper-into-the-future-of-semantic
Repo https://github.com/m-serra/action-inference-for-video-prediction-benchmarking
Framework tf

Self-Taught Convolutional Neural Networks for Short Text Clustering

Title Self-Taught Convolutional Neural Networks for Short Text Clustering
Authors Jiaming Xu, Bo Xu, Peng Wang, Suncong Zheng, Guanhua Tian, Jun Zhao, Bo Xu
Abstract Short text clustering is a challenging problem due to its sparseness of text representation. Here we propose a flexible Self-Taught Convolutional neural network framework for Short Text Clustering (dubbed STC^2), which can flexibly and successfully incorporate more useful semantic features and learn non-biased deep text representation in an unsupervised manner. In our framework, the original raw text features are firstly embedded into compact binary codes by using one existing unsupervised dimensionality reduction methods. Then, word embeddings are explored and fed into convolutional neural networks to learn deep feature representations, meanwhile the output units are used to fit the pre-trained binary codes in the training process. Finally, we get the optimal clusters by employing K-means to cluster the learned representations. Extensive experimental results demonstrate that the proposed framework is effective, flexible and outperform several popular clustering methods when tested on three public short text datasets.
Tasks Dimensionality Reduction, Text Clustering, Word Embeddings
Published 2017-01-01
URL http://arxiv.org/abs/1701.00185v1
PDF http://arxiv.org/pdf/1701.00185v1.pdf
PWC https://paperswithcode.com/paper/self-taught-convolutional-neural-networks-for
Repo https://github.com/jacoxu/STC2
Framework tf
comments powered by Disqus