July 29, 2019

3232 words 16 mins read

Paper Group AWR 77

Bayesian Cluster Enumeration Criterion for Unsupervised Learning. BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet. Developing a comprehensive framework for multimodal feature extraction. A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing. Towards Alzheimer’s Disease Classification throug …

Bayesian Cluster Enumeration Criterion for Unsupervised Learning


Title	Bayesian Cluster Enumeration Criterion for Unsupervised Learning
Authors	Freweyni K. Teklehaymanot, Michael Muma, Abdelhak M. Zoubir
Abstract	We derive a new Bayesian Information Criterion (BIC) by formulating the problem of estimating the number of clusters in an observed data set as maximization of the posterior probability of the candidate models. Given that some mild assumptions are satisfied, we provide a general BIC expression for a broad class of data distributions. This serves as a starting point when deriving the BIC for specific distributions. Along this line, we provide a closed-form BIC expression for multivariate Gaussian distributed variables. We show that incorporating the data structure of the clustering problem into the derivation of the BIC results in an expression whose penalty term is different from that of the original BIC. We propose a two-step cluster enumeration algorithm. First, a model-based unsupervised learning algorithm partitions the data according to a given set of candidate models. Subsequently, the number of clusters is determined as the one associated with the model for which the proposed BIC is maximal. The performance of the proposed two-step algorithm is tested using synthetic and real data sets.
Tasks
Published	2017-10-22
URL	http://arxiv.org/abs/1710.07954v3
PDF	http://arxiv.org/pdf/1710.07954v3.pdf
PWC	https://paperswithcode.com/paper/bayesian-cluster-enumeration-criterion-for
Repo	https://github.com/FreTekle/Bayesian-Cluster-Enumeration
Framework	none

BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet


Title	BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet
Authors	Haojin Yang, Martin Fritzsche, Christian Bartz, Christoph Meinel
Abstract	Binary Neural Networks (BNNs) can drastically reduce memory size and accesses by applying bit-wise operations instead of standard arithmetic operations. Therefore it could significantly improve the efficiency and lower the energy consumption at runtime, which enables the application of state-of-the-art deep learning models on low power devices. BMXNet is an open-source BNN library based on MXNet, which supports both XNOR-Networks and Quantized Neural Networks. The developed BNN layers can be seamlessly applied with other standard library components and work in both GPU and CPU mode. BMXNet is maintained and developed by the multimedia research group at Hasso Plattner Institute and released under Apache license. Extensive experiments validate the efficiency and effectiveness of our implementation. The BMXNet library, several sample projects, and a collection of pre-trained binary deep models are available for download at https://github.com/hpi-xnor
Tasks
Published	2017-05-27
URL	http://arxiv.org/abs/1705.09864v1
PDF	http://arxiv.org/pdf/1705.09864v1.pdf
PWC	https://paperswithcode.com/paper/bmxnet-an-open-source-binary-neural-network
Repo	https://github.com/hpi-xnor/BMXNet
Framework	mxnet

Developing a comprehensive framework for multimodal feature extraction


Title	Developing a comprehensive framework for multimodal feature extraction
Authors	Quinten McNamara, Alejandro de la Vega, Tal Yarkoni
Abstract	Feature extraction is a critical component of many applied data science workflows. In recent years, rapid advances in artificial intelligence and machine learning have led to an explosion of feature extraction tools and services that allow data scientists to cheaply and effectively annotate their data along a vast array of dimensions—ranging from detecting faces in images to analyzing the sentiment expressed in coherent text. Unfortunately, the proliferation of powerful feature extraction services has been mirrored by a corresponding expansion in the number of distinct interfaces to feature extraction services. In a world where nearly every new service has its own API, documentation, and/or client library, data scientists who need to combine diverse features obtained from multiple sources are often forced to write and maintain ever more elaborate feature extraction pipelines. To address this challenge, we introduce a new open-source framework for comprehensive multimodal feature extraction. Pliers is an open-source Python package that supports standardized annotation of diverse data types (video, images, audio, and text), and is expressly with both ease-of-use and extensibility in mind. Users can apply a wide range of pre-existing feature extraction tools to their data in just a few lines of Python code, and can also easily add their own custom extractors by writing modular classes. A graph-based API enables rapid development of complex feature extraction pipelines that output results in a single, standardized format. We describe the package’s architecture, detail its major advantages over previous feature extraction toolboxes, and use a sample application to a large functional MRI dataset to illustrate how pliers can significantly reduce the time and effort required to construct sophisticated feature extraction workflows while increasing code clarity and maintainability.
Tasks
Published	2017-02-20
URL	http://arxiv.org/abs/1702.06151v1
PDF	http://arxiv.org/pdf/1702.06151v1.pdf
PWC	https://paperswithcode.com/paper/developing-a-comprehensive-framework-for
Repo	https://github.com/tyarkoni/pliers
Framework	none

A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing


Title	A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing
Authors	Dat Quoc Nguyen, Mark Dras, Mark Johnson
Abstract	We present a novel neural network model that learns POS tagging and graph-based dependency parsing jointly. Our model uses bidirectional LSTMs to learn feature representations shared for both POS tagging and dependency parsing tasks, thus handling the feature-engineering problem. Our extensive experiments, on 19 languages from the Universal Dependencies project, show that our model outperforms the state-of-the-art neural network-based Stack-propagation model for joint POS tagging and transition-based dependency parsing, resulting in a new state of the art. Our code is open-source and available together with pre-trained models at: https://github.com/datquocnguyen/jPTDP
Tasks	Dependency Parsing, Feature Engineering, Part-Of-Speech Tagging, Transition-Based Dependency Parsing
Published	2017-05-16
URL	http://arxiv.org/abs/1705.05952v2
PDF	http://arxiv.org/pdf/1705.05952v2.pdf
PWC	https://paperswithcode.com/paper/a-novel-neural-network-model-for-joint-pos
Repo	https://github.com/datquocnguyen/jPTDP
Framework	none

Towards Alzheimer’s Disease Classification through Transfer Learning


Title	Towards Alzheimer’s Disease Classification through Transfer Learning
Authors	Marcia Hon, Naimul Khan
Abstract	Detection of Alzheimer’s Disease (AD) from neuroimaging data such as MRI through machine learning have been a subject of intense research in recent years. Recent success of deep learning in computer vision have progressed such research further. However, common limitations with such algorithms are reliance on a large number of training images, and requirement of careful optimization of the architecture of deep networks. In this paper, we attempt solving these issues with transfer learning, where state-of-the-art architectures such as VGG and Inception are initialized with pre-trained weights from large benchmark datasets consisting of natural images, and the fully-connected layer is re-trained with only a small number of MRI images. We employ image entropy to select the most informative slices for training. Through experimentation on the OASIS MRI dataset, we show that with training size almost 10 times smaller than the state-of-the-art, we reach comparable or even better performance than current deep-learning based methods.
Tasks	Transfer Learning
Published	2017-11-29
URL	http://arxiv.org/abs/1711.11117v1
PDF	http://arxiv.org/pdf/1711.11117v1.pdf
PWC	https://paperswithcode.com/paper/towards-alzheimers-disease-classification
Repo	https://github.com/marciahon29/Ryerson_MRP
Framework	tf

Detecting Online Hate Speech Using Context Aware Models


Title	Detecting Online Hate Speech Using Context Aware Models
Authors	Lei Gao, Ruihong Huang
Abstract	In the wake of a polarizing election, the cyber world is laden with hate speech. Context accompanying a hate speech text is useful for identifying hate speech, which however has been largely overlooked in existing datasets and hate speech detection models. In this paper, we provide an annotated corpus of hate speech with context information well kept. Then we propose two types of hate speech detection models that incorporate context information, a logistic regression model with context features and a neural network model with learning components for context. Our evaluation shows that both models outperform a strong baseline by around 3% to 4% in F1 score and combining these two models further improve the performance by another 7% in F1 score.
Tasks	Hate Speech Detection
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07395v2
PDF	http://arxiv.org/pdf/1710.07395v2.pdf
PWC	https://paperswithcode.com/paper/detecting-online-hate-speech-using-context
Repo	https://github.com/sjtuprog/fox-news-comments
Framework	none

Towards the Automatic Anime Characters Creation with Generative Adversarial Networks


Title	Towards the Automatic Anime Characters Creation with Generative Adversarial Networks
Authors	Yanghua Jin, Jiakai Zhang, Minjun Li, Yingtao Tian, Huachun Zhu, Zhihao Fang
Abstract	Automatic generation of facial images has been well studied after the Generative Adversarial Network (GAN) came out. There exists some attempts applying the GAN model to the problem of generating facial images of anime characters, but none of the existing work gives a promising result. In this work, we explore the training of GAN models specialized on an anime facial image dataset. We address the issue from both the data and the model aspect, by collecting a more clean, well-suited dataset and leverage proper, empirical application of DRAGAN. With quantitative analysis and case studies we demonstrate that our efforts lead to a stable and high-quality model. Moreover, to assist people with anime character design, we build a website (http://make.girls.moe) with our pre-trained model available online, which makes the model easily accessible to general public.
Tasks
Published	2017-08-18
URL	http://arxiv.org/abs/1708.05509v1
PDF	http://arxiv.org/pdf/1708.05509v1.pdf
PWC	https://paperswithcode.com/paper/towards-the-automatic-anime-characters
Repo	https://github.com/ctwxdd/Tensorflow-ACGAN-Anime-Generation
Framework	tf

Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms


Title	Comparing Dataset Characteristics that Favor the Apriori, Eclat or FP-Growth Frequent Itemset Mining Algorithms
Authors	Jeff Heaton
Abstract	Frequent itemset mining is a popular data mining technique. Apriori, Eclat, and FP-Growth are among the most common algorithms for frequent itemset mining. Considerable research has been performed to compare the relative performance between these three algorithms, by evaluating the scalability of each algorithm as the dataset size increases. While scalability as data size increases is important, previous papers have not examined the performance impact of similarly sized datasets that contain different itemset characteristics. This paper explores the effects that two dataset characteristics can have on the performance of these three frequent itemset algorithms. To perform this empirical analysis, a dataset generator is created to measure the effects of frequent item density and the maximum transaction size on performance. The generated datasets contain the same number of rows. This provides some insight into dataset characteristics that are conducive to each algorithm. The results of this paper’s research demonstrate Eclat and FP-Growth both handle increases in maximum transaction size and frequent itemset density considerably better than the Apriori algorithm. This paper explores the effects that two dataset characteristics can have on the performance of these three frequent itemset algorithms. To perform this empirical analysis, a dataset generator is created to measure the effects of frequent item density and the maximum transaction size on performance. The generated datasets contain the same number of rows. This provides some insight into dataset characteristics that are conducive to each algorithm. The results of this paper’s research demonstrate Eclat and FP-Growth both handle increases in maximum transaction size and frequent itemset density considerably better than the Apriori algorithm.
Tasks
Published	2017-01-30
URL	http://arxiv.org/abs/1701.09042v1
PDF	http://arxiv.org/pdf/1701.09042v1.pdf
PWC	https://paperswithcode.com/paper/comparing-dataset-characteristics-that-favor
Repo	https://github.com/alextanhongpin/affinity-analysis
Framework	none

Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching


Title	Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching
Authors	Andy Zeng, Shuran Song, Kuan-Ting Yu, Elliott Donlon, Francois R. Hogan, Maria Bauza, Daolin Ma, Orion Taylor, Melody Liu, Eudald Romo, Nima Fazeli, Ferran Alet, Nikhil Chavan Dafle, Rachel Holladay, Isabella Morona, Prem Qu Nair, Druck Green, Ian Taylor, Weber Liu, Thomas Funkhouser, Alberto Rodriguez
Abstract	This paper presents a robotic pick-and-place system that is capable of grasping and recognizing both known and novel objects in cluttered environments. The key new feature of the system is that it handles a wide range of object categories without needing any task-specific training data for novel objects. To achieve this, it first uses a category-agnostic affordance prediction algorithm to select and execute among four different grasping primitive behaviors. It then recognizes picked objects with a cross-domain image classification framework that matches observed images to product images. Since product images are readily available for a wide range of objects (e.g., from the web), the system works out-of-the-box for novel objects without requiring any additional training data. Exhaustive experimental results demonstrate that our multi-affordance grasping achieves high success rates for a wide variety of objects in clutter, and our recognition algorithm achieves high accuracy for both known and novel grasped objects. The approach was part of the MIT-Princeton Team system that took 1st place in the stowing task at the 2017 Amazon Robotics Challenge. All code, datasets, and pre-trained models are available online at http://arc.cs.princeton.edu
Tasks	Image Classification, Robotic Grasping
Published	2017-10-03
URL	http://arxiv.org/abs/1710.01330v4
PDF	http://arxiv.org/pdf/1710.01330v4.pdf
PWC	https://paperswithcode.com/paper/robotic-pick-and-place-of-novel-objects-in
Repo	https://github.com/andyzeng/arc-robot-vision
Framework	torch

Semi-Supervised Deep Learning for Fully Convolutional Networks


Title	Semi-Supervised Deep Learning for Fully Convolutional Networks
Authors	Christoph Baur, Shadi Albarqouni, Nassir Navab
Abstract	Deep learning usually requires large amounts of labeled training data, but annotating data is costly and tedious. The framework of semi-supervised learning provides the means to use both labeled data and arbitrary amounts of unlabeled data for training. Recently, semi-supervised deep learning has been intensively studied for standard CNN architectures. However, Fully Convolutional Networks (FCNs) set the state-of-the-art for many image segmentation tasks. To the best of our knowledge, there is no existing semi-supervised learning method for such FCNs yet. We lift the concept of auxiliary manifold embedding for semi-supervised learning to FCNs with the help of Random Feature Embedding. In our experiments on the challenging task of MS Lesion Segmentation, we leverage the proposed framework for the purpose of domain adaptation and report substantial improvements over the baseline model.
Tasks	Domain Adaptation, Lesion Segmentation, Semantic Segmentation
Published	2017-03-17
URL	http://arxiv.org/abs/1703.06000v2
PDF	http://arxiv.org/pdf/1703.06000v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-deep-learning-for-fully
Repo	https://github.com/bumuckl/SemiSupervisedDLForFCNs
Framework	none

Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates


Title	Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
Authors	Leslie N. Smith, Nicholay Topin
Abstract	In this paper, we describe a phenomenon, which we named “super-convergence”, where neural networks can be trained an order of magnitude faster than with standard training methods. The existence of super-convergence is relevant to understanding why deep networks generalize well. One of the key elements of super-convergence is training with one learning rate cycle and a large maximum learning rate. A primary insight that allows super-convergence training is that large learning rates regularize the training, hence requiring a reduction of all other forms of regularization in order to preserve an optimal regularization balance. We also derive a simplification of the Hessian Free optimization method to compute an estimate of the optimal learning rate. Experiments demonstrate super-convergence for Cifar-10/100, MNIST and Imagenet datasets, and resnet, wide-resnet, densenet, and inception architectures. In addition, we show that super-convergence provides a greater boost in performance relative to standard training when the amount of labeled training data is limited. The architectures and code to replicate the figures in this paper are available at github.com/lnsmith54/super-convergence. See http://www.fast.ai/2018/04/30/dawnbench-fastai/ for an application of super-convergence to win the DAWNBench challenge (see https://dawn.cs.stanford.edu/benchmark/).
Tasks
Published	2017-08-23
URL	http://arxiv.org/abs/1708.07120v3
PDF	http://arxiv.org/pdf/1708.07120v3.pdf
PWC	https://paperswithcode.com/paper/super-convergence-very-fast-training-of-1
Repo	https://github.com/coxy1989/superconv
Framework	tf

Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach


Title	Predicting Head Movement in Panoramic Video: A Deep Reinforcement Learning Approach
Authors	Yuhang Song, Mai Xu, Jianyi Wang, Minglang Qiao, Liangyu Huo, Zulin Wang
Abstract	Panoramic video provides immersive and interactive experience by enabling humans to control the field of view (FoV) through head movement (HM). Thus, HM plays a key role in modeling human attention on panoramic video. This paper establishes a database collecting subjects’ HM in panoramic video sequences. From this database, we find that the HM data are highly consistent across subjects. Furthermore, we find that deep reinforcement learning (DRL) can be applied to predict HM positions, via maximizing the reward of imitating human HM scanpaths through the agent’s actions. Based on our findings, we propose a DRL-based HM prediction (DHP) approach with offline and online versions, called offline-DHP and online-DHP. In offline-DHP, multiple DRL workflows are run to determine potential HM positions at each panoramic frame. Then, a heat map of the potential HM positions, named the HM map, is generated as the output of offline-DHP. In online-DHP, the next HM position of one subject is estimated given the currently observed HM position, which is achieved by developing a DRL algorithm upon the learned offline-DHP model. Finally, the experiments validate that our approach is effective in both offline and online prediction of HM positions for panoramic video, and that the learned offline-DHP model can improve the performance of online-DHP.
Tasks
Published	2017-10-30
URL	https://arxiv.org/abs/1710.10755v5
PDF	https://arxiv.org/pdf/1710.10755v5.pdf
PWC	https://paperswithcode.com/paper/predicting-head-movement-in-panoramic-video-a
Repo	https://github.com/YuhangSong/DHP
Framework	tf

SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements


Title	SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements
Authors	Francisco J. R. Ruiz, Susan Athey, David M. Blei
Abstract	We develop SHOPPER, a sequential probabilistic model of shopping data. SHOPPER uses interpretable components to model the forces that drive how a customer chooses products; in particular, we designed SHOPPER to capture how items interact with other items. We develop an efficient posterior inference algorithm to estimate these forces from large-scale data, and we analyze a large dataset from a major chain grocery store. We are interested in answering counterfactual queries about changes in prices. We found that SHOPPER provides accurate predictions even under price interventions, and that it helps identify complementary and substitutable pairs of products.
Tasks
Published	2017-11-09
URL	https://arxiv.org/abs/1711.03560v3
PDF	https://arxiv.org/pdf/1711.03560v3.pdf
PWC	https://paperswithcode.com/paper/shopper-a-probabilistic-model-of-consumer
Repo	https://github.com/franrruiz/shopper-src
Framework	none

Predicting Deeper into the Future of Semantic Segmentation


Title	Predicting Deeper into the Future of Semantic Segmentation
Authors	Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann LeCun
Abstract	The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making. While prediction of the raw RGB pixel values in future video frames has been studied in previous work, here we introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future. We develop an autoregressive convolutional neural network that learns to iteratively generate multiple frames. Our results on the Cityscapes dataset show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames. Prediction results up to half a second in the future are visually convincing and are much more accurate than those of a baseline based on warping semantic segmentations using optical flow.
Tasks	Autonomous Driving, Decision Making, Optical Flow Estimation, Scene Understanding, Semantic Segmentation, Video Prediction
Published	2017-03-22
URL	http://arxiv.org/abs/1703.07684v3
PDF	http://arxiv.org/pdf/1703.07684v3.pdf
PWC	https://paperswithcode.com/paper/predicting-deeper-into-the-future-of-semantic
Repo	https://github.com/m-serra/action-inference-for-video-prediction-benchmarking
Framework	tf

Self-Taught Convolutional Neural Networks for Short Text Clustering


Title	Self-Taught Convolutional Neural Networks for Short Text Clustering
Authors	Jiaming Xu, Bo Xu, Peng Wang, Suncong Zheng, Guanhua Tian, Jun Zhao, Bo Xu
Abstract	Short text clustering is a challenging problem due to its sparseness of text representation. Here we propose a flexible Self-Taught Convolutional neural network framework for Short Text Clustering (dubbed STC^2), which can flexibly and successfully incorporate more useful semantic features and learn non-biased deep text representation in an unsupervised manner. In our framework, the original raw text features are firstly embedded into compact binary codes by using one existing unsupervised dimensionality reduction methods. Then, word embeddings are explored and fed into convolutional neural networks to learn deep feature representations, meanwhile the output units are used to fit the pre-trained binary codes in the training process. Finally, we get the optimal clusters by employing K-means to cluster the learned representations. Extensive experimental results demonstrate that the proposed framework is effective, flexible and outperform several popular clustering methods when tested on three public short text datasets.
Tasks	Dimensionality Reduction, Text Clustering, Word Embeddings
Published	2017-01-01
URL	http://arxiv.org/abs/1701.00185v1
PDF	http://arxiv.org/pdf/1701.00185v1.pdf
PWC	https://paperswithcode.com/paper/self-taught-convolutional-neural-networks-for
Repo	https://github.com/jacoxu/STC2
Framework	tf