October 21, 2019

2977 words 14 mins read

Paper Group AWR 84

Fast and Accurate Tensor Completion with Total Variation Regularized Tensor Trains. Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. Toward Abstractive Summarization Using Semantic Representations. Familia: A Configurable Topic Modeling Framework for …

Fast and Accurate Tensor Completion with Total Variation Regularized Tensor Trains


Title	Fast and Accurate Tensor Completion with Total Variation Regularized Tensor Trains
Authors	Ching-Yun Ko, Kim Batselier, Wenjian Yu, Ngai Wong
Abstract	We propose a new tensor completion method based on tensor trains. The to-be-completed tensor is modeled as a low-rank tensor train, where we use the known tensor entries and their coordinates to update the tensor train. A novel tensor train initialization procedure is proposed specifically for image and video completion, which is demonstrated to ensure fast convergence of the completion algorithm. The tensor train framework is also shown to easily accommodate Total Variation and Tikhonov regularization due to their low-rank tensor train representations. Image and video inpainting experiments verify the superiority of the proposed scheme in terms of both speed and scalability, where a speedup of up to 155X is observed compared to state-of-the-art tensor completion methods at a similar accuracy. Moreover, we demonstrate the proposed scheme is especially advantageous over existing algorithms when only tiny portions (say, 1%) of the to-be-completed images/videos are known.
Tasks	Video Inpainting
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06128v3
PDF	http://arxiv.org/pdf/1804.06128v3.pdf
PWC	https://paperswithcode.com/paper/fast-and-accurate-tensor-completion-with
Repo	https://github.com/IRENEKO/TTC
Framework	none

Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform


Title	Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform
Authors	Kehelwala D. G. Maduranga, Kyle E. Helfrich, Qiang Ye
Abstract	Recurrent neural networks (RNNs) have been successfully used on a wide range of sequential data problems. A well known difficulty in using RNNs is the \textit{vanishing or exploding gradient} problem. Recently, there have been several different RNN architectures that try to mitigate this issue by maintaining an orthogonal or unitary recurrent weight matrix. One such architecture is the scaled Cayley orthogonal recurrent neural network (scoRNN) which parameterizes the orthogonal recurrent weight matrix through a scaled Cayley transform. This parametrization contains a diagonal scaling matrix consisting of positive or negative one entries that can not be optimized by gradient descent. Thus the scaling matrix is fixed before training and a hyperparameter is introduced to tune the matrix for each particular task. In this paper, we develop a unitary RNN architecture based on a complex scaled Cayley transform. Unlike the real orthogonal case, the transformation uses a diagonal scaling matrix consisting of entries on the complex unit circle which can be optimized using gradient descent and no longer requires the tuning of a hyperparameter. We also provide an analysis of a potential issue of the modReLU activiation function which is used in our work and several other unitary RNNs. In the experiments conducted, the scaled Cayley unitary recurrent neural network (scuRNN) achieves comparable or better results than scoRNN and other unitary RNNs without fixing the scaling matrix.
Tasks
Published	2018-11-09
URL	http://arxiv.org/abs/1811.04142v2
PDF	http://arxiv.org/pdf/1811.04142v2.pdf
PWC	https://paperswithcode.com/paper/complex-unitary-recurrent-neural-networks
Repo	https://github.com/Gayan225/scuRNN
Framework	tf

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design


Title	ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
Authors	Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun
Abstract	Datasets, Transforms and Models specific to Computer Vision
Tasks	Image Classification
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11164v1
PDF	http://arxiv.org/pdf/1807.11164v1.pdf
PWC	https://paperswithcode.com/paper/shufflenet-v2-practical-guidelines-for
Repo	https://github.com/Qengineering/ShuffleNetV2-ncnn
Framework	none

Toward Abstractive Summarization Using Semantic Representations


Title	Toward Abstractive Summarization Using Semantic Representations
Authors	Fei Liu, Jeffrey Flanigan, Sam Thomson, Norman Sadeh, Noah A. Smith
Abstract	We present a novel abstractive summarization framework that draws on the recent development of a treebank for the Abstract Meaning Representation (AMR). In this framework, the source text is parsed to a set of AMR graphs, the graphs are transformed into a summary graph, and then text is generated from the summary graph. We focus on the graph-to-graph transformation that reduces the source semantic graph into a summary graph, making use of an existing AMR parser and assuming the eventual availability of an AMR-to-text generator. The framework is data-driven, trainable, and not specifically designed for a particular domain. Experiments on gold-standard AMR annotations and system parses show promising results. Code is available at: https://github.com/summarization
Tasks	Abstractive Text Summarization
Published	2018-05-25
URL	http://arxiv.org/abs/1805.10399v1
PDF	http://arxiv.org/pdf/1805.10399v1.pdf
PWC	https://paperswithcode.com/paper/toward-abstractive-summarization-using
Repo	https://github.com/summarization/semantic_summ
Framework	none

Familia: A Configurable Topic Modeling Framework for Industrial Text Engineering


Title	Familia: A Configurable Topic Modeling Framework for Industrial Text Engineering
Authors	Di Jiang, Yuanfeng Song, Rongzhong Lian, Siqi Bao, Jinhua Peng, Huang He, Hua Wu
Abstract	In the last decade, a variety of topic models have been proposed for text engineering. However, except Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA), most of existing topic models are seldom applied or considered in industrial scenarios. This phenomenon is caused by the fact that there are very few convenient tools to support these topic models so far. Intimidated by the demanding expertise and labor of designing and implementing parameter inference algorithms, software engineers are prone to simply resort to PLSA/LDA, without considering whether it is proper for their problem at hand or not. In this paper, we propose a configurable topic modeling framework named Familia, in order to bridge the huge gap between academic research fruits and current industrial practice. Familia supports an important line of topic models that are widely applicable in text engineering scenarios. In order to relieve burdens of software engineers without knowledge of Bayesian networks, Familia is able to conduct automatic parameter inference for a variety of topic models. Simply through changing the data organization of Familia, software engineers are able to easily explore a broad spectrum of existing topic models or even design their own topic models, and find the one that best suits the problem at hand. With its superior extendability, Familia has a novel sampling mechanism that strikes balance between effectiveness and efficiency of parameter inference. Furthermore, Familia is essentially a big topic modeling framework that supports parallel parameter inference and distributed parameter storage. The utilities and necessity of Familia are demonstrated in real-life industrial applications. Familia would significantly enlarge software engineers’ arsenal of topic models and pave the way for utilizing highly customized topic models in real-life problems.
Tasks	Topic Models
Published	2018-08-11
URL	http://arxiv.org/abs/1808.03733v2
PDF	http://arxiv.org/pdf/1808.03733v2.pdf
PWC	https://paperswithcode.com/paper/familia-a-configurable-topic-modeling
Repo	https://github.com/baidu/Familia
Framework	none

Gotta Learn Fast: A New Benchmark for Generalization in RL


Title	Gotta Learn Fast: A New Benchmark for Generalization in RL
Authors	Alex Nichol, Vicki Pfau, Christopher Hesse, Oleg Klimov, John Schulman
Abstract	In this report, we present a new reinforcement learning (RL) benchmark based on the Sonic the Hedgehog (TM) video game franchise. This benchmark is intended to measure the performance of transfer learning and few-shot learning algorithms in the RL domain. We also present and evaluate some baseline algorithms on the new benchmark.
Tasks	Few-Shot Learning, Transfer Learning
Published	2018-04-10
URL	http://arxiv.org/abs/1804.03720v2
PDF	http://arxiv.org/pdf/1804.03720v2.pdf
PWC	https://paperswithcode.com/paper/gotta-learn-fast-a-new-benchmark-for
Repo	https://github.com/goolulusaurs/WorldModels
Framework	pytorch

A biconvex analysis for Lasso l1 reweighting


Title	A biconvex analysis for Lasso l1 reweighting
Authors	Sophie M. Fosson
Abstract	l1 reweighting algorithms are very popular in sparse signal recovery and compressed sensing, since in the practice they have been observed to outperform classical l1 methods. Nevertheless, the theoretical analysis of their convergence is a critical point, and generally is limited to the convergence of the functional to a local minimum or to subsequence convergence. In this letter, we propose a new convergence analysis of a Lasso l1 reweighting method, based on the observation that the algorithm is an alternated convex search for a biconvex problem. Based on that, we are able to prove the numerical convergence of the sequence of the iterates generated by the algorithm. This is not yet the convergence of the sequence, but it is close enough for practical and numerical purposes. Furthermore, we propose an alternative iterative soft thresholding procedure, which is faster than the main algorithm.
Tasks
Published	2018-12-07
URL	http://arxiv.org/abs/1812.02990v1
PDF	http://arxiv.org/pdf/1812.02990v1.pdf
PWC	https://paperswithcode.com/paper/a-biconvex-analysis-for-lasso-l1-reweighting
Repo	https://github.com/sophie27/Lasso-l1-reweigthing
Framework	none

Panoptic Segmentation


Title	Panoptic Segmentation
Authors	Alexander Kirillov, Kaiming He, Ross Girshick, Carsten Rother, Piotr Dollár
Abstract	We propose and study a task we name panoptic segmentation (PS). Panoptic segmentation unifies the typically distinct tasks of semantic segmentation (assign a class label to each pixel) and instance segmentation (detect and segment each object instance). The proposed task requires generating a coherent scene segmentation that is rich and complete, an important step toward real-world vision systems. While early work in computer vision addressed related image/scene parsing tasks, these are not currently popular, possibly due to lack of appropriate metrics or associated recognition challenges. To address this, we propose a novel panoptic quality (PQ) metric that captures performance for all classes (stuff and things) in an interpretable and unified manner. Using the proposed metric, we perform a rigorous study of both human and machine performance for PS on three existing datasets, revealing interesting insights about the task. The aim of our work is to revive the interest of the community in a more unified view of image segmentation.
Tasks	Instance Segmentation, Panoptic Segmentation, Scene Parsing, Scene Segmentation, Semantic Segmentation
Published	2018-01-03
URL	http://arxiv.org/abs/1801.00868v3
PDF	http://arxiv.org/pdf/1801.00868v3.pdf
PWC	https://paperswithcode.com/paper/panoptic-segmentation
Repo	https://github.com/kdethoor/panoptictorch
Framework	pytorch

Classification of crystallization outcomes using deep convolutional neural networks


Title	Classification of crystallization outcomes using deep convolutional neural networks
Authors	Andrew E. Bruno, Patrick Charbonneau, Janet Newman, Edward H. Snell, David R. So, Vincent Vanhoucke, Christopher J. Watkins, Shawn Williams, Julie Wilson
Abstract	The Machine Recognition of Crystallization Outcomes (MARCO) initiative has assembled roughly half a million annotated images of macromolecular crystallization experiments from various sources and setups. Here, state-of-the-art machine learning algorithms are trained and tested on different parts of this data set. We find that more than 94% of the test images can be correctly labeled, irrespective of their experimental origin. Because crystal recognition is key to high-density screening and the systematic analysis of crystallization experiments, this approach opens the door to both industrial and fundamental research applications.
Tasks
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10342v2
PDF	http://arxiv.org/pdf/1803.10342v2.pdf
PWC	https://paperswithcode.com/paper/classification-of-crystallization-outcomes
Repo	https://github.com/kavanp/MARCO
Framework	pytorch

Link and code: Fast indexing with graphs and compact regression codes


Title	Link and code: Fast indexing with graphs and compact regression codes
Authors	Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou
Abstract	Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements. In this paper, we revisit these approaches by considering, additionally, the memory constraint required to index billions of images on a single server. This leads us to propose a method based both on graph traversal and compact representations. We encode the indexed vectors using quantization and exploit the graph structure to refine the similarity estimation. In essence, our method takes the best of these two worlds: the search strategy is based on nested graphs, thereby providing high precision with a relatively small set of comparisons. At the same time it offers a significant memory compression. As a result, our approach outperforms the state of the art on operating points considering 64-128 bytes per vector, as demonstrated by our results on two billion-scale public benchmarks.
Tasks	Image Similarity Search, Quantization
Published	2018-04-26
URL	http://arxiv.org/abs/1804.09996v2
PDF	http://arxiv.org/pdf/1804.09996v2.pdf
PWC	https://paperswithcode.com/paper/link-and-code-fast-indexing-with-graphs-and
Repo	https://github.com/x-x-p/faiss_facebook
Framework	none

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks


Title	Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks
Authors	Jason Kuen, Xiangfei Kong, Zhe Lin, Gang Wang, Jianxiong Yin, Simon See, Yap-Peng Tan
Abstract	It is desirable to train convolutional networks (CNNs) to run more efficiently during inference. In many cases however, the computational budget that the system has for inference cannot be known beforehand during training, or the inference budget is dependent on the changing real-time resource availability. Thus, it is inadequate to train just inference-efficient CNNs, whose inference costs are not adjustable and cannot adapt to varied inference budgets. We propose a novel approach for cost-adjustable inference in CNNs - Stochastic Downsampling Point (SDPoint). During training, SDPoint applies feature map downsampling to a random point in the layer hierarchy, with a random downsampling ratio. The different stochastic downsampling configurations known as SDPoint instances (of the same model) have computational costs different from each other, while being trained to minimize the same prediction loss. Sharing network parameters across different instances provides significant regularization boost. During inference, one may handpick a SDPoint instance that best fits the inference budget. The effectiveness of SDPoint, as both a cost-adjustable inference approach and a regularizer, is validated through extensive experiments on image classification.
Tasks	Image Classification, Object Recognition
Published	2018-01-29
URL	http://arxiv.org/abs/1801.09335v1
PDF	http://arxiv.org/pdf/1801.09335v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-downsampling-for-cost-adjustable
Repo	https://github.com/xternalz/SDPoint
Framework	pytorch

Harmonic Networks: Integrating Spectral Information into CNNs


Title	Harmonic Networks: Integrating Spectral Information into CNNs
Authors	Matej Ulicny, Vladimir A. Krylov, Rozenn Dahyot
Abstract	Convolutional neural networks (CNNs) learn filters in order to capture local correlation patterns in feature space. In contrast, in this paper we propose harmonic blocks that produce features by learning optimal combinations of spectral filters defined by the Discrete Cosine Transform. The harmonic blocks are used to replace conventional convolutional layers to construct partial or fully harmonic CNNs. We extensively validate our approach and show that the introduction of harmonic blocks into state-of-the-art CNN baseline architectures results in comparable or better performance in classification tasks on small NORB, CIFAR10 and CIFAR100 datasets.
Tasks
Published	2018-12-07
URL	http://arxiv.org/abs/1812.03205v1
PDF	http://arxiv.org/pdf/1812.03205v1.pdf
PWC	https://paperswithcode.com/paper/harmonic-networks-integrating-spectral
Repo	https://github.com/matej-ulicny/harmonic-networks
Framework	pytorch

FMHash: Deep Hashing of In-Air-Handwriting for User Identification


Title	FMHash: Deep Hashing of In-Air-Handwriting for User Identification
Authors	Duo Lu, Dijiang Huang, Anshul Rai
Abstract	Many mobile systems and wearable devices, such as Virtual Reality (VR) or Augmented Reality (AR) headsets, lack a keyboard or touchscreen to type an ID and password for signing into a virtual website. However, they are usually equipped with gesture capture interfaces to allow the user to interact with the system directly with hand gestures. Although gesture-based authentication has been well-studied, less attention is paid to the gesture-based user identification problem, which is essentially an input method of account ID and an efficient searching and indexing method of a database of gesture signals. In this paper, we propose FMHash (i.e., Finger Motion Hash), a user identification framework that can generate a compact binary hash code from a piece of in-air-handwriting of an ID string. This hash code enables indexing and fast search of a large account database using the in-air-handwriting by a hash table. To demonstrate the effectiveness of the framework, we implemented a prototype and achieved >99.5% precision and >92.6% recall with exact hash code match on a dataset of 200 accounts collected by us. The ability of hashing in-air-handwriting pattern to binary code can be used to achieve convenient sign-in and sign-up with in-air-handwriting gesture ID on future mobile and wearable systems connected to the Internet.
Tasks
Published	2018-06-10
URL	https://arxiv.org/abs/1806.03574v2
PDF	https://arxiv.org/pdf/1806.03574v2.pdf
PWC	https://paperswithcode.com/paper/fmhash-deep-hashing-of-in-air-handwriting-for
Repo	https://github.com/duolu/fmkit
Framework	none

Multivariate Arrival Times with Recurrent Neural Networks for Personalized Demand Forecasting


Title	Multivariate Arrival Times with Recurrent Neural Networks for Personalized Demand Forecasting
Authors	Tianle Chen, Brian Keng, Javier Moreno
Abstract	Access to a large variety of data across a massive population has made it possible to predict customer purchase patterns and responses to marketing campaigns. In particular, accurate demand forecasts for popular products with frequent repeat purchases are essential since these products are one of the main drivers of profits. However, buyer purchase patterns are extremely diverse and sparse on a per-product level due to population heterogeneity as well as dependence in purchase patterns across product categories. Traditional methods in survival analysis have proven effective in dealing with censored data by assuming parametric distributions on inter-arrival times. Distributional parameters are then fitted, typically in a regression framework. On the other hand, neural-network based models take a non-parametric approach to learn relations from a larger functional class. However, the lack of distributional assumptions make it difficult to model partially observed data. In this paper, we model directly the inter-arrival times as well as the partially observed information at each time step in a survival-based approach using Recurrent Neural Networks (RNN) to model purchase times jointly over several products. Instead of predicting a point estimate for inter-arrival times, the RNN outputs parameters that define a distributional estimate. The loss function is the negative log-likelihood of these parameters given partially observed data. This approach allows one to leverage both fully observed data as well as partial information. By externalizing the censoring problem through a log-likelihood loss function, we show that substantial improvements over state-of-the-art machine learning methods can be achieved. We present experimental results based on two open datasets as well as a study on a real dataset from a large retailer.
Tasks	Survival Analysis
Published	2018-12-29
URL	http://arxiv.org/abs/1812.11444v1
PDF	http://arxiv.org/pdf/1812.11444v1.pdf
PWC	https://paperswithcode.com/paper/multivariate-arrival-times-with-recurrent
Repo	https://github.com/rubikloud/matrnn
Framework	tf

Learn to Combine Modalities in Multimodal Deep Learning


Title	Learn to Combine Modalities in Multimodal Deep Learning
Authors	Kuan Liu, Yanen Li, Ning Xu, Prem Natarajan
Abstract	Combining complementary information from multiple modalities is intuitively appealing for improving the performance of learning-based approaches. However, it is challenging to fully leverage different modalities due to practical challenges such as varying levels of noise and conflicts between modalities. Existing methods do not adopt a joint approach to capturing synergies between the modalities while simultaneously filtering noise and resolving conflicts on a per sample basis. In this work we propose a novel deep neural network based technique that multiplicatively combines information from different source modalities. Thus the model training process automatically focuses on information from more reliable modalities while reducing emphasis on the less reliable modalities. Furthermore, we propose an extension that multiplicatively combines not only the single-source modalities, but a set of mixtured source modalities to better capture cross-modal signal correlations. We demonstrate the effectiveness of our proposed technique by presenting empirical results on three multimodal classification tasks from different domains. The results show consistent accuracy improvements on all three tasks.
Tasks
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11730v1
PDF	http://arxiv.org/pdf/1805.11730v1.pdf
PWC	https://paperswithcode.com/paper/learn-to-combine-modalities-in-multimodal
Repo	https://github.com/skywaLKer518/MultiplicativeMultimodal
Framework	tf