May 7, 2019

2819 words 14 mins read

Paper Group AWR 42

megaman: Manifold Learning with Millions of points. Food Image Recognition by Using Convolutional Neural Networks (CNNs). Neural Semantic Encoders. The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process. The minimal hitting set generation problem: algorithms and computation. A Deep Spatial Contextual Long-term Recurrent Co …

megaman: Manifold Learning with Millions of points


Title	megaman: Manifold Learning with Millions of points
Authors	James McQueen, Marina Meila, Jacob VanderPlas, Zhongyue Zhang
Abstract	Manifold Learning is a class of algorithms seeking a low-dimensional non-linear representation of high-dimensional data. Thus manifold learning algorithms are, at least in theory, most applicable to high-dimensional data and sample sizes to enable accurate estimation of the manifold. Despite this, most existing manifold learning implementations are not particularly scalable. Here we present a Python package that implements a variety of manifold learning algorithms in a modular and scalable fashion, using fast approximate neighbors searches and fast sparse eigendecompositions. The package incorporates theoretical advances in manifold learning, such as the unbiased Laplacian estimator and the estimation of the embedding distortion by the Riemannian metric method. In benchmarks, even on a single-core desktop computer, our code embeds millions of data points in minutes, and takes just 200 minutes to embed the main sample of galaxy spectra from the Sloan Digital Sky Survey — consisting of 0.6 million samples in 3750-dimensions — a task which has not previously been possible.
Tasks
Published	2016-03-09
URL	http://arxiv.org/abs/1603.02763v1
PDF	http://arxiv.org/pdf/1603.02763v1.pdf
PWC	https://paperswithcode.com/paper/megaman-manifold-learning-with-millions-of
Repo	https://github.com/mmp2/megaman
Framework	none

Food Image Recognition by Using Convolutional Neural Networks (CNNs)


Title	Food Image Recognition by Using Convolutional Neural Networks (CNNs)
Authors	Yuzhen Lu
Abstract	Food image recognition is one of the promising applications of visual object recognition in computer vision. In this study, a small-scale dataset consisting of 5822 images of ten categories and a five-layer CNN was constructed to recognize these images. The bag-of-features (BoF) model coupled with support vector machine (SVM) was first evaluated for image classification, resulting in an overall accuracy of 56%; while the CNN model performed much better with an overall accuracy of 74%. Data augmentation techniques based on geometric transformation were applied to increase the size of training images, which achieved a significantly improved accuracy of more than 90% while preventing the overfitting issue that occurred to the CNN based on raw training data. Further improvements can be expected by collecting more images and optimizing the network architecture and hyper-parameters.
Tasks	Data Augmentation, Image Classification, Object Recognition
Published	2016-12-03
URL	http://arxiv.org/abs/1612.00983v2
PDF	http://arxiv.org/pdf/1612.00983v2.pdf
PWC	https://paperswithcode.com/paper/food-image-recognition-by-using-convolutional
Repo	https://github.com/jingweimo/food-image-classification-
Framework	none

Neural Semantic Encoders


Title	Neural Semantic Encoders
Authors	Tsendsuren Munkhdalai, Hong Yu
Abstract	We present a memory augmented neural network for natural language understanding: Neural Semantic Encoders. NSE is equipped with a novel memory update rule and has a variable sized encoding memory that evolves over time and maintains the understanding of input sequences through read}, compose and write operations. NSE can also access multiple and shared memories. In this paper, we demonstrated the effectiveness and the flexibility of NSE on five different natural language tasks: natural language inference, question answering, sentence classification, document sentiment analysis and machine translation where NSE achieved state-of-the-art performance when evaluated on publically available benchmarks. For example, our shared-memory model showed an encouraging result on neural machine translation, improving an attention-based baseline by approximately 1.0 BLEU.
Tasks	Machine Translation, Natural Language Inference, Question Answering, Sentence Classification, Sentiment Analysis
Published	2016-07-14
URL	http://arxiv.org/abs/1607.04315v3
PDF	http://arxiv.org/pdf/1607.04315v3.pdf
PWC	https://paperswithcode.com/paper/neural-semantic-encoders
Repo	https://github.com/Smerity/keras_snli
Framework	none

The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process


Title	The Neural Hawkes Process: A Neurally Self-Modulating Multivariate Point Process
Authors	Hongyuan Mei, Jason Eisner
Abstract	Many events occur in the world. Some event types are stochastically excited or inhibited—in the sense of having their probabilities elevated or decreased—by patterns in the sequence of previous events. Discovering such patterns can help us predict which type of event will happen next and when. We model streams of discrete events in continuous time, by constructing a neurally self-modulating multivariate point process in which the intensities of multiple event types evolve according to a novel continuous-time LSTM. This generative model allows past events to influence the future in complex and realistic ways, by conditioning future event intensities on the hidden state of a recurrent neural network that has consumed the stream of past events. Our model has desirable qualitative properties. It achieves competitive likelihood and predictive accuracy on real and synthetic datasets, including under missing-data conditions.
Tasks
Published	2016-12-29
URL	http://arxiv.org/abs/1612.09328v3
PDF	http://arxiv.org/pdf/1612.09328v3.pdf
PWC	https://paperswithcode.com/paper/the-neural-hawkes-process-a-neurally-self
Repo	https://github.com/HMEIatJHU/neurawkes
Framework	none

The minimal hitting set generation problem: algorithms and computation


Title	The minimal hitting set generation problem: algorithms and computation
Authors	Andrew Gainer-Dewar, Paola Vera-Licona
Abstract	Finding inclusion-minimal “hitting sets” for a given collection of sets is a fundamental combinatorial problem with applications in domains as diverse as Boolean algebra, computational biology, and data mining. Much of the algorithmic literature focuses on the problem of recognizing the collection of minimal hitting sets; however, in many of the applications, it is more important to generate these hitting sets. We survey twenty algorithms from across a variety of domains, considering their history, classification, useful features, and computational performance on a variety of synthetic and real-world inputs. We also provide a suite of implementations of these algorithms with a ready-to-use, platform-agnostic interface based on Docker containers and the AlgoRun framework, so that interested computational scientists can easily perform similar tests with inputs from their own research areas on their own computers or through a convenient Web interface.
Tasks
Published	2016-01-05
URL	http://arxiv.org/abs/1601.02939v1
PDF	http://arxiv.org/pdf/1601.02939v1.pdf
PWC	https://paperswithcode.com/paper/the-minimal-hitting-set-generation-problem
Repo	https://github.com/VeraLiconaResearchGroup/MHSGenerationAlgorithms
Framework	none

A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection


Title	A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection
Authors	Nian Liu, Junwei Han
Abstract	Traditional saliency models usually adopt hand-crafted image features and human-designed mechanisms to calculate local or global contrast. In this paper, we propose a novel computational saliency model, i.e., deep spatial contextual long-term recurrent convolutional network (DSCLRCN) to predict where people looks in natural scenes. DSCLRCN first automatically learns saliency related local features on each image location in parallel. Then, in contrast with most other deep network based saliency models which infer saliency in local contexts, DSCLRCN can mimic the cortical lateral inhibition mechanisms in human visual system to incorporate global contexts to assess the saliency of each image location by leveraging the deep spatial long short-term memory (DSLSTM) model. Moreover, we also integrate scene context modulation in DSLSTM for saliency inference, leading to a novel deep spatial contextual LSTM (DSCLSTM) model. The whole network can be trained end-to-end and works efficiently when testing. Experimental results on two benchmark datasets show that DSCLRCN can achieve state-of-the-art performance on saliency detection. Furthermore, the proposed DSCLSTM model can significantly boost the saliency detection performance by incorporating both global spatial interconnections and scene context modulation, which may uncover novel inspirations for studies on them in computational saliency models.
Tasks	Saliency Detection
Published	2016-10-06
URL	http://arxiv.org/abs/1610.01708v1
PDF	http://arxiv.org/pdf/1610.01708v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-spatial-contextual-long-term-recurrent
Repo	https://github.com/AAshqar/DSCLRCN-PyTorch
Framework	pytorch

Fundamental principles of cortical computation: unsupervised learning with prediction, compression and feedback


Title	Fundamental principles of cortical computation: unsupervised learning with prediction, compression and feedback
Authors	Micah Richert, Dimitry Fisher, Filip Piekniewski, Eugene M. Izhikevich, Todd L. Hylton
Abstract	There has been great progress in understanding of anatomical and functional microcircuitry of the primate cortex. However, the fundamental principles of cortical computation - the principles that allow the visual cortex to bind retinal spikes into representations of objects, scenes and scenarios - have so far remained elusive. In an attempt to come closer to understanding the fundamental principles of cortical computation, here we present a functional, phenomenological model of the primate visual cortex. The core part of the model describes four hierarchical cortical areas with feedforward, lateral, and recurrent connections. The three main principles implemented in the model are information compression, unsupervised learning by prediction, and use of lateral and top-down context. We show that the model reproduces key aspects of the primate ventral stream of visual processing including Simple and Complex cells in V1, increasingly complicated feature encoding, and increased separability of object representations in higher cortical areas. The model learns representations of the visual environment that allow for accurate classification and state-of-the-art visual tracking performance on novel objects.
Tasks	Visual Tracking
Published	2016-08-19
URL	http://arxiv.org/abs/1608.06277v1
PDF	http://arxiv.org/pdf/1608.06277v1.pdf
PWC	https://paperswithcode.com/paper/fundamental-principles-of-cortical
Repo	https://github.com/braincorp/ASC
Framework	none

De-Conflated Semantic Representations


Title	De-Conflated Semantic Representations
Authors	Mohammad Taher Pilehvar, Nigel Collier
Abstract	One major deficiency of most semantic representation techniques is that they usually model a word type as a single point in the semantic space, hence conflating all the meanings that the word can have. Addressing this issue by learning distinct representations for individual meanings of words has been the subject of several research studies in the past few years. However, the generated sense representations are either not linked to any sense inventory or are unreliable for infrequent word senses. We propose a technique that tackles these problems by de-conflating the representations of words based on the deep knowledge it derives from a semantic network. Our approach provides multiple advantages in comparison to the past work, including its high coverage and the ability to generate accurate representations even for infrequent word senses. We carry out evaluations on six datasets across two semantic similarity tasks and report state-of-the-art results on most of them.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2016-08-05
URL	http://arxiv.org/abs/1608.01961v1
PDF	http://arxiv.org/pdf/1608.01961v1.pdf
PWC	https://paperswithcode.com/paper/de-conflated-semantic-representations
Repo	https://github.com/pilehvar/deconf
Framework	none

Learning long-term dependencies for action recognition with a biologically-inspired deep network


Title	Learning long-term dependencies for action recognition with a biologically-inspired deep network
Authors	Yemin Shi, Yonghong Tian, Yaowei Wang, Tiejun Huang
Abstract	Despite a lot of research efforts devoted in recent years, how to efficiently learn long-term dependencies from sequences still remains a pretty challenging task. As one of the key models for sequence learning, recurrent neural network (RNN) and its variants such as long short term memory (LSTM) and gated recurrent unit (GRU) are still not powerful enough in practice. One possible reason is that they have only feedforward connections, which is different from the biological neural system that is typically composed of both feedforward and feedback connections. To address this problem, this paper proposes a biologically-inspired deep network, called shuttleNet\footnote{Our code is available at \url{https://github.com/shiyemin/shuttlenet}}. Technologically, the shuttleNet consists of several processors, each of which is a GRU while associated with multiple groups of cells and states. Unlike traditional RNNs, all processors inside shuttleNet are loop connected to mimic the brain’s feedforward and feedback connections, in which they are shared across multiple pathways in the loop connection. Attention mechanism is then employed to select the best information flow pathway. Extensive experiments conducted on two benchmark datasets (i.e UCF101 and HMDB51) show that we can beat state-of-the-art methods by simply embedding shuttleNet into a CNN-RNN framework.
Tasks	Temporal Action Localization
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05216v3
PDF	http://arxiv.org/pdf/1611.05216v3.pdf
PWC	https://paperswithcode.com/paper/learning-long-term-dependencies-for-action
Repo	https://github.com/shiyemin/shuttlenet
Framework	tf

Chinese Poetry Generation with Planning based Neural Network


Title	Chinese Poetry Generation with Planning based Neural Network
Authors	Zhe Wang, Wei He, Hua Wu, Haiyang Wu, Wei Li, Haifeng Wang, Enhong Chen
Abstract	Chinese poetry generation is a very challenging task in natural language processing. In this paper, we propose a novel two-stage poetry generating method which first plans the sub-topics of the poem according to the user’s writing intent, and then generates each line of the poem sequentially, using a modified recurrent neural network encoder-decoder framework. The proposed planning-based method can ensure that the generated poem is coherent and semantically consistent with the user’s intent. A comprehensive evaluation with human judgments demonstrates that our proposed approach outperforms the state-of-the-art poetry generating methods and the poem quality is somehow comparable to human poets.
Tasks
Published	2016-10-31
URL	http://arxiv.org/abs/1610.09889v2
PDF	http://arxiv.org/pdf/1610.09889v2.pdf
PWC	https://paperswithcode.com/paper/chinese-poetry-generation-with-planning-based
Repo	https://github.com/Epoch-Mengying/Generating-Poetry-with-Chatbot
Framework	none

Local Binary Convolutional Neural Networks


Title	Local Binary Convolutional Neural Networks
Authors	Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides
Abstract	We propose local binary convolution (LBC), an efficient alternative to convolutional layers in standard convolutional neural networks (CNN). The design principles of LBC are motivated by local binary patterns (LBP). The LBC layer comprises of a set of fixed sparse pre-defined binary convolutional filters that are not updated during the training process, a non-linear activation function and a set of learnable linear weights. The linear weights combine the activated filter responses to approximate the corresponding activated filter responses of a standard convolutional layer. The LBC layer affords significant parameter savings, 9x to 169x in the number of learnable parameters compared to a standard convolutional layer. Furthermore, the sparse and binary nature of the weights also results in up to 9x to 169x savings in model size compared to a standard convolutional layer. We demonstrate both theoretically and experimentally that our local binary convolution layer is a good approximation of a standard convolutional layer. Empirically, CNNs with LBC layers, called local binary convolutional neural networks (LBCNN), achieves performance parity with regular CNNs on a range of visual datasets (MNIST, SVHN, CIFAR-10, and ImageNet) while enjoying significant computational savings.
Tasks
Published	2016-08-22
URL	http://arxiv.org/abs/1608.06049v2
PDF	http://arxiv.org/pdf/1608.06049v2.pdf
PWC	https://paperswithcode.com/paper/local-binary-convolutional-neural-networks
Repo	https://github.com/juefeix/pnn.pytorch
Framework	pytorch

Incorporating Discrete Translation Lexicons into Neural Machine Translation


Title	Incorporating Discrete Translation Lexicons into Neural Machine Translation
Authors	Philip Arthur, Graham Neubig, Satoshi Nakamura
Abstract	Neural machine translation (NMT) often makes mistakes in translating low-frequency content words that are essential to understanding the meaning of the sentence. We propose a method to alleviate this problem by augmenting NMT systems with discrete translation lexicons that efficiently encode translations of these low-frequency words. We describe a method to calculate the lexicon probability of the next word in the translation candidate by using the attention vector of the NMT model to select which source word lexical probabilities the model should focus on. We test two methods to combine this probability with the standard NMT probability: (1) using it as a bias, and (2) linear interpolation. Experiments on two corpora show an improvement of 2.0-2.3 BLEU and 0.13-0.44 NIST score, and faster convergence time.
Tasks	Machine Translation
Published	2016-06-07
URL	http://arxiv.org/abs/1606.02006v2
PDF	http://arxiv.org/pdf/1606.02006v2.pdf
PWC	https://paperswithcode.com/paper/incorporating-discrete-translation-lexicons
Repo	https://github.com/duyvuleo/Transformer-DyNet
Framework	tf

Learning to Generate Compositional Color Descriptions


Title	Learning to Generate Compositional Color Descriptions
Authors	Will Monroe, Noah D. Goodman, Christopher Potts
Abstract	The production of color language is essential for grounded language generation. Color descriptions have many challenging properties: they can be vague, compositionally complex, and denotationally rich. We present an effective approach to generating color descriptions using recurrent neural networks and a Fourier-transformed color representation. Our model outperforms previous work on a conditional language modeling task over a large corpus of naturalistic color descriptions. In addition, probing the model’s output reveals that it can accurately produce not only basic color terms but also descriptors with non-convex denotations (“greenish”), bare modifiers (“bright”, “dull”), and compositional phrases (“faded teal”) not seen in training.
Tasks	Language Modelling, Text Generation
Published	2016-06-13
URL	http://arxiv.org/abs/1606.03821v2
PDF	http://arxiv.org/pdf/1606.03821v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-generate-compositional-color
Repo	https://github.com/stanfordnlp/color-describer
Framework	none

Deeper Depth Prediction with Fully Convolutional Residual Networks


Title	Deeper Depth Prediction with Fully Convolutional Residual Networks
Authors	Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari, Nassir Navab
Abstract	This paper addresses the problem of estimating the depth map of a scene given a single RGB image. We propose a fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps. In order to improve the output resolution, we present a novel way to efficiently learn feature map up-sampling within the network. For optimization, we introduce the reverse Huber loss that is particularly suited for the task at hand and driven by the value distributions commonly present in depth maps. Our model is composed of a single architecture that is trained end-to-end and does not rely on post-processing techniques, such as CRFs or other additional refinement steps. As a result, it runs in real-time on images or videos. In the evaluation, we show that the proposed model contains fewer parameters and requires fewer training data than the current state of the art, while outperforming all approaches on depth estimation. Code and models are publicly available.
Tasks	Depth Estimation
Published	2016-06-01
URL	http://arxiv.org/abs/1606.00373v2
PDF	http://arxiv.org/pdf/1606.00373v2.pdf
PWC	https://paperswithcode.com/paper/deeper-depth-prediction-with-fully
Repo	https://github.com/LeonSun0101/CD-SD
Framework	pytorch

Learning Convolutional Neural Networks for Graphs


Title	Learning Convolutional Neural Networks for Graphs
Authors	Mathias Niepert, Mohamed Ahmed, Konstantin Kutzkov
Abstract	Numerous important problems can be framed as learning from graph data. We propose a framework for learning convolutional neural networks for arbitrary graphs. These graphs may be undirected, directed, and with both discrete and continuous node and edge attributes. Analogous to image-based convolutional networks that operate on locally connected regions of the input, we present a general approach to extracting locally connected regions from graphs. Using established benchmark data sets, we demonstrate that the learned feature representations are competitive with state of the art graph kernels and that their computation is highly efficient.
Tasks	Graph Classification
Published	2016-05-17
URL	http://arxiv.org/abs/1605.05273v4
PDF	http://arxiv.org/pdf/1605.05273v4.pdf
PWC	https://paperswithcode.com/paper/learning-convolutional-neural-networks-for
Repo	https://github.com/TibiG97/Part2Project
Framework	none