May 7, 2019

2793 words 14 mins read

Paper Group AWR 73

Paper Group AWR 73

Deep neural networks can be improved using human-derived contextual expectations. An Integrated Recommender Algorithm for Rating Prediction. Kronecker Determinantal Point Processes. FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics. $\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent. Hierar …

Deep neural networks can be improved using human-derived contextual expectations

Title Deep neural networks can be improved using human-derived contextual expectations
Authors Harish Katti, Marius V. Peelen, S. P. Arun
Abstract Real-world objects occur in specific contexts. Such context has been shown to facilitate detection by constraining the locations to search. But can context directly benefit object detection? To do so, context needs to be learned independently from target features. This is impossible in traditional object detection where classifiers are trained on images containing both target features and surrounding context. In contrast, humans can learn context and target features separately, such as when we see highways without cars. Here we show for the first time that human-derived scene expectations can be used to improve object detection performance in machines. To measure contextual expectations, we asked human subjects to indicate the scale, location and likelihood at which cars or people might occur in scenes without these objects. Humans showed highly systematic expectations that we could accurately predict using scene features. This allowed us to predict human expectations on novel scenes without requiring manual annotation. On augmenting deep neural networks with predicted human expectations, we obtained substantial gains in accuracy for detecting cars and people (1-3%) as well as on detecting associated objects (3-20%). In contrast, augmenting deep networks with other conventional features yielded far smaller gains. This improvement was due to relatively poor matches at highly likely locations being correctly labelled as target and conversely strong matches at unlikely locations being correctly rejected as false alarms. Taken together, our results show that augmenting deep neural networks with human-derived context features improves their performance, suggesting that humans learn scene context separately unlike deep networks.
Tasks Object Detection
Published 2016-11-22
URL http://arxiv.org/abs/1611.07218v4
PDF http://arxiv.org/pdf/1611.07218v4.pdf
PWC https://paperswithcode.com/paper/deep-neural-networks-can-be-improved-using
Repo https://github.com/harish2006/cntxt_likelihood
Framework none

An Integrated Recommender Algorithm for Rating Prediction

Title An Integrated Recommender Algorithm for Rating Prediction
Authors Yefeng Ruan, Tzu-Chun Lin
Abstract Recommender system is currently widely used in many e-commerce systems, such as Amazon, eBay, and so on. It aims to help users to find items which they may be interested in. In literature, neighborhood-based collaborative filtering and matrix factorization are two common methods used in recommender systems. In this paper, we combine these two methods with personalized weights on them. Rather than using fixed weights for these two methods, we assume each user has her/his own preference over them. Our results shows that our algorithm outperforms neighborhood-based collaborative filtering algorithm, matrix factorization algorithm and their combination with fixed weights.
Tasks Recommendation Systems
Published 2016-08-05
URL https://arxiv.org/abs/1608.02021v1
PDF https://arxiv.org/pdf/1608.02021v1.pdf
PWC https://paperswithcode.com/paper/an-integrated-recommender-algorithm-for
Repo https://github.com/sidooms/MovieTweetings
Framework none

Kronecker Determinantal Point Processes

Title Kronecker Determinantal Point Processes
Authors Zelda Mariet, Suvrit Sra
Abstract Determinantal Point Processes (DPPs) are probabilistic models over all subsets a ground set of $N$ items. They have recently gained prominence in several applications that rely on “diverse” subsets. However, their applicability to large problems is still limited due to the $\mathcal O(N^3)$ complexity of core tasks such as sampling and learning. We enable efficient sampling and learning for DPPs by introducing KronDPP, a DPP model whose kernel matrix decomposes as a tensor product of multiple smaller kernel matrices. This decomposition immediately enables fast exact sampling. But contrary to what one may expect, leveraging the Kronecker product structure for speeding up DPP learning turns out to be more difficult. We overcome this challenge, and derive batch and stochastic optimization algorithms for efficiently learning the parameters of a KronDPP.
Tasks Point Processes, Stochastic Optimization
Published 2016-05-26
URL http://arxiv.org/abs/1605.08374v1
PDF http://arxiv.org/pdf/1605.08374v1.pdf
PWC https://paperswithcode.com/paper/kronecker-determinantal-point-processes
Repo https://github.com/alshedivat/DeterminantalPointProcesses.jl
Framework none

FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics

Title FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics
Authors Tran Minh Quan, David G. C. Hildebrand, Won-Ki Jeong
Abstract Electron microscopic connectomics is an ambitious research direction with the goal of studying comprehensive brain connectivity maps by using high-throughput, nano-scale microscopy. One of the main challenges in connectomics research is developing scalable image analysis algorithms that require minimal user intervention. Recently, deep learning has drawn much attention in computer vision because of its exceptional performance in image classification tasks. For this reason, its application to connectomic analyses holds great promise, as well. In this paper, we introduce a novel deep neural network architecture, FusionNet, for the automatic segmentation of neuronal structures in connectomics data. FusionNet leverages the latest advances in machine learning, such as semantic segmentation and residual neural networks, with the novel introduction of summation-based skip connections to allow a much deeper network architecture for a more accurate segmentation. We demonstrate the performance of the proposed method by comparing it with state-of-the-art electron microscopy (EM) segmentation methods from the ISBI EM segmentation challenge. We also show the segmentation results on two different tasks including cell membrane and cell body segmentation and a statistical analysis of cell morphology.
Tasks Brain Image Segmentation, Image Classification, Semantic Segmentation
Published 2016-12-16
URL http://arxiv.org/abs/1612.05360v2
PDF http://arxiv.org/pdf/1612.05360v2.pdf
PWC https://paperswithcode.com/paper/fusionnet-a-deep-fully-residual-convolutional
Repo https://github.com/Jeongseungwoo/Fusion-net
Framework tf

$\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent

Title $\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent
Authors Mario Souto, Joaquim D. Garcia, Gustavo C. Amaral
Abstract Identifying the unknown underlying trend of a given noisy signal is extremely useful for a wide range of applications. The number of potential trends might be exponential, which can be computationally exhaustive even for short signals. Another challenge, is the presence of abrupt changes and outliers at unknown times which impart resourceful information regarding the signal’s characteristics. In this paper, we present the $\ell_1$ Adaptive Trend Filter, which can consistently identify the components in the underlying trend and multiple level-shifts, even in the presence of outliers. Additionally, an enhanced coordinate descent algorithm which exploit the filter design is presented. Some implementation details are discussed and a version in the Julia language is presented along with two distinct applications to illustrate the filter’s potential.
Tasks
Published 2016-03-11
URL https://arxiv.org/abs/1603.03799v2
PDF https://arxiv.org/pdf/1603.03799v2.pdf
PWC https://paperswithcode.com/paper/ell_1-adaptive-trend-filter-via-fast
Repo https://github.com/joaquimg/L1AdaptiveTrendFilter.jl
Framework none

Hierarchical Memory Networks for Answer Selection on Unknown Words

Title Hierarchical Memory Networks for Answer Selection on Unknown Words
Authors Jiaming Xu, Jing Shi, Yiqun Yao, Suncong Zheng, Bo Xu, Bo Xu
Abstract Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory. However, memory networks conduct the reasoning on sentence-level memory to output coarse semantic vectors and do not further take any attention mechanism to focus on words, which may lead to the model lose some detail information, especially when the answers are rare or unknown words. In this paper, we propose a novel Hierarchical Memory Networks, dubbed HMN. First, we encode the past facts into sentence-level memory and word-level memory respectively. Then, (k)-max pooling is exploited following reasoning module on the sentence-level memory to sample the (k) most relevant sentences to a question and feed these sentences into attention mechanism on the word-level memory to focus the words in the selected sentences. Finally, the prediction is jointly learned over the outputs of the sentence-level reasoning module and the word-level attention mechanism. The experimental results demonstrate that our approach successfully conducts answer selection on unknown words and achieves a better performance than memory networks.
Tasks Answer Selection, Question Answering
Published 2016-09-28
URL http://arxiv.org/abs/1609.08843v1
PDF http://arxiv.org/pdf/1609.08843v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-memory-networks-for-answer
Repo https://github.com/jacoxu/HMN4QA
Framework none

UnrealCV: Connecting Computer Vision to Unreal Engine

Title UnrealCV: Connecting Computer Vision to Unreal Engine
Authors Weichao Qiu, Alan Yuille
Abstract Computer graphics can not only generate synthetic images and ground truth but it also offers the possibility of constructing virtual worlds in which: (i) an agent can perceive, navigate, and take actions guided by AI algorithms, (ii) properties of the worlds can be modified (e.g., material and reflectance), (iii) physical simulations can be performed, and (iv) algorithms can be learnt and evaluated. But creating realistic virtual worlds is not easy. The game industry, however, has spent a lot of effort creating 3D worlds, which a player can interact with. So researchers can build on these resources to create virtual worlds, provided we can access and modify the internal data structures of the games. To enable this we created an open-source plugin UnrealCV (http://unrealcv.github.io) for a popular game engine Unreal Engine 4 (UE4). We show two applications: (i) a proof of concept image dataset, and (ii) linking Caffe with the virtual world to test deep network algorithms.
Tasks Physical Simulations
Published 2016-09-05
URL http://arxiv.org/abs/1609.01326v1
PDF http://arxiv.org/pdf/1609.01326v1.pdf
PWC https://paperswithcode.com/paper/unrealcv-connecting-computer-vision-to-unreal
Repo https://github.com/unrealcv/unrealcv
Framework none

SSHMT: Semi-supervised Hierarchical Merge Tree for Electron Microscopy Image Segmentation

Title SSHMT: Semi-supervised Hierarchical Merge Tree for Electron Microscopy Image Segmentation
Authors Ting Liu, Miaomiao Zhang, Mehran Javanmardi, Nisha Ramesh, Tolga Tasdizen
Abstract Region-based methods have proven necessary for improving segmentation accuracy of neuronal structures in electron microscopy (EM) images. Most region-based segmentation methods use a scoring function to determine region merging. Such functions are usually learned with supervised algorithms that demand considerable ground truth data, which are costly to collect. We propose a semi-supervised approach that reduces this demand. Based on a merge tree structure, we develop a differentiable unsupervised loss term that enforces consistent predictions from the learned function. We then propose a Bayesian model that combines the supervised and the unsupervised information for probabilistic learning. The experimental results on three EM data sets demonstrate that by using a subset of only 3% to 7% of the entire ground truth data, our approach consistently performs close to the state-of-the-art supervised method with the full labeled data set, and significantly outperforms the supervised method with the same labeled subset.
Tasks Electron Microscopy Image Segmentation, Semantic Segmentation
Published 2016-08-14
URL http://arxiv.org/abs/1608.04051v1
PDF http://arxiv.org/pdf/1608.04051v1.pdf
PWC https://paperswithcode.com/paper/sshmt-semi-supervised-hierarchical-merge-tree
Repo https://github.com/tingliu/glia
Framework none

DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification

Title DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification
Authors Linjie Xing, Yu Qiao
Abstract Text-independent writer identification is challenging due to the huge variation of written contents and the ambiguous written styles of different writers. This paper proposes DeepWriter, a deep multi-stream CNN to learn deep powerful representation for recognizing writers. DeepWriter takes local handwritten patches as input and is trained with softmax classification loss. The main contributions are: 1) we design and optimize multi-stream structure for writer identification task; 2) we introduce data augmentation learning to enhance the performance of DeepWriter; 3) we introduce a patch scanning strategy to handle text image with different lengths. In addition, we find that different languages such as English and Chinese may share common features for writer identification, and joint training can yield better performance. Experimental results on IAM and HWDB datasets show that our models achieve high identification accuracy: 99.01% on 301 writers and 97.03% on 657 writers with one English sentence input, 93.85% on 300 writers with one Chinese character input, which outperform previous methods with a large margin. Moreover, our models obtain accuracy of 98.01% on 301 writers with only 4 English alphabets as input.
Tasks Data Augmentation
Published 2016-06-21
URL http://arxiv.org/abs/1606.06472v2
PDF http://arxiv.org/pdf/1606.06472v2.pdf
PWC https://paperswithcode.com/paper/deepwriter-a-multi-stream-deep-cnn-for-text
Repo https://github.com/Nihhaar/Connected-Handwriting-Recognition
Framework none

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

Title CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Authors Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Abstract When building artificial intelligence systems that can reason and answer questions about visual data, we need diagnostic tests to analyze our progress and discover shortcomings. Existing benchmarks for visual question answering can help, but have strong biases that models can exploit to correctly answer questions without reasoning. They also conflate multiple sources of error, making it hard to pinpoint model weaknesses. We present a diagnostic dataset that tests a range of visual reasoning abilities. It contains minimal biases and has detailed annotations describing the kind of reasoning each question requires. We use this dataset to analyze a variety of modern visual reasoning systems, providing novel insights into their abilities and limitations.
Tasks Question Answering, Visual Question Answering, Visual Reasoning
Published 2016-12-20
URL http://arxiv.org/abs/1612.06890v1
PDF http://arxiv.org/pdf/1612.06890v1.pdf
PWC https://paperswithcode.com/paper/clevr-a-diagnostic-dataset-for-compositional
Repo https://github.com/ethanjperez/film
Framework pytorch

Binarized Neural Networks

Title Binarized Neural Networks
Authors Itay Hubara, Daniel Soudry, Ran El Yaniv
Abstract We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time and when computing the parameters’ gradient at train-time. We conduct two sets of experiments, each based on a different framework, namely Torch7 and Theano, where we train BNNs on MNIST, CIFAR-10 and SVHN, and achieve nearly state-of-the-art results. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which might lead to a great increase in power-efficiency. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available.
Tasks
Published 2016-02-08
URL http://arxiv.org/abs/1602.02505v3
PDF http://arxiv.org/pdf/1602.02505v3.pdf
PWC https://paperswithcode.com/paper/binarized-neural-networks
Repo https://github.com/ryuz/BinaryBrain
Framework none

Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations

Title Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations
Authors Philip Blair, Yuval Merhav, Joel Barry
Abstract We propose a language-agnostic way of automatically generating sets of semantically similar clusters of entities along with sets of “outlier” elements, which may then be used to perform an intrinsic evaluation of word embeddings in the outlier detection task. We used our methodology to create a gold-standard dataset, which we call WikiSem500, and evaluated multiple state-of-the-art embeddings. The results show a correlation between performance on this dataset and performance on sentiment analysis.
Tasks Outlier Detection, Sentiment Analysis, Word Embeddings
Published 2016-11-04
URL http://arxiv.org/abs/1611.01547v5
PDF http://arxiv.org/pdf/1611.01547v5.pdf
PWC https://paperswithcode.com/paper/automated-generation-of-multilingual-clusters
Repo https://github.com/belph/wiki-sem-500
Framework none

Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?

Title Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?
Authors Christophe Servan, Alexandre Berard, Zied Elloumi, Hervé Blanchon, Laurent Besacier
Abstract This paper presents an approach combining lexico-semantic resources and distributed representations of words applied to the evaluation in machine translation (MT). This study is made through the enrichment of a well-known MT evaluation metric: METEOR. This metric enables an approximate match (synonymy or morphological similarity) between an automatic and a reference translation. Our experiments are made in the framework of the Metrics task of WMT 2014. We show that distributed representations are a good alternative to lexico-semantic resources for MT evaluation and they can even bring interesting additional information. The augmented versions of METEOR, using vector representations, are made available on our Github page.
Tasks Machine Translation
Published 2016-10-05
URL http://arxiv.org/abs/1610.01291v1
PDF http://arxiv.org/pdf/1610.01291v1.pdf
PWC https://paperswithcode.com/paper/word2vec-vs-dbnary-augmenting-meteor-using
Repo https://github.com/cservan/METEOR-E
Framework none

Fully Convolutional Instance-aware Semantic Segmentation

Title Fully Convolutional Instance-aware Semantic Segmentation
Authors Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, Yichen Wei
Abstract We present the first fully convolutional end-to-end solution for instance-aware semantic segmentation task. It inherits all the merits of FCNs for semantic segmentation and instance mask proposal. It performs instance mask prediction and classification jointly. The underlying convolutional representation is fully shared between the two sub-tasks, as well as between all regions of interest. The proposed network is highly integrated and achieves state-of-the-art performance in both accuracy and efficiency. It wins the COCO 2016 segmentation competition by a large margin. Code would be released at \url{https://github.com/daijifeng001/TA-FCN}.
Tasks Semantic Segmentation
Published 2016-11-23
URL http://arxiv.org/abs/1611.07709v2
PDF http://arxiv.org/pdf/1611.07709v2.pdf
PWC https://paperswithcode.com/paper/fully-convolutional-instance-aware-semantic
Repo https://github.com/daijifeng001/TA-FCN
Framework mxnet

Generalized Deep Image to Image Regression

Title Generalized Deep Image to Image Regression
Authors Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis
Abstract We present a Deep Convolutional Neural Network architecture which serves as a generic image-to-image regressor that can be trained end-to-end without any further machinery. Our proposed architecture: the Recursively Branched Deconvolutional Network (RBDN) develops a cheap multi-context image representation very early on using an efficient recursive branching scheme with extensive parameter sharing and learnable upsampling. This multi-context representation is subjected to a highly non-linear locality preserving transformation by the remainder of our network comprising of a series of convolutions/deconvolutions without any spatial downsampling. The RBDN architecture is fully convolutional and can handle variable sized images during inference. We provide qualitative/quantitative results on $3$ diverse tasks: relighting, denoising and colorization and show that our proposed RBDN architecture obtains comparable results to the state-of-the-art on each of these tasks when used off-the-shelf without any post processing or task-specific architectural modifications.
Tasks Colorization, Denoising
Published 2016-12-10
URL http://arxiv.org/abs/1612.03268v1
PDF http://arxiv.org/pdf/1612.03268v1.pdf
PWC https://paperswithcode.com/paper/generalized-deep-image-to-image-regression
Repo https://github.com/venkai/RBDN
Framework none
comments powered by Disqus