May 7, 2019

2793 words 14 mins read

Paper Group AWR 73

Deep neural networks can be improved using human-derived contextual expectations. An Integrated Recommender Algorithm for Rating Prediction. Kronecker Determinantal Point Processes. FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics. $\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent. Hierar …

Deep neural networks can be improved using human-derived contextual expectations


Title	Deep neural networks can be improved using human-derived contextual expectations
Authors	Harish Katti, Marius V. Peelen, S. P. Arun
Abstract	Real-world objects occur in specific contexts. Such context has been shown to facilitate detection by constraining the locations to search. But can context directly benefit object detection? To do so, context needs to be learned independently from target features. This is impossible in traditional object detection where classifiers are trained on images containing both target features and surrounding context. In contrast, humans can learn context and target features separately, such as when we see highways without cars. Here we show for the first time that human-derived scene expectations can be used to improve object detection performance in machines. To measure contextual expectations, we asked human subjects to indicate the scale, location and likelihood at which cars or people might occur in scenes without these objects. Humans showed highly systematic expectations that we could accurately predict using scene features. This allowed us to predict human expectations on novel scenes without requiring manual annotation. On augmenting deep neural networks with predicted human expectations, we obtained substantial gains in accuracy for detecting cars and people (1-3%) as well as on detecting associated objects (3-20%). In contrast, augmenting deep networks with other conventional features yielded far smaller gains. This improvement was due to relatively poor matches at highly likely locations being correctly labelled as target and conversely strong matches at unlikely locations being correctly rejected as false alarms. Taken together, our results show that augmenting deep neural networks with human-derived context features improves their performance, suggesting that humans learn scene context separately unlike deep networks.
Tasks	Object Detection
Published	2016-11-22
URL	http://arxiv.org/abs/1611.07218v4
PDF	http://arxiv.org/pdf/1611.07218v4.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-can-be-improved-using
Repo	https://github.com/harish2006/cntxt_likelihood
Framework	none

An Integrated Recommender Algorithm for Rating Prediction


Title	An Integrated Recommender Algorithm for Rating Prediction
Authors	Yefeng Ruan, Tzu-Chun Lin
Abstract	Recommender system is currently widely used in many e-commerce systems, such as Amazon, eBay, and so on. It aims to help users to find items which they may be interested in. In literature, neighborhood-based collaborative filtering and matrix factorization are two common methods used in recommender systems. In this paper, we combine these two methods with personalized weights on them. Rather than using fixed weights for these two methods, we assume each user has her/his own preference over them. Our results shows that our algorithm outperforms neighborhood-based collaborative filtering algorithm, matrix factorization algorithm and their combination with fixed weights.
Tasks	Recommendation Systems
Published	2016-08-05
URL	https://arxiv.org/abs/1608.02021v1
PDF	https://arxiv.org/pdf/1608.02021v1.pdf
PWC	https://paperswithcode.com/paper/an-integrated-recommender-algorithm-for
Repo	https://github.com/sidooms/MovieTweetings
Framework	none

Kronecker Determinantal Point Processes


Title	Kronecker Determinantal Point Processes
Authors	Zelda Mariet, Suvrit Sra
Abstract	Determinantal Point Processes (DPPs) are probabilistic models over all subsets a ground set of $N$ items. They have recently gained prominence in several applications that rely on “diverse” subsets. However, their applicability to large problems is still limited due to the $\mathcal O(N^3)$ complexity of core tasks such as sampling and learning. We enable efficient sampling and learning for DPPs by introducing KronDPP, a DPP model whose kernel matrix decomposes as a tensor product of multiple smaller kernel matrices. This decomposition immediately enables fast exact sampling. But contrary to what one may expect, leveraging the Kronecker product structure for speeding up DPP learning turns out to be more difficult. We overcome this challenge, and derive batch and stochastic optimization algorithms for efficiently learning the parameters of a KronDPP.
Tasks	Point Processes, Stochastic Optimization
Published	2016-05-26
URL	http://arxiv.org/abs/1605.08374v1
PDF	http://arxiv.org/pdf/1605.08374v1.pdf
PWC	https://paperswithcode.com/paper/kronecker-determinantal-point-processes
Repo	https://github.com/alshedivat/DeterminantalPointProcesses.jl
Framework	none

FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics


Title	FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics
Authors	Tran Minh Quan, David G. C. Hildebrand, Won-Ki Jeong
Abstract	Electron microscopic connectomics is an ambitious research direction with the goal of studying comprehensive brain connectivity maps by using high-throughput, nano-scale microscopy. One of the main challenges in connectomics research is developing scalable image analysis algorithms that require minimal user intervention. Recently, deep learning has drawn much attention in computer vision because of its exceptional performance in image classification tasks. For this reason, its application to connectomic analyses holds great promise, as well. In this paper, we introduce a novel deep neural network architecture, FusionNet, for the automatic segmentation of neuronal structures in connectomics data. FusionNet leverages the latest advances in machine learning, such as semantic segmentation and residual neural networks, with the novel introduction of summation-based skip connections to allow a much deeper network architecture for a more accurate segmentation. We demonstrate the performance of the proposed method by comparing it with state-of-the-art electron microscopy (EM) segmentation methods from the ISBI EM segmentation challenge. We also show the segmentation results on two different tasks including cell membrane and cell body segmentation and a statistical analysis of cell morphology.
Tasks	Brain Image Segmentation, Image Classification, Semantic Segmentation
Published	2016-12-16
URL	http://arxiv.org/abs/1612.05360v2
PDF	http://arxiv.org/pdf/1612.05360v2.pdf
PWC	https://paperswithcode.com/paper/fusionnet-a-deep-fully-residual-convolutional
Repo	https://github.com/Jeongseungwoo/Fusion-net
Framework	tf

$\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent


Title	$\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent
Authors	Mario Souto, Joaquim D. Garcia, Gustavo C. Amaral
Abstract	Identifying the unknown underlying trend of a given noisy signal is extremely useful for a wide range of applications. The number of potential trends might be exponential, which can be computationally exhaustive even for short signals. Another challenge, is the presence of abrupt changes and outliers at unknown times which impart resourceful information regarding the signal’s characteristics. In this paper, we present the $\ell_1$ Adaptive Trend Filter, which can consistently identify the components in the underlying trend and multiple level-shifts, even in the presence of outliers. Additionally, an enhanced coordinate descent algorithm which exploit the filter design is presented. Some implementation details are discussed and a version in the Julia language is presented along with two distinct applications to illustrate the filter’s potential.
Tasks
Published	2016-03-11
URL	https://arxiv.org/abs/1603.03799v2
PDF	https://arxiv.org/pdf/1603.03799v2.pdf
PWC	https://paperswithcode.com/paper/ell_1-adaptive-trend-filter-via-fast
Repo	https://github.com/joaquimg/L1AdaptiveTrendFilter.jl
Framework	none

Hierarchical Memory Networks for Answer Selection on Unknown Words


Title	Hierarchical Memory Networks for Answer Selection on Unknown Words
Authors	Jiaming Xu, Jing Shi, Yiqun Yao, Suncong Zheng, Bo Xu, Bo Xu
Abstract	Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory. However, memory networks conduct the reasoning on sentence-level memory to output coarse semantic vectors and do not further take any attention mechanism to focus on words, which may lead to the model lose some detail information, especially when the answers are rare or unknown words. In this paper, we propose a novel Hierarchical Memory Networks, dubbed HMN. First, we encode the past facts into sentence-level memory and word-level memory respectively. Then, (k)-max pooling is exploited following reasoning module on the sentence-level memory to sample the (k) most relevant sentences to a question and feed these sentences into attention mechanism on the word-level memory to focus the words in the selected sentences. Finally, the prediction is jointly learned over the outputs of the sentence-level reasoning module and the word-level attention mechanism. The experimental results demonstrate that our approach successfully conducts answer selection on unknown words and achieves a better performance than memory networks.
Tasks	Answer Selection, Question Answering
Published	2016-09-28
URL	http://arxiv.org/abs/1609.08843v1
PDF	http://arxiv.org/pdf/1609.08843v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-memory-networks-for-answer
Repo	https://github.com/jacoxu/HMN4QA
Framework	none

UnrealCV: Connecting Computer Vision to Unreal Engine


Title	UnrealCV: Connecting Computer Vision to Unreal Engine
Authors	Weichao Qiu, Alan Yuille
Abstract	Computer graphics can not only generate synthetic images and ground truth but it also offers the possibility of constructing virtual worlds in which: (i) an agent can perceive, navigate, and take actions guided by AI algorithms, (ii) properties of the worlds can be modified (e.g., material and reflectance), (iii) physical simulations can be performed, and (iv) algorithms can be learnt and evaluated. But creating realistic virtual worlds is not easy. The game industry, however, has spent a lot of effort creating 3D worlds, which a player can interact with. So researchers can build on these resources to create virtual worlds, provided we can access and modify the internal data structures of the games. To enable this we created an open-source plugin UnrealCV (http://unrealcv.github.io) for a popular game engine Unreal Engine 4 (UE4). We show two applications: (i) a proof of concept image dataset, and (ii) linking Caffe with the virtual world to test deep network algorithms.
Tasks	Physical Simulations
Published	2016-09-05
URL	http://arxiv.org/abs/1609.01326v1
PDF	http://arxiv.org/pdf/1609.01326v1.pdf
PWC	https://paperswithcode.com/paper/unrealcv-connecting-computer-vision-to-unreal
Repo	https://github.com/unrealcv/unrealcv
Framework	none

SSHMT: Semi-supervised Hierarchical Merge Tree for Electron Microscopy Image Segmentation


Title	SSHMT: Semi-supervised Hierarchical Merge Tree for Electron Microscopy Image Segmentation
Authors	Ting Liu, Miaomiao Zhang, Mehran Javanmardi, Nisha Ramesh, Tolga Tasdizen
Abstract	Region-based methods have proven necessary for improving segmentation accuracy of neuronal structures in electron microscopy (EM) images. Most region-based segmentation methods use a scoring function to determine region merging. Such functions are usually learned with supervised algorithms that demand considerable ground truth data, which are costly to collect. We propose a semi-supervised approach that reduces this demand. Based on a merge tree structure, we develop a differentiable unsupervised loss term that enforces consistent predictions from the learned function. We then propose a Bayesian model that combines the supervised and the unsupervised information for probabilistic learning. The experimental results on three EM data sets demonstrate that by using a subset of only 3% to 7% of the entire ground truth data, our approach consistently performs close to the state-of-the-art supervised method with the full labeled data set, and significantly outperforms the supervised method with the same labeled subset.
Tasks	Electron Microscopy Image Segmentation, Semantic Segmentation
Published	2016-08-14
URL	http://arxiv.org/abs/1608.04051v1
PDF	http://arxiv.org/pdf/1608.04051v1.pdf
PWC	https://paperswithcode.com/paper/sshmt-semi-supervised-hierarchical-merge-tree
Repo	https://github.com/tingliu/glia
Framework	none

DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification


Title	DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification
Authors	Linjie Xing, Yu Qiao
Abstract	Text-independent writer identification is challenging due to the huge variation of written contents and the ambiguous written styles of different writers. This paper proposes DeepWriter, a deep multi-stream CNN to learn deep powerful representation for recognizing writers. DeepWriter takes local handwritten patches as input and is trained with softmax classification loss. The main contributions are: 1) we design and optimize multi-stream structure for writer identification task; 2) we introduce data augmentation learning to enhance the performance of DeepWriter; 3) we introduce a patch scanning strategy to handle text image with different lengths. In addition, we find that different languages such as English and Chinese may share common features for writer identification, and joint training can yield better performance. Experimental results on IAM and HWDB datasets show that our models achieve high identification accuracy: 99.01% on 301 writers and 97.03% on 657 writers with one English sentence input, 93.85% on 300 writers with one Chinese character input, which outperform previous methods with a large margin. Moreover, our models obtain accuracy of 98.01% on 301 writers with only 4 English alphabets as input.
Tasks	Data Augmentation
Published	2016-06-21
URL	http://arxiv.org/abs/1606.06472v2
PDF	http://arxiv.org/pdf/1606.06472v2.pdf
PWC	https://paperswithcode.com/paper/deepwriter-a-multi-stream-deep-cnn-for-text
Repo	https://github.com/Nihhaar/Connected-Handwriting-Recognition
Framework	none

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning


Title	CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Authors	Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick
Abstract	When building artificial intelligence systems that can reason and answer questions about visual data, we need diagnostic tests to analyze our progress and discover shortcomings. Existing benchmarks for visual question answering can help, but have strong biases that models can exploit to correctly answer questions without reasoning. They also conflate multiple sources of error, making it hard to pinpoint model weaknesses. We present a diagnostic dataset that tests a range of visual reasoning abilities. It contains minimal biases and has detailed annotations describing the kind of reasoning each question requires. We use this dataset to analyze a variety of modern visual reasoning systems, providing novel insights into their abilities and limitations.
Tasks	Question Answering, Visual Question Answering, Visual Reasoning
Published	2016-12-20
URL	http://arxiv.org/abs/1612.06890v1
PDF	http://arxiv.org/pdf/1612.06890v1.pdf
PWC	https://paperswithcode.com/paper/clevr-a-diagnostic-dataset-for-compositional
Repo	https://github.com/ethanjperez/film
Framework	pytorch

Binarized Neural Networks


Title	Binarized Neural Networks
Authors	Itay Hubara, Daniel Soudry, Ran El Yaniv
Abstract	We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time and when computing the parameters’ gradient at train-time. We conduct two sets of experiments, each based on a different framework, namely Torch7 and Theano, where we train BNNs on MNIST, CIFAR-10 and SVHN, and achieve nearly state-of-the-art results. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which might lead to a great increase in power-efficiency. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available.
Tasks
Published	2016-02-08
URL	http://arxiv.org/abs/1602.02505v3
PDF	http://arxiv.org/pdf/1602.02505v3.pdf
PWC	https://paperswithcode.com/paper/binarized-neural-networks
Repo	https://github.com/ryuz/BinaryBrain
Framework	none

Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations


Title	Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations
Authors	Philip Blair, Yuval Merhav, Joel Barry
Abstract	We propose a language-agnostic way of automatically generating sets of semantically similar clusters of entities along with sets of “outlier” elements, which may then be used to perform an intrinsic evaluation of word embeddings in the outlier detection task. We used our methodology to create a gold-standard dataset, which we call WikiSem500, and evaluated multiple state-of-the-art embeddings. The results show a correlation between performance on this dataset and performance on sentiment analysis.
Tasks	Outlier Detection, Sentiment Analysis, Word Embeddings
Published	2016-11-04
URL	http://arxiv.org/abs/1611.01547v5
PDF	http://arxiv.org/pdf/1611.01547v5.pdf
PWC	https://paperswithcode.com/paper/automated-generation-of-multilingual-clusters
Repo	https://github.com/belph/wiki-sem-500
Framework	none

Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?


Title	Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?
Authors	Christophe Servan, Alexandre Berard, Zied Elloumi, Hervé Blanchon, Laurent Besacier
Abstract	This paper presents an approach combining lexico-semantic resources and distributed representations of words applied to the evaluation in machine translation (MT). This study is made through the enrichment of a well-known MT evaluation metric: METEOR. This metric enables an approximate match (synonymy or morphological similarity) between an automatic and a reference translation. Our experiments are made in the framework of the Metrics task of WMT 2014. We show that distributed representations are a good alternative to lexico-semantic resources for MT evaluation and they can even bring interesting additional information. The augmented versions of METEOR, using vector representations, are made available on our Github page.
Tasks	Machine Translation
Published	2016-10-05
URL	http://arxiv.org/abs/1610.01291v1
PDF	http://arxiv.org/pdf/1610.01291v1.pdf
PWC	https://paperswithcode.com/paper/word2vec-vs-dbnary-augmenting-meteor-using
Repo	https://github.com/cservan/METEOR-E
Framework	none

Fully Convolutional Instance-aware Semantic Segmentation


Title	Fully Convolutional Instance-aware Semantic Segmentation
Authors	Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, Yichen Wei
Abstract	We present the first fully convolutional end-to-end solution for instance-aware semantic segmentation task. It inherits all the merits of FCNs for semantic segmentation and instance mask proposal. It performs instance mask prediction and classification jointly. The underlying convolutional representation is fully shared between the two sub-tasks, as well as between all regions of interest. The proposed network is highly integrated and achieves state-of-the-art performance in both accuracy and efficiency. It wins the COCO 2016 segmentation competition by a large margin. Code would be released at \url{https://github.com/daijifeng001/TA-FCN}.
Tasks	Semantic Segmentation
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07709v2
PDF	http://arxiv.org/pdf/1611.07709v2.pdf
PWC	https://paperswithcode.com/paper/fully-convolutional-instance-aware-semantic
Repo	https://github.com/daijifeng001/TA-FCN
Framework	mxnet

Generalized Deep Image to Image Regression


Title	Generalized Deep Image to Image Regression
Authors	Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis
Abstract	We present a Deep Convolutional Neural Network architecture which serves as a generic image-to-image regressor that can be trained end-to-end without any further machinery. Our proposed architecture: the Recursively Branched Deconvolutional Network (RBDN) develops a cheap multi-context image representation very early on using an efficient recursive branching scheme with extensive parameter sharing and learnable upsampling. This multi-context representation is subjected to a highly non-linear locality preserving transformation by the remainder of our network comprising of a series of convolutions/deconvolutions without any spatial downsampling. The RBDN architecture is fully convolutional and can handle variable sized images during inference. We provide qualitative/quantitative results on $3$ diverse tasks: relighting, denoising and colorization and show that our proposed RBDN architecture obtains comparable results to the state-of-the-art on each of these tasks when used off-the-shelf without any post processing or task-specific architectural modifications.
Tasks	Colorization, Denoising
Published	2016-12-10
URL	http://arxiv.org/abs/1612.03268v1
PDF	http://arxiv.org/pdf/1612.03268v1.pdf
PWC	https://paperswithcode.com/paper/generalized-deep-image-to-image-regression
Repo	https://github.com/venkai/RBDN
Framework	none