Paper Group AWR 73
Deep neural networks can be improved using human-derived contextual expectations. An Integrated Recommender Algorithm for Rating Prediction. Kronecker Determinantal Point Processes. FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics. $\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent. Hierar …
Deep neural networks can be improved using human-derived contextual expectations
Title | Deep neural networks can be improved using human-derived contextual expectations |
Authors | Harish Katti, Marius V. Peelen, S. P. Arun |
Abstract | Real-world objects occur in specific contexts. Such context has been shown to facilitate detection by constraining the locations to search. But can context directly benefit object detection? To do so, context needs to be learned independently from target features. This is impossible in traditional object detection where classifiers are trained on images containing both target features and surrounding context. In contrast, humans can learn context and target features separately, such as when we see highways without cars. Here we show for the first time that human-derived scene expectations can be used to improve object detection performance in machines. To measure contextual expectations, we asked human subjects to indicate the scale, location and likelihood at which cars or people might occur in scenes without these objects. Humans showed highly systematic expectations that we could accurately predict using scene features. This allowed us to predict human expectations on novel scenes without requiring manual annotation. On augmenting deep neural networks with predicted human expectations, we obtained substantial gains in accuracy for detecting cars and people (1-3%) as well as on detecting associated objects (3-20%). In contrast, augmenting deep networks with other conventional features yielded far smaller gains. This improvement was due to relatively poor matches at highly likely locations being correctly labelled as target and conversely strong matches at unlikely locations being correctly rejected as false alarms. Taken together, our results show that augmenting deep neural networks with human-derived context features improves their performance, suggesting that humans learn scene context separately unlike deep networks. |
Tasks | Object Detection |
Published | 2016-11-22 |
URL | http://arxiv.org/abs/1611.07218v4 |
http://arxiv.org/pdf/1611.07218v4.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-can-be-improved-using |
Repo | https://github.com/harish2006/cntxt_likelihood |
Framework | none |
An Integrated Recommender Algorithm for Rating Prediction
Title | An Integrated Recommender Algorithm for Rating Prediction |
Authors | Yefeng Ruan, Tzu-Chun Lin |
Abstract | Recommender system is currently widely used in many e-commerce systems, such as Amazon, eBay, and so on. It aims to help users to find items which they may be interested in. In literature, neighborhood-based collaborative filtering and matrix factorization are two common methods used in recommender systems. In this paper, we combine these two methods with personalized weights on them. Rather than using fixed weights for these two methods, we assume each user has her/his own preference over them. Our results shows that our algorithm outperforms neighborhood-based collaborative filtering algorithm, matrix factorization algorithm and their combination with fixed weights. |
Tasks | Recommendation Systems |
Published | 2016-08-05 |
URL | https://arxiv.org/abs/1608.02021v1 |
https://arxiv.org/pdf/1608.02021v1.pdf | |
PWC | https://paperswithcode.com/paper/an-integrated-recommender-algorithm-for |
Repo | https://github.com/sidooms/MovieTweetings |
Framework | none |
Kronecker Determinantal Point Processes
Title | Kronecker Determinantal Point Processes |
Authors | Zelda Mariet, Suvrit Sra |
Abstract | Determinantal Point Processes (DPPs) are probabilistic models over all subsets a ground set of $N$ items. They have recently gained prominence in several applications that rely on “diverse” subsets. However, their applicability to large problems is still limited due to the $\mathcal O(N^3)$ complexity of core tasks such as sampling and learning. We enable efficient sampling and learning for DPPs by introducing KronDPP, a DPP model whose kernel matrix decomposes as a tensor product of multiple smaller kernel matrices. This decomposition immediately enables fast exact sampling. But contrary to what one may expect, leveraging the Kronecker product structure for speeding up DPP learning turns out to be more difficult. We overcome this challenge, and derive batch and stochastic optimization algorithms for efficiently learning the parameters of a KronDPP. |
Tasks | Point Processes, Stochastic Optimization |
Published | 2016-05-26 |
URL | http://arxiv.org/abs/1605.08374v1 |
http://arxiv.org/pdf/1605.08374v1.pdf | |
PWC | https://paperswithcode.com/paper/kronecker-determinantal-point-processes |
Repo | https://github.com/alshedivat/DeterminantalPointProcesses.jl |
Framework | none |
FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics
Title | FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics |
Authors | Tran Minh Quan, David G. C. Hildebrand, Won-Ki Jeong |
Abstract | Electron microscopic connectomics is an ambitious research direction with the goal of studying comprehensive brain connectivity maps by using high-throughput, nano-scale microscopy. One of the main challenges in connectomics research is developing scalable image analysis algorithms that require minimal user intervention. Recently, deep learning has drawn much attention in computer vision because of its exceptional performance in image classification tasks. For this reason, its application to connectomic analyses holds great promise, as well. In this paper, we introduce a novel deep neural network architecture, FusionNet, for the automatic segmentation of neuronal structures in connectomics data. FusionNet leverages the latest advances in machine learning, such as semantic segmentation and residual neural networks, with the novel introduction of summation-based skip connections to allow a much deeper network architecture for a more accurate segmentation. We demonstrate the performance of the proposed method by comparing it with state-of-the-art electron microscopy (EM) segmentation methods from the ISBI EM segmentation challenge. We also show the segmentation results on two different tasks including cell membrane and cell body segmentation and a statistical analysis of cell morphology. |
Tasks | Brain Image Segmentation, Image Classification, Semantic Segmentation |
Published | 2016-12-16 |
URL | http://arxiv.org/abs/1612.05360v2 |
http://arxiv.org/pdf/1612.05360v2.pdf | |
PWC | https://paperswithcode.com/paper/fusionnet-a-deep-fully-residual-convolutional |
Repo | https://github.com/Jeongseungwoo/Fusion-net |
Framework | tf |
$\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent
Title | $\ell_1$ Adaptive Trend Filter via Fast Coordinate Descent |
Authors | Mario Souto, Joaquim D. Garcia, Gustavo C. Amaral |
Abstract | Identifying the unknown underlying trend of a given noisy signal is extremely useful for a wide range of applications. The number of potential trends might be exponential, which can be computationally exhaustive even for short signals. Another challenge, is the presence of abrupt changes and outliers at unknown times which impart resourceful information regarding the signal’s characteristics. In this paper, we present the $\ell_1$ Adaptive Trend Filter, which can consistently identify the components in the underlying trend and multiple level-shifts, even in the presence of outliers. Additionally, an enhanced coordinate descent algorithm which exploit the filter design is presented. Some implementation details are discussed and a version in the Julia language is presented along with two distinct applications to illustrate the filter’s potential. |
Tasks | |
Published | 2016-03-11 |
URL | https://arxiv.org/abs/1603.03799v2 |
https://arxiv.org/pdf/1603.03799v2.pdf | |
PWC | https://paperswithcode.com/paper/ell_1-adaptive-trend-filter-via-fast |
Repo | https://github.com/joaquimg/L1AdaptiveTrendFilter.jl |
Framework | none |
Hierarchical Memory Networks for Answer Selection on Unknown Words
Title | Hierarchical Memory Networks for Answer Selection on Unknown Words |
Authors | Jiaming Xu, Jing Shi, Yiqun Yao, Suncong Zheng, Bo Xu, Bo Xu |
Abstract | Recently, end-to-end memory networks have shown promising results on Question Answering task, which encode the past facts into an explicit memory and perform reasoning ability by making multiple computational steps on the memory. However, memory networks conduct the reasoning on sentence-level memory to output coarse semantic vectors and do not further take any attention mechanism to focus on words, which may lead to the model lose some detail information, especially when the answers are rare or unknown words. In this paper, we propose a novel Hierarchical Memory Networks, dubbed HMN. First, we encode the past facts into sentence-level memory and word-level memory respectively. Then, (k)-max pooling is exploited following reasoning module on the sentence-level memory to sample the (k) most relevant sentences to a question and feed these sentences into attention mechanism on the word-level memory to focus the words in the selected sentences. Finally, the prediction is jointly learned over the outputs of the sentence-level reasoning module and the word-level attention mechanism. The experimental results demonstrate that our approach successfully conducts answer selection on unknown words and achieves a better performance than memory networks. |
Tasks | Answer Selection, Question Answering |
Published | 2016-09-28 |
URL | http://arxiv.org/abs/1609.08843v1 |
http://arxiv.org/pdf/1609.08843v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-memory-networks-for-answer |
Repo | https://github.com/jacoxu/HMN4QA |
Framework | none |
UnrealCV: Connecting Computer Vision to Unreal Engine
Title | UnrealCV: Connecting Computer Vision to Unreal Engine |
Authors | Weichao Qiu, Alan Yuille |
Abstract | Computer graphics can not only generate synthetic images and ground truth but it also offers the possibility of constructing virtual worlds in which: (i) an agent can perceive, navigate, and take actions guided by AI algorithms, (ii) properties of the worlds can be modified (e.g., material and reflectance), (iii) physical simulations can be performed, and (iv) algorithms can be learnt and evaluated. But creating realistic virtual worlds is not easy. The game industry, however, has spent a lot of effort creating 3D worlds, which a player can interact with. So researchers can build on these resources to create virtual worlds, provided we can access and modify the internal data structures of the games. To enable this we created an open-source plugin UnrealCV (http://unrealcv.github.io) for a popular game engine Unreal Engine 4 (UE4). We show two applications: (i) a proof of concept image dataset, and (ii) linking Caffe with the virtual world to test deep network algorithms. |
Tasks | Physical Simulations |
Published | 2016-09-05 |
URL | http://arxiv.org/abs/1609.01326v1 |
http://arxiv.org/pdf/1609.01326v1.pdf | |
PWC | https://paperswithcode.com/paper/unrealcv-connecting-computer-vision-to-unreal |
Repo | https://github.com/unrealcv/unrealcv |
Framework | none |
SSHMT: Semi-supervised Hierarchical Merge Tree for Electron Microscopy Image Segmentation
Title | SSHMT: Semi-supervised Hierarchical Merge Tree for Electron Microscopy Image Segmentation |
Authors | Ting Liu, Miaomiao Zhang, Mehran Javanmardi, Nisha Ramesh, Tolga Tasdizen |
Abstract | Region-based methods have proven necessary for improving segmentation accuracy of neuronal structures in electron microscopy (EM) images. Most region-based segmentation methods use a scoring function to determine region merging. Such functions are usually learned with supervised algorithms that demand considerable ground truth data, which are costly to collect. We propose a semi-supervised approach that reduces this demand. Based on a merge tree structure, we develop a differentiable unsupervised loss term that enforces consistent predictions from the learned function. We then propose a Bayesian model that combines the supervised and the unsupervised information for probabilistic learning. The experimental results on three EM data sets demonstrate that by using a subset of only 3% to 7% of the entire ground truth data, our approach consistently performs close to the state-of-the-art supervised method with the full labeled data set, and significantly outperforms the supervised method with the same labeled subset. |
Tasks | Electron Microscopy Image Segmentation, Semantic Segmentation |
Published | 2016-08-14 |
URL | http://arxiv.org/abs/1608.04051v1 |
http://arxiv.org/pdf/1608.04051v1.pdf | |
PWC | https://paperswithcode.com/paper/sshmt-semi-supervised-hierarchical-merge-tree |
Repo | https://github.com/tingliu/glia |
Framework | none |
DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification
Title | DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification |
Authors | Linjie Xing, Yu Qiao |
Abstract | Text-independent writer identification is challenging due to the huge variation of written contents and the ambiguous written styles of different writers. This paper proposes DeepWriter, a deep multi-stream CNN to learn deep powerful representation for recognizing writers. DeepWriter takes local handwritten patches as input and is trained with softmax classification loss. The main contributions are: 1) we design and optimize multi-stream structure for writer identification task; 2) we introduce data augmentation learning to enhance the performance of DeepWriter; 3) we introduce a patch scanning strategy to handle text image with different lengths. In addition, we find that different languages such as English and Chinese may share common features for writer identification, and joint training can yield better performance. Experimental results on IAM and HWDB datasets show that our models achieve high identification accuracy: 99.01% on 301 writers and 97.03% on 657 writers with one English sentence input, 93.85% on 300 writers with one Chinese character input, which outperform previous methods with a large margin. Moreover, our models obtain accuracy of 98.01% on 301 writers with only 4 English alphabets as input. |
Tasks | Data Augmentation |
Published | 2016-06-21 |
URL | http://arxiv.org/abs/1606.06472v2 |
http://arxiv.org/pdf/1606.06472v2.pdf | |
PWC | https://paperswithcode.com/paper/deepwriter-a-multi-stream-deep-cnn-for-text |
Repo | https://github.com/Nihhaar/Connected-Handwriting-Recognition |
Framework | none |
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Title | CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning |
Authors | Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick |
Abstract | When building artificial intelligence systems that can reason and answer questions about visual data, we need diagnostic tests to analyze our progress and discover shortcomings. Existing benchmarks for visual question answering can help, but have strong biases that models can exploit to correctly answer questions without reasoning. They also conflate multiple sources of error, making it hard to pinpoint model weaknesses. We present a diagnostic dataset that tests a range of visual reasoning abilities. It contains minimal biases and has detailed annotations describing the kind of reasoning each question requires. We use this dataset to analyze a variety of modern visual reasoning systems, providing novel insights into their abilities and limitations. |
Tasks | Question Answering, Visual Question Answering, Visual Reasoning |
Published | 2016-12-20 |
URL | http://arxiv.org/abs/1612.06890v1 |
http://arxiv.org/pdf/1612.06890v1.pdf | |
PWC | https://paperswithcode.com/paper/clevr-a-diagnostic-dataset-for-compositional |
Repo | https://github.com/ethanjperez/film |
Framework | pytorch |
Binarized Neural Networks
Title | Binarized Neural Networks |
Authors | Itay Hubara, Daniel Soudry, Ran El Yaniv |
Abstract | We introduce a method to train Binarized Neural Networks (BNNs) - neural networks with binary weights and activations at run-time and when computing the parameters’ gradient at train-time. We conduct two sets of experiments, each based on a different framework, namely Torch7 and Theano, where we train BNNs on MNIST, CIFAR-10 and SVHN, and achieve nearly state-of-the-art results. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which might lead to a great increase in power-efficiency. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available. |
Tasks | |
Published | 2016-02-08 |
URL | http://arxiv.org/abs/1602.02505v3 |
http://arxiv.org/pdf/1602.02505v3.pdf | |
PWC | https://paperswithcode.com/paper/binarized-neural-networks |
Repo | https://github.com/ryuz/BinaryBrain |
Framework | none |
Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations
Title | Automated Generation of Multilingual Clusters for the Evaluation of Distributed Representations |
Authors | Philip Blair, Yuval Merhav, Joel Barry |
Abstract | We propose a language-agnostic way of automatically generating sets of semantically similar clusters of entities along with sets of “outlier” elements, which may then be used to perform an intrinsic evaluation of word embeddings in the outlier detection task. We used our methodology to create a gold-standard dataset, which we call WikiSem500, and evaluated multiple state-of-the-art embeddings. The results show a correlation between performance on this dataset and performance on sentiment analysis. |
Tasks | Outlier Detection, Sentiment Analysis, Word Embeddings |
Published | 2016-11-04 |
URL | http://arxiv.org/abs/1611.01547v5 |
http://arxiv.org/pdf/1611.01547v5.pdf | |
PWC | https://paperswithcode.com/paper/automated-generation-of-multilingual-clusters |
Repo | https://github.com/belph/wiki-sem-500 |
Framework | none |
Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?
Title | Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources? |
Authors | Christophe Servan, Alexandre Berard, Zied Elloumi, Hervé Blanchon, Laurent Besacier |
Abstract | This paper presents an approach combining lexico-semantic resources and distributed representations of words applied to the evaluation in machine translation (MT). This study is made through the enrichment of a well-known MT evaluation metric: METEOR. This metric enables an approximate match (synonymy or morphological similarity) between an automatic and a reference translation. Our experiments are made in the framework of the Metrics task of WMT 2014. We show that distributed representations are a good alternative to lexico-semantic resources for MT evaluation and they can even bring interesting additional information. The augmented versions of METEOR, using vector representations, are made available on our Github page. |
Tasks | Machine Translation |
Published | 2016-10-05 |
URL | http://arxiv.org/abs/1610.01291v1 |
http://arxiv.org/pdf/1610.01291v1.pdf | |
PWC | https://paperswithcode.com/paper/word2vec-vs-dbnary-augmenting-meteor-using |
Repo | https://github.com/cservan/METEOR-E |
Framework | none |
Fully Convolutional Instance-aware Semantic Segmentation
Title | Fully Convolutional Instance-aware Semantic Segmentation |
Authors | Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, Yichen Wei |
Abstract | We present the first fully convolutional end-to-end solution for instance-aware semantic segmentation task. It inherits all the merits of FCNs for semantic segmentation and instance mask proposal. It performs instance mask prediction and classification jointly. The underlying convolutional representation is fully shared between the two sub-tasks, as well as between all regions of interest. The proposed network is highly integrated and achieves state-of-the-art performance in both accuracy and efficiency. It wins the COCO 2016 segmentation competition by a large margin. Code would be released at \url{https://github.com/daijifeng001/TA-FCN}. |
Tasks | Semantic Segmentation |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.07709v2 |
http://arxiv.org/pdf/1611.07709v2.pdf | |
PWC | https://paperswithcode.com/paper/fully-convolutional-instance-aware-semantic |
Repo | https://github.com/daijifeng001/TA-FCN |
Framework | mxnet |
Generalized Deep Image to Image Regression
Title | Generalized Deep Image to Image Regression |
Authors | Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis |
Abstract | We present a Deep Convolutional Neural Network architecture which serves as a generic image-to-image regressor that can be trained end-to-end without any further machinery. Our proposed architecture: the Recursively Branched Deconvolutional Network (RBDN) develops a cheap multi-context image representation very early on using an efficient recursive branching scheme with extensive parameter sharing and learnable upsampling. This multi-context representation is subjected to a highly non-linear locality preserving transformation by the remainder of our network comprising of a series of convolutions/deconvolutions without any spatial downsampling. The RBDN architecture is fully convolutional and can handle variable sized images during inference. We provide qualitative/quantitative results on $3$ diverse tasks: relighting, denoising and colorization and show that our proposed RBDN architecture obtains comparable results to the state-of-the-art on each of these tasks when used off-the-shelf without any post processing or task-specific architectural modifications. |
Tasks | Colorization, Denoising |
Published | 2016-12-10 |
URL | http://arxiv.org/abs/1612.03268v1 |
http://arxiv.org/pdf/1612.03268v1.pdf | |
PWC | https://paperswithcode.com/paper/generalized-deep-image-to-image-regression |
Repo | https://github.com/venkai/RBDN |
Framework | none |