May 7, 2019

3214 words 16 mins read

Paper Group AWR 98

Paper Group AWR 98

Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora. Harnessing the Power of the Crowd to Increase Capacity for Data Science in the Social Sector. Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation. Every Filter Extracts A Specific Texture In Convolutional Neural Networks. Technical Report: Towards a Unive …

Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora

Title Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora
Authors William L. Hamilton, Kevin Clark, Jure Leskovec, Dan Jurafsky
Abstract A word’s sentiment depends on the domain in which it is used. Computational social science research thus requires sentiment lexicons that are specific to the domains being studied. We combine domain-specific word embeddings with a label propagation framework to induce accurate domain-specific sentiment lexicons using small sets of seed words, achieving state-of-the-art performance competitive with approaches that rely on hand-curated resources. Using our framework we perform two large-scale empirical studies to quantify the extent to which sentiment varies across time and between communities. We induce and release historical sentiment lexicons for 150 years of English and community-specific sentiment lexicons for 250 online communities from the social media forum Reddit. The historical lexicons show that more than 5% of sentiment-bearing (non-neutral) English words completely switched polarity during the last 150 years, and the community-specific lexicons highlight how sentiment varies drastically between different communities.
Tasks Word Embeddings
Published 2016-06-09
URL http://arxiv.org/abs/1606.02820v2
PDF http://arxiv.org/pdf/1606.02820v2.pdf
PWC https://paperswithcode.com/paper/inducing-domain-specific-sentiment-lexicons
Repo https://github.com/williamleif/socialsent
Framework tf

Harnessing the Power of the Crowd to Increase Capacity for Data Science in the Social Sector

Title Harnessing the Power of the Crowd to Increase Capacity for Data Science in the Social Sector
Authors Peter Bull, Isaac Slavitt, Greg Lipstein
Abstract We present three case studies of organizations using a data science competition to answer a pressing question. The first is in education where a nonprofit that creates smart school budgets wanted to automatically tag budget line items. The second is in public health, where a low-cost, nonprofit women’s health care provider wanted to understand the effect of demographic and behavioral questions on predicting which services a woman would need. The third and final example is in government innovation: using online restaurant reviews from Yelp, competitors built models to forecast which restaurants were most likely to have hygiene violations when visited by health inspectors. Finally, we reflect on the unique benefits of the open, public competition model.
Tasks
Published 2016-06-24
URL http://arxiv.org/abs/1606.07781v1
PDF http://arxiv.org/pdf/1606.07781v1.pdf
PWC https://paperswithcode.com/paper/harnessing-the-power-of-the-crowd-to-increase
Repo https://github.com/jsiloto/dengAI
Framework tf

Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation

Title Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation
Authors Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi, Wen Li
Abstract In this paper, we propose a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition. Specifically, we design a new model called Deep Reconstruction-Classification Network (DRCN), which jointly learns a shared encoding representation for two tasks: i) supervised classification of labeled source data, and ii) unsupervised reconstruction of unlabeled target data.In this way, the learnt representation not only preserves discriminability, but also encodes useful information from the target domain. Our new DRCN model can be optimized by using backpropagation similarly as the standard neural networks. We evaluate the performance of DRCN on a series of cross-domain object recognition tasks, where DRCN provides a considerable improvement (up to ~8% in accuracy) over the prior state-of-the-art algorithms. Interestingly, we also observe that the reconstruction pipeline of DRCN transforms images from the source domain into images whose appearance resembles the target dataset. This suggests that DRCN’s performance is due to constructing a single composite representation that encodes information about both the structure of target images and the classification of source images. Finally, we provide a formal analysis to justify the algorithm’s objective in domain adaptation context.
Tasks Domain Adaptation, Object Recognition, Unsupervised Domain Adaptation
Published 2016-07-12
URL http://arxiv.org/abs/1607.03516v2
PDF http://arxiv.org/pdf/1607.03516v2.pdf
PWC https://paperswithcode.com/paper/deep-reconstruction-classification-networks
Repo https://github.com/ghif/drcn
Framework tf

Every Filter Extracts A Specific Texture In Convolutional Neural Networks

Title Every Filter Extracts A Specific Texture In Convolutional Neural Networks
Authors Zhiqiang Xia, Ce Zhu, Zhengtao Wang, Qi Guo, Yipeng Liu
Abstract Many works have concentrated on visualizing and understanding the inner mechanism of convolutional neural networks (CNNs) by generating images that activate some specific neurons, which is called deep visualization. However, it is still unclear what the filters extract from images intuitively. In this paper, we propose a modified code inversion algorithm, called feature map inversion, to understand the function of filter of interest in CNNs. We reveal that every filter extracts a specific texture. The texture from higher layer contains more colours and more intricate structures. We also demonstrate that style of images could be a combination of these texture primitives. Two methods are proposed to reallocate energy distribution of feature maps randomly and purposefully. Then, we inverse the modified code and generate images of diverse styles. With these results, we provide an explanation about why Gram matrix of feature maps \cite{Gatys_2016_CVPR} could represent image style.
Tasks
Published 2016-08-15
URL http://arxiv.org/abs/1608.04170v2
PDF http://arxiv.org/pdf/1608.04170v2.pdf
PWC https://paperswithcode.com/paper/every-filter-extracts-a-specific-texture-in
Repo https://github.com/xzqjack/FeatureMapInversion
Framework mxnet

Technical Report: Towards a Universal Code Formatter through Machine Learning

Title Technical Report: Towards a Universal Code Formatter through Machine Learning
Authors Terence Parr, Jurgin Vinju
Abstract There are many declarative frameworks that allow us to implement code formatters relatively easily for any specific language, but constructing them is cumbersome. The first problem is that “everybody” wants to format their code differently, leading to either many formatter variants or a ridiculous number of configuration options. Second, the size of each implementation scales with a language’s grammar size, leading to hundreds of rules. In this paper, we solve the formatter construction problem using a novel approach, one that automatically derives formatters for any given language without intervention from a language expert. We introduce a code formatter called CodeBuff that uses machine learning to abstract formatting rules from a representative corpus, using a carefully designed feature set. Our experiments on Java, SQL, and ANTLR grammars show that CodeBuff is efficient, has excellent accuracy, and is grammar invariant for a given language. It also generalizes to a 4th language tested during manuscript preparation.
Tasks
Published 2016-06-28
URL http://arxiv.org/abs/1606.08866v1
PDF http://arxiv.org/pdf/1606.08866v1.pdf
PWC https://paperswithcode.com/paper/technical-report-towards-a-universal-code
Repo https://github.com/antlr/codebuff
Framework none

A MultiPath Network for Object Detection

Title A MultiPath Network for Object Detection
Authors Sergey Zagoruyko, Adam Lerer, Tsung-Yi Lin, Pedro O. Pinheiro, Sam Gross, Soumith Chintala, Piotr Dollár
Abstract The recent COCO object detection dataset presents several new challenges for object detection. In particular, it contains objects at a broad range of scales, less prototypical images, and requires more precise localization. To address these challenges, we test three modifications to the standard Fast R-CNN object detector: (1) skip connections that give the detector access to features at multiple network layers, (2) a foveal structure to exploit object context at multiple object resolutions, and (3) an integral loss function and corresponding network adjustment that improve localization. The result of these modifications is that information can flow along multiple paths in our network, including through features from multiple network layers and from multiple object views. We refer to our modified classifier as a “MultiPath” network. We couple our MultiPath network with DeepMask object proposals, which are well suited for localization and small objects, and adapt our pipeline to predict segmentation masks in addition to bounding boxes. The combined system improves results over the baseline Fast R-CNN detector with Selective Search by 66% overall and by 4x on small objects. It placed second in both the COCO 2015 detection and segmentation challenges.
Tasks Instance Segmentation, Object Detection
Published 2016-04-07
URL http://arxiv.org/abs/1604.02135v2
PDF http://arxiv.org/pdf/1604.02135v2.pdf
PWC https://paperswithcode.com/paper/a-multipath-network-for-object-detection
Repo https://github.com/facebookresearch/multipathnet
Framework torch

Structured illumination microscopy image reconstruction algorithm

Title Structured illumination microscopy image reconstruction algorithm
Authors Amit Lal, Chunyan Shan, Peng Xi
Abstract Structured illumination microscopy (SIM) is a very important super-resolution microscopy technique, which provides high speed super-resolution with about two-fold spatial resolution enhancement. Several attempts aimed at improving the performance of SIM reconstruction algorithm have been reported. However, most of these highlight only one specific aspect of the SIM reconstruction – such as the determination of the illumination pattern phase shift accurately – whereas other key elements – such as determination of modulation factor, estimation of object power spectrum, Wiener filtering frequency components with inclusion of object power spectrum information, translocating and the merging of the overlapping frequency components – are usually glossed over superficially. In addition, most of the work reported lie scattered throughout the literature and a comprehensive review of the theoretical background is found lacking. The purpose of the present work is two-fold: 1) to collect the essential theoretical details of SIM algorithm at one place, thereby making them readily accessible to readers for the first time; and 2) to provide an open source SIM reconstruction code (named OpenSIM), which enables users to interactively vary the code parameters and study it’s effect on reconstructed SIM image.
Tasks Image Reconstruction, Super-Resolution
Published 2016-02-19
URL http://arxiv.org/abs/1602.06904v1
PDF http://arxiv.org/pdf/1602.06904v1.pdf
PWC https://paperswithcode.com/paper/structured-illumination-microscopy-image
Repo https://github.com/iandobbie/CUDA_SIMrecon
Framework none

Eye Tracking for Everyone

Title Eye Tracking for Everyone
Authors Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Harini Kannan, Suchendra Bhandarkar, Wojciech Matusik, Antonio Torralba
Abstract From scientific research to commercial applications, eye tracking is an important tool across many domains. Despite its range of applications, eye tracking has yet to become a pervasive technology. We believe that we can put the power of eye tracking in everyone’s palm by building eye tracking software that works on commodity hardware such as mobile phones and tablets, without the need for additional sensors or devices. We tackle this problem by introducing GazeCapture, the first large-scale dataset for eye tracking, containing data from over 1450 people consisting of almost 2.5M frames. Using GazeCapture, we train iTracker, a convolutional neural network for eye tracking, which achieves a significant reduction in error over previous approaches while running in real time (10-15fps) on a modern mobile device. Our model achieves a prediction error of 1.71cm and 2.53cm without calibration on mobile phones and tablets respectively. With calibration, this is reduced to 1.34cm and 2.12cm. Further, we demonstrate that the features learned by iTracker generalize well to other datasets, achieving state-of-the-art results. The code, data, and models are available at http://gazecapture.csail.mit.edu.
Tasks Calibration, Eye Tracking, Gaze Estimation
Published 2016-06-18
URL http://arxiv.org/abs/1606.05814v1
PDF http://arxiv.org/pdf/1606.05814v1.pdf
PWC https://paperswithcode.com/paper/eye-tracking-for-everyone
Repo https://github.com/CSAILVision/GazeCapture
Framework none

Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs

Title Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs
Authors Siddhartha Chandra, Iasonas Kokkinos
Abstract In this work we propose a structured prediction technique that combines the virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a) our structured prediction task has a unique global optimum that is obtained exactly from the solution of a linear system (b) the gradients of our model parameters are analytically computed using closed form expressions, in contrast to the memory-demanding contemporary deep structured prediction approaches that rely on back-propagation-through-time, (c) our pairwise terms do not have to be simple hand-crafted expressions, as in the line of works building on the DenseCRF, but can rather be `discovered’ from data through deep architectures, and (d) out system can trained in an end-to-end manner. Building on standard tools from numerical analysis we develop very efficient algorithms for inference and learning, as well as a customized technique adapted to the semantic segmentation task. This efficiency allows us to explore more sophisticated architectures for structured prediction in deep learning: we introduce multi-resolution architectures to couple information across scales in a joint optimization framework, yielding systematic improvements. We demonstrate the utility of our approach on the challenging VOC PASCAL 2012 image segmentation benchmark, showing substantial improvements over strong baselines. We make all of our code and experiments available at {https://github.com/siddharthachandra/gcrf} |
Tasks Semantic Segmentation, Structured Prediction
Published 2016-03-28
URL http://arxiv.org/abs/1603.08358v4
PDF http://arxiv.org/pdf/1603.08358v4.pdf
PWC https://paperswithcode.com/paper/fast-exact-and-multi-scale-inference-for
Repo https://github.com/siddharthachandra/gcrf
Framework none

Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial “Bottleneck” Structure

Title Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial “Bottleneck” Structure
Authors Min Wang, Baoyuan Liu, Hassan Foroosh
Abstract Deep convolutional neural networks achieve remarkable visual recognition performance, at the cost of high computational complexity. In this paper, we have a new design of efficient convolutional layers based on three schemes. The 3D convolution operation in a convolutional layer can be considered as performing spatial convolution in each channel and linear projection across channels simultaneously. By unravelling them and arranging the spatial convolution sequentially, the proposed layer is composed of a single intra-channel convolution, of which the computation is negligible, and a linear channel projection. A topological subdivisioning is adopted to reduce the connection between the input channels and output channels. Additionally, we also introduce a spatial “bottleneck” structure that utilizes a convolution-projection-deconvolution pipeline to take advantage of the correlation between adjacent pixels in the input. Our experiments demonstrate that the proposed layers remarkably outperform the standard convolutional layers with regard to accuracy/complexity ratio. Our models achieve similar accuracy to VGG, ResNet-50, ResNet-101 while requiring 42, 4.5, 6.5 times less computation respectively.
Tasks
Published 2016-08-15
URL http://arxiv.org/abs/1608.04337v2
PDF http://arxiv.org/pdf/1608.04337v2.pdf
PWC https://paperswithcode.com/paper/design-of-efficient-convolutional-layers
Repo https://github.com/asfathermou/human-computer-interaction
Framework tf

Recursive nonlinear-system identification using latent variables

Title Recursive nonlinear-system identification using latent variables
Authors Per Mattsson, Dave Zachariah, Petre Stoica
Abstract In this paper we develop a method for learning nonlinear systems with multiple outputs and inputs. We begin by modelling the errors of a nominal predictor of the system using a latent variable framework. Then using the maximum likelihood principle we derive a criterion for learning the model. The resulting optimization problem is tackled using a majorization-minimization approach. Finally, we develop a convex majorization technique and show that it enables a recursive identification method. The method learns parsimonious predictive models and is tested on both synthetic and real nonlinear systems.
Tasks
Published 2016-06-14
URL http://arxiv.org/abs/1606.04366v3
PDF http://arxiv.org/pdf/1606.04366v3.pdf
PWC https://paperswithcode.com/paper/recursive-nonlinear-system-identification
Repo https://github.com/magni84/lava
Framework none

Deep Biaffine Attention for Neural Dependency Parsing

Title Deep Biaffine Attention for Neural Dependency Parsing
Authors Timothy Dozat, Christopher D. Manning
Abstract This paper builds off recent work from Kiperwasser & Goldberg (2016) using neural attention in a simple graph-based dependency parser. We use a larger but more thoroughly regularized parser than other recent BiLSTM-based approaches, with biaffine classifiers to predict arcs and labels. Our parser gets state of the art or near state of the art performance on standard treebanks for six different languages, achieving 95.7% UAS and 94.1% LAS on the most popular English PTB dataset. This makes it the highest-performing graph-based parser on this benchmark—outperforming Kiperwasser Goldberg (2016) by 1.8% and 2.2%—and comparable to the highest performing transition-based parser (Kuncoro et al., 2016), which achieves 95.8% UAS and 94.6% LAS. We also show which hyperparameter choices had a significant effect on parsing accuracy, allowing us to achieve large gains over other graph-based approaches.
Tasks Dependency Parsing
Published 2016-11-06
URL http://arxiv.org/abs/1611.01734v3
PDF http://arxiv.org/pdf/1611.01734v3.pdf
PWC https://paperswithcode.com/paper/deep-biaffine-attention-for-neural-dependency
Repo https://github.com/ITUnlp/UniParse
Framework none

Speed/accuracy trade-offs for modern convolutional object detectors

Title Speed/accuracy trade-offs for modern convolutional object detectors
Authors Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, Kevin Murphy
Abstract The goal of this paper is to serve as a guide for selecting a detection architecture that achieves the right speed/memory/accuracy balance for a given application and platform. To this end, we investigate various ways to trade accuracy for speed and memory usage in modern convolutional object detection systems. A number of successful systems have been proposed in recent years, but apples-to-apples comparisons are difficult due to different base feature extractors (e.g., VGG, Residual Networks), different default image resolutions, as well as different hardware and software platforms. We present a unified implementation of the Faster R-CNN [Ren et al., 2015], R-FCN [Dai et al., 2016] and SSD [Liu et al., 2015] systems, which we view as “meta-architectures” and trace out the speed/accuracy trade-off curve created by using alternative feature extractors and varying other critical parameters such as image size within each of these meta-architectures. On one extreme end of this spectrum where speed and memory are critical, we present a detector that achieves real time speeds and can be deployed on a mobile device. On the opposite end in which accuracy is critical, we present a detector that achieves state-of-the-art performance measured on the COCO detection task.
Tasks Object Detection
Published 2016-11-30
URL http://arxiv.org/abs/1611.10012v3
PDF http://arxiv.org/pdf/1611.10012v3.pdf
PWC https://paperswithcode.com/paper/speedaccuracy-trade-offs-for-modern
Repo https://github.com/Qengineering/TensorFlow_Lite_RPi_64-bits
Framework tf

End-to-end Learning of Deep Visual Representations for Image Retrieval

Title End-to-end Learning of Deep Visual Representations for Image Retrieval
Authors Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus
Abstract While deep learning has become a key ingredient in the top performing methods for many computer vision tasks, it has failed so far to bring similar improvements to instance-level image retrieval. In this article, we argue that reasons for the underwhelming results of deep methods on image retrieval are threefold: i) noisy training data, ii) inappropriate deep architecture, and iii) suboptimal training procedure. We address all three issues. First, we leverage a large-scale but noisy landmark dataset and develop an automatic cleaning method that produces a suitable training set for deep retrieval. Second, we build on the recent R-MAC descriptor, show that it can be interpreted as a deep and differentiable architecture, and present improvements to enhance it. Last, we train this network with a siamese architecture that combines three streams with a triplet loss. At the end of the training process, the proposed architecture produces a global image representation in a single forward pass that is well suited for image retrieval. Extensive experiments show that our approach significantly outperforms previous retrieval approaches, including state-of-the-art methods based on costly local descriptor indexing and spatial verification. On Oxford 5k, Paris 6k and Holidays, we respectively report 94.7, 96.6, and 94.8 mean average precision. Our representations can also be heavily compressed using product quantization with little loss in accuracy. For additional material, please see www.xrce.xerox.com/Deep-Image-Retrieval.
Tasks Image Retrieval, Quantization
Published 2016-10-25
URL http://arxiv.org/abs/1610.07940v2
PDF http://arxiv.org/pdf/1610.07940v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-learning-of-deep-visual
Repo https://github.com/almazan/deep-image-retrieval
Framework pytorch

YOLO9000: Better, Faster, Stronger

Title YOLO9000: Better, Faster, Stronger
Authors Joseph Redmon, Ali Farhadi
Abstract We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet and SSD while still running significantly faster. Finally we propose a method to jointly train on object detection and classification. Using this method we train YOLO9000 simultaneously on the COCO detection dataset and the ImageNet classification dataset. Our joint training allows YOLO9000 to predict detections for object classes that don’t have labelled detection data. We validate our approach on the ImageNet detection task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes; it predicts detections for more than 9000 different object categories. And it still runs in real-time.
Tasks Object Detection, Real-Time Object Detection
Published 2016-12-25
URL http://arxiv.org/abs/1612.08242v1
PDF http://arxiv.org/pdf/1612.08242v1.pdf
PWC https://paperswithcode.com/paper/yolo9000-better-faster-stronger
Repo https://github.com/yuliani29/yolotraining
Framework none
comments powered by Disqus