May 7, 2019

3214 words 16 mins read

Paper Group AWR 98

Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora. Harnessing the Power of the Crowd to Increase Capacity for Data Science in the Social Sector. Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation. Every Filter Extracts A Specific Texture In Convolutional Neural Networks. Technical Report: Towards a Unive …

Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora


Title	Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora
Authors	William L. Hamilton, Kevin Clark, Jure Leskovec, Dan Jurafsky
Abstract	A word’s sentiment depends on the domain in which it is used. Computational social science research thus requires sentiment lexicons that are specific to the domains being studied. We combine domain-specific word embeddings with a label propagation framework to induce accurate domain-specific sentiment lexicons using small sets of seed words, achieving state-of-the-art performance competitive with approaches that rely on hand-curated resources. Using our framework we perform two large-scale empirical studies to quantify the extent to which sentiment varies across time and between communities. We induce and release historical sentiment lexicons for 150 years of English and community-specific sentiment lexicons for 250 online communities from the social media forum Reddit. The historical lexicons show that more than 5% of sentiment-bearing (non-neutral) English words completely switched polarity during the last 150 years, and the community-specific lexicons highlight how sentiment varies drastically between different communities.
Tasks	Word Embeddings
Published	2016-06-09
URL	http://arxiv.org/abs/1606.02820v2
PDF	http://arxiv.org/pdf/1606.02820v2.pdf
PWC	https://paperswithcode.com/paper/inducing-domain-specific-sentiment-lexicons
Repo	https://github.com/williamleif/socialsent
Framework	tf


Title	Harnessing the Power of the Crowd to Increase Capacity for Data Science in the Social Sector
Authors	Peter Bull, Isaac Slavitt, Greg Lipstein
Abstract	We present three case studies of organizations using a data science competition to answer a pressing question. The first is in education where a nonprofit that creates smart school budgets wanted to automatically tag budget line items. The second is in public health, where a low-cost, nonprofit women’s health care provider wanted to understand the effect of demographic and behavioral questions on predicting which services a woman would need. The third and final example is in government innovation: using online restaurant reviews from Yelp, competitors built models to forecast which restaurants were most likely to have hygiene violations when visited by health inspectors. Finally, we reflect on the unique benefits of the open, public competition model.
Tasks
Published	2016-06-24
URL	http://arxiv.org/abs/1606.07781v1
PDF	http://arxiv.org/pdf/1606.07781v1.pdf
PWC	https://paperswithcode.com/paper/harnessing-the-power-of-the-crowd-to-increase
Repo	https://github.com/jsiloto/dengAI
Framework	tf

Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation


Title	Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation
Authors	Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi, Wen Li
Abstract	In this paper, we propose a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition. Specifically, we design a new model called Deep Reconstruction-Classification Network (DRCN), which jointly learns a shared encoding representation for two tasks: i) supervised classification of labeled source data, and ii) unsupervised reconstruction of unlabeled target data.In this way, the learnt representation not only preserves discriminability, but also encodes useful information from the target domain. Our new DRCN model can be optimized by using backpropagation similarly as the standard neural networks. We evaluate the performance of DRCN on a series of cross-domain object recognition tasks, where DRCN provides a considerable improvement (up to ~8% in accuracy) over the prior state-of-the-art algorithms. Interestingly, we also observe that the reconstruction pipeline of DRCN transforms images from the source domain into images whose appearance resembles the target dataset. This suggests that DRCN’s performance is due to constructing a single composite representation that encodes information about both the structure of target images and the classification of source images. Finally, we provide a formal analysis to justify the algorithm’s objective in domain adaptation context.
Tasks	Domain Adaptation, Object Recognition, Unsupervised Domain Adaptation
Published	2016-07-12
URL	http://arxiv.org/abs/1607.03516v2
PDF	http://arxiv.org/pdf/1607.03516v2.pdf
PWC	https://paperswithcode.com/paper/deep-reconstruction-classification-networks
Repo	https://github.com/ghif/drcn
Framework	tf

Every Filter Extracts A Specific Texture In Convolutional Neural Networks


Title	Every Filter Extracts A Specific Texture In Convolutional Neural Networks
Authors	Zhiqiang Xia, Ce Zhu, Zhengtao Wang, Qi Guo, Yipeng Liu
Abstract	Many works have concentrated on visualizing and understanding the inner mechanism of convolutional neural networks (CNNs) by generating images that activate some specific neurons, which is called deep visualization. However, it is still unclear what the filters extract from images intuitively. In this paper, we propose a modified code inversion algorithm, called feature map inversion, to understand the function of filter of interest in CNNs. We reveal that every filter extracts a specific texture. The texture from higher layer contains more colours and more intricate structures. We also demonstrate that style of images could be a combination of these texture primitives. Two methods are proposed to reallocate energy distribution of feature maps randomly and purposefully. Then, we inverse the modified code and generate images of diverse styles. With these results, we provide an explanation about why Gram matrix of feature maps \cite{Gatys_2016_CVPR} could represent image style.
Tasks
Published	2016-08-15
URL	http://arxiv.org/abs/1608.04170v2
PDF	http://arxiv.org/pdf/1608.04170v2.pdf
PWC	https://paperswithcode.com/paper/every-filter-extracts-a-specific-texture-in
Repo	https://github.com/xzqjack/FeatureMapInversion
Framework	mxnet

Technical Report: Towards a Universal Code Formatter through Machine Learning


Title	Technical Report: Towards a Universal Code Formatter through Machine Learning
Authors	Terence Parr, Jurgin Vinju
Abstract	There are many declarative frameworks that allow us to implement code formatters relatively easily for any specific language, but constructing them is cumbersome. The first problem is that “everybody” wants to format their code differently, leading to either many formatter variants or a ridiculous number of configuration options. Second, the size of each implementation scales with a language’s grammar size, leading to hundreds of rules. In this paper, we solve the formatter construction problem using a novel approach, one that automatically derives formatters for any given language without intervention from a language expert. We introduce a code formatter called CodeBuff that uses machine learning to abstract formatting rules from a representative corpus, using a carefully designed feature set. Our experiments on Java, SQL, and ANTLR grammars show that CodeBuff is efficient, has excellent accuracy, and is grammar invariant for a given language. It also generalizes to a 4th language tested during manuscript preparation.
Tasks
Published	2016-06-28
URL	http://arxiv.org/abs/1606.08866v1
PDF	http://arxiv.org/pdf/1606.08866v1.pdf
PWC	https://paperswithcode.com/paper/technical-report-towards-a-universal-code
Repo	https://github.com/antlr/codebuff
Framework	none

A MultiPath Network for Object Detection


Title	A MultiPath Network for Object Detection
Authors	Sergey Zagoruyko, Adam Lerer, Tsung-Yi Lin, Pedro O. Pinheiro, Sam Gross, Soumith Chintala, Piotr Dollár
Abstract	The recent COCO object detection dataset presents several new challenges for object detection. In particular, it contains objects at a broad range of scales, less prototypical images, and requires more precise localization. To address these challenges, we test three modifications to the standard Fast R-CNN object detector: (1) skip connections that give the detector access to features at multiple network layers, (2) a foveal structure to exploit object context at multiple object resolutions, and (3) an integral loss function and corresponding network adjustment that improve localization. The result of these modifications is that information can flow along multiple paths in our network, including through features from multiple network layers and from multiple object views. We refer to our modified classifier as a “MultiPath” network. We couple our MultiPath network with DeepMask object proposals, which are well suited for localization and small objects, and adapt our pipeline to predict segmentation masks in addition to bounding boxes. The combined system improves results over the baseline Fast R-CNN detector with Selective Search by 66% overall and by 4x on small objects. It placed second in both the COCO 2015 detection and segmentation challenges.
Tasks	Instance Segmentation, Object Detection
Published	2016-04-07
URL	http://arxiv.org/abs/1604.02135v2
PDF	http://arxiv.org/pdf/1604.02135v2.pdf
PWC	https://paperswithcode.com/paper/a-multipath-network-for-object-detection
Repo	https://github.com/facebookresearch/multipathnet
Framework	torch

Structured illumination microscopy image reconstruction algorithm


Title	Structured illumination microscopy image reconstruction algorithm
Authors	Amit Lal, Chunyan Shan, Peng Xi
Abstract	Structured illumination microscopy (SIM) is a very important super-resolution microscopy technique, which provides high speed super-resolution with about two-fold spatial resolution enhancement. Several attempts aimed at improving the performance of SIM reconstruction algorithm have been reported. However, most of these highlight only one specific aspect of the SIM reconstruction – such as the determination of the illumination pattern phase shift accurately – whereas other key elements – such as determination of modulation factor, estimation of object power spectrum, Wiener filtering frequency components with inclusion of object power spectrum information, translocating and the merging of the overlapping frequency components – are usually glossed over superficially. In addition, most of the work reported lie scattered throughout the literature and a comprehensive review of the theoretical background is found lacking. The purpose of the present work is two-fold: 1) to collect the essential theoretical details of SIM algorithm at one place, thereby making them readily accessible to readers for the first time; and 2) to provide an open source SIM reconstruction code (named OpenSIM), which enables users to interactively vary the code parameters and study it’s effect on reconstructed SIM image.
Tasks	Image Reconstruction, Super-Resolution
Published	2016-02-19
URL	http://arxiv.org/abs/1602.06904v1
PDF	http://arxiv.org/pdf/1602.06904v1.pdf
PWC	https://paperswithcode.com/paper/structured-illumination-microscopy-image
Repo	https://github.com/iandobbie/CUDA_SIMrecon
Framework	none

Eye Tracking for Everyone


Title	Eye Tracking for Everyone
Authors	Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Harini Kannan, Suchendra Bhandarkar, Wojciech Matusik, Antonio Torralba
Abstract	From scientific research to commercial applications, eye tracking is an important tool across many domains. Despite its range of applications, eye tracking has yet to become a pervasive technology. We believe that we can put the power of eye tracking in everyone’s palm by building eye tracking software that works on commodity hardware such as mobile phones and tablets, without the need for additional sensors or devices. We tackle this problem by introducing GazeCapture, the first large-scale dataset for eye tracking, containing data from over 1450 people consisting of almost 2.5M frames. Using GazeCapture, we train iTracker, a convolutional neural network for eye tracking, which achieves a significant reduction in error over previous approaches while running in real time (10-15fps) on a modern mobile device. Our model achieves a prediction error of 1.71cm and 2.53cm without calibration on mobile phones and tablets respectively. With calibration, this is reduced to 1.34cm and 2.12cm. Further, we demonstrate that the features learned by iTracker generalize well to other datasets, achieving state-of-the-art results. The code, data, and models are available at http://gazecapture.csail.mit.edu.
Tasks	Calibration, Eye Tracking, Gaze Estimation
Published	2016-06-18
URL	http://arxiv.org/abs/1606.05814v1
PDF	http://arxiv.org/pdf/1606.05814v1.pdf
PWC	https://paperswithcode.com/paper/eye-tracking-for-everyone
Repo	https://github.com/CSAILVision/GazeCapture
Framework	none

Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs


Title	Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs
Authors	Siddhartha Chandra, Iasonas Kokkinos
Abstract	In this work we propose a structured prediction technique that combines the virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a) our structured prediction task has a unique global optimum that is obtained exactly from the solution of a linear system (b) the gradients of our model parameters are analytically computed using closed form expressions, in contrast to the memory-demanding contemporary deep structured prediction approaches that rely on back-propagation-through-time, (c) our pairwise terms do not have to be simple hand-crafted expressions, as in the line of works building on the DenseCRF, but can rather be `discovered’ from data through deep architectures, and (d) out system can trained in an end-to-end manner. Building on standard tools from numerical analysis we develop very efficient algorithms for inference and learning, as well as a customized technique adapted to the semantic segmentation task. This efficiency allows us to explore more sophisticated architectures for structured prediction in deep learning: we introduce multi-resolution architectures to couple information across scales in a joint optimization framework, yielding systematic improvements. We demonstrate the utility of our approach on the challenging VOC PASCAL 2012 image segmentation benchmark, showing substantial improvements over strong baselines. We make all of our code and experiments available at {https://github.com/siddharthachandra/gcrf} \|
Tasks	Semantic Segmentation, Structured Prediction
Published	2016-03-28
URL	http://arxiv.org/abs/1603.08358v4
PDF	http://arxiv.org/pdf/1603.08358v4.pdf
PWC	https://paperswithcode.com/paper/fast-exact-and-multi-scale-inference-for
Repo	https://github.com/siddharthachandra/gcrf
Framework	none

Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial “Bottleneck” Structure


Title	Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial “Bottleneck” Structure
Authors	Min Wang, Baoyuan Liu, Hassan Foroosh
Abstract	Deep convolutional neural networks achieve remarkable visual recognition performance, at the cost of high computational complexity. In this paper, we have a new design of efficient convolutional layers based on three schemes. The 3D convolution operation in a convolutional layer can be considered as performing spatial convolution in each channel and linear projection across channels simultaneously. By unravelling them and arranging the spatial convolution sequentially, the proposed layer is composed of a single intra-channel convolution, of which the computation is negligible, and a linear channel projection. A topological subdivisioning is adopted to reduce the connection between the input channels and output channels. Additionally, we also introduce a spatial “bottleneck” structure that utilizes a convolution-projection-deconvolution pipeline to take advantage of the correlation between adjacent pixels in the input. Our experiments demonstrate that the proposed layers remarkably outperform the standard convolutional layers with regard to accuracy/complexity ratio. Our models achieve similar accuracy to VGG, ResNet-50, ResNet-101 while requiring 42, 4.5, 6.5 times less computation respectively.
Tasks
Published	2016-08-15
URL	http://arxiv.org/abs/1608.04337v2
PDF	http://arxiv.org/pdf/1608.04337v2.pdf
PWC	https://paperswithcode.com/paper/design-of-efficient-convolutional-layers
Repo	https://github.com/asfathermou/human-computer-interaction
Framework	tf

Recursive nonlinear-system identification using latent variables


Title	Recursive nonlinear-system identification using latent variables
Authors	Per Mattsson, Dave Zachariah, Petre Stoica
Abstract	In this paper we develop a method for learning nonlinear systems with multiple outputs and inputs. We begin by modelling the errors of a nominal predictor of the system using a latent variable framework. Then using the maximum likelihood principle we derive a criterion for learning the model. The resulting optimization problem is tackled using a majorization-minimization approach. Finally, we develop a convex majorization technique and show that it enables a recursive identification method. The method learns parsimonious predictive models and is tested on both synthetic and real nonlinear systems.
Tasks
Published	2016-06-14
URL	http://arxiv.org/abs/1606.04366v3
PDF	http://arxiv.org/pdf/1606.04366v3.pdf
PWC	https://paperswithcode.com/paper/recursive-nonlinear-system-identification
Repo	https://github.com/magni84/lava
Framework	none

Deep Biaffine Attention for Neural Dependency Parsing


Title	Deep Biaffine Attention for Neural Dependency Parsing
Authors	Timothy Dozat, Christopher D. Manning
Abstract	This paper builds off recent work from Kiperwasser & Goldberg (2016) using neural attention in a simple graph-based dependency parser. We use a larger but more thoroughly regularized parser than other recent BiLSTM-based approaches, with biaffine classifiers to predict arcs and labels. Our parser gets state of the art or near state of the art performance on standard treebanks for six different languages, achieving 95.7% UAS and 94.1% LAS on the most popular English PTB dataset. This makes it the highest-performing graph-based parser on this benchmark—outperforming Kiperwasser Goldberg (2016) by 1.8% and 2.2%—and comparable to the highest performing transition-based parser (Kuncoro et al., 2016), which achieves 95.8% UAS and 94.6% LAS. We also show which hyperparameter choices had a significant effect on parsing accuracy, allowing us to achieve large gains over other graph-based approaches.
Tasks	Dependency Parsing
Published	2016-11-06
URL	http://arxiv.org/abs/1611.01734v3
PDF	http://arxiv.org/pdf/1611.01734v3.pdf
PWC	https://paperswithcode.com/paper/deep-biaffine-attention-for-neural-dependency
Repo	https://github.com/ITUnlp/UniParse
Framework	none

Speed/accuracy trade-offs for modern convolutional object detectors


Title	Speed/accuracy trade-offs for modern convolutional object detectors
Authors	Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, Kevin Murphy
Abstract	The goal of this paper is to serve as a guide for selecting a detection architecture that achieves the right speed/memory/accuracy balance for a given application and platform. To this end, we investigate various ways to trade accuracy for speed and memory usage in modern convolutional object detection systems. A number of successful systems have been proposed in recent years, but apples-to-apples comparisons are difficult due to different base feature extractors (e.g., VGG, Residual Networks), different default image resolutions, as well as different hardware and software platforms. We present a unified implementation of the Faster R-CNN [Ren et al., 2015], R-FCN [Dai et al., 2016] and SSD [Liu et al., 2015] systems, which we view as “meta-architectures” and trace out the speed/accuracy trade-off curve created by using alternative feature extractors and varying other critical parameters such as image size within each of these meta-architectures. On one extreme end of this spectrum where speed and memory are critical, we present a detector that achieves real time speeds and can be deployed on a mobile device. On the opposite end in which accuracy is critical, we present a detector that achieves state-of-the-art performance measured on the COCO detection task.
Tasks	Object Detection
Published	2016-11-30
URL	http://arxiv.org/abs/1611.10012v3
PDF	http://arxiv.org/pdf/1611.10012v3.pdf
PWC	https://paperswithcode.com/paper/speedaccuracy-trade-offs-for-modern
Repo	https://github.com/Qengineering/TensorFlow_Lite_RPi_64-bits
Framework	tf

End-to-end Learning of Deep Visual Representations for Image Retrieval


Title	End-to-end Learning of Deep Visual Representations for Image Retrieval
Authors	Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus
Abstract	While deep learning has become a key ingredient in the top performing methods for many computer vision tasks, it has failed so far to bring similar improvements to instance-level image retrieval. In this article, we argue that reasons for the underwhelming results of deep methods on image retrieval are threefold: i) noisy training data, ii) inappropriate deep architecture, and iii) suboptimal training procedure. We address all three issues. First, we leverage a large-scale but noisy landmark dataset and develop an automatic cleaning method that produces a suitable training set for deep retrieval. Second, we build on the recent R-MAC descriptor, show that it can be interpreted as a deep and differentiable architecture, and present improvements to enhance it. Last, we train this network with a siamese architecture that combines three streams with a triplet loss. At the end of the training process, the proposed architecture produces a global image representation in a single forward pass that is well suited for image retrieval. Extensive experiments show that our approach significantly outperforms previous retrieval approaches, including state-of-the-art methods based on costly local descriptor indexing and spatial verification. On Oxford 5k, Paris 6k and Holidays, we respectively report 94.7, 96.6, and 94.8 mean average precision. Our representations can also be heavily compressed using product quantization with little loss in accuracy. For additional material, please see www.xrce.xerox.com/Deep-Image-Retrieval.
Tasks	Image Retrieval, Quantization
Published	2016-10-25
URL	http://arxiv.org/abs/1610.07940v2
PDF	http://arxiv.org/pdf/1610.07940v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-learning-of-deep-visual
Repo	https://github.com/almazan/deep-image-retrieval
Framework	pytorch

YOLO9000: Better, Faster, Stronger


Title	YOLO9000: Better, Faster, Stronger
Authors	Joseph Redmon, Ali Farhadi
Abstract	We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet and SSD while still running significantly faster. Finally we propose a method to jointly train on object detection and classification. Using this method we train YOLO9000 simultaneously on the COCO detection dataset and the ImageNet classification dataset. Our joint training allows YOLO9000 to predict detections for object classes that don’t have labelled detection data. We validate our approach on the ImageNet detection task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes; it predicts detections for more than 9000 different object categories. And it still runs in real-time.
Tasks	Object Detection, Real-Time Object Detection
Published	2016-12-25
URL	http://arxiv.org/abs/1612.08242v1
PDF	http://arxiv.org/pdf/1612.08242v1.pdf
PWC	https://paperswithcode.com/paper/yolo9000-better-faster-stronger
Repo	https://github.com/yuliani29/yolotraining
Framework	none