May 7, 2019

2952 words 14 mins read

Paper Group AWR 21

Strategyproof Peer Selection using Randomization, Partitioning, and Apportionment. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. Adversarial Machine Learning at Scale. Understanding How Image Quality Affects Deep Neural Networks. LightNet: A Versatile, Standalone Matlab-based Environment for Deep Learning. Detec …

Strategyproof Peer Selection using Randomization, Partitioning, and Apportionment


Title	Strategyproof Peer Selection using Randomization, Partitioning, and Apportionment
Authors	Haris Aziz, Omer Lev, Nicholas Mattei, Jeffrey S. Rosenschein, Toby Walsh
Abstract	Peer reviews, evaluations, and selections are a fundamental aspect of modern science. Funding bodies the world over employ experts to review and select the best proposals from those submitted for funding. The problem of peer selection, however, is much more general: a professional society may want to give a subset of its members awards based on the opinions of all members; an instructor for a Massive Open Online Course (MOOC) or an online course may want to crowdsource grading; or a marketing company may select ideas from group brainstorming sessions based on peer evaluation. We make three fundamental contributions to the study of peer selection, a specific type of group decision-making problem, studied in computer science, economics, and political science. First, we propose a novel mechanism that is strategyproof, i.e., agents cannot benefit by reporting insincere valuations. Second, we demonstrate the effectiveness of our mechanism by a comprehensive simulation-based comparison with a suite of mechanisms found in the literature. Finally, our mechanism employs a randomized rounding technique that is of independent interest, as it solves the apportionment problem that arises in various settings where discrete resources such as parliamentary representation slots need to be divided proportionally.
Tasks	Decision Making
Published	2016-04-13
URL	http://arxiv.org/abs/1604.03632v4
PDF	http://arxiv.org/pdf/1604.03632v4.pdf
PWC	https://paperswithcode.com/paper/strategyproof-peer-selection-using
Repo	https://github.com/nmattei/peerselection
Framework	none

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning


Title	Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Authors	Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi
Abstract	Very deep convolutional networks have been central to the largest advances in image recognition performance in recent years. One example is the Inception architecture that has been shown to achieve very good performance at relatively low computational cost. Recently, the introduction of residual connections in conjunction with a more traditional architecture has yielded state-of-the-art performance in the 2015 ILSVRC challenge; its performance was similar to the latest generation Inception-v3 network. This raises the question of whether there are any benefit in combining the Inception architecture with residual connections. Here we give clear empirical evidence that training with residual connections accelerates the training of Inception networks significantly. There is also some evidence of residual Inception networks outperforming similarly expensive Inception networks without residual connections by a thin margin. We also present several new streamlined architectures for both residual and non-residual Inception networks. These variations improve the single-frame recognition performance on the ILSVRC 2012 classification task significantly. We further demonstrate how proper activation scaling stabilizes the training of very wide residual Inception networks. With an ensemble of three residual and one Inception-v4, we achieve 3.08 percent top-5 error on the test set of the ImageNet classification (CLS) challenge
Tasks	Image Classification
Published	2016-02-23
URL	http://arxiv.org/abs/1602.07261v2
PDF	http://arxiv.org/pdf/1602.07261v2.pdf
PWC	https://paperswithcode.com/paper/inception-v4-inception-resnet-and-the-impact
Repo	https://github.com/nknytk/ml-study
Framework	none

Adversarial Machine Learning at Scale


Title	Adversarial Machine Learning at Scale
Authors	Alexey Kurakin, Ian Goodfellow, Samy Bengio
Abstract	Adversarial examples are malicious inputs designed to fool machine learning models. They often transfer from one model to another, allowing attackers to mount black box attacks without knowledge of the target model’s parameters. Adversarial training is the process of explicitly training a model on adversarial examples, in order to make it more robust to attack or to reduce its test error on clean inputs. So far, adversarial training has primarily been applied to small problems. In this research, we apply adversarial training to ImageNet. Our contributions include: (1) recommendations for how to succesfully scale adversarial training to large models and datasets, (2) the observation that adversarial training confers robustness to single-step attack methods, (3) the finding that multi-step attack methods are somewhat less transferable than single-step attack methods, so single-step attacks are the best for mounting black-box attacks, and (4) resolution of a “label leaking” effect that causes adversarially trained models to perform better on adversarial examples than on clean examples, because the adversarial example construction process uses the true label and the model can learn to exploit regularities in the construction process.
Tasks
Published	2016-11-04
URL	http://arxiv.org/abs/1611.01236v2
PDF	http://arxiv.org/pdf/1611.01236v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-machine-learning-at-scale
Repo	https://github.com/cs-giung/course-dl-TP
Framework	pytorch

Understanding How Image Quality Affects Deep Neural Networks


Title	Understanding How Image Quality Affects Deep Neural Networks
Authors	Samuel Dodge, Lina Karam
Abstract	Image quality is an important practical challenge that is often overlooked in the design of machine vision systems. Commonly, machine vision systems are trained and tested on high quality image datasets, yet in practical applications the input images can not be assumed to be of high quality. Recently, deep neural networks have obtained state-of-the-art performance on many machine vision tasks. In this paper we provide an evaluation of 4 state-of-the-art deep neural network models for image classification under quality distortions. We consider five types of quality distortions: blur, noise, contrast, JPEG, and JPEG2000 compression. We show that the existing networks are susceptible to these quality distortions, particularly to blur and noise. These results enable future work in developing deep neural networks that are more invariant to quality distortions.
Tasks	Image Classification
Published	2016-04-14
URL	http://arxiv.org/abs/1604.04004v2
PDF	http://arxiv.org/pdf/1604.04004v2.pdf
PWC	https://paperswithcode.com/paper/understanding-how-image-quality-affects-deep
Repo	https://github.com/premthomas/keras-image-classification
Framework	tf

LightNet: A Versatile, Standalone Matlab-based Environment for Deep Learning


Title	LightNet: A Versatile, Standalone Matlab-based Environment for Deep Learning
Authors	Chengxi Ye, Chen Zhao, Yezhou Yang, Cornelia Fermuller, Yiannis Aloimonos
Abstract	LightNet is a lightweight, versatile and purely Matlab-based deep learning framework. The idea underlying its design is to provide an easy-to-understand, easy-to-use and efficient computational platform for deep learning research. The implemented framework supports major deep learning architectures such as Multilayer Perceptron Networks (MLP), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). The framework also supports both CPU and GPU computation, and the switch between them is straightforward. Different applications in computer vision, natural language processing and robotics are demonstrated as experiments.
Tasks
Published	2016-05-09
URL	http://arxiv.org/abs/1605.02766v3
PDF	http://arxiv.org/pdf/1605.02766v3.pdf
PWC	https://paperswithcode.com/paper/lightnet-a-versatile-standalone-matlab-based
Repo	https://github.com/yechengxi/LightNet
Framework	none

Detecting People in Artwork with CNNs


Title	Detecting People in Artwork with CNNs
Authors	Nicholas Westlake, Hongping Cai, Peter Hall
Abstract	CNNs have massively improved performance in object detection in photographs. However research into object detection in artwork remains limited. We show state-of-the-art performance on a challenging dataset, People-Art, which contains people from photos, cartoons and 41 different artwork movements. We achieve this high performance by fine-tuning a CNN for this task, thus also demonstrating that training CNNs on photos results in overfitting for photos: only the first three or four layers transfer from photos to artwork. Although the CNN’s performance is the highest yet, it remains less than 60% AP, suggesting further work is needed for the cross-depiction problem. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-46604-0_57
Tasks	Object Detection
Published	2016-10-27
URL	http://arxiv.org/abs/1610.08871v1
PDF	http://arxiv.org/pdf/1610.08871v1.pdf
PWC	https://paperswithcode.com/paper/detecting-people-in-artwork-with-cnns
Repo	https://github.com/BathVisArtData/PeopleArt
Framework	none

Pairwise Choice Markov Chains


Title	Pairwise Choice Markov Chains
Authors	Stephen Ragain, Johan Ugander
Abstract	As datasets capturing human choices grow in richness and scale—particularly in online domains—there is an increasing need for choice models that escape traditional choice-theoretic axioms such as regularity, stochastic transitivity, and Luce’s choice axiom. In this work we introduce the Pairwise Choice Markov Chain (PCMC) model of discrete choice, an inferentially tractable model that does not assume any of the above axioms while still satisfying the foundational axiom of uniform expansion, a considerably weaker assumption than Luce’s choice axiom. We show that the PCMC model significantly outperforms the Multinomial Logit (MNL) model in prediction tasks on both synthetic and empirical datasets known to exhibit violations of Luce’s axiom. Our analysis also synthesizes several recent observations connecting the Multinomial Logit model and Markov chains; the PCMC model retains the Multinomial Logit model as a special case.
Tasks
Published	2016-03-08
URL	http://arxiv.org/abs/1603.02740v3
PDF	http://arxiv.org/pdf/1603.02740v3.pdf
PWC	https://paperswithcode.com/paper/pairwise-choice-markov-chains
Repo	https://github.com/sragain/pcmc-nips
Framework	none

Identity Mappings in Deep Residual Networks


Title	Identity Mappings in Deep Residual Networks
Authors	Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Abstract	Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors. In this paper, we analyze the propagation formulations behind the residual building blocks, which suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. A series of ablation experiments support the importance of these identity mappings. This motivates us to propose a new residual unit, which makes training easier and improves generalization. We report improved results using a 1001-layer ResNet on CIFAR-10 (4.62% error) and CIFAR-100, and a 200-layer ResNet on ImageNet. Code is available at: https://github.com/KaimingHe/resnet-1k-layers
Tasks	Image Classification
Published	2016-03-16
URL	http://arxiv.org/abs/1603.05027v3
PDF	http://arxiv.org/pdf/1603.05027v3.pdf
PWC	https://paperswithcode.com/paper/identity-mappings-in-deep-residual-networks
Repo	https://github.com/alrojo/lasagne_residual_network
Framework	none

Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields


Title	Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Authors	Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh
Abstract	We present an approach to efficiently detect the 2D pose of multiple people in an image. The approach uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image. The architecture encodes global context, allowing a greedy bottom-up parsing step that maintains high accuracy while achieving realtime performance, irrespective of the number of people in the image. The architecture is designed to jointly learn part locations and their association via two branches of the same sequential prediction process. Our method placed first in the inaugural COCO 2016 keypoints challenge, and significantly exceeds the previous state-of-the-art result on the MPII Multi-Person benchmark, both in performance and efficiency.
Tasks	Keypoint Detection, Multi-Person Pose Estimation, Pose Estimation
Published	2016-11-24
URL	http://arxiv.org/abs/1611.08050v2
PDF	http://arxiv.org/pdf/1611.08050v2.pdf
PWC	https://paperswithcode.com/paper/realtime-multi-person-2d-pose-estimation
Repo	https://github.com/CMU-Perceptual-Computing-Lab/openpose_unity_plugin
Framework	none

Controlling Output Length in Neural Encoder-Decoders


Title	Controlling Output Length in Neural Encoder-Decoders
Authors	Yuta Kikuchi, Graham Neubig, Ryohei Sasano, Hiroya Takamura, Manabu Okumura
Abstract	Neural encoder-decoder models have shown great success in many sequence generation tasks. However, previous work has not investigated situations in which we would like to control the length of encoder-decoder outputs. This capability is crucial for applications such as text summarization, in which we have to generate concise summaries with a desired length. In this paper, we propose methods for controlling the output sequence length for neural encoder-decoder models: two decoding-based methods and two learning-based methods. Results show that our learning-based methods have the capability to control length without degrading summary quality in a summarization task.
Tasks	Text Summarization
Published	2016-09-30
URL	http://arxiv.org/abs/1609.09552v1
PDF	http://arxiv.org/pdf/1609.09552v1.pdf
PWC	https://paperswithcode.com/paper/controlling-output-length-in-neural-encoder
Repo	https://github.com/kiyukuta/lencon
Framework	none

Elastic Functional Coding of Riemannian Trajectories


Title	Elastic Functional Coding of Riemannian Trajectories
Authors	Rushil Anirudh, Pavan Turaga, Jingyong Su, Anuj Srivastava
Abstract	Visual observations of dynamic phenomena, such as human actions, are often represented as sequences of smoothly-varying features . In cases where the feature spaces can be structured as Riemannian manifolds, the corresponding representations become trajectories on manifolds. Analysis of these trajectories is challenging due to non-linearity of underlying spaces and high-dimensionality of trajectories. In vision problems, given the nature of physical systems involved, these phenomena are better characterized on a low-dimensional manifold compared to the space of Riemannian trajectories. For instance, if one does not impose physical constraints of the human body, in data involving human action analysis, the resulting representation space will have highly redundant features. Learning an effective, low-dimensional embedding for action representations will have a huge impact in the areas of search and retrieval, visualization, learning, and recognition. The difficulty lies in inherent non-linearity of the domain and temporal variability of actions that can distort any traditional metric between trajectories. To overcome these issues, we use the framework based on transported square-root velocity fields (TSRVF); this framework has several desirable properties, including a rate-invariant metric and vector space representations. We propose to learn an embedding such that each action trajectory is mapped to a single point in a low-dimensional Euclidean space, and the trajectories that differ only in temporal rates map to the same point. We utilize the TSRVF representation, and accompanying statistical summaries of Riemannian trajectories, to extend existing coding methods such as PCA, KSVD and Label Consistent KSVD to Riemannian trajectories or more generally to Riemannian functions.
Tasks
Published	2016-03-07
URL	http://arxiv.org/abs/1603.02200v1
PDF	http://arxiv.org/pdf/1603.02200v1.pdf
PWC	https://paperswithcode.com/paper/elastic-functional-coding-of-riemannian
Repo	https://github.com/rushilanirudh/tsrvf
Framework	none

Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures


Title	Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures
Authors	Hengyuan Hu, Rui Peng, Yu-Wing Tai, Chi-Keung Tang
Abstract	State-of-the-art neural networks are getting deeper and wider. While their performance increases with the increasing number of layers and neurons, it is crucial to design an efficient deep architecture in order to reduce computational and memory costs. Designing an efficient neural network, however, is labor intensive requiring many experiments, and fine-tunings. In this paper, we introduce network trimming which iteratively optimizes the network by pruning unimportant neurons based on analysis of their outputs on a large dataset. Our algorithm is inspired by an observation that the outputs of a significant portion of neurons in a large network are mostly zero, regardless of what inputs the network received. These zero activation neurons are redundant, and can be removed without affecting the overall accuracy of the network. After pruning the zero activation neurons, we retrain the network using the weights before pruning as initialization. We alternate the pruning and retraining to further reduce zero activations in a network. Our experiments on the LeNet and VGG-16 show that we can achieve high compression ratio of parameters without losing or even achieving higher accuracy than the original network.
Tasks
Published	2016-07-12
URL	http://arxiv.org/abs/1607.03250v1
PDF	http://arxiv.org/pdf/1607.03250v1.pdf
PWC	https://paperswithcode.com/paper/network-trimming-a-data-driven-neuron-pruning
Repo	https://github.com/marcoancona/TorchPruner
Framework	pytorch

The Cityscapes Dataset for Semantic Urban Scene Understanding


Title	The Cityscapes Dataset for Semantic Urban Scene Understanding
Authors	Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, Bernt Schiele
Abstract	Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations; 20000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.
Tasks	Object Detection, Scene Understanding
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01685v2
PDF	http://arxiv.org/pdf/1604.01685v2.pdf
PWC	https://paperswithcode.com/paper/the-cityscapes-dataset-for-semantic-urban
Repo	https://github.com/smitheric95/domain_stylization
Framework	pytorch

Neural Machine Translation with Characters and Hierarchical Encoding


Title	Neural Machine Translation with Characters and Hierarchical Encoding
Authors	Alexander Rosenberg Johansen, Jonas Meinertz Hansen, Elias Khazen Obeid, Casper Kaae Sønderby, Ole Winther
Abstract	Most existing Neural Machine Translation models use groups of characters or whole words as their unit of input and output. We propose a model with a hierarchical char2word encoder, that takes individual characters both as input and output. We first argue that this hierarchical representation of the character encoder reduces computational complexity, and show that it improves translation performance. Secondly, by qualitatively studying attention plots from the decoder we find that the model learns to compress common words into a single embedding whereas rare words, such as names and places, are represented character by character.
Tasks	Machine Translation
Published	2016-10-20
URL	http://arxiv.org/abs/1610.06550v1
PDF	http://arxiv.org/pdf/1610.06550v1.pdf
PWC	https://paperswithcode.com/paper/neural-machine-translation-with-characters
Repo	https://github.com/Styrke/master-code
Framework	tf

Inductive Bias of Deep Convolutional Networks through Pooling Geometry


Title	Inductive Bias of Deep Convolutional Networks through Pooling Geometry
Authors	Nadav Cohen, Amnon Shashua
Abstract	Our formal understanding of the inductive bias that drives the success of convolutional networks on computer vision tasks is limited. In particular, it is unclear what makes hypotheses spaces born from convolution and pooling operations so suitable for natural images. In this paper we study the ability of convolutional networks to model correlations among regions of their input. We theoretically analyze convolutional arithmetic circuits, and empirically validate our findings on other types of convolutional networks as well. Correlations are formalized through the notion of separation rank, which for a given partition of the input, measures how far a function is from being separable. We show that a polynomially sized deep network supports exponentially high separation ranks for certain input partitions, while being limited to polynomial separation ranks for others. The network’s pooling geometry effectively determines which input partitions are favored, thus serves as a means for controlling the inductive bias. Contiguous pooling windows as commonly employed in practice favor interleaved partitions over coarse ones, orienting the inductive bias towards the statistics of natural images. Other pooling schemes lead to different preferences, and this allows tailoring the network to data that departs from the usual domain of natural imagery. In addition to analyzing deep networks, we show that shallow ones support only linear separation ranks, and by this gain insight into the benefit of functions brought forth by depth - they are able to efficiently model strong correlation under favored partitions of the input.
Tasks
Published	2016-05-22
URL	http://arxiv.org/abs/1605.06743v4
PDF	http://arxiv.org/pdf/1605.06743v4.pdf
PWC	https://paperswithcode.com/paper/inductive-bias-of-deep-convolutional-networks
Repo	https://github.com/HUJI-Deep/inductive-pooling
Framework	none