July 30, 2019

2922 words 14 mins read

Paper Group AWR 25

Semantic Instance Segmentation with a Discriminative Loss Function. Large-Scale Plant Classification with Deep Neural Networks. Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields. The Perception-Distortion Tradeoff. Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics. Improving text classifica …

Semantic Instance Segmentation with a Discriminative Loss Function


Title	Semantic Instance Segmentation with a Discriminative Loss Function
Authors	Bert De Brabandere, Davy Neven, Luc Van Gool
Abstract	Semantic instance segmentation remains a challenging task. In this work we propose to tackle the problem with a discriminative loss function, operating at the pixel level, that encourages a convolutional network to produce a representation of the image that can easily be clustered into instances with a simple post-processing step. The loss function encourages the network to map each pixel to a point in feature space so that pixels belonging to the same instance lie close together while different instances are separated by a wide margin. Our approach of combining an off-the-shelf network with a principled loss function inspired by a metric learning objective is conceptually simple and distinct from recent efforts in instance segmentation. In contrast to previous works, our method does not rely on object proposals or recurrent mechanisms. A key contribution of our work is to demonstrate that such a simple setup without bells and whistles is effective and can perform on par with more complex methods. Moreover, we show that it does not suffer from some of the limitations of the popular detect-and-segment approaches. We achieve competitive performance on the Cityscapes and CVPPP leaf segmentation benchmarks.
Tasks	Instance Segmentation, Lane Detection, Metric Learning, Multi-Human Parsing, Semantic Segmentation
Published	2017-08-08
URL	http://arxiv.org/abs/1708.02551v1
PDF	http://arxiv.org/pdf/1708.02551v1.pdf
PWC	https://paperswithcode.com/paper/semantic-instance-segmentation-with-a
Repo	https://github.com/alicranck/instance-seg
Framework	pytorch

Large-Scale Plant Classification with Deep Neural Networks


Title	Large-Scale Plant Classification with Deep Neural Networks
Authors	Ignacio Heredia
Abstract	This paper discusses the potential of applying deep learning techniques for plant classification and its usage for citizen science in large-scale biodiversity monitoring. We show that plant classification using near state-of-the-art convolutional network architectures like ResNet50 achieves significant improvements in accuracy compared to the most widespread plant classification application in test sets composed of thousands of different species labels. We find that the predictions can be confidently used as a baseline classification in citizen science communities like iNaturalist (or its Spanish fork, Natusfera) which in turn can share their data with biodiversity portals like GBIF.
Tasks
Published	2017-06-12
URL	http://arxiv.org/abs/1706.03736v1
PDF	http://arxiv.org/pdf/1706.03736v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-plant-classification-with-deep
Repo	https://github.com/indigo-dc/seeds-classification-theano
Framework	none

Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields


Title	Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields
Authors	Thomas Unterthiner, Bernhard Nessler, Calvin Seward, Günter Klambauer, Martin Heusel, Hubert Ramsauer, Sepp Hochreiter
Abstract	Generative adversarial networks (GANs) evolved into one of the most successful unsupervised techniques for generating realistic images. Even though it has recently been shown that GAN training converges, GAN models often end up in local Nash equilibria that are associated with mode collapse or otherwise fail to model the target distribution. We introduce Coulomb GANs, which pose the GAN learning problem as a potential field of charged particles, where generated samples are attracted to training set samples but repel each other. The discriminator learns a potential field while the generator decreases the energy by moving its samples along the vector (force) field determined by the gradient of the potential field. Through decreasing the energy, the GAN model learns to generate samples according to the whole target distribution and does not only cover some of its modes. We prove that Coulomb GANs possess only one Nash equilibrium which is optimal in the sense that the model distribution equals the target distribution. We show the efficacy of Coulomb GANs on a variety of image datasets. On LSUN and celebA, Coulomb GANs set a new state of the art and produce a previously unseen variety of different samples.
Tasks
Published	2017-08-29
URL	http://arxiv.org/abs/1708.08819v3
PDF	http://arxiv.org/pdf/1708.08819v3.pdf
PWC	https://paperswithcode.com/paper/coulomb-gans-provably-optimal-nash-equilibria
Repo	https://github.com/bioinf-jku/coulomb_gan
Framework	tf

The Perception-Distortion Tradeoff


Title	The Perception-Distortion Tradeoff
Authors	Yochai Blau, Tomer Michaeli
Abstract	Image restoration algorithms are typically evaluated by some distortion measure (e.g. PSNR, SSIM, IFC, VIF) or by human opinion scores that quantify perceived perceptual quality. In this paper, we prove mathematically that distortion and perceptual quality are at odds with each other. Specifically, we study the optimal probability for correctly discriminating the outputs of an image restoration algorithm from real images. We show that as the mean distortion decreases, this probability must increase (indicating worse perceptual quality). As opposed to the common belief, this result holds true for any distortion measure, and is not only a problem of the PSNR or SSIM criteria. We also show that generative-adversarial-nets (GANs) provide a principled way to approach the perception-distortion bound. This constitutes theoretical support to their observed success in low-level vision tasks. Based on our analysis, we propose a new methodology for evaluating image restoration methods, and use it to perform an extensive comparison between recent super-resolution algorithms.
Tasks	Image Restoration, Super-Resolution
Published	2017-11-16
URL	https://arxiv.org/abs/1711.06077v3
PDF	https://arxiv.org/pdf/1711.06077v3.pdf
PWC	https://paperswithcode.com/paper/the-perception-distortion-tradeoff
Repo	https://github.com/roimehrez/PIRM2018
Framework	none

Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics


Title	Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics
Authors	Linfeng Zhang, Jiequn Han, Han Wang, Roberto Car, Weinan E
Abstract	We introduce a scheme for molecular simulations, the Deep Potential Molecular Dynamics (DeePMD) method, based on a many-body potential and interatomic forces generated by a carefully crafted deep neural network trained with ab initio data. The neural network model preserves all the natural symmetries in the problem. It is “first principle-based” in the sense that there are no ad hoc components aside from the network model. We show that the proposed scheme provides an efficient and accurate protocol in a variety of systems, including bulk materials and molecules. In all these cases, DeePMD gives results that are essentially indistinguishable from the original data, at a cost that scales linearly with system size.
Tasks
Published	2017-07-30
URL	http://arxiv.org/abs/1707.09571v2
PDF	http://arxiv.org/pdf/1707.09571v2.pdf
PWC	https://paperswithcode.com/paper/deep-potential-molecular-dynamics-a-scalable
Repo	https://github.com/google/differentiable-atomistic-potentials
Framework	tf

Improving text classification with vectors of reduced precision


Title	Improving text classification with vectors of reduced precision
Authors	Krzysztof Wróbel, Maciej Wielgosz, Marcin Pietroń, Michał Karwatowski, Aleksander Smywiński-Pohl
Abstract	This paper presents the analysis of the impact of a floating-point number precision reduction on the quality of text classification. The precision reduction of the vectors representing the data (e.g. TF-IDF representation in our case) allows for a decrease of computing time and memory footprint on dedicated hardware platforms. The impact of precision reduction on the classification quality was performed on 5 corpora, using 4 different classifiers. Also, dimensionality reduction was taken into account. Results indicate that the precision reduction improves classification accuracy for most cases (up to 25% of error reduction). In general, the reduction from 64 to 4 bits gives the best scores and ensures that the results will not be worse than with the full floating-point representation.
Tasks	Dimensionality Reduction, Text Classification
Published	2017-06-20
URL	http://arxiv.org/abs/1706.06363v1
PDF	http://arxiv.org/pdf/1706.06363v1.pdf
PWC	https://paperswithcode.com/paper/improving-text-classification-with-vectors-of
Repo	https://github.com/kwrobel-nlp/precision-reduction
Framework	none

An attentive neural architecture for joint segmentation and parsing and its application to real estate ads


Title	An attentive neural architecture for joint segmentation and parsing and its application to real estate ads
Authors	Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder
Abstract	In processing human produced text using natural language processing (NLP) techniques, two fundamental subtasks that arise are (i) segmentation of the plain text into meaningful subunits (e.g., entities), and (ii) dependency parsing, to establish relations between subunits. In this paper, we develop a relatively simple and effective neural joint model that performs both segmentation and dependency parsing together, instead of one after the other as in most state-of-the-art works. We will focus in particular on the real estate ad setting, aiming to convert an ad to a structured description, which we name property tree, comprising the tasks of (1) identifying important entities of a property (e.g., rooms) from classifieds and (2) structuring them into a tree format. In this work, we propose a new joint model that is able to tackle the two tasks simultaneously and construct the property tree by (i) avoiding the error propagation that would arise from the subtasks one after the other in a pipelined fashion, and (ii) exploiting the interactions between the subtasks. For this purpose, we perform an extensive comparative study of the pipeline methods and the new proposed joint model, reporting an improvement of over three percentage points in the overall edge F1 score of the property tree. Also, we propose attention methods, to encourage our model to focus on salient tokens during the construction of the property tree. Thus we experimentally demonstrate the usefulness of attentive neural architectures for the proposed joint model, showcasing a further improvement of two percentage points in edge F1 score for our application.
Tasks	Dependency Parsing
Published	2017-09-27
URL	http://arxiv.org/abs/1709.09590v2
PDF	http://arxiv.org/pdf/1709.09590v2.pdf
PWC	https://paperswithcode.com/paper/an-attentive-neural-architecture-for-joint
Repo	https://github.com/bekou/ad_data
Framework	none

Logic Programming Petri Nets


Title	Logic Programming Petri Nets
Authors	Giovanni Sileno
Abstract	With the purpose of modeling, specifying and reasoning in an integrated fashion with procedural and declarative aspects (both commonly present in cases or scenarios), the paper introduces Logic Programming Petri Nets (LPPN), an extension to the Petri Net notation providing an interface to logic programming constructs. Two semantics are presented. First, a hybrid operational semantics that separates the process component, treated with Petri nets, from the constraint/terminological component, treated with Answer Set Programming (ASP). Second, a denotational semantics maps the notation to ASP fully, via Event Calculus. These two alternative specifications enable a preliminary evaluation in terms of reasoning efficiency.
Tasks
Published	2017-01-26
URL	http://arxiv.org/abs/1701.07657v1
PDF	http://arxiv.org/pdf/1701.07657v1.pdf
PWC	https://paperswithcode.com/paper/logic-programming-petri-nets
Repo	https://github.com/s1l3n0/pypneu
Framework	none

Learning Combinatorial Optimization Algorithms over Graphs


Title	Learning Combinatorial Optimization Algorithms over Graphs
Authors	Hanjun Dai, Elias B. Khalil, Yuyu Zhang, Bistra Dilkina, Le Song
Abstract	The design of good heuristics or approximation algorithms for NP-hard combinatorial optimization problems often requires significant specialized knowledge and trial-and-error. Can we automate this challenging, tedious process, and learn the algorithms instead? In many real-world applications, it is typically the case that the same optimization problem is solved again and again on a regular basis, maintaining the same problem structure but differing in the data. This provides an opportunity for learning heuristic algorithms that exploit the structure of such recurring problems. In this paper, we propose a unique combination of reinforcement learning and graph embedding to address this challenge. The learned greedy policy behaves like a meta-algorithm that incrementally constructs a solution, and the action is determined by the output of a graph embedding network capturing the current state of the solution. We show that our framework can be applied to a diverse range of optimization problems over graphs, and learns effective algorithms for the Minimum Vertex Cover, Maximum Cut and Traveling Salesman problems.
Tasks	Combinatorial Optimization, Graph Embedding
Published	2017-04-05
URL	http://arxiv.org/abs/1704.01665v4
PDF	http://arxiv.org/pdf/1704.01665v4.pdf
PWC	https://paperswithcode.com/paper/learning-combinatorial-optimization
Repo	https://github.com/HENTAIBOY/S2V-DQN_pytorch
Framework	pytorch

Oversampling for Imbalanced Learning Based on K-Means and SMOTE


Title	Oversampling for Imbalanced Learning Based on K-Means and SMOTE
Authors	Felix Last, Georgios Douzas, Fernando Bacao
Abstract	Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versatile than modifications to the classification algorithm. Such techniques, called oversamplers, modify the training data, allowing any classifier to be used with class-imbalanced datasets. Many algorithms have been proposed for this task, but most are complex and tend to generate unnecessary noise. This work presents a simple and effective oversampling method based on k-means clustering and SMOTE oversampling, which avoids the generation of noise and effectively overcomes imbalances between and within classes. Empirical results of extensive experiments with 71 datasets show that training data oversampled with the proposed method improves classification results. Moreover, k-means SMOTE consistently outperforms other popular oversampling methods. An implementation is made available in the python programming language.
Tasks	Data-to-Text Generation
Published	2017-11-02
URL	http://arxiv.org/abs/1711.00837v2
PDF	http://arxiv.org/pdf/1711.00837v2.pdf
PWC	https://paperswithcode.com/paper/oversampling-for-imbalanced-learning-based-on
Repo	https://github.com/AlgoWit/publications
Framework	none

Class-Weighted Convolutional Features for Visual Instance Search


Title	Class-Weighted Convolutional Features for Visual Instance Search
Authors	Albert Jimenez, Jose M. Alvarez, Xavier Giro-i-Nieto
Abstract	Image retrieval in realistic scenarios targets large dynamic datasets of unlabeled images. In these cases, training or fine-tuning a model every time new images are added to the database is neither efficient nor scalable. Convolutional neural networks trained for image classification over large datasets have been proven effective feature extractors for image retrieval. The most successful approaches are based on encoding the activations of convolutional layers, as they convey the image spatial information. In this paper, we go beyond this spatial information and propose a local-aware encoding of convolutional features based on semantic information predicted in the target image. To this end, we obtain the most discriminative regions of an image using Class Activation Maps (CAMs). CAMs are based on the knowledge contained in the network and therefore, our approach, has the additional advantage of not requiring external information. In addition, we use CAMs to generate object proposals during an unsupervised re-ranking stage after a first fast search. Our experiments on two public available datasets for instance retrieval, Oxford5k and Paris6k, demonstrate the competitiveness of our approach outperforming the current state-of-the-art when using off-the-shelf models trained on ImageNet. The source code and model used in this paper are publicly available at http://imatge-upc.github.io/retrieval-2017-cam/.
Tasks	Image Retrieval, Instance Search
Published	2017-07-09
URL	http://arxiv.org/abs/1707.02581v1
PDF	http://arxiv.org/pdf/1707.02581v1.pdf
PWC	https://paperswithcode.com/paper/class-weighted-convolutional-features-for
Repo	https://github.com/zxy14120448/Summary
Framework	pytorch

Beating the World’s Best at Super Smash Bros. with Deep Reinforcement Learning


Title	Beating the World’s Best at Super Smash Bros. with Deep Reinforcement Learning
Authors	Vlad Firoiu, William F. Whitney, Joshua B. Tenenbaum
Abstract	There has been a recent explosion in the capabilities of game-playing artificial intelligence. Many classes of RL tasks, from Atari games to motor control to board games, are now solvable by fairly generic algorithms, based on deep learning, that learn to play from experience with minimal knowledge of the specific domain of interest. In this work, we will investigate the performance of these methods on Super Smash Bros. Melee (SSBM), a popular console fighting game. The SSBM environment has complex dynamics and partial observability, making it challenging for human and machine alike. The multi-player aspect poses an additional challenge, as the vast majority of recent advances in RL have focused on single-agent environments. Nonetheless, we will show that it is possible to train agents that are competitive against and even surpass human professionals, a new result for the multi-player video game setting.
Tasks	Atari Games, Board Games
Published	2017-02-21
URL	http://arxiv.org/abs/1702.06230v3
PDF	http://arxiv.org/pdf/1702.06230v3.pdf
PWC	https://paperswithcode.com/paper/beating-the-worlds-best-at-super-smash-bros
Repo	https://github.com/shaneallcroft/SSB64RLBot
Framework	none

Accurate Lung Segmentation via Network-Wise Training of Convolutional Networks


Title	Accurate Lung Segmentation via Network-Wise Training of Convolutional Networks
Authors	Sangheum Hwang, Sunggyun Park
Abstract	We introduce an accurate lung segmentation model for chest radiographs based on deep convolutional neural networks. Our model is based on atrous convolutional layers to increase the field-of-view of filters efficiently. To improve segmentation performances further, we also propose a multi-stage training strategy, network-wise training, which the current stage network is fed with both input images and the outputs from pre-stage network. It is shown that this strategy has an ability to reduce falsely predicted labels and produce smooth boundaries of lung fields. We evaluate the proposed model on a common benchmark dataset, JSRT, and achieve the state-of-the-art segmentation performances with much fewer model parameters.
Tasks
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00710v1
PDF	http://arxiv.org/pdf/1708.00710v1.pdf
PWC	https://paperswithcode.com/paper/accurate-lung-segmentation-via-network-wise
Repo	https://github.com/IlliaOvcharenko/lung-segmentation
Framework	pytorch

PQk-means: Billion-scale Clustering for Product-quantized Codes


Title	PQk-means: Billion-scale Clustering for Product-quantized Codes
Authors	Yusuke Matsui, Keisuke Ogaki, Toshihiko Yamasaki, Kiyoharu Aizawa
Abstract	Data clustering is a fundamental operation in data analysis. For handling large-scale data, the standard k-means clustering method is not only slow, but also memory-inefficient. We propose an efficient clustering method for billion-scale feature vectors, called PQk-means. By first compressing input vectors into short product-quantized (PQ) codes, PQk-means achieves fast and memory-efficient clustering, even for high-dimensional vectors. Similar to k-means, PQk-means repeats the assignment and update steps, both of which can be performed in the PQ-code domain. Experimental results show that even short-length (32 bit) PQ-codes can produce competitive results compared with k-means. This result is of practical importance for clustering in memory-restricted environments. Using the proposed PQk-means scheme, the clustering of one billion 128D SIFT features with K = 10^5 is achieved within 14 hours, using just 32 GB of memory consumption on a single computer.
Tasks
Published	2017-09-12
URL	http://arxiv.org/abs/1709.03708v1
PDF	http://arxiv.org/pdf/1709.03708v1.pdf
PWC	https://paperswithcode.com/paper/pqk-means-billion-scale-clustering-for
Repo	https://github.com/DwangoMediaVillage/pqkmeans
Framework	none

Learning how to explain neural networks: PatternNet and PatternAttribution


Title	Learning how to explain neural networks: PatternNet and PatternAttribution
Authors	Pieter-Jan Kindermans, Kristof T. Schütt, Maximilian Alber, Klaus-Robert Müller, Dumitru Erhan, Been Kim, Sven Dähne
Abstract	DeConvNet, Guided BackProp, LRP, were invented to better understand deep neural networks. We show that these methods do not produce the theoretically correct explanation for a linear model. Yet they are used on multi-layer networks with millions of parameters. This is a cause for concern since linear models are simple neural networks. We argue that explanation methods for neural nets should work reliably in the limit of simplicity, the linear models. Based on our analysis of linear models we propose a generalization that yields two explanation techniques (PatternNet and PatternAttribution) that are theoretically sound for linear models and produce improved explanations for deep networks.
Tasks
Published	2017-05-16
URL	http://arxiv.org/abs/1705.05598v2
PDF	http://arxiv.org/pdf/1705.05598v2.pdf
PWC	https://paperswithcode.com/paper/learning-how-to-explain-neural-networks
Repo	https://github.com/pikinder/nn-patterns
Framework	none