Paper Group AWR 25
Semantic Instance Segmentation with a Discriminative Loss Function. Large-Scale Plant Classification with Deep Neural Networks. Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields. The Perception-Distortion Tradeoff. Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics. Improving text classifica …
Semantic Instance Segmentation with a Discriminative Loss Function
Title | Semantic Instance Segmentation with a Discriminative Loss Function |
Authors | Bert De Brabandere, Davy Neven, Luc Van Gool |
Abstract | Semantic instance segmentation remains a challenging task. In this work we propose to tackle the problem with a discriminative loss function, operating at the pixel level, that encourages a convolutional network to produce a representation of the image that can easily be clustered into instances with a simple post-processing step. The loss function encourages the network to map each pixel to a point in feature space so that pixels belonging to the same instance lie close together while different instances are separated by a wide margin. Our approach of combining an off-the-shelf network with a principled loss function inspired by a metric learning objective is conceptually simple and distinct from recent efforts in instance segmentation. In contrast to previous works, our method does not rely on object proposals or recurrent mechanisms. A key contribution of our work is to demonstrate that such a simple setup without bells and whistles is effective and can perform on par with more complex methods. Moreover, we show that it does not suffer from some of the limitations of the popular detect-and-segment approaches. We achieve competitive performance on the Cityscapes and CVPPP leaf segmentation benchmarks. |
Tasks | Instance Segmentation, Lane Detection, Metric Learning, Multi-Human Parsing, Semantic Segmentation |
Published | 2017-08-08 |
URL | http://arxiv.org/abs/1708.02551v1 |
http://arxiv.org/pdf/1708.02551v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-instance-segmentation-with-a |
Repo | https://github.com/alicranck/instance-seg |
Framework | pytorch |
Large-Scale Plant Classification with Deep Neural Networks
Title | Large-Scale Plant Classification with Deep Neural Networks |
Authors | Ignacio Heredia |
Abstract | This paper discusses the potential of applying deep learning techniques for plant classification and its usage for citizen science in large-scale biodiversity monitoring. We show that plant classification using near state-of-the-art convolutional network architectures like ResNet50 achieves significant improvements in accuracy compared to the most widespread plant classification application in test sets composed of thousands of different species labels. We find that the predictions can be confidently used as a baseline classification in citizen science communities like iNaturalist (or its Spanish fork, Natusfera) which in turn can share their data with biodiversity portals like GBIF. |
Tasks | |
Published | 2017-06-12 |
URL | http://arxiv.org/abs/1706.03736v1 |
http://arxiv.org/pdf/1706.03736v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-plant-classification-with-deep |
Repo | https://github.com/indigo-dc/seeds-classification-theano |
Framework | none |
Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields
Title | Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields |
Authors | Thomas Unterthiner, Bernhard Nessler, Calvin Seward, Günter Klambauer, Martin Heusel, Hubert Ramsauer, Sepp Hochreiter |
Abstract | Generative adversarial networks (GANs) evolved into one of the most successful unsupervised techniques for generating realistic images. Even though it has recently been shown that GAN training converges, GAN models often end up in local Nash equilibria that are associated with mode collapse or otherwise fail to model the target distribution. We introduce Coulomb GANs, which pose the GAN learning problem as a potential field of charged particles, where generated samples are attracted to training set samples but repel each other. The discriminator learns a potential field while the generator decreases the energy by moving its samples along the vector (force) field determined by the gradient of the potential field. Through decreasing the energy, the GAN model learns to generate samples according to the whole target distribution and does not only cover some of its modes. We prove that Coulomb GANs possess only one Nash equilibrium which is optimal in the sense that the model distribution equals the target distribution. We show the efficacy of Coulomb GANs on a variety of image datasets. On LSUN and celebA, Coulomb GANs set a new state of the art and produce a previously unseen variety of different samples. |
Tasks | |
Published | 2017-08-29 |
URL | http://arxiv.org/abs/1708.08819v3 |
http://arxiv.org/pdf/1708.08819v3.pdf | |
PWC | https://paperswithcode.com/paper/coulomb-gans-provably-optimal-nash-equilibria |
Repo | https://github.com/bioinf-jku/coulomb_gan |
Framework | tf |
The Perception-Distortion Tradeoff
Title | The Perception-Distortion Tradeoff |
Authors | Yochai Blau, Tomer Michaeli |
Abstract | Image restoration algorithms are typically evaluated by some distortion measure (e.g. PSNR, SSIM, IFC, VIF) or by human opinion scores that quantify perceived perceptual quality. In this paper, we prove mathematically that distortion and perceptual quality are at odds with each other. Specifically, we study the optimal probability for correctly discriminating the outputs of an image restoration algorithm from real images. We show that as the mean distortion decreases, this probability must increase (indicating worse perceptual quality). As opposed to the common belief, this result holds true for any distortion measure, and is not only a problem of the PSNR or SSIM criteria. We also show that generative-adversarial-nets (GANs) provide a principled way to approach the perception-distortion bound. This constitutes theoretical support to their observed success in low-level vision tasks. Based on our analysis, we propose a new methodology for evaluating image restoration methods, and use it to perform an extensive comparison between recent super-resolution algorithms. |
Tasks | Image Restoration, Super-Resolution |
Published | 2017-11-16 |
URL | https://arxiv.org/abs/1711.06077v3 |
https://arxiv.org/pdf/1711.06077v3.pdf | |
PWC | https://paperswithcode.com/paper/the-perception-distortion-tradeoff |
Repo | https://github.com/roimehrez/PIRM2018 |
Framework | none |
Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics
Title | Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics |
Authors | Linfeng Zhang, Jiequn Han, Han Wang, Roberto Car, Weinan E |
Abstract | We introduce a scheme for molecular simulations, the Deep Potential Molecular Dynamics (DeePMD) method, based on a many-body potential and interatomic forces generated by a carefully crafted deep neural network trained with ab initio data. The neural network model preserves all the natural symmetries in the problem. It is “first principle-based” in the sense that there are no ad hoc components aside from the network model. We show that the proposed scheme provides an efficient and accurate protocol in a variety of systems, including bulk materials and molecules. In all these cases, DeePMD gives results that are essentially indistinguishable from the original data, at a cost that scales linearly with system size. |
Tasks | |
Published | 2017-07-30 |
URL | http://arxiv.org/abs/1707.09571v2 |
http://arxiv.org/pdf/1707.09571v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-potential-molecular-dynamics-a-scalable |
Repo | https://github.com/google/differentiable-atomistic-potentials |
Framework | tf |
Improving text classification with vectors of reduced precision
Title | Improving text classification with vectors of reduced precision |
Authors | Krzysztof Wróbel, Maciej Wielgosz, Marcin Pietroń, Michał Karwatowski, Aleksander Smywiński-Pohl |
Abstract | This paper presents the analysis of the impact of a floating-point number precision reduction on the quality of text classification. The precision reduction of the vectors representing the data (e.g. TF-IDF representation in our case) allows for a decrease of computing time and memory footprint on dedicated hardware platforms. The impact of precision reduction on the classification quality was performed on 5 corpora, using 4 different classifiers. Also, dimensionality reduction was taken into account. Results indicate that the precision reduction improves classification accuracy for most cases (up to 25% of error reduction). In general, the reduction from 64 to 4 bits gives the best scores and ensures that the results will not be worse than with the full floating-point representation. |
Tasks | Dimensionality Reduction, Text Classification |
Published | 2017-06-20 |
URL | http://arxiv.org/abs/1706.06363v1 |
http://arxiv.org/pdf/1706.06363v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-text-classification-with-vectors-of |
Repo | https://github.com/kwrobel-nlp/precision-reduction |
Framework | none |
An attentive neural architecture for joint segmentation and parsing and its application to real estate ads
Title | An attentive neural architecture for joint segmentation and parsing and its application to real estate ads |
Authors | Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder |
Abstract | In processing human produced text using natural language processing (NLP) techniques, two fundamental subtasks that arise are (i) segmentation of the plain text into meaningful subunits (e.g., entities), and (ii) dependency parsing, to establish relations between subunits. In this paper, we develop a relatively simple and effective neural joint model that performs both segmentation and dependency parsing together, instead of one after the other as in most state-of-the-art works. We will focus in particular on the real estate ad setting, aiming to convert an ad to a structured description, which we name property tree, comprising the tasks of (1) identifying important entities of a property (e.g., rooms) from classifieds and (2) structuring them into a tree format. In this work, we propose a new joint model that is able to tackle the two tasks simultaneously and construct the property tree by (i) avoiding the error propagation that would arise from the subtasks one after the other in a pipelined fashion, and (ii) exploiting the interactions between the subtasks. For this purpose, we perform an extensive comparative study of the pipeline methods and the new proposed joint model, reporting an improvement of over three percentage points in the overall edge F1 score of the property tree. Also, we propose attention methods, to encourage our model to focus on salient tokens during the construction of the property tree. Thus we experimentally demonstrate the usefulness of attentive neural architectures for the proposed joint model, showcasing a further improvement of two percentage points in edge F1 score for our application. |
Tasks | Dependency Parsing |
Published | 2017-09-27 |
URL | http://arxiv.org/abs/1709.09590v2 |
http://arxiv.org/pdf/1709.09590v2.pdf | |
PWC | https://paperswithcode.com/paper/an-attentive-neural-architecture-for-joint |
Repo | https://github.com/bekou/ad_data |
Framework | none |
Logic Programming Petri Nets
Title | Logic Programming Petri Nets |
Authors | Giovanni Sileno |
Abstract | With the purpose of modeling, specifying and reasoning in an integrated fashion with procedural and declarative aspects (both commonly present in cases or scenarios), the paper introduces Logic Programming Petri Nets (LPPN), an extension to the Petri Net notation providing an interface to logic programming constructs. Two semantics are presented. First, a hybrid operational semantics that separates the process component, treated with Petri nets, from the constraint/terminological component, treated with Answer Set Programming (ASP). Second, a denotational semantics maps the notation to ASP fully, via Event Calculus. These two alternative specifications enable a preliminary evaluation in terms of reasoning efficiency. |
Tasks | |
Published | 2017-01-26 |
URL | http://arxiv.org/abs/1701.07657v1 |
http://arxiv.org/pdf/1701.07657v1.pdf | |
PWC | https://paperswithcode.com/paper/logic-programming-petri-nets |
Repo | https://github.com/s1l3n0/pypneu |
Framework | none |
Learning Combinatorial Optimization Algorithms over Graphs
Title | Learning Combinatorial Optimization Algorithms over Graphs |
Authors | Hanjun Dai, Elias B. Khalil, Yuyu Zhang, Bistra Dilkina, Le Song |
Abstract | The design of good heuristics or approximation algorithms for NP-hard combinatorial optimization problems often requires significant specialized knowledge and trial-and-error. Can we automate this challenging, tedious process, and learn the algorithms instead? In many real-world applications, it is typically the case that the same optimization problem is solved again and again on a regular basis, maintaining the same problem structure but differing in the data. This provides an opportunity for learning heuristic algorithms that exploit the structure of such recurring problems. In this paper, we propose a unique combination of reinforcement learning and graph embedding to address this challenge. The learned greedy policy behaves like a meta-algorithm that incrementally constructs a solution, and the action is determined by the output of a graph embedding network capturing the current state of the solution. We show that our framework can be applied to a diverse range of optimization problems over graphs, and learns effective algorithms for the Minimum Vertex Cover, Maximum Cut and Traveling Salesman problems. |
Tasks | Combinatorial Optimization, Graph Embedding |
Published | 2017-04-05 |
URL | http://arxiv.org/abs/1704.01665v4 |
http://arxiv.org/pdf/1704.01665v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-combinatorial-optimization |
Repo | https://github.com/HENTAIBOY/S2V-DQN_pytorch |
Framework | pytorch |
Oversampling for Imbalanced Learning Based on K-Means and SMOTE
Title | Oversampling for Imbalanced Learning Based on K-Means and SMOTE |
Authors | Felix Last, Georgios Douzas, Fernando Bacao |
Abstract | Learning from class-imbalanced data continues to be a common and challenging problem in supervised learning as standard classification algorithms are designed to handle balanced class distributions. While different strategies exist to tackle this problem, methods which generate artificial data to achieve a balanced class distribution are more versatile than modifications to the classification algorithm. Such techniques, called oversamplers, modify the training data, allowing any classifier to be used with class-imbalanced datasets. Many algorithms have been proposed for this task, but most are complex and tend to generate unnecessary noise. This work presents a simple and effective oversampling method based on k-means clustering and SMOTE oversampling, which avoids the generation of noise and effectively overcomes imbalances between and within classes. Empirical results of extensive experiments with 71 datasets show that training data oversampled with the proposed method improves classification results. Moreover, k-means SMOTE consistently outperforms other popular oversampling methods. An implementation is made available in the python programming language. |
Tasks | Data-to-Text Generation |
Published | 2017-11-02 |
URL | http://arxiv.org/abs/1711.00837v2 |
http://arxiv.org/pdf/1711.00837v2.pdf | |
PWC | https://paperswithcode.com/paper/oversampling-for-imbalanced-learning-based-on |
Repo | https://github.com/AlgoWit/publications |
Framework | none |
Class-Weighted Convolutional Features for Visual Instance Search
Title | Class-Weighted Convolutional Features for Visual Instance Search |
Authors | Albert Jimenez, Jose M. Alvarez, Xavier Giro-i-Nieto |
Abstract | Image retrieval in realistic scenarios targets large dynamic datasets of unlabeled images. In these cases, training or fine-tuning a model every time new images are added to the database is neither efficient nor scalable. Convolutional neural networks trained for image classification over large datasets have been proven effective feature extractors for image retrieval. The most successful approaches are based on encoding the activations of convolutional layers, as they convey the image spatial information. In this paper, we go beyond this spatial information and propose a local-aware encoding of convolutional features based on semantic information predicted in the target image. To this end, we obtain the most discriminative regions of an image using Class Activation Maps (CAMs). CAMs are based on the knowledge contained in the network and therefore, our approach, has the additional advantage of not requiring external information. In addition, we use CAMs to generate object proposals during an unsupervised re-ranking stage after a first fast search. Our experiments on two public available datasets for instance retrieval, Oxford5k and Paris6k, demonstrate the competitiveness of our approach outperforming the current state-of-the-art when using off-the-shelf models trained on ImageNet. The source code and model used in this paper are publicly available at http://imatge-upc.github.io/retrieval-2017-cam/. |
Tasks | Image Retrieval, Instance Search |
Published | 2017-07-09 |
URL | http://arxiv.org/abs/1707.02581v1 |
http://arxiv.org/pdf/1707.02581v1.pdf | |
PWC | https://paperswithcode.com/paper/class-weighted-convolutional-features-for |
Repo | https://github.com/zxy14120448/Summary |
Framework | pytorch |
Beating the World’s Best at Super Smash Bros. with Deep Reinforcement Learning
Title | Beating the World’s Best at Super Smash Bros. with Deep Reinforcement Learning |
Authors | Vlad Firoiu, William F. Whitney, Joshua B. Tenenbaum |
Abstract | There has been a recent explosion in the capabilities of game-playing artificial intelligence. Many classes of RL tasks, from Atari games to motor control to board games, are now solvable by fairly generic algorithms, based on deep learning, that learn to play from experience with minimal knowledge of the specific domain of interest. In this work, we will investigate the performance of these methods on Super Smash Bros. Melee (SSBM), a popular console fighting game. The SSBM environment has complex dynamics and partial observability, making it challenging for human and machine alike. The multi-player aspect poses an additional challenge, as the vast majority of recent advances in RL have focused on single-agent environments. Nonetheless, we will show that it is possible to train agents that are competitive against and even surpass human professionals, a new result for the multi-player video game setting. |
Tasks | Atari Games, Board Games |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06230v3 |
http://arxiv.org/pdf/1702.06230v3.pdf | |
PWC | https://paperswithcode.com/paper/beating-the-worlds-best-at-super-smash-bros |
Repo | https://github.com/shaneallcroft/SSB64RLBot |
Framework | none |
Accurate Lung Segmentation via Network-Wise Training of Convolutional Networks
Title | Accurate Lung Segmentation via Network-Wise Training of Convolutional Networks |
Authors | Sangheum Hwang, Sunggyun Park |
Abstract | We introduce an accurate lung segmentation model for chest radiographs based on deep convolutional neural networks. Our model is based on atrous convolutional layers to increase the field-of-view of filters efficiently. To improve segmentation performances further, we also propose a multi-stage training strategy, network-wise training, which the current stage network is fed with both input images and the outputs from pre-stage network. It is shown that this strategy has an ability to reduce falsely predicted labels and produce smooth boundaries of lung fields. We evaluate the proposed model on a common benchmark dataset, JSRT, and achieve the state-of-the-art segmentation performances with much fewer model parameters. |
Tasks | |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00710v1 |
http://arxiv.org/pdf/1708.00710v1.pdf | |
PWC | https://paperswithcode.com/paper/accurate-lung-segmentation-via-network-wise |
Repo | https://github.com/IlliaOvcharenko/lung-segmentation |
Framework | pytorch |
PQk-means: Billion-scale Clustering for Product-quantized Codes
Title | PQk-means: Billion-scale Clustering for Product-quantized Codes |
Authors | Yusuke Matsui, Keisuke Ogaki, Toshihiko Yamasaki, Kiyoharu Aizawa |
Abstract | Data clustering is a fundamental operation in data analysis. For handling large-scale data, the standard k-means clustering method is not only slow, but also memory-inefficient. We propose an efficient clustering method for billion-scale feature vectors, called PQk-means. By first compressing input vectors into short product-quantized (PQ) codes, PQk-means achieves fast and memory-efficient clustering, even for high-dimensional vectors. Similar to k-means, PQk-means repeats the assignment and update steps, both of which can be performed in the PQ-code domain. Experimental results show that even short-length (32 bit) PQ-codes can produce competitive results compared with k-means. This result is of practical importance for clustering in memory-restricted environments. Using the proposed PQk-means scheme, the clustering of one billion 128D SIFT features with K = 10^5 is achieved within 14 hours, using just 32 GB of memory consumption on a single computer. |
Tasks | |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03708v1 |
http://arxiv.org/pdf/1709.03708v1.pdf | |
PWC | https://paperswithcode.com/paper/pqk-means-billion-scale-clustering-for |
Repo | https://github.com/DwangoMediaVillage/pqkmeans |
Framework | none |
Learning how to explain neural networks: PatternNet and PatternAttribution
Title | Learning how to explain neural networks: PatternNet and PatternAttribution |
Authors | Pieter-Jan Kindermans, Kristof T. Schütt, Maximilian Alber, Klaus-Robert Müller, Dumitru Erhan, Been Kim, Sven Dähne |
Abstract | DeConvNet, Guided BackProp, LRP, were invented to better understand deep neural networks. We show that these methods do not produce the theoretically correct explanation for a linear model. Yet they are used on multi-layer networks with millions of parameters. This is a cause for concern since linear models are simple neural networks. We argue that explanation methods for neural nets should work reliably in the limit of simplicity, the linear models. Based on our analysis of linear models we propose a generalization that yields two explanation techniques (PatternNet and PatternAttribution) that are theoretically sound for linear models and produce improved explanations for deep networks. |
Tasks | |
Published | 2017-05-16 |
URL | http://arxiv.org/abs/1705.05598v2 |
http://arxiv.org/pdf/1705.05598v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-how-to-explain-neural-networks |
Repo | https://github.com/pikinder/nn-patterns |
Framework | none |