April 3, 2020

3291 words 16 mins read

Paper Group AWR 5

Paper Group AWR 5

Gryffin: An algorithm for Bayesian optimization for categorical variables informed by physical intuition with applications to chemistry. Training Binary Neural Networks with Real-to-Binary Convolutions. ForecastNet: A Time-Variant Deep Feed-Forward Neural Network Architecture for Multi-Step-Ahead Time-Series Forecasting. TACO: Trash Annotations in …

Gryffin: An algorithm for Bayesian optimization for categorical variables informed by physical intuition with applications to chemistry

Title Gryffin: An algorithm for Bayesian optimization for categorical variables informed by physical intuition with applications to chemistry
Authors Florian Häse, Loïc M. Roch, Alán Aspuru-Guzik
Abstract Designing functional molecules and advanced materials requires complex interdependent design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting categorical variables like catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters despite the urge to devise efficient strategies for the selection of categorical variables to substantially accelerate scientific discovery. We introduce Gryffin, as a general purpose optimization framework for the autonomous selection of categorical variables driven by expert knowledge. Gryffin augments Bayesian optimization with kernel density estimation using smooth approximations to categorical distributions. Leveraging domain knowledge from physicochemical descriptors to characterize categorical options, Gryffin can significantly accelerate the search for promising molecules and materials. Gryffin can further highlight relevant correlations between the provided descriptors to inspire physical insights and foster scientific intuition. In addition to comprehensive benchmarks, we demonstrate the capabilities and performance of Gryffin on three examples in materials science and chemistry: (i) the discovery of non-fullerene acceptors for organic solar cells, (ii) the design of hybrid organic-inorganic perovskites for light-harvesting, and (iii) the identification of ligands and process parameters for Suzuki-Miyaura reactions. Our observations suggest that Gryffin, in its simplest form without descriptors, constitutes a competitive categorical optimizer compared to state-of-the-art approaches. However, when leveraging domain knowledge provided via descriptors, Gryffin can optimize at considerable higher rates and refine this domain knowledge to spark scientific understanding.
Tasks Density Estimation
Published 2020-03-26
URL https://arxiv.org/abs/2003.12127v1
PDF https://arxiv.org/pdf/2003.12127v1.pdf
PWC https://paperswithcode.com/paper/gryffin-an-algorithm-for-bayesian
Repo https://github.com/aspuru-guzik-group/gryffin
Framework none

Training Binary Neural Networks with Real-to-Binary Convolutions

Title Training Binary Neural Networks with Real-to-Binary Convolutions
Authors Brais Martinez, Jing Yang, Adrian Bulat, Georgios Tzimiropoulos
Abstract This paper shows how to train binary networks to within a few percent points ($\sim 3-5 %$) of the full precision counterpart. We first show how to build a strong baseline, which already achieves state-of-the-art accuracy, by combining recently proposed advances and carefully adjusting the optimization procedure. Secondly, we show that by attempting to minimize the discrepancy between the output of the binary and the corresponding real-valued convolution, additional significant accuracy gains can be obtained. We materialize this idea in two complementary ways: (1) with a loss function, during training, by matching the spatial attention maps computed at the output of the binary and real-valued convolutions, and (2) in a data-driven manner, by using the real-valued activations, available during inference prior to the binarization process, for re-scaling the activations right after the binary convolution. Finally, we show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet and reduces the gap to its real-valued counterpart to less than 3% and 5% top-1 accuracy on CIFAR-100 and ImageNet respectively when using a ResNet-18 architecture. Code available at https://github.com/brais-martinez/real2binary.
Published 2020-03-25
URL https://arxiv.org/abs/2003.11535v1
PDF https://arxiv.org/pdf/2003.11535v1.pdf
PWC https://paperswithcode.com/paper/training-binary-neural-networks-with-real-to-1
Repo https://github.com/brais-martinez/real2binary
Framework none

ForecastNet: A Time-Variant Deep Feed-Forward Neural Network Architecture for Multi-Step-Ahead Time-Series Forecasting

Title ForecastNet: A Time-Variant Deep Feed-Forward Neural Network Architecture for Multi-Step-Ahead Time-Series Forecasting
Authors Joel Janek Dabrowski, YiFan Zhang, Ashfaqur Rahman
Abstract Recurrent and convolutional neural networks are the most common architectures used for time series forecasting in deep learning literature. These networks use parameter sharing by repeating a set of fixed architectures with fixed parameters over time or space. The result is that the overall architecture is time-invariant (shift-invariant in the spatial domain) or stationary. We argue that time-invariance can reduce the capacity to perform multi-step-ahead forecasting, where modelling the dynamics at a range of scales and resolutions is required. We propose ForecastNet which uses a deep feed-forward architecture to provide a time-variant model. An additional novelty of ForecastNet is interleaved outputs, which we show assist in mitigating vanishing gradients. ForecastNet is demonstrated to outperform statistical and deep learning benchmark models on several datasets.
Tasks Time Series, Time Series Forecasting
Published 2020-02-11
URL https://arxiv.org/abs/2002.04155v1
PDF https://arxiv.org/pdf/2002.04155v1.pdf
PWC https://paperswithcode.com/paper/forecastnet-a-time-variant-deep-feed-forward
Repo https://github.com/jjdabr/forecastNet
Framework pytorch

TACO: Trash Annotations in Context for Litter Detection

Title TACO: Trash Annotations in Context for Litter Detection
Authors Pedro F Proença, Pedro Simões
Abstract TACO is an open image dataset for litter detection and segmentation, which is growing through crowdsourcing. Firstly, this paper describes this dataset and the tools developed to support it. Secondly, we report instance segmentation performance using Mask R-CNN on the current version of TACO. Despite its small size (1500 images and 4784 annotations), our results are promising on this challenging problem. However, to achieve satisfactory trash detection in the wild for deployment, TACO still needs much more manual annotations. These can be contributed using: http://tacodataset.org/
Tasks Instance Segmentation, Semantic Segmentation
Published 2020-03-16
URL https://arxiv.org/abs/2003.06975v2
PDF https://arxiv.org/pdf/2003.06975v2.pdf
PWC https://paperswithcode.com/paper/taco-trash-annotations-in-context-for-litter
Repo https://github.com/pedropro/TACO
Framework none

Hybrid Deep Embedding for Recommendations with Dynamic Aspect-Level Explanations

Title Hybrid Deep Embedding for Recommendations with Dynamic Aspect-Level Explanations
Authors Huanrui Luo, Ning Yang, Philip S. Yu
Abstract Explainable recommendation is far from being well solved partly due to three challenges. The first is the personalization of preference learning, which requires that different items/users have different contributions to the learning of user preference or item quality. The second one is dynamic explanation, which is crucial for the timeliness of recommendation explanations. The last one is the granularity of explanations. In practice, aspect-level explanations are more persuasive than item-level or user-level ones. In this paper, to address these challenges simultaneously, we propose a novel model called Hybrid Deep Embedding (HDE) for aspect-based explainable recommendations, which can make recommendations with dynamic aspect-level explanations. The main idea of HDE is to learn the dynamic embeddings of users and items for rating prediction and the dynamic latent aspect preference/quality vectors for the generation of aspect-level explanations, through fusion of the dynamic implicit feedbacks extracted from reviews and the attentive user-item interactions. Particularly, as the aspect preference/quality of users/items is learned automatically, HDE is able to capture the impact of aspects that are not mentioned in reviews of a user or an item. The extensive experiments conducted on real datasets verify the recommending performance and explainability of HDE. The source code of our work is available at \url{https://github.com/lola63/HDE-Python}
Published 2020-01-18
URL https://arxiv.org/abs/2001.10341v1
PDF https://arxiv.org/pdf/2001.10341v1.pdf
PWC https://paperswithcode.com/paper/hybrid-deep-embedding-for-recommendations
Repo https://github.com/lola63/HDE-Python
Framework none

Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers

Title Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers
Authors Michal Rolínek, Paul Swoboda, Dominik Zietlow, Anselm Paulus, Vít Musil, Georg Martius
Abstract Building on recent progress at the intersection of combinatorial optimization and deep learning, we propose an end-to-end trainable architecture for deep graph matching that contains unmodified combinatorial solvers. Using the presence of heavily optimized combinatorial solvers together with some improvements in architecture design, we advance state-of-the-art on deep graph matching benchmarks for keypoint correspondence. In addition, we highlight the conceptual advantages of incorporating solvers into deep learning architectures, such as the possibility of post-processing with a strong multi-graph matching solver or the indifference to changes in the training setting. Finally, we propose two new challenging experimental setups.
Tasks Combinatorial Optimization, Graph Matching
Published 2020-03-25
URL https://arxiv.org/abs/2003.11657v1
PDF https://arxiv.org/pdf/2003.11657v1.pdf
PWC https://paperswithcode.com/paper/deep-graph-matching-via-blackbox
Repo https://github.com/martius-lab/blackbox-backprop
Framework pytorch

A Unified Object Motion and Affinity Model for Online Multi-Object Tracking

Title A Unified Object Motion and Affinity Model for Online Multi-Object Tracking
Authors Junbo Yin, Wenguan Wang, Qinghao Meng, Ruigang Yang, Jianbing Shen
Abstract Current popular online multi-object tracking (MOT) solutions apply single object trackers (SOTs) to capture object motions, while often requiring an extra affinity network to associate objects, especially for the occluded ones. This brings extra computational overhead due to repetitive feature extraction for SOT and affinity computation. Meanwhile, the model size of the sophisticated affinity network is usually non-trivial. In this paper, we propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA, in order to learn a compact feature that is discriminative for both object motion and affinity measure. In particular, UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning. Such design brings advantages of improved computation efficiency, low memory requirement and simplified training procedure. In addition, we equip our model with a task-specific attention module, which is used to boost task-aware feature learning. The proposed UMA can be easily trained end-to-end, and is elegant - requiring only one training stage. Experimental results show that it achieves promising performance on several MOT Challenge benchmarks.
Tasks Metric Learning, Multi-Object Tracking, Multi-Task Learning, Object Tracking, Online Multi-Object Tracking
Published 2020-03-25
URL https://arxiv.org/abs/2003.11291v1
PDF https://arxiv.org/pdf/2003.11291v1.pdf
PWC https://paperswithcode.com/paper/a-unified-object-motion-and-affinity-model
Repo https://github.com/yinjunbo/UMA-MOT
Framework none

AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction

Title AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction
Authors Bin Liu, Chenxu Zhu, Guilin Li, Weinan Zhang, Jincai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, Yong Yu
Abstract Learning effective feature interactions is crucial for click-through rate (CTR) prediction tasks in recommender systems. In most of the existing deep learning models, feature interactions are either manually designed or simply enumerated. However, enumerating all feature interactions brings large memory and computation cost. Even worse, useless interactions may introduce unnecessary noise and complicate the training process. In this work, we propose a two-stage algorithm called Automatic Feature Interaction Selection (AutoFIS). AutoFIS can automatically identify all the important feature interactions for factorization models with just the computational cost equivalent to training the target model to convergence. In the \emph{search stage}, instead of searching over a discrete set of candidate feature interactions, we relax the choices to be continuous by introducing the architecture parameters. By implementing a regularized optimizer over the architecture parameters, the model can automatically identify and remove the redundant feature interactions during the training process of the model. In the \emph{re-train stage}, we keep the architecture parameters serving as an attention unit to further boost the performance. Offline experiments on three large-scale datasets (two public benchmarks, one private) demonstrate that the proposed AutoFIS can significantly improve various FM based models. AutoFIS has been deployed onto the training platform of Huawei App Store recommendation service, where a 10-day online A/B test demonstrated that AutoFIS improved the DeepFM model by 20.3% and 20.1% in terms of CTR and CVR respectively.
Tasks Click-Through Rate Prediction, Recommendation Systems
Published 2020-03-25
URL https://arxiv.org/abs/2003.11235v2
PDF https://arxiv.org/pdf/2003.11235v2.pdf
PWC https://paperswithcode.com/paper/autofis-automatic-feature-interaction
Repo https://github.com/zhuchenxv/AutoFIS
Framework none

Spatio-Temporal Handwriting Imitation

Title Spatio-Temporal Handwriting Imitation
Authors Martin Mayr, Martin Stumpf, Anguelos Nikolaou, Mathias Seuret, Andreas Maier, Vincent Christlein
Abstract Most people think that their handwriting is unique and cannot be imitated by machines, especially not using completely new content. Current cursive handwriting synthesis is visually limited or needs user interaction. We show that subdividing the process into smaller subtasks makes it possible to imitate someone’s handwriting with a high chance to be visually indistinguishable for humans. Therefore, a given handwritten sample will be used as the target style. This sample is transferred to an online sequence. Then, a method for online handwriting synthesis is used to produce a new realistic-looking text primed with the online input sequence. This new text is then rendered and style-adapted to the input pen. We show the effectiveness of the pipeline by generating in- and out-of-vocabulary handwritten samples that are validated in a comprehensive user study. Additionally, we show that also a typical writer identification system can partially be fooled by the created fake handwritings.
Published 2020-03-24
URL https://arxiv.org/abs/2003.10593v1
PDF https://arxiv.org/pdf/2003.10593v1.pdf
PWC https://paperswithcode.com/paper/spatio-temporal-handwriting-imitation
Repo https://github.com/M4rt1nM4yr/spatio-temporal_handwriting_imitation
Framework pytorch

Motion-Attentive Transition for Zero-Shot Video Object Segmentation

Title Motion-Attentive Transition for Zero-Shot Video Object Segmentation
Authors Tianfei Zhou, Shunzhou Wang, Yi Zhou, Yazhou Yao, Jianwu Li, Ling Shao
Abstract In this paper, we present a novel Motion-Attentive Transition Network (MATNet) for zero-shot video object segmentation, which provides a new way of leveraging motion information to reinforce spatio-temporal object representation. An asymmetric attention block, called Motion-Attentive Transition (MAT), is designed within a two-stream encoder, which transforms appearance features into motion-attentive representations at each convolutional stage. In this way, the encoder becomes deeply interleaved, allowing for closely hierarchical interactions between object motion and appearance. This is superior to the typical two-stream architecture, which treats motion and appearance separately in each stream and often suffers from overfitting to appearance information. Additionally, a bridge network is proposed to obtain a compact, discriminative and scale-sensitive representation for multi-level encoder features, which is further fed into a decoder to achieve segmentation results. Extensive experiments on three challenging public benchmarks (i.e. DAVIS-16, FBMS and Youtube-Objects) show that our model achieves compelling performance against the state-of-the-arts.
Tasks Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published 2020-03-09
URL https://arxiv.org/abs/2003.04253v2
PDF https://arxiv.org/pdf/2003.04253v2.pdf
PWC https://paperswithcode.com/paper/motion-attentive-transition-for-zero-shot
Repo https://github.com/tfzhou/MATNet
Framework pytorch

A Probabilistic Formulation of Unsupervised Text Style Transfer

Title A Probabilistic Formulation of Unsupervised Text Style Transfer
Authors Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick
Abstract We present a deep generative model for unsupervised text style transfer that unifies previously proposed non-generative techniques. Our probabilistic approach models non-parallel data from two domains as a partially observed parallel corpus. By hypothesizing a parallel latent sequence that generates each observed sequence, our model learns to transform sequences from one domain to another in a completely unsupervised fashion. In contrast with traditional generative sequence models (e.g. the HMM), our model makes few assumptions about the data it generates: it uses a recurrent language model as a prior and an encoder-decoder as a transduction distribution. While computation of marginal data likelihood is intractable in this model class, we show that amortized variational inference admits a practical surrogate. Further, by drawing connections between our variational objective and other recent unsupervised style transfer and machine translation techniques, we show how our probabilistic view can unify some known non-generative objectives such as backtranslation and adversarial loss. Finally, we demonstrate the effectiveness of our method on a wide range of unsupervised style transfer tasks, including sentiment transfer, formality transfer, word decipherment, author imitation, and related language translation. Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes. Further, we conduct experiments on a standard unsupervised machine translation task and find that our unified approach matches the current state-of-the-art.
Tasks Language Modelling, Machine Translation, Style Transfer, Text Style Transfer, Unsupervised Machine Translation
Published 2020-02-10
URL https://arxiv.org/abs/2002.03912v2
PDF https://arxiv.org/pdf/2002.03912v2.pdf
PWC https://paperswithcode.com/paper/a-probabilistic-formulation-of-unsupervised-1
Repo https://github.com/cindyxinyiwang/deep-latent-sequence-model
Framework pytorch

Learning@home: Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts

Title Learning@home: Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts
Authors Maksim Riabinin, Anton Gusev
Abstract Many recent breakthroughs in deep learning were achieved by training increasingly larger models on massive datasets. However, training such models can be prohibitively expensive. For instance, Megatron Language Model with 8.3B parameters was trained on a GPU cluster worth $25 million. As a result, most researchers cannot afford to train state of the art models and contribute to their development. Hypothetically, a researcher could crowdsource the training of large neural networks with thousands of regular PCs provided by volunteers. The raw computing power of ten thousand $2500 desktops dwarfs that of a $25M server pod, but one cannot utilize that power efficiently with conventional distributed training methods. In this work, we propose Learning@home: a neural network training paradigm designed to handle millions of poorly connected participants. We analyze the performance, reliability, and architectural constraints of this paradigm and compare it against existing distributed training techniques.
Tasks Language Modelling
Published 2020-02-10
URL https://arxiv.org/abs/2002.04013v1
PDF https://arxiv.org/pdf/2002.04013v1.pdf
PWC https://paperswithcode.com/paper/learninghome-crowdsourced-training-of-large
Repo https://github.com/mryab/learning-at-home
Framework pytorch

Adapted Center and Scale Prediction: More Stable and More Accurate

Title Adapted Center and Scale Prediction: More Stable and More Accurate
Authors Wenhao Wang
Abstract Pedestrian detection benefits from deep learning technology and gains rapid development in recent years. Most of detectors follow general object detection frame, i.e. default boxes and two-stage process. Recently, anchor-free and one-stage detectors have been introduced into this area. However, their accuracies are unsatisfactory. Therefore, in order to enjoy the simplicity of anchor-free detectors and the accuracy of two-stage ones simultaneously, we propose some adaptations based on a detector, Center and Scale Prediction(CSP). The main contributions of our paper are: (1) We improve the robustness of CSP and make it easier to train. (2) We propose a novel method to predict width, namely compressing width. (3) We achieve the second best performance on CityPersons benchmark, i.e. 9.3% log-average miss rate(MR) on reasonable set, 8.7% MR on partial set and 5.6% MR on bare set, which shows an anchor-free and one-stage detector can still have high accuracy. (4) We explore some capabilities of Switchable Normalization which are not mentioned in its original paper.
Tasks Object Detection, Pedestrian Detection
Published 2020-02-20
URL https://arxiv.org/abs/2002.09053v2
PDF https://arxiv.org/pdf/2002.09053v2.pdf
PWC https://paperswithcode.com/paper/adapted-center-and-scale-prediction-more
Repo https://github.com/WangWenhao0716/Adapted-Center-and-Scale-Prediction
Framework pytorch

Efficient Sentence Embedding via Semantic Subspace Analysis

Title Efficient Sentence Embedding via Semantic Subspace Analysis
Authors Bin Wang, Fenxiao Chen, Yuncheng Wang, C. -C. Jay Kuo
Abstract A novel sentence embedding method built upon semantic subspace analysis, called semantic subspace sentence embedding (S3E), is proposed in this work. Given the fact that word embeddings can capture semantic relationship while semantically similar words tend to form semantic groups in a high-dimensional embedding space, we develop a sentence representation scheme by analyzing semantic subspaces of its constituent words. Specifically, we construct a sentence model from two aspects. First, we represent words that lie in the same semantic group using the intra-group descriptor. Second, we characterize the interaction between multiple semantic groups with the inter-group descriptor. The proposed S3E method is evaluated on both textual similarity tasks and supervised tasks. Experimental results show that it offers comparable or better performance than the state-of-the-art. The complexity of our S3E method is also much lower than other parameterized models.
Tasks Sentence Embedding, Word Embeddings
Published 2020-02-22
URL https://arxiv.org/abs/2002.09620v2
PDF https://arxiv.org/pdf/2002.09620v2.pdf
PWC https://paperswithcode.com/paper/efficient-sentence-embedding-via-semantic
Repo https://github.com/BinWang28/Sentence-Embedding-S3E
Framework none

R2DE: a NLP approach to estimating IRT parameters of newly generated questions

Title R2DE: a NLP approach to estimating IRT parameters of newly generated questions
Authors Luca Benedetto, Andrea Cappelli, Roberto Turrin, Paolo Cremonesi
Abstract The main objective of exams consists in performing an assessment of students’ expertise on a specific subject. Such expertise, also referred to as skill or knowledge level, can then be leveraged in different ways (e.g., to assign a grade to the students, to understand whether a student might need some support, etc.). Similarly, the questions appearing in the exams have to be assessed in some way before being used to evaluate students. Standard approaches to questions’ assessment are either subjective (e.g., assessment by human experts) or introduce a long delay in the process of question generation (e.g., pretesting with real students). In this work we introduce R2DE (which is a Regressor for Difficulty and Discrimination Estimation), a model capable of assessing newly generated multiple-choice questions by looking at the text of the question and the text of the possible choices. In particular, it can estimate the difficulty and the discrimination of each question, as they are defined in Item Response Theory. We also present the results of extensive experiments we carried out on a real world large scale dataset coming from an e-learning platform, showing that our model can be used to perform an initial assessment of newly created questions and ease some of the problems that arise in question generation.
Tasks Question Generation
Published 2020-01-21
URL https://arxiv.org/abs/2001.07569v1
PDF https://arxiv.org/pdf/2001.07569v1.pdf
PWC https://paperswithcode.com/paper/r2de-a-nlp-approach-to-estimating-irt
Repo https://github.com/lucabenedetto/r2de-nlp-to-estimating-irt-parameters
Framework none
comments powered by Disqus