Paper Group AWR 77
Review-guided Helpful Answer Identification in E-commerce. Review, Analyze, and Design a Comprehensive Deep Reinforcement Learning Framework. Inference for Batched Bandits. Transformation Importance with Applications to Cosmology. Rethinking 1D-CNN for Time Series Classification: A Stronger Baseline. Pixel-Level Self-Paced Learning for Super-Resolu …
Review-guided Helpful Answer Identification in E-commerce
Title | Review-guided Helpful Answer Identification in E-commerce |
Authors | Wenxuan Zhang, Wai Lam, Yang Deng, Jing Ma |
Abstract | Product-specific community question answering platforms can greatly help address the concerns of potential customers. However, the user-provided answers on such platforms often vary a lot in their qualities. Helpfulness votes from the community can indicate the overall quality of the answer, but they are often missing. Accurately predicting the helpfulness of an answer to a given question and thus identifying helpful answers is becoming a demanding need. Since the helpfulness of an answer depends on multiple perspectives instead of only topical relevance investigated in typical QA tasks, common answer selection algorithms are insufficient for tackling this task. In this paper, we propose the Review-guided Answer Helpfulness Prediction (RAHP) model that not only considers the interactions between QA pairs but also investigates the opinion coherence between the answer and crowds’ opinions reflected in the reviews, which is another important factor to identify helpful answers. Moreover, we tackle the task of determining opinion coherence as a language inference problem and explore the utilization of pre-training strategy to transfer the textual inference knowledge obtained from a specifically designed trained network. Extensive experiments conducted on real-world data across seven product categories show that our proposed model achieves superior performance on the prediction task. |
Tasks | Answer Selection, Community Question Answering, Question Answering |
Published | 2020-03-13 |
URL | https://arxiv.org/abs/2003.06209v1 |
https://arxiv.org/pdf/2003.06209v1.pdf | |
PWC | https://paperswithcode.com/paper/review-guided-helpful-answer-identification |
Repo | https://github.com/isakzhang/answer-helpfulness-prediction |
Framework | none |
Review, Analyze, and Design a Comprehensive Deep Reinforcement Learning Framework
Title | Review, Analyze, and Design a Comprehensive Deep Reinforcement Learning Framework |
Authors | Ngoc Duy Nguyen, Thanh Thi Nguyen, Hai Nguyen, Saeid Nahavandi |
Abstract | Reinforcement learning (RL) has emerged as a standard approach for building an intelligent system, which involves multiple self-operated agents to collectively accomplish a designated task. More importantly, there has been a great attention to RL since the introduction of deep learning that essentially makes RL feasible to operate in high-dimensional environments. However, current research interests are diverted into different directions, such as multi-agent and multi-objective learning, and human-machine interactions. Therefore, in this paper, we propose a comprehensive software architecture that not only plays a vital role in designing a connect-the-dots deep RL architecture but also provides a guideline to develop a realistic RL application in a short time span. By inheriting the proposed architecture, software managers can foresee any challenges when designing a deep RL-based system. As a result, they can expedite the design process and actively control every stage of software development, which is especially critical in agile development environments. For this reason, we designed a deep RL-based framework that strictly ensures flexibility, robustness, and scalability. Finally, to enforce generalization, the proposed architecture does not depend on a specific RL algorithm, a network configuration, the number of agents, or the type of agents. |
Tasks | |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.11883v1 |
https://arxiv.org/pdf/2002.11883v1.pdf | |
PWC | https://paperswithcode.com/paper/review-analyze-and-design-a-comprehensive |
Repo | https://github.com/garlicdevs/Fruit-API |
Framework | tf |
Inference for Batched Bandits
Title | Inference for Batched Bandits |
Authors | Kelly W. Zhang, Lucas Janson, Susan A. Murphy |
Abstract | As bandit algorithms are increasingly utilized in scientific studies, there is an associated increasing need for reliable inference methods based on the resulting adaptively-collected data. In this work, we develop methods for inference regarding the treatment effect on data collected in batches using a bandit algorithm. We focus on the setting in which the total number of batches is fixed and develop approximate inference methods based on the asymptotic distribution as the size of the batches goes to infinity. We first prove that the ordinary least squares estimator (OLS), which is asymptotically normal on independently sampled data, is not asymptotically normal on data collected using standard bandit algorithms when the treatment effect is zero. This asymptotic non-normality result implies that the naive assumption that the OLS estimator is approximately normal can lead to Type-1 error inflation and confidence intervals with below-nominal coverage probabilities. Second, we introduce the Batched OLS estimator (BOLS) that we prove is asymptotically normal—even in the zero treatment effect case—on data collected from both multi-arm and contextual bandits. Moreover, BOLS is robust to changes in the baseline reward and can be used for obtaining simultaneous confidence intervals for the treatment effect from all batches in non-stationary bandits. We demonstrate in simulations that BOLS can be used reliably for hypothesis testing and obtaining a confidence interval for the treatment effect, even in small sample settings. |
Tasks | Multi-Armed Bandits |
Published | 2020-02-08 |
URL | https://arxiv.org/abs/2002.03217v1 |
https://arxiv.org/pdf/2002.03217v1.pdf | |
PWC | https://paperswithcode.com/paper/inference-for-batched-bandits |
Repo | https://github.com/kellywzhang/inference_batched_bandits |
Framework | none |
Transformation Importance with Applications to Cosmology
Title | Transformation Importance with Applications to Cosmology |
Authors | Chandan Singh, Wooseok Ha, Francois Lanusse, Vanessa Boehm, Jia Liu, Bin Yu |
Abstract | Machine learning lies at the heart of new possibilities for scientific discovery, knowledge generation, and artificial intelligence. Its potential benefits to these fields requires going beyond predictive accuracy and focusing on interpretability. In particular, many scientific problems require interpretations in a domain-specific interpretable feature space (e.g. the frequency domain) whereas attributions to the raw features (e.g. the pixel space) may be unintelligible or even misleading. To address this challenge, we propose TRIM (TRansformation IMportance), a novel approach which attributes importances to features in a transformed space and can be applied post-hoc to a fully trained model. TRIM is motivated by a cosmological parameter estimation problem using deep neural networks (DNNs) on simulated data, but it is generally applicable across domains/models and can be combined with any local interpretation method. In our cosmology example, combining TRIM with contextual decomposition shows promising results for identifying which frequencies a DNN uses, helping cosmologists to understand and validate that the model learns appropriate physical features rather than simulation artifacts. |
Tasks | |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.01926v1 |
https://arxiv.org/pdf/2003.01926v1.pdf | |
PWC | https://paperswithcode.com/paper/transformation-importance-with-applications |
Repo | https://github.com/csinva/transformation-importance |
Framework | pytorch |
Rethinking 1D-CNN for Time Series Classification: A Stronger Baseline
Title | Rethinking 1D-CNN for Time Series Classification: A Stronger Baseline |
Authors | Wensi Tang, Guodong Long, Lu Liu, Tianyi Zhou, Jing Jiang, Michael Blumenstein |
Abstract | For time series classification task using 1D-CNN, the selection of kernel size is critically important to ensure the model can capture the right scale salient signal from a long time-series. Most of the existing work on 1D-CNN treats the kernel size as a hyper-parameter and tries to find the proper kernel size through a grid search which is time-consuming and is inefficient. This paper theoretically analyses how kernel size impacts the performance of 1D-CNN. Considering the importance of kernel size, we propose a novel Omni-Scale 1D-CNN (OS-CNN) architecture to capture the proper kernel size during the model learning period. A specific design for kernel size configuration is developed which enables us to assemble very few kernel-size options to represent more receptive fields. The proposed OS-CNN method is evaluated using the UCR archive with 85 datasets. The experiment results demonstrate that our method is a stronger baseline in multiple performance indicators, including the critical difference diagram, counts of wins, and average accuracy. We also published the experimental source codes at GitHub (https://github.com/Wensi-Tang/OS-CNN/). |
Tasks | Time Series, Time Series Classification |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10061v1 |
https://arxiv.org/pdf/2002.10061v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-1d-cnn-for-time-series |
Repo | https://github.com/Wensi-Tang/OS-CNN |
Framework | pytorch |
Pixel-Level Self-Paced Learning for Super-Resolution
Title | Pixel-Level Self-Paced Learning for Super-Resolution |
Authors | Wei. Lin, Junyu. Gao, Qi. Wang, Xuelong. Li |
Abstract | Recently, lots of deep networks are proposed to improve the quality of predicted super-resolution (SR) images, due to its widespread use in several image-based fields. However, with these networks being constructed deeper and deeper, they also cost much longer time for training, which may guide the learners to local optimization. To tackle this problem, this paper designs a training strategy named Pixel-level Self-Paced Learning (PSPL) to accelerate the convergence velocity of SISR models. PSPL imitating self-paced learning gives each pixel in the predicted SR image and its corresponding pixel in ground truth an attention weight, to guide the model to a better region in parameter space. Extensive experiments proved that PSPL could speed up the training of SISR models, and prompt several existing models to obtain new better results. Furthermore, the source code is available at https://github.com/Elin24/PSPL. |
Tasks | Super-Resolution |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03113v2 |
https://arxiv.org/pdf/2003.03113v2.pdf | |
PWC | https://paperswithcode.com/paper/pixel-level-self-paced-learning-for-super |
Repo | https://github.com/Elin24/PSPL |
Framework | pytorch |
Towards High Performance Human Keypoint Detection
Title | Towards High Performance Human Keypoint Detection |
Authors | Jing Zhang, Zhe Chen, Dacheng Tao |
Abstract | Human keypoint detection from a single image is very challenging due to occlusion, blur, illumination and scale variance. In this paper, we address this problem from three aspects by devising an efficient network structure, proposing three effective training strategies, and exploiting four useful postprocessing techniques. First, we find that context information plays an important role in reasoning human body configuration and invisible keypoints. Inspired by this, we propose a cascaded context mixer (CCM), which efficiently integrates spatial and channel context information and progressively refines them. Then, to maximize CCM’s representation capability, we develop a hard-negative person detection mining strategy and a joint-training strategy by exploiting abundant unlabeled data. It enables CCM to learn discriminative features from massive diverse poses. Third, we present several sub-pixel refinement techniques for postprocessing keypoint predictions to improve detection accuracy. Extensive experiments on the MS COCO keypoint detection benchmark demonstrate the superiority of the proposed method over representative state-of-the-art (SOTA) methods. Our single model achieves comparable performance with the winner of the 2018 COCO Keypoint Detection Challenge. The final ensemble model sets a new SOTA on this benchmark. |
Tasks | Human Detection, Keypoint Detection |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00537v1 |
https://arxiv.org/pdf/2002.00537v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-high-performance-human-keypoint |
Repo | https://github.com/chaimi2013/CCM |
Framework | none |
Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods
Title | Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods |
Authors | Jiale Zhi, Rui Wang, Jeff Clune, Kenneth O. Stanley |
Abstract | Recent advances in machine learning are consistently enabled by increasing amounts of computation. Reinforcement learning (RL) and population-based methods in particular pose unique challenges for efficiency and flexibility to the underlying distributed computing frameworks. These challenges include frequent interaction with simulations, the need for dynamic scaling, and the need for a user interface with low adoption cost and consistency across different backends. In this paper we address these challenges while still retaining development efficiency and flexibility for both research and practical applications by introducing Fiber, a scalable distributed computing framework for RL and population-based methods. Fiber aims to significantly expand the accessibility of large-scale parallel computation to users of otherwise complicated RL and population-based approaches without the need to for specialized computational expertise. |
Tasks | |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11164v1 |
https://arxiv.org/pdf/2003.11164v1.pdf | |
PWC | https://paperswithcode.com/paper/fiber-a-platform-for-efficient-development |
Repo | https://github.com/uber/fiber |
Framework | none |
Anatomy-aware 3D Human Pose Estimation in Videos
Title | Anatomy-aware 3D Human Pose Estimation in Videos |
Authors | Tianlang Chen, Chen Fang, Xiaohui Shen, Yiheng Zhu, Zhili Chen, Jiebo Luo |
Abstract | In this work, we propose a new solution for 3D human pose estimation in videos. Instead of directly regressing the 3D joint locations, we draw inspiration from the human skeleton anatomy and decompose the task into bone direction prediction and bone length prediction, from which the 3D joint locations can be completely derived. Our motivation is the fact that the bone lengths of a human skeleton remain consistent across time. This promotes us to develop effective techniques to utilize global information across {\it all} the frames in a video for high-accuracy bone length prediction. Moreover, for the bone direction prediction network, we propose a fully-convolutional propagating architecture with long skip connections. Essentially, it predicts the directions of different bones hierarchically without using any time-consuming memory units (e.g. LSTM). A novel joint shift loss is further introduced to bridge the training of the bone length and bone direction prediction networks. Finally, we employ an implicit attention mechanism to feed the 2D keypoint visibility scores into the model as extra guidance, which significantly mitigates the depth ambiguity in many challenging poses. Our full model outperforms the previous best results on Human3.6M and MPI-INF-3DHP datasets, where comprehensive evaluation validates the effectiveness of our model. |
Tasks | 3D Human Pose Estimation, Pose Estimation |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10322v3 |
https://arxiv.org/pdf/2002.10322v3.pdf | |
PWC | https://paperswithcode.com/paper/anatomy-aware-3d-human-pose-estimation-in |
Repo | https://github.com/sunnychencool/Anatomy3D |
Framework | pytorch |
FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms
Title | FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms |
Authors | Gourab K. Patro, Arpita Biswas, Niloy Ganguly, Krishna P. Gummadi, Abhijnan Chakraborty |
Abstract | We investigate the problem of fair recommendation in the context of two-sided online platforms, comprising customers on one side and producers on the other. Traditionally, recommendation services in these platforms have focused on maximizing customer satisfaction by tailoring the results according to the personalized preferences of individual customers. However, our investigation reveals that such customer-centric design may lead to unfair distribution of exposure among the producers, which may adversely impact their well-being. On the other hand, a producer-centric design might become unfair to the customers. Thus, we consider fairness issues that span both customers and producers. Our approach involves a novel mapping of the fair recommendation problem to a constrained version of the problem of fairly allocating indivisible goods. Our proposed FairRec algorithm guarantees at least Maximin Share (MMS) of exposure for most of the producers and Envy-Free up to One item (EF1) fairness for every customer. Extensive evaluations over multiple real-world datasets show the effectiveness of FairRec in ensuring two-sided fairness while incurring a marginal loss in the overall recommendation quality. |
Tasks | |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10764v1 |
https://arxiv.org/pdf/2002.10764v1.pdf | |
PWC | https://paperswithcode.com/paper/fairrec-two-sided-fairness-for-personalized |
Repo | https://github.com/gourabkumarpatro/FairRec_www_2020 |
Framework | none |
A hypergeometric test interpretation of a common tf-idf variant
Title | A hypergeometric test interpretation of a common tf-idf variant |
Authors | Paul Sheridan, Mikael Onsjö |
Abstract | Term frequency-inverse document frequency, or tf-idf for short, is a numerical measure that is widely used in information retrieval to quantify the importance of a term of interest in one out of many documents. While tf-idf was originally proposed as a heuristic, much work has been devoted over the years to placing it on a solid theoretical foundation. Following in this tradition, we here advance the first justification for tf-idf that is grounded in statistical hypothesis testing. More precisely, we first show that the hypergeometric test from classical statistics corresponds well with a common tf-idf variant on selected real-data information retrieval tasks. Then we set forth a mathematical argument that suggests the tf-idf variant functions as an approximation to the hypergeometric test (and vice versa). The hypergeometric test interpretation of this common tf-idf variant equips the working statistician with a ready explanation of tf-idf’s long-established effectiveness. |
Tasks | Information Retrieval |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11844v1 |
https://arxiv.org/pdf/2002.11844v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hypergeometric-test-interpretation-of-a |
Repo | https://github.com/paul-sheridan/hgt-tfidf |
Framework | tf |
NeurIPS 2019 Disentanglement Challenge: Improved Disentanglement through Learned Aggregation of Convolutional Feature Maps
Title | NeurIPS 2019 Disentanglement Challenge: Improved Disentanglement through Learned Aggregation of Convolutional Feature Maps |
Authors | Maximilian Seitzer, Andreas Foltyn, Felix P. Kemeth |
Abstract | This report to our stage 2 submission to the NeurIPS 2019 disentanglement challenge presents a simple image preprocessing method for learning disentangled latent factors. We propose to train a variational autoencoder on regionally aggregated feature maps obtained from networks pretrained on the ImageNet database, utilizing the implicit inductive bias contained in those features for disentanglement. This bias can be further enhanced by explicitly fine-tuning the feature maps on auxiliary tasks useful for the challenge, such as angle, position estimation, or color classification. Our approach achieved the 2nd place in stage 2 of the challenge. Code is available at https://github.com/mseitzer/neurips2019-disentanglement-challenge. |
Tasks | |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12356v1 |
https://arxiv.org/pdf/2002.12356v1.pdf | |
PWC | https://paperswithcode.com/paper/neurips-2019-disentanglement-challenge-1 |
Repo | https://github.com/mseitzer/neurips2019-disentanglement-challenge |
Framework | pytorch |
Memory-Based Graph Networks
Title | Memory-Based Graph Networks |
Authors | Amir Hosein Khasahmadi, Kaveh Hassani, Parsa Moradi, Leo Lee, Quaid Morris |
Abstract | Graph neural networks (GNNs) are a class of deep models that operate on data with arbitrary topology represented as graphs. We introduce an efficient memory layer for GNNs that can jointly learn node representations and coarsen the graph. We also introduce two new networks based on this layer: memory-based GNN (MemGNN) and graph memory network (GMN) that can learn hierarchical graph representations. The experimental results shows that the proposed models achieve state-of-the-art results in eight out of nine graph classification and regression benchmarks. We also show that the learned representations could correspond to chemical features in the molecule data. Code and reference implementations are released at: https://github.com/amirkhas/GraphMemoryNet |
Tasks | Graph Classification |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09518v1 |
https://arxiv.org/pdf/2002.09518v1.pdf | |
PWC | https://paperswithcode.com/paper/memory-based-graph-networks-1 |
Repo | https://github.com/amirkhas/GraphMemoryNet |
Framework | none |
TEDL: A Text Encryption Method Based on Deep Learning
Title | TEDL: A Text Encryption Method Based on Deep Learning |
Authors | Xiang Li, Peng Wang |
Abstract | Recent years have seen an increasing emphasis on information security, and various encryption methods have been proposed. However, for symmetric encryption methods, the well-known encryption techniques still rely on the key space to guarantee security and suffer from frequent key updating. Aiming to solve those problems, this paper proposes a novel text encryption method based on deep learning called TEDL, where the secret key includes hyperparameters in deep learning model and the core step of encryption is transforming input data into weights trained under hyperparameters. Firstly, both communication parties establish a word vector table by training a deep learning model according to specified hyperparameters. Then, a self-update codebook is constructed on the word vector table with the SHA-256 function and other tricks. When communication starts, encryption and decryption are equivalent to indexing and inverted indexing on the codebook, respectively, thus achieving the transformation between plaintext and ciphertext. Results of experiments and relevant analyses show that TEDL performs well for security, efficiency, generality, and has a lower demand for the frequency of key redistribution. Especially, as a supplement to current encryption methods, the time-consuming process of constructing a codebook increases the difficulty of brute-force attacks while not degrade the communication efficiency. |
Tasks | |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04038v2 |
https://arxiv.org/pdf/2003.04038v2.pdf | |
PWC | https://paperswithcode.com/paper/tedl-a-text-encryption-method-based-on-deep |
Repo | https://github.com/AmbitionXiang/TEDL |
Framework | none |
Equalization Loss for Long-Tailed Object Recognition
Title | Equalization Loss for Long-Tailed Object Recognition |
Authors | Jingru Tan, Changbao Wang, Buyu Li, Quanquan Li, Wanli Ouyang, Changqing Yin, Junjie Yan |
Abstract | Object recognition techniques using convolutional neural networks (CNN) have achieved great success. However, state-of-the-art object detection methods still perform poorly on large vocabulary and long-tailed datasets, e.g. LVIS. In this work, we analyze this problem from a novel perspective: each positive sample of one category can be seen as a negative sample for other categories, making the tail categories receive more discouraging gradients. Based on it, we propose a simple but effective loss, named equalization loss, to tackle the problem of long-tailed rare categories by simply ignoring those gradients for rare categories. The equalization loss protects the learning of rare categories from being at a disadvantage during the network parameter updating. Thus the model is capable of learning better discriminative features for objects of rare classes. Without any bells and whistles, our method achieves AP gains of 4.1% and 4.8% for the rare and common categories on the challenging LVIS benchmark, compared to the Mask R-CNN baseline. With the utilization of the effective equalization loss, we finally won the 1st place in the LVIS Challenge 2019. Code has been made available at: https: //github.com/tztztztztz/eql.detectron2 |
Tasks | Object Detection, Object Recognition |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05176v1 |
https://arxiv.org/pdf/2003.05176v1.pdf | |
PWC | https://paperswithcode.com/paper/equalization-loss-for-long-tailed-object |
Repo | https://github.com/tztztztztz/eql.detectron2 |
Framework | pytorch |