Paper Group ANR 676
Context-aware Deep Model for Entity Recommendation in Search Engine at Alibaba. Low-Memory Neural Network Training: A Technical Report. GRAPHENE: A Precise Biomedical Literature Retrieval Engine with Graph Augmented Deep Learning and External Knowledge Empowerment. VolMap: A Real-time Model for Semantic Segmentation of a LiDAR surrounding view. Imp …
Context-aware Deep Model for Entity Recommendation in Search Engine at Alibaba
Title | Context-aware Deep Model for Entity Recommendation in Search Engine at Alibaba |
Authors | Qianghuai Jia, Ningyu Zhang, Nengwei Hua |
Abstract | Entity recommendation, providing search users with an improved experience via assisting them in finding related entities for a given query, has become an indispensable feature of today’s search engines. Existing studies typically only consider the queries with explicit entities. They usually fail to handle complex queries that without entities, such as “what food is good for cold weather”, because their models could not infer the underlying meaning of the input text. In this work, we believe that contexts convey valuable evidence that could facilitate the semantic modeling of queries, and take them into consideration for entity recommendation. In order to better model the semantics of queries and entities, we learn the representation of queries and entities jointly with attentive deep neural networks. We evaluate our approach using large-scale, real-world search logs from a widely used commercial Chinese search engine. Our system has been deployed in ShenMa Search Engine and you can fetch it in UC Browser of Alibaba. Results from online A/B test suggest that the impression efficiency of click-through rate increased by 5.1% and page view increased by 5.5%. |
Tasks | |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.04493v1 |
https://arxiv.org/pdf/1909.04493v1.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-deep-model-for-entity |
Repo | |
Framework | |
Low-Memory Neural Network Training: A Technical Report
Title | Low-Memory Neural Network Training: A Technical Report |
Authors | Nimit Sharad Sohoni, Christopher Richard Aberger, Megan Leszczynski, Jian Zhang, Christopher Ré |
Abstract | Memory is increasingly often the bottleneck when training neural network models. Despite this, techniques to lower the overall memory requirements of training have been less widely studied compared to the extensive literature on reducing the memory requirements of inference. In this paper we study a fundamental question: How much memory is actually needed to train a neural network? To answer this question, we profile the overall memory usage of training on two representative deep learning benchmarks – the WideResNet model for image classification and the DynamicConv Transformer model for machine translation – and comprehensively evaluate four standard techniques for reducing the training memory requirements: (1) imposing sparsity on the model, (2) using low precision, (3) microbatching, and (4) gradient checkpointing. We explore how each of these techniques in isolation affects both the peak memory usage of training and the quality of the end model, and explore the memory, accuracy, and computation tradeoffs incurred when combining these techniques. Using appropriate combinations of these techniques, we show that it is possible to the reduce the memory required to train a WideResNet-28-2 on CIFAR-10 by up to 60.7x with a 0.4% loss in accuracy, and reduce the memory required to train a DynamicConv model on IWSLT’14 German to English translation by up to 8.7x with a BLEU score drop of 0.15. |
Tasks | Image Classification, Machine Translation |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10631v1 |
http://arxiv.org/pdf/1904.10631v1.pdf | |
PWC | https://paperswithcode.com/paper/low-memory-neural-network-training-a |
Repo | |
Framework | |
GRAPHENE: A Precise Biomedical Literature Retrieval Engine with Graph Augmented Deep Learning and External Knowledge Empowerment
Title | GRAPHENE: A Precise Biomedical Literature Retrieval Engine with Graph Augmented Deep Learning and External Knowledge Empowerment |
Authors | Sendong Zhao, Chang Su, Andrea Sboner, Fei Wang |
Abstract | Effective biomedical literature retrieval (BLR) plays a central role in precision medicine informatics. In this paper, we propose GRAPHENE, which is a deep learning based framework for precise BLR. GRAPHENE consists of three main different modules 1) graph-augmented document representation learning; 2) query expansion and representation learning and 3) learning to rank biomedical articles. The graph-augmented document representation learning module constructs a document-concept graph containing biomedical concept nodes and document nodes so that global biomedical related concept from external knowledge source can be captured, which is further connected to a BiLSTM so both local and global topics can be explored. Query expansion and representation learning module expands the query with abbreviations and different names, and then builds a CNN-based model to convolve the expanded query and obtain a vector representation for each query. Learning to rank minimizes a ranking loss between biomedical articles with the query to learn the retrieval function. Experimental results on applying our system to TREC Precision Medicine track data are provided to demonstrate its effectiveness. |
Tasks | Learning-To-Rank, Representation Learning |
Published | 2019-11-02 |
URL | https://arxiv.org/abs/1911.00760v2 |
https://arxiv.org/pdf/1911.00760v2.pdf | |
PWC | https://paperswithcode.com/paper/graphene-a-precise-biomedical-literature |
Repo | |
Framework | |
VolMap: A Real-time Model for Semantic Segmentation of a LiDAR surrounding view
Title | VolMap: A Real-time Model for Semantic Segmentation of a LiDAR surrounding view |
Authors | Hager Radi, Waleed Ali |
Abstract | This paper introduces VolMap, a real-time approach for the semantic segmentation of a 3D LiDAR surrounding view system in autonomous vehicles. We designed an optimized deep convolution neural network that can accurately segment the point cloud produced by a 360\degree{} LiDAR setup, where the input consists of a volumetric bird-eye view with LiDAR height layers used as input channels. We further investigated the usage of multi-LiDAR setup and its effect on the performance of the semantic segmentation task. Our evaluations are carried out on a large scale 3D object detection benchmark containing a LiDAR cocoon setup, along with KITTI dataset, where the per-point segmentation labels are derived from 3D bounding boxes. We show that VolMap achieved an excellent balance between high accuracy and real-time running on CPU. |
Tasks | 3D Object Detection, Autonomous Vehicles, Object Detection, Semantic Segmentation |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.11873v1 |
https://arxiv.org/pdf/1906.11873v1.pdf | |
PWC | https://paperswithcode.com/paper/volmap-a-real-time-model-for-semantic |
Repo | |
Framework | |
Implicit Regularization of Accelerated Methods in Hilbert Spaces
Title | Implicit Regularization of Accelerated Methods in Hilbert Spaces |
Authors | Nicolò Pagliana, Lorenzo Rosasco |
Abstract | We study learning properties of accelerated gradient descent methods for linear least-squares in Hilbert spaces. We analyze the implicit regularization properties of Nesterov acceleration and a variant of heavy-ball in terms of corresponding learning error bounds. Our results show that acceleration can provides faster bias decay than gradient descent, but also suffers of a more unstable behavior. As a result acceleration cannot be in general expected to improve learning accuracy with respect to gradient descent, but rather to achieve the same accuracy with reduced computations. Our theoretical results are validated by numerical simulations. Our analysis is based on studying suitable polynomials induced by the accelerated dynamics and combining spectral techniques with concentration inequalities. |
Tasks | |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13000v4 |
https://arxiv.org/pdf/1905.13000v4.pdf | |
PWC | https://paperswithcode.com/paper/implicit-regularization-of-accelerated |
Repo | |
Framework | |
Location Attention for Extrapolation to Longer Sequences
Title | Location Attention for Extrapolation to Longer Sequences |
Authors | Yann Dubois, Gautier Dagan, Dieuwke Hupkes, Elia Bruni |
Abstract | Neural networks are surprisingly good at interpolating and perform remarkably well when the training set examples resemble those in the test set. However, they are often unable to extrapolate patterns beyond the seen data, even when the abstractions required for such patterns are simple. In this paper, we first review the notion of extrapolation, why it is important and how one could hope to tackle it. We then focus on a specific type of extrapolation which is especially useful for natural language processing: generalization to sequences that are longer than the training ones. We hypothesize that models with a separate content- and location-based attention are more likely to extrapolate than those with common attention mechanisms. We empirically support our claim for recurrent seq2seq models with our proposed attention on variants of the Lookup Table task. This sheds light on some striking failures of neural models for sequences and on possible methods to approaching such issues. |
Tasks | |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03872v1 |
https://arxiv.org/pdf/1911.03872v1.pdf | |
PWC | https://paperswithcode.com/paper/location-attention-for-extrapolation-to |
Repo | |
Framework | |
Semiparametric Wavelet-based JPEG IV Estimator for endogenously truncated data
Title | Semiparametric Wavelet-based JPEG IV Estimator for endogenously truncated data |
Authors | Nir Billfeld, Moshe Kim |
Abstract | A new and an enriched JPEG algorithm is provided for identifying redundancies in a sequence of irregular noisy data points which also accommodates a reference-free criterion function. Our main contribution is by formulating analytically (instead of approximating) the inverse of the transpose of JPEGwavelet transform without involving matrices which are computationally cumbersome. The algorithm is suitable for the widely-spread situations where the original data distribution is unobservable such as in cases where there is deficient representation of the entire population in the training data (in machine learning) and thus the covariate shift assumption is violated. The proposed estimator corrects for both biases, the one generated by endogenous truncation and the one generated by endogenous covariates. Results from utilizing 2,000,000 different distribution functions verify the applicability and high accuracy of our procedure to cases in which the disturbances are neither jointly nor marginally normally distributed. |
Tasks | |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.02166v1 |
https://arxiv.org/pdf/1908.02166v1.pdf | |
PWC | https://paperswithcode.com/paper/semiparametric-wavelet-based-jpeg-iv |
Repo | |
Framework | |
A Method of Fluorescent Fibers Detection on Identity Documents under Ultraviolet Light
Title | A Method of Fluorescent Fibers Detection on Identity Documents under Ultraviolet Light |
Authors | Kunina I. A., Aliev M. A., Arlazarov N. V., Polevoy D. V |
Abstract | In this work we consider the problem of the fluorescent security fibers detection on the images of identity documents captured under ultraviolet light. As an example we use images of the second and third pages of the Russian passport and show features that render known methods and approaches based on image binarization non applicable. We propose a solution based on ridge detection in the gray-scale image of the document with preliminary normalized background. The algorithm was tested on a private dataset consisting of both authentic and model passports. Abandonment of binarization allowed to provide reliable and stable functioning of the proposed detector on a target dataset. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.01916v1 |
https://arxiv.org/pdf/1912.01916v1.pdf | |
PWC | https://paperswithcode.com/paper/a-method-of-fluorescent-fibers-detection-on |
Repo | |
Framework | |
A* Tree Search for Portfolio Management
Title | A* Tree Search for Portfolio Management |
Authors | Xiaojie Gao, Shikui Tu, Lei Xu |
Abstract | We propose a planning-based method to teach an agent to manage portfolio from scratch. Our approach combines deep reinforcement learning techniques with search techniques like AlphaGo. By uniting the advantages in A* search algorithm with Monte Carlo tree search, we come up with a new algorithm named A* tree search in which best information is returned to guide next search. Also, the expansion mode of Monte Carlo tree is improved for a higher utilization of the neural network. The suggested algorithm can also optimize non-differentiable utility function by combinatorial search. This technique is then used in our trading system. The major component is a neural network that is trained by trading experiences from tree search and outputs prior probability to guide search by pruning away branches in turn. Experimental results on simulated and real financial data verify the robustness of the proposed trading system and the trading system produces better strategies than several approaches based on reinforcement learning. |
Tasks | |
Published | 2019-01-07 |
URL | http://arxiv.org/abs/1901.01855v2 |
http://arxiv.org/pdf/1901.01855v2.pdf | |
PWC | https://paperswithcode.com/paper/a-tree-search-for-portfolio-management |
Repo | |
Framework | |
ViWi: A Deep Learning Dataset Framework for Vision-Aided Wireless Communications
Title | ViWi: A Deep Learning Dataset Framework for Vision-Aided Wireless Communications |
Authors | Muhammad Alrabeiah, Andrew Hredzak, Zhenhao Liu, Ahmed Alkhateeb |
Abstract | The growing role that artificial intelligence and specifically machine learning is playing in shaping the future of wireless communications has opened up many new and intriguing research directions. This paper motivates the research in the novel direction of \textit{vision-aided wireless communications}, which aims at leveraging visual sensory information in tackling wireless communication problems. Like any new research direction driven by machine learning, obtaining a development dataset poses the first and most important challenge to vision-aided wireless communications. This paper addresses this issue by introducing the Vision-Wireless (ViWi) dataset framework. It is developed to be a parametric, systematic, and scalable data generation framework. It utilizes advanced 3D-modeling and ray-tracing softwares to generate high-fidelity synthetic wireless and vision data samples for the same scenes. The result is a framework that does not only offer a way to generate training and testing datasets but helps provide a common ground on which the quality of different machine learning-powered solutions could be assessed. |
Tasks | |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.06257v1 |
https://arxiv.org/pdf/1911.06257v1.pdf | |
PWC | https://paperswithcode.com/paper/viwi-a-deep-learning-dataset-framework-for |
Repo | |
Framework | |
Optimizing Collision Avoidance in Dense Airspace using Deep Reinforcement Learning
Title | Optimizing Collision Avoidance in Dense Airspace using Deep Reinforcement Learning |
Authors | Sheng Li, Maxim Egorov, Mykel Kochenderfer |
Abstract | New methodologies will be needed to ensure the airspace remains safe and efficient as traffic densities rise to accommodate new unmanned operations. This paper explores how unmanned free-flight traffic may operate in dense airspace. We develop and analyze autonomous collision avoidance systems for aircraft operating in dense airspace where traditional collision avoidance systems fail. We propose a metric for quantifying the decision burden on a collision avoidance system as well as a metric for measuring the impact of the collision avoidance system on airspace. We use deep reinforcement learning to compute corrections for an existing collision avoidance approach to account for dense airspace. The results show that a corrected collision avoidance system can operate more efficiently than traditional methods in dense airspace while maintaining high levels of safety. |
Tasks | |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.10146v1 |
https://arxiv.org/pdf/1912.10146v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-collision-avoidance-in-dense |
Repo | |
Framework | |
Self-Learned Formula Synthesis in Set Theory
Title | Self-Learned Formula Synthesis in Set Theory |
Authors | Chad E. Brown, Thibault Gauthier |
Abstract | A reinforcement learning algorithm accomplishes the task of synthesizing a set-theoretical formula that evaluates to given truth values for given assignments. |
Tasks | |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01525v1 |
https://arxiv.org/pdf/1912.01525v1.pdf | |
PWC | https://paperswithcode.com/paper/self-learned-formula-synthesis-in-set-theory |
Repo | |
Framework | |
A Reference Vector based Many-Objective Evolutionary Algorithm with Feasibility-aware Adaptation
Title | A Reference Vector based Many-Objective Evolutionary Algorithm with Feasibility-aware Adaptation |
Authors | Mingde Zhao, Hongwei Ge, Kai Zhang, Yaqing Hou |
Abstract | The infeasible parts of the objective space in difficult many-objective optimization problems cause trouble for evolutionary algorithms. This paper proposes a reference vector based algorithm which uses two interacting engines to adapt the reference vectors and to evolve the population towards the true Pareto Front (PF) s.t. the reference vectors are always evenly distributed within the current PF to provide appropriate guidance for selection. The current PF is tracked by maintaining an archive of undominated individuals, and adaptation of reference vectors is conducted with the help of another archive that contains layers of reference vectors corresponding to different density. Experimental results show the expected characteristics and competitive performance of the proposed algorithm TEEA. |
Tasks | |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06302v1 |
http://arxiv.org/pdf/1904.06302v1.pdf | |
PWC | https://paperswithcode.com/paper/a-reference-vector-based-many-objective |
Repo | |
Framework | |
Social Choice Methods for Database Aggregation
Title | Social Choice Methods for Database Aggregation |
Authors | Francesco Belardinelli, Umberto Grandi |
Abstract | Knowledge can be represented compactly in multiple ways, from a set of propositional formulas, to a Kripke model, to a database. In this paper we study the aggregation of information coming from multiple sources, each source submitting a database modelled as a first-order relational structure. In the presence of integrity constraints, we identify classes of aggregators that respect them in the aggregated database, provided these are satisfied in all individual databases. We also characterise languages for first-order queries on which the answer to a query on the aggregated database coincides with the aggregation of the answers to the query obtained on each individual database. This contribution is meant to be a first step on the application of techniques from social choice theory to knowledge representation in databases. |
Tasks | |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.10492v1 |
https://arxiv.org/pdf/1907.10492v1.pdf | |
PWC | https://paperswithcode.com/paper/social-choice-methods-for-database |
Repo | |
Framework | |
SIM: A Slot-Independent Neural Model for Dialogue State Tracking
Title | SIM: A Slot-Independent Neural Model for Dialogue State Tracking |
Authors | Chenguang Zhu, Michael Zeng, Xuedong Huang |
Abstract | Dialogue state tracking is an important component in task-oriented dialogue systems to identify users’ goals and requests as a dialogue proceeds. However, as most previous models are dependent on dialogue slots, the model complexity soars when the number of slots increases. In this paper, we put forward a slot-independent neural model (SIM) to track dialogue states while keeping the model complexity invariant to the number of dialogue slots. The model utilizes attention mechanisms between user utterance and system actions. SIM achieves state-of-the-art results on WoZ and DSTC2 tasks, with only 20% of the model size of previous models. |
Tasks | Dialogue State Tracking, Task-Oriented Dialogue Systems |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.11833v1 |
https://arxiv.org/pdf/1909.11833v1.pdf | |
PWC | https://paperswithcode.com/paper/sim-a-slot-independent-neural-model-for |
Repo | |
Framework | |