Paper Group ANR 1059
Stochastic Variance-Reduced Hamilton Monte Carlo Methods. Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition. Comparison-Based Random Forests. A fast quasi-Newton-type method for large-scale stochastic optimisation. Future-Prediction-Based Model for Neural Machine Translation. Improving Massive MIMO Belief Pro …
Stochastic Variance-Reduced Hamilton Monte Carlo Methods
Title | Stochastic Variance-Reduced Hamilton Monte Carlo Methods |
Authors | Difan Zou, Pan Xu, Quanquan Gu |
Abstract | We propose a fast stochastic Hamilton Monte Carlo (HMC) method, for sampling from a smooth and strongly log-concave distribution. At the core of our proposed method is a variance reduction technique inspired by the recent advance in stochastic optimization. We show that, to achieve $\epsilon$ accuracy in 2-Wasserstein distance, our algorithm achieves $\tilde O\big(n+\kappa^{2}d^{1/2}/\epsilon+\kappa^{4/3}d^{1/3}n^{2/3}/\epsilon^{2/3}\big)$ gradient complexity (i.e., number of component gradient evaluations), which outperforms the state-of-the-art HMC and stochastic gradient HMC methods in a wide regime. We also extend our algorithm for sampling from smooth and general log-concave distributions, and prove the corresponding gradient complexity as well. Experiments on both synthetic and real data demonstrate the superior performance of our algorithm. |
Tasks | Stochastic Optimization |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04791v1 |
http://arxiv.org/pdf/1802.04791v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-variance-reduced-hamilton-monte |
Repo | |
Framework | |
Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition
Title | Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition |
Authors | Zhenghui Wang, Yanru Qu, Liheng Chen, Jian Shen, Weinan Zhang, Shaodian Zhang, Yimei Gao, Gen Gu, Ken Chen, Yong Yu |
Abstract | We study the problem of named entity recognition (NER) from electronic medical records, which is one of the most fundamental and critical problems for medical text mining. Medical records which are written by clinicians from different specialties usually contain quite different terminologies and writing styles. The difference of specialties and the cost of human annotation makes it particularly difficult to train a universal medical NER system. In this paper, we propose a label-aware double transfer learning framework (La-DTL) for cross-specialty NER, so that a medical NER system designed for one specialty could be conveniently applied to another one with minimal annotation efforts. The transferability is guaranteed by two components: (i) we propose label-aware MMD for feature representation transfer, and (ii) we perform parameter transfer with a theoretical upper bound which is also label aware. We conduct extensive experiments on 12 cross-specialty NER tasks. The experimental results demonstrate that La-DTL provides consistent accuracy improvement over strong baselines. Besides, the promising experimental results on non-medical NER scenarios indicate that La-DTL is potential to be seamlessly adapted to a wide range of NER tasks. |
Tasks | Medical Named Entity Recognition, Named Entity Recognition, Transfer Learning |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.09021v2 |
http://arxiv.org/pdf/1804.09021v2.pdf | |
PWC | https://paperswithcode.com/paper/label-aware-double-transfer-learning-for |
Repo | |
Framework | |
Comparison-Based Random Forests
Title | Comparison-Based Random Forests |
Authors | Siavash Haghiri, Damien Garreau, Ulrike von Luxburg |
Abstract | Assume we are given a set of items from a general metric space, but we neither have access to the representation of the data nor to the distances between data points. Instead, suppose that we can actively choose a triplet of items (A,B,C) and ask an oracle whether item A is closer to item B or to item C. In this paper, we propose a novel random forest algorithm for regression and classification that relies only on such triplet comparisons. In the theory part of this paper, we establish sufficient conditions for the consistency of such a forest. In a set of comprehensive experiments, we then demonstrate that the proposed random forest is efficient both for classification and regression. In particular, it is even competitive with other methods that have direct access to the metric representation of the data. |
Tasks | |
Published | 2018-06-18 |
URL | http://arxiv.org/abs/1806.06616v1 |
http://arxiv.org/pdf/1806.06616v1.pdf | |
PWC | https://paperswithcode.com/paper/comparison-based-random-forests |
Repo | |
Framework | |
A fast quasi-Newton-type method for large-scale stochastic optimisation
Title | A fast quasi-Newton-type method for large-scale stochastic optimisation |
Authors | Adrian Wills, Carl Jidling, Thomas Schon |
Abstract | During recent years there has been an increased interest in stochastic adaptations of limited memory quasi-Newton methods, which compared to pure gradient-based routines can improve the convergence by incorporating second order information. In this work we propose a direct least-squares approach conceptually similar to the limited memory quasi-Newton methods, but that computes the search direction in a slightly different way. This is achieved in a fast and numerically robust manner by maintaining a Cholesky factor of low dimension. This is combined with a stochastic line search relying upon fulfilment of the Wolfe condition in a backtracking manner, where the step length is adaptively modified with respect to the optimisation progress. We support our new algorithm by providing several theoretical results guaranteeing its performance. The performance is demonstrated on real-world benchmark problems which shows improved results in comparison with already established methods. |
Tasks | |
Published | 2018-09-29 |
URL | http://arxiv.org/abs/1810.01269v1 |
http://arxiv.org/pdf/1810.01269v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fast-quasi-newton-type-method-for-large |
Repo | |
Framework | |
Future-Prediction-Based Model for Neural Machine Translation
Title | Future-Prediction-Based Model for Neural Machine Translation |
Authors | Bingzhen Wei, Junyang Lin |
Abstract | We propose a novel model for Neural Machine Translation (NMT). Different from the conventional method, our model can predict the future text length and words at each decoding time step so that the generation can be helped with the information from the future prediction. With such information, the model does not stop generation without having translated enough content. Experimental results demonstrate that our model can significantly outperform the baseline models. Besides, our analysis reflects that our model is effective in the prediction of the length and words of the untranslated content. |
Tasks | Future prediction, Machine Translation |
Published | 2018-09-02 |
URL | http://arxiv.org/abs/1809.00336v1 |
http://arxiv.org/pdf/1809.00336v1.pdf | |
PWC | https://paperswithcode.com/paper/future-prediction-based-model-for-neural |
Repo | |
Framework | |
Improving Massive MIMO Belief Propagation Detector with Deep Neural Network
Title | Improving Massive MIMO Belief Propagation Detector with Deep Neural Network |
Authors | Xiaosi Tan, Weihong Xu, Yair Be’ery, Zaichen Zhang, Xiaohu You, Chuan Zhang |
Abstract | In this paper, deep neural network (DNN) is utilized to improve the belief propagation (BP) detection for massive multiple-input multiple-output (MIMO) systems. A neural network architecture suitable for detection task is firstly introduced by unfolding BP algorithms. DNN MIMO detectors are then proposed based on two modified BP detectors, damped BP and max-sum BP. The correction factors in these algorithms are optimized through deep learning techniques, aiming at improved detection performance. Numerical results are presented to demonstrate the performance of the DNN detectors in comparison with various BP modifications. The neural network is trained once and can be used for multiple online detections. The results show that, compared to other state-of-the-art detectors, the DNN detectors can achieve lower bit error rate (BER) with improved robustness against various antenna configurations and channel conditions at the same level of complexity. |
Tasks | |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.01002v2 |
http://arxiv.org/pdf/1804.01002v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-massive-mimo-belief-propagation |
Repo | |
Framework | |
Characters Detection on Namecard with faster RCNN
Title | Characters Detection on Namecard with faster RCNN |
Authors | Weitong Zhang |
Abstract | We apply Faster R-CNN to the detection of characters in namecard, in order to solve the problem of a small amount of data and the inbalance between different class, we designed the data augmentation and the ‘fake’ data generalizer to generate more data for the training of network. Without using data augmentation, the average IoU in correct samples could be no less than 80% and the mAP result of 80% was also achieved with Faster R-CNN. By applying the data augmentation, the variance of mAP is decreased and both of the IoU and mAP score has increased a little. |
Tasks | Data Augmentation |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10417v1 |
http://arxiv.org/pdf/1807.10417v1.pdf | |
PWC | https://paperswithcode.com/paper/characters-detection-on-namecard-with-faster |
Repo | |
Framework | |
An Interpretable Model for Scene Graph Generation
Title | An Interpretable Model for Scene Graph Generation |
Authors | Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal |
Abstract | We propose an efficient and interpretable scene graph generator. We consider three types of features: visual, spatial and semantic, and we use a late fusion strategy such that each feature’s contribution can be explicitly investigated. We study the key factors about these features that have the most impact on the performance, and also visualize the learned visual features for relationships and investigate the efficacy of our model. We won the champion of the OpenImages Visual Relationship Detection Challenge on Kaggle, where we outperform the 2nd place by 5% (20% relatively). We believe an accurate scene graph generator is a fundamental stepping stone for higher-level vision-language tasks such as image captioning and visual QA, since it provides a semantic, structured comprehension of an image that is beyond pixels and objects. |
Tasks | Graph Generation, Image Captioning, Scene Graph Generation |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.09543v1 |
http://arxiv.org/pdf/1811.09543v1.pdf | |
PWC | https://paperswithcode.com/paper/an-interpretable-model-for-scene-graph |
Repo | |
Framework | |
Learning an internal representation of the end-effector configuration space
Title | Learning an internal representation of the end-effector configuration space |
Authors | Alban Laflaquière, Alexander V. Terekhov, Bruno Gas, J. Kevin O’Regan |
Abstract | Current machine learning techniques proposed to automatically discover a robot kinematics usually rely on a priori information about the robot’s structure, sensors properties or end-effector position. This paper proposes a method to estimate a certain aspect of the forward kinematics model with no such information. An internal representation of the end-effector configuration is generated from unstructured proprioceptive and exteroceptive data flow under very limited assumptions. A mapping from the proprioceptive space to this representational space can then be used to control the robot. |
Tasks | |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01866v1 |
http://arxiv.org/pdf/1810.01866v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-an-internal-representation-of-the |
Repo | |
Framework | |
Aesthetic Features for Personalized Photo Recommendation
Title | Aesthetic Features for Personalized Photo Recommendation |
Authors | Yu Qing Zhou, Ga Wu, Scott Sanner, Putra Manggala |
Abstract | Many photography websites such as Flickr, 500px, Unsplash, and Adobe Behance are used by amateur and professional photography enthusiasts. Unlike content-based image search, such users of photography websites are not just looking for photos with certain content, but more generally for photos with a certain photographic “aesthetic”. In this context, we explore personalized photo recommendation and propose two aesthetic feature extraction methods based on (i) color space and (ii) deep style transfer embeddings. Using a dataset from 500px, we evaluate how these features can be best leveraged by collaborative filtering methods and show that (ii) provides a significant boost in photo recommendation performance. |
Tasks | Image Retrieval, Style Transfer |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1809.00060v1 |
http://arxiv.org/pdf/1809.00060v1.pdf | |
PWC | https://paperswithcode.com/paper/aesthetic-features-for-personalized-photo |
Repo | |
Framework | |
Deep Reinforcement Learning based Optimal Control of Hot Water Systems
Title | Deep Reinforcement Learning based Optimal Control of Hot Water Systems |
Authors | Hussain Kazmi, Fahad Mehmood, Stefan Lodeweyckx, Johan Driesen |
Abstract | Energy consumption for hot water production is a major draw in high efficiency buildings. Optimizing this has typically been approached from a thermodynamics perspective, decoupled from occupant influence. Furthermore, optimization usually presupposes existence of a detailed dynamics model for the hot water system. These assumptions lead to suboptimal energy efficiency in the real world. In this paper, we present a novel reinforcement learning based methodology which optimizes hot water production. The proposed methodology is completely generalizable, and does not require an offline step or human domain knowledge to build a model for the hot water vessel or the heating element. Occupant preferences too are learnt on the fly. The proposed system is applied to a set of 32 houses in the Netherlands where it reduces energy consumption for hot water production by roughly 20% with no loss of occupant comfort. Extrapolating, this translates to absolute savings of roughly 200 kWh for a single household on an annual basis. This performance can be replicated to any domestic hot water system and optimization objective, given that the fairly minimal requirements on sensor data are met. With millions of hot water systems operational worldwide, the proposed framework has the potential to reduce energy consumption in existing and new systems on a multi Gigawatt-hour scale in the years to come. |
Tasks | |
Published | 2018-01-04 |
URL | http://arxiv.org/abs/1801.01467v1 |
http://arxiv.org/pdf/1801.01467v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-based-optimal |
Repo | |
Framework | |
DeepDownscale: a Deep Learning Strategy for High-Resolution Weather Forecast
Title | DeepDownscale: a Deep Learning Strategy for High-Resolution Weather Forecast |
Authors | Eduardo R. Rodrigues, Igor Oliveira, Renato L. F. Cunha, Marco A. S. Netto |
Abstract | Running high-resolution physical models is computationally expensive and essential for many disciplines. Agriculture, transportation, and energy are sectors that depend on high-resolution weather models, which typically consume many hours of large High Performance Computing (HPC) systems to deliver timely results. Many users cannot afford to run the desired resolution and are forced to use low resolution output. One simple solution is to interpolate results for visualization. It is also possible to combine an ensemble of low resolution models to obtain a better prediction. However, these approaches fail to capture the redundant information and patterns in the low-resolution input that could help improve the quality of prediction. In this paper, we propose and evaluate a strategy based on a deep neural network to learn a high-resolution representation from low-resolution predictions using weather forecast as a practical use case. We take a supervised learning approach, since obtaining labeled data can be done automatically. Our results show significant improvement when compared with standard practices and the strategy is still lightweight enough to run on modest computer systems. |
Tasks | |
Published | 2018-08-15 |
URL | http://arxiv.org/abs/1808.05264v1 |
http://arxiv.org/pdf/1808.05264v1.pdf | |
PWC | https://paperswithcode.com/paper/deepdownscale-a-deep-learning-strategy-for |
Repo | |
Framework | |
Applying the Closed World Assumption to SUMO-based FOL Ontologies for Effective Commonsense Reasoning
Title | Applying the Closed World Assumption to SUMO-based FOL Ontologies for Effective Commonsense Reasoning |
Authors | Javier Álvez, Itziar Gonzalez-Dios, German Rigau |
Abstract | Most commonly, the Open World Assumption is adopted as a standard strategy for the design, construction and use of ontologies. This strategy limits the inferencing capabilities of any system because non-asserted statements (missing knowledge) could be assumed to be alternatively true or false. As we will demonstrate, this is especially the case of first-order logic (FOL) ontologies where non-asserted statements is nowadays one of the main obstacles to its practical application in automated commonsense reasoning tasks. In this paper, we investigate the application of the Closed World Assumption (CWA) to enable a better exploitation of FOL ontologies by using state-of-the-art automated theorem provers. To that end, we explore different CWA formulations for the structural knowledge encoded in a FOL translation of the SUMO ontology, discovering that almost 30 % of the structural knowledge is missing. We evaluate these formulations on a practical experimentation using a very large commonsense benchmark obtained from WordNet through its mapping to SUMO. The results show that the competency of the ontology improves more than 50 % when reasoning under the CWA. Thus, applying the CWA automatically to FOL ontologies reduces their ambiguity and more commonsense questions can be answered |
Tasks | |
Published | 2018-08-14 |
URL | https://arxiv.org/abs/1808.04620v2 |
https://arxiv.org/pdf/1808.04620v2.pdf | |
PWC | https://paperswithcode.com/paper/applying-the-closed-world-assumption-to-sumo |
Repo | |
Framework | |
A Disease Diagnosis and Treatment Recommendation System Based on Big Data Mining and Cloud Computing
Title | A Disease Diagnosis and Treatment Recommendation System Based on Big Data Mining and Cloud Computing |
Authors | Jianguo Chen, Kenli Li, Huigui Rong, Kashif Bilal, Nan Yang, Keqin Li |
Abstract | It is crucial to provide compatible treatment schemes for a disease according to various symptoms at different stages. However, most classification methods might be ineffective in accurately classifying a disease that holds the characteristics of multiple treatment stages, various symptoms, and multi-pathogenesis. Moreover, there are limited exchanges and cooperative actions in disease diagnoses and treatments between different departments and hospitals. Thus, when new diseases occur with atypical symptoms, inexperienced doctors might have difficulty in identifying them promptly and accurately. Therefore, to maximize the utilization of the advanced medical technology of developed hospitals and the rich medical knowledge of experienced doctors, a Disease Diagnosis and Treatment Recommendation System (DDTRS) is proposed in this paper. First, to effectively identify disease symptoms more accurately, a Density-Peaked Clustering Analysis (DPCA) algorithm is introduced for disease-symptom clustering. In addition, association analyses on Disease-Diagnosis (D-D) rules and Disease-Treatment (D-T) rules are conducted by the Apriori algorithm separately. The appropriate diagnosis and treatment schemes are recommended for patients and inexperienced doctors, even if they are in a limited therapeutic environment. Moreover, to reach the goals of high performance and low latency response, we implement a parallel solution for DDTRS using the Apache Spark cloud platform. Extensive experimental results demonstrate that the proposed DDTRS realizes disease-symptom clustering effectively and derives disease treatment recommendations intelligently and accurately. |
Tasks | |
Published | 2018-10-17 |
URL | http://arxiv.org/abs/1810.07762v1 |
http://arxiv.org/pdf/1810.07762v1.pdf | |
PWC | https://paperswithcode.com/paper/a-disease-diagnosis-and-treatment |
Repo | |
Framework | |
Population-aware Hierarchical Bayesian Domain Adaptation
Title | Population-aware Hierarchical Bayesian Domain Adaptation |
Authors | Vishwali Mhasawade, Nabeel Abdur Rehman, Rumi Chunara |
Abstract | Population attributes are essential in health for understanding who the data represents and precision medicine efforts. Even within disease infection labels, patients can exhibit significant variability; “fever” may mean something different when reported in a doctor’s office versus from an online app, precluding directly learning across different datasets for the same prediction task. This problem falls into the domain adaptation paradigm. However, research in this area has to-date not considered who generates the data; symptoms reported by a woman versus a man, for example, could also have different implications. We propose a novel population-aware domain adaptation approach by formulating the domain adaptation task as a multi-source hierarchical Bayesian framework. The model improves prediction in the case of largely unlabelled target data by harnessing both domain and population invariant information. |
Tasks | Domain Adaptation |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08579v1 |
http://arxiv.org/pdf/1811.08579v1.pdf | |
PWC | https://paperswithcode.com/paper/population-aware-hierarchical-bayesian-domain |
Repo | |
Framework | |