Paper Group ANR 1089
Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path. Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials. Exploiting Sentential Context for Neural Machine Translation. A unified view on differential privacy and robustness to adversarial examples. A Review on Quantile Regression fo …
Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path
Title | Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path |
Authors | Jiacheng Chen, Chen Liu, Jiaye Wu, Yasutaka Furukawa |
Abstract | This paper proposes a new approach for automated floorplan reconstruction from RGBD scans, a major milestone in indoor mapping research. The approach, dubbed Floor-SP, formulates a novel optimization problem, where room-wise coordinate descent sequentially solves dynamic programming to optimize the floorplan graph structure. The objective function consists of data terms guided by deep neural networks, consistency terms encouraging adjacent rooms to share corners and walls, and the model complexity term. The approach does not require corner/edge detection with thresholds, unlike most other methods. We have evaluated our system on production-quality RGBD scans of 527 apartments or houses, including many units with non-Manhattan structures. Qualitative and quantitative evaluations demonstrate a significant performance boost over the current state-of-the-art. Please refer to our project website http://jcchen.me/floor-sp/ for code and data. |
Tasks | Edge Detection |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06702v1 |
https://arxiv.org/pdf/1908.06702v1.pdf | |
PWC | https://paperswithcode.com/paper/floor-sp-inverse-cad-for-floorplans-by |
Repo | |
Framework | |
Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials
Title | Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials |
Authors | Hossein Aboutalebi, Doina Precup, Tibor Schuster |
Abstract | The stochastic multi-armed bandit problem is a well-known model for studying the exploration-exploitation trade-off. It has significant possible applications in adaptive clinical trials, which allow for dynamic changes in the treatment allocation probabilities of patients. However, most bandit learning algorithms are designed with the goal of minimizing the expected regret. While this approach is useful in many areas, in clinical trials, it can be sensitive to outlier data, especially when the sample size is small. In this paper, we define and study a new robustness criterion for bandit problems. Specifically, we consider optimizing a function of the distribution of returns as a regret measure. This provides practitioners more flexibility to define an appropriate regret measure. The learning algorithm we propose to solve this type of problem is a modification of the BESA algorithm [Baransi et al., 2014], which considers a more general version of regret. We present a regret bound for our approach and evaluate it empirically both on synthetic problems as well as on a dataset from the clinical trial literature. Our approach compares favorably to a suite of standard bandit algorithms. |
Tasks | |
Published | 2019-03-04 |
URL | https://arxiv.org/abs/1903.01026v3 |
https://arxiv.org/pdf/1903.01026v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-modular-safe-policies-in-the-bandit |
Repo | |
Framework | |
Exploiting Sentential Context for Neural Machine Translation
Title | Exploiting Sentential Context for Neural Machine Translation |
Authors | Xing Wang, Zhaopeng Tu, Longyue Wang, Shuming Shi |
Abstract | In this work, we present novel approaches to exploit sentential context for neural machine translation (NMT). Specifically, we first show that a shallow sentential context extracted from the top encoder layer only, can improve translation performance via contextualizing the encoding representations of individual words. Next, we introduce a deep sentential context, which aggregates the sentential context representations from all the internal layers of the encoder to form a more comprehensive context representation. Experimental results on the WMT14 English-to-German and English-to-French benchmarks show that our model consistently improves performance over the strong TRANSFORMER model (Vaswani et al., 2017), demonstrating the necessity and effectiveness of exploiting sentential context for NMT. |
Tasks | Machine Translation |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01268v1 |
https://arxiv.org/pdf/1906.01268v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-sentential-context-for-neural |
Repo | |
Framework | |
A unified view on differential privacy and robustness to adversarial examples
Title | A unified view on differential privacy and robustness to adversarial examples |
Authors | Rafael Pinot, Florian Yger, Cédric Gouy-Pailler, Jamal Atif |
Abstract | This short note highlights some links between two lines of research within the emerging topic of trustworthy machine learning: differential privacy and robustness to adversarial examples. By abstracting the definitions of both notions, we show that they build upon the same theoretical ground and hence results obtained so far in one domain can be transferred to the other. More precisely, our analysis is based on two key elements: probabilistic mappings (also called randomized algorithms in the differential privacy community), and the Renyi divergence which subsumes a large family of divergences. We first generalize the definition of robustness against adversarial examples to encompass probabilistic mappings. Then we observe that Renyi-differential privacy (a generalization of differential privacy recently proposed in~\cite{Mironov2017RenyiDP}) and our definition of robustness share several similarities. We finally discuss how can both communities benefit from this connection to transfer technical tools from one research field to the other. |
Tasks | |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.07982v1 |
https://arxiv.org/pdf/1906.07982v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-view-on-differential-privacy-and |
Repo | |
Framework | |
A Review on Quantile Regression for Stochastic Computer Experiments
Title | A Review on Quantile Regression for Stochastic Computer Experiments |
Authors | Léonard Torossian, Victor Picheny, Robert Faivre, Aurélien Garivier |
Abstract | We report on an empirical study of the main strategies for quantile regression in the context of stochastic computer experiments. To ensure adequate diversity, six metamodels are presented, divided into three categories based on order statistics, functional approaches, and those of Bayesian inspiration. The metamodels are tested on several problems characterized by the size of the training set, the input dimension, the signal-to-noise ratio and the value of the probability density function at the targeted quantile. The metamodels studied reveal good contrasts in our set of experiments, enabling several patterns to be extracted. Based on our results, guidelines are proposed to allow users to select the best method for a given problem. |
Tasks | |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.07874v4 |
https://arxiv.org/pdf/1901.07874v4.pdf | |
PWC | https://paperswithcode.com/paper/a-review-on-quantile-regression-for |
Repo | |
Framework | |
Introduction to Coresets: Accurate Coresets
Title | Introduction to Coresets: Accurate Coresets |
Authors | Ibrahim Jubran, Alaa Maalouf, Dan Feldman |
Abstract | A coreset (or core-set) of an input set is its small summation, such that solving a problem on the coreset as its input, provably yields the same result as solving the same problem on the original (full) set, for a given family of problems (models, classifiers, loss functions). Over the past decade, coreset construction algorithms have been suggested for many fundamental problems in e.g. machine/deep learning, computer vision, graphics, databases, and theoretical computer science. This introductory paper was written following requests from (usually non-expert, but also colleagues) regarding the many inconsistent coreset definitions, lack of available source code, the required deep theoretical background from different fields, and the dense papers that make it hard for beginners to apply coresets and develop new ones. The paper provides folklore, classic and simple results including step-by-step proofs and figures, for the simplest (accurate) coresets of very basic problems, such as: sum of vectors, minimum enclosing ball, SVD/ PCA and linear regression. Nevertheless, we did not find most of their constructions in the literature. Moreover, we expect that putting them together in a retrospective context would help the reader to grasp modern results that usually extend and generalize these fundamental observations. Experts might appreciate the unified notation and comparison table that links between existing results. Open source code with example scripts are provided for all the presented algorithms, to demonstrate their practical usage, and to support the readers who are more familiar with programming than math. |
Tasks | |
Published | 2019-10-19 |
URL | https://arxiv.org/abs/1910.08707v1 |
https://arxiv.org/pdf/1910.08707v1.pdf | |
PWC | https://paperswithcode.com/paper/introduction-to-coresets-accurate-coresets |
Repo | |
Framework | |
Unsupervised Abbreviation Disambiguation Contextual disambiguation using word embeddings
Title | Unsupervised Abbreviation Disambiguation Contextual disambiguation using word embeddings |
Authors | Manuel Ciosici, Tobias Sommer, Ira Assent |
Abstract | Abbreviations often have several distinct meanings, often making their use in text ambiguous. Expanding them to their intended meaning in context is important for Machine Reading tasks such as document search, recommendation and question answering. Existing approaches mostly rely on manually labeled examples of abbreviations and their correct long-forms. Such data sets are costly to create and result in trained models with limited applicability and flexibility. Importantly, most current methods must be subjected to a full empirical evaluation in order to understand their limitations, which is cumbersome in practice. In this paper, we present an entirely unsupervised abbreviation disambiguation method (called UAD) that picks up abbreviation definitions from unstructured text. Creating distinct tokens per meaning, we learn context representations as word vectors. We demonstrate how to further boost abbreviation disambiguation performance by obtaining better context representations using additional unstructured text. Our method is the first abbreviation disambiguation approach with a transparent model that allows performance analysis without requiring full-scale evaluation, making it highly relevant for real-world deployments. In our thorough empirical evaluation, UAD achieves high performance on large real-world data sets from different domains and outperforms both baseline and state-of-the-art methods. UAD scales well and supports thousands of abbreviations with multiple different meanings within a single model. In order to spur more research into abbreviation disambiguation, we publish a new data set, that we also use in our experiments. |
Tasks | Question Answering, Reading Comprehension, Word Embeddings |
Published | 2019-04-01 |
URL | https://arxiv.org/abs/1904.00929v2 |
https://arxiv.org/pdf/1904.00929v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-abbreviation-disambiguation |
Repo | |
Framework | |
Differentiable Meta-learning Model for Few-shot Semantic Segmentation
Title | Differentiable Meta-learning Model for Few-shot Semantic Segmentation |
Authors | Pinzhuo Tian, Zhangkai Wu, Lei Qi, Lei Wang, Yinghuan Shi, Yang Gao |
Abstract | To address the annotation scarcity issue in some cases of semantic segmentation, there have been a few attempts to develop the segmentation model in the few-shot learning paradigm. However, most existing methods only focus on the traditional 1-way segmentation setting (i.e., one image only contains a single object). This is far away from practical semantic segmentation tasks where the K-way setting (K>1) is usually required by performing the accurate multi-object segmentation. To deal with this issue, we formulate the few-shot semantic segmentation task as a learning-based pixel classification problem and propose a novel framework called MetaSegNet based on meta-learning. In MetaSegNet, an architecture of embedding module consisting of the global and local feature branches is developed to extract the appropriate meta-knowledge for the few-shot segmentation. Moreover, we incorporate a linear model into MetaSegNet as a base learner to directly predict the label of each pixel for the multi-object segmentation. Furthermore, our MetaSegNet can be trained by the episodic training mechanism in an end-to-end manner from scratch. Experiments on two popular semantic segmentation datasets, i.e., PASCAL VOC and COCO, reveal the effectiveness of the proposed MetaSegNet in the K-way few-shot semantic segmentation task. |
Tasks | Few-Shot Learning, Few-Shot Semantic Segmentation, Meta-Learning, Semantic Segmentation |
Published | 2019-11-23 |
URL | https://arxiv.org/abs/1911.10371v1 |
https://arxiv.org/pdf/1911.10371v1.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-meta-learning-model-for-few |
Repo | |
Framework | |
ClassyTune: A Performance Auto-Tuner for Systems in the Cloud
Title | ClassyTune: A Performance Auto-Tuner for Systems in the Cloud |
Authors | Yuqing Zhu, Jianxun Liu |
Abstract | Performance tuning can improve the system performance and thus enable the reduction of cloud computing resources needed to support an application. Due to the ever increasing number of parameters and complexity of systems, there is a necessity to automate performance tuning for the complicated systems in the cloud. The state-of-the-art tuning methods are adopting either the experience-driven tuning approach or the data-driven one. Data-driven tuning is attracting increasing attentions, as it has wider applicability. But existing data-driven methods cannot fully address the challenges of sample scarcity and high dimensionality simultaneously. We present ClassyTune, a data-driven automatic configuration tuning tool for cloud systems. ClassyTune exploits the machine learning model of classification for auto-tuning. This exploitation enables the induction of more training samples without increasing the input dimension. Experiments on seven popular systems in the cloud show that ClassyTune can effectively tune system performance to seven times higher for high-dimensional configuration space, outperforming expert tuning and the state-of-the-art auto-tuning solutions. We also describe a use case in which performance tuning enables the reduction of 33% computing resources needed to run an online stateless service. |
Tasks | |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05482v1 |
https://arxiv.org/pdf/1910.05482v1.pdf | |
PWC | https://paperswithcode.com/paper/classytune-a-performance-auto-tuner-for |
Repo | |
Framework | |
PROFET: Construction and Inference of DBNs Based on Mathematical Models
Title | PROFET: Construction and Inference of DBNs Based on Mathematical Models |
Authors | Hamda Ajmal, Michael Madden, Catherine Enright |
Abstract | This paper presents, evaluates, and discusses a new software tool to automatically build Dynamic Bayesian Networks (DBNs) from ordinary differential equations (ODEs) entered by the user. The DBNs generated from ODE models can handle both data uncertainty and model uncertainty in a principled manner. The application, named PROFET, can be used for temporal data mining with noisy or missing variables. It enables automatic re-estimation of model parameters using temporal evidence in the form of data streams. For temporal inference, PROFET includes both standard fixed time step particle filtering and its extension, adaptive-time particle filtering algorithms. Adaptive-time particle filtering enables the DBN to automatically adapt its time step length to match the dynamics of the model. We demonstrate PROFET’s functionality by using it to infer the model variables by estimating the model parameters of four benchmark ODE systems. From the generation of the DBN model to temporal inference, the entire process is automated and is delivered as an open-source platform-independent software application with a comprehensive user interface. PROFET is released under the Apache License 2.0. Its source code, executable and documentation are available at http:://profet.it.nuigalway.ie. |
Tasks | |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04895v2 |
https://arxiv.org/pdf/1910.04895v2.pdf | |
PWC | https://paperswithcode.com/paper/profet-construction-and-inference-of-dbns |
Repo | |
Framework | |
Knowledge Graph Transfer Network for Few-Shot Recognition
Title | Knowledge Graph Transfer Network for Few-Shot Recognition |
Authors | Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, Liang Lin |
Abstract | Few-shot learning aims to learn novel categories from very few samples given some base categories with sufficient training samples. The main challenge of this task is the novel categories are prone to dominated by color, texture, shape of the object or background context (namely specificity), which are distinct for the given few training samples but not common for the corresponding categories (see Figure 1). Fortunately, we find that transferring information of the correlated based categories can help learn the novel concepts and thus avoid the novel concept being dominated by the specificity. Besides, incorporating semantic correlations among different categories can effectively regularize this information transfer. In this work, we represent the semantic correlations in the form of structured knowledge graph and integrate this graph into deep neural networks to promote few-shot learning by a novel Knowledge Graph Transfer Network (KGTN). Specifically, by initializing each node with the classifier weight of the corresponding category, a propagation mechanism is learned to adaptively propagate node message through the graph to explore node interaction and transfer classifier information of the base categories to those of the novel ones. Extensive experiments on the ImageNet dataset show significant performance improvement compared with current leading competitors. Furthermore, we construct an ImageNet-6K dataset that covers larger scale categories, i.e, 6,000 categories, and experiments on this dataset further demonstrate the effectiveness of our proposed model. |
Tasks | Few-Shot Learning |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09579v1 |
https://arxiv.org/pdf/1911.09579v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-graph-transfer-network-for-few-shot |
Repo | |
Framework | |
An Analysis of Pre-Training on Object Detection
Title | An Analysis of Pre-Training on Object Detection |
Authors | Hengduo Li, Bharat Singh, Mahyar Najibi, Zuxuan Wu, Larry S. Davis |
Abstract | We provide a detailed analysis of convolutional neural networks which are pre-trained on the task of object detection. To this end, we train detectors on large datasets like OpenImagesV4, ImageNet Localization and COCO. We analyze how well their features generalize to tasks like image classification, semantic segmentation and object detection on small datasets like PASCAL-VOC, Caltech-256, SUN-397, Flowers-102 etc. Some important conclusions from our analysis are — 1) Pre-training on large detection datasets is crucial for fine-tuning on small detection datasets, especially when precise localization is needed. For example, we obtain 81.1% mAP on the PASCAL-VOC dataset at 0.7 IoU after pre-training on OpenImagesV4, which is 7.6% better than the recently proposed DeformableConvNetsV2 which uses ImageNet pre-training. 2) Detection pre-training also benefits other localization tasks like semantic segmentation but adversely affects image classification. 3) Features for images (like avg. pooled Conv5) which are similar in the object detection feature space are likely to be similar in the image classification feature space but the converse is not true. 4) Visualization of features reveals that detection neurons have activations over an entire object, while activations for classification networks typically focus on parts. Therefore, detection networks are poor at classification when multiple instances are present in an image or when an instance only covers a small fraction of an image. |
Tasks | Image Classification, Object Detection, Semantic Segmentation |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05871v1 |
http://arxiv.org/pdf/1904.05871v1.pdf | |
PWC | https://paperswithcode.com/paper/an-analysis-of-pre-training-on-object |
Repo | |
Framework | |
New methods to assess and improve LIGO detector duty cycle
Title | New methods to assess and improve LIGO detector duty cycle |
Authors | Ayon Biswas, Jess McIver, Ashish Mahabal |
Abstract | A network of three or more gravitational wave detectors simultaneously taking data is required to generate a well-localized sky map for gravitational wave sources, such as GW170817. Local seismic disturbances often cause the LIGO and Virgo detectors to lose light resonance in one or more of their component optic cavities, and the affected detector is unable to take data until resonance is recovered. In this paper, we use machine learning techniques to gain insight into the predictive behavior of the LIGO detector optic cavities during the second LIGO-Virgo observing run. We identify a minimal set of optic cavity control signals and data features which capture interferometer behavior leading to a loss of light resonance, or lockloss. We use these channels to accurately distinguish between lockloss events and quiet interferometer operating times via both supervised and unsupervised machine learning methods. This analysis yields new insights into how components of the LIGO detectors contribute to lockloss events, which could inform detector commissioning efforts to mitigate the associated loss of uptime. Particularly, we find that the state of the component optical cavities is a better predictor of loss of lock than ground motion trends. We report prediction accuracies of 98% for times just prior to lock loss, and 90% for times up to 30 seconds prior to lockloss, which shows promise for this method to be applied in near-real time to trigger preventative detector state changes. This method can be extended to target other auxiliary subsystems or times of interest, such as transient noise or loss in detector sensitivity. Application of these techniques during the third LIGO-Virgo observing run and beyond would maximize the potential of the global detector network for multi-messenger astronomy with gravitational waves. |
Tasks | |
Published | 2019-10-26 |
URL | https://arxiv.org/abs/1910.12143v1 |
https://arxiv.org/pdf/1910.12143v1.pdf | |
PWC | https://paperswithcode.com/paper/new-methods-to-assess-and-improve-ligo |
Repo | |
Framework | |
Variable Metric Proximal Gradient Method with Diagonal Barzilai-Borwein Stepsize
Title | Variable Metric Proximal Gradient Method with Diagonal Barzilai-Borwein Stepsize |
Authors | Youngsuk Park, Sauptik Dhar, Stephen Boyd, Mohak Shah |
Abstract | Variable metric proximal gradient (VM-PG) is a widely used class of convex optimization method. Lately, there has been a lot of research on the theoretical guarantees of VM-PG with different metric selections. However, most such metric selections are dependent on (an expensive) Hessian, or limited to scalar stepsizes like the Barzilai-Borwein (BB) stepsize with lots of safeguarding. Instead, in this paper we propose an adaptive metric selection strategy called the diagonal Barzilai-Borwein (BB) stepsize. The proposed diagonal selection better captures the local geometry of the problem while keeping per-step computation cost similar to the scalar BB stepsize i.e. $O(n)$. Under this metric selection for VM-PG, the theoretical convergence is analyzed. Our empirical studies illustrate the improved convergence results under the proposed diagonal BB stepsize, specifically for ill-conditioned machine learning problems for both synthetic and real-world datasets. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.07056v1 |
https://arxiv.org/pdf/1910.07056v1.pdf | |
PWC | https://paperswithcode.com/paper/variable-metric-proximal-gradient-method-with |
Repo | |
Framework | |
Uncrowded Hypervolume Improvement: COMO-CMA-ES and the Sofomore framework
Title | Uncrowded Hypervolume Improvement: COMO-CMA-ES and the Sofomore framework |
Authors | Cheikh Touré, Nikolaus Hansen, Anne Auger, Dimo Brockhoff |
Abstract | We present a framework to build a multiobjective algorithm from single-objective ones. This framework addresses the $p \times n$-dimensional problem of finding p solutions in an n-dimensional search space, maximizing an indicator by dynamic subspace optimization. Each single-objective algorithm optimizes the indicator function given $p - 1$ fixed solutions. Crucially, dominated solutions minimize their distance to the empirical Pareto front defined by these $p - 1$ solutions. We instantiate the framework with CMA-ES as single-objective optimizer. The new algorithm, COMO-CMA-ES, is empirically shown to converge linearly on bi-objective convex-quadratic problems and is compared to MO-CMA-ES, NSGA-II and SMS-EMOA. |
Tasks | |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.08823v1 |
http://arxiv.org/pdf/1904.08823v1.pdf | |
PWC | https://paperswithcode.com/paper/uncrowded-hypervolume-improvement-como-cma-es |
Repo | |
Framework | |