January 28, 2020

3252 words 16 mins read

Paper Group ANR 884

Paper Group ANR 884

Improved Causal Discovery from Longitudinal Data Using a Mixture of DAGs. On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement. Federated Reinforcement Distillation with Proxy Experience Memory. Coarse-to-fine Optimization for Speech Enhancement. Ontology of Card Sleights. MLSys: The New Frontier of Machine Learning Systems. Con …

Improved Causal Discovery from Longitudinal Data Using a Mixture of DAGs

Title Improved Causal Discovery from Longitudinal Data Using a Mixture of DAGs
Authors Eric V. Strobl
Abstract Many causal processes in biomedicine contain cycles and evolve. However, most causal discovery algorithms assume that the underlying causal process follows a single directed acyclic graph (DAG) that does not change over time. The algorithms can therefore infer erroneous causal relations with high confidence when run on real biomedical data. In this paper, I relax the single DAG assumption by modeling causal processes using a mixture of DAGs so that the graph can change over time. I then describe a causal discovery algorithm called Causal Inference over Mixtures (CIM) to infer causal structure from a mixture of DAGs using longitudinal data. CIM improves the accuracy of causal discovery on both real and synthetic clinical datasets even when cycles, non-stationarity, non-linearity, latent variables and selection bias exist simultaneously.
Tasks Causal Discovery, Causal Inference
Published 2019-01-28
URL http://arxiv.org/abs/1901.09475v1
PDF http://arxiv.org/pdf/1901.09475v1.pdf
PWC https://paperswithcode.com/paper/improved-causal-discovery-from-longitudinal
Repo
Framework

On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement

Title On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement
Authors Morten Kolbæk, Zheng-Hua Tan, Søren Holdt Jensen, Jesper Jensen
Abstract Many deep learning-based speech enhancement algorithms are designed to minimize the mean-square error (MSE) in some transform domain between a predicted and a target speech signal. However, optimizing for MSE does not necessarily guarantee high speech quality or intelligibility, which is the ultimate goal of many speech enhancement algorithms. Additionally, only little is known about the impact of the loss function on the emerging class of time-domain deep learning-based speech enhancement systems. We study how popular loss functions influence the performance of deep learning-based speech enhancement systems. First, we demonstrate that perceptually inspired loss functions might be advantageous if the receiver is the human auditory system. Furthermore, we show that the learning rate is a crucial design parameter even for adaptive gradient-based optimizers, which has been generally overlooked in the literature. Also, we found that waveform matching performance metrics must be used with caution as they in certain situations can fail completely. Finally, we show that a loss function based on scale-invariant signal-to-distortion ratio (SI-SDR) achieves good general performance across a range of popular speech enhancement evaluation metrics, which suggests that SI-SDR is a good candidate as a general-purpose loss function for speech enhancement systems.
Tasks Speech Enhancement
Published 2019-09-03
URL https://arxiv.org/abs/1909.01019v2
PDF https://arxiv.org/pdf/1909.01019v2.pdf
PWC https://paperswithcode.com/paper/on-loss-functions-for-supervised-monaural
Repo
Framework

Federated Reinforcement Distillation with Proxy Experience Memory

Title Federated Reinforcement Distillation with Proxy Experience Memory
Authors Han Cha, Jihong Park, Hyesung Kim, Seong-Lyun Kim, Mehdi Bennis
Abstract In distributed reinforcement learning, it is common to exchange the experience memory of each agent and thereby collectively train their local models. The experience memory, however, contains all the preceding state observations and their corresponding policies of the host agent, which may violate the privacy of the agent. To avoid this problem, in this work, we propose a privacy-preserving distributed reinforcement learning (RL) framework, termed federated reinforcement distillation (FRD). The key idea is to exchange a proxy experience memory comprising a pre-arranged set of states and time-averaged policies, thereby preserving the privacy of actual experiences. Based on an advantage actor-critic RL architecture, we numerically evaluate the effectiveness of FRD and investigate how the performance of FRD is affected by the proxy memory structure and different memory exchanging rules.
Tasks
Published 2019-07-15
URL https://arxiv.org/abs/1907.06536v1
PDF https://arxiv.org/pdf/1907.06536v1.pdf
PWC https://paperswithcode.com/paper/federated-reinforcement-distillation-with
Repo
Framework

Coarse-to-fine Optimization for Speech Enhancement

Title Coarse-to-fine Optimization for Speech Enhancement
Authors Jian Yao, Ahmad Al-Dahle
Abstract In this paper, we propose the coarse-to-fine optimization for the task of speech enhancement. Cosine similarity loss [1] has proven to be an effective metric to measure similarity of speech signals. However, due to the large variance of the enhanced speech with even the same cosine similarity loss in high dimensional space, a deep neural network learnt with this loss might not be able to predict enhanced speech with good quality. Our coarse-to-fine strategy optimizes the cosine similarity loss for different granularities so that more constraints are added to the prediction from high dimension to relatively low dimension. In this way, the enhanced speech will better resemble the clean speech. Experimental results show the effectiveness of our proposed coarse-to-fine optimization in both discriminative models and generative models. Moreover, we apply the coarse-to-fine strategy to the adversarial loss in generative adversarial network (GAN) and propose dynamic perceptual loss, which dynamically computes the adversarial loss from coarse resolution to fine resolution. Dynamic perceptual loss further improves the accuracy and achieves state-of-the-art results compared with other generative models.
Tasks Speech Enhancement
Published 2019-08-21
URL https://arxiv.org/abs/1908.08044v1
PDF https://arxiv.org/pdf/1908.08044v1.pdf
PWC https://paperswithcode.com/paper/coarse-to-fine-optimization-for-speech
Repo
Framework

Ontology of Card Sleights

Title Ontology of Card Sleights
Authors Aaron Sterling
Abstract We present a machine-readable movement writing for sleight-of-hand moves with cards – a “Labanotation of card magic.” This scheme of movement writing contains 440 categories of motion, and appears to taxonomize all card sleights that have appeared in over 1500 publications. The movement writing is axiomatized in $\mathcal{SROIQ}$(D) Description Logic, and collected formally as an Ontology of Card Sleights, a computational ontology that extends the Basic Formal Ontology and the Information Artifact Ontology. The Ontology of Card Sleights is implemented in OWL DL, a Description Logic fragment of the Web Ontology Language. While ontologies have historically been used to classify at a less granular level, the algorithmic nature of card tricks allows us to transcribe a performer’s actions step by step. We conclude by discussing design criteria we have used to ensure the ontology can be accessed and modified with a simple click-and-drag interface. This may allow database searches and performance transcriptions by users with card magic knowledge, but no ontology background.
Tasks
Published 2019-03-20
URL http://arxiv.org/abs/1903.08523v1
PDF http://arxiv.org/pdf/1903.08523v1.pdf
PWC https://paperswithcode.com/paper/ontology-of-card-sleights
Repo
Framework

MLSys: The New Frontier of Machine Learning Systems

Title MLSys: The New Frontier of Machine Learning Systems
Authors Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar
Abstract Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a new systems machine learning research community at the intersection of the traditional systems and ML communities, focused on topics such as hardware systems for ML, software systems for ML, and ML optimized for metrics beyond predictive accuracy. To do this, we describe a new conference, MLSys, that explicitly targets research at the intersection of systems and machine learning with a program committee split evenly between experts in systems and ML, and an explicit focus on topics at the intersection of the two.
Tasks
Published 2019-03-29
URL https://arxiv.org/abs/1904.03257v3
PDF https://arxiv.org/pdf/1904.03257v3.pdf
PWC https://paperswithcode.com/paper/sysml-the-new-frontier-of-machine-learning
Repo
Framework

Consensus Feature Network for Scene Parsing

Title Consensus Feature Network for Scene Parsing
Authors Tianyi Wu, Sheng Tang, Rui Zhang, Guodong Guo, Yongdong Zhang
Abstract Scene parsing is challenging as it aims to assign one of the semantic categories to each pixel in scene images. Thus, pixel-level features are desired for scene parsing. However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category. To address this problem, we propose two transform units to learn pixel-level consensus features. One is an Instance Consensus Transform (ICT) unit to learn the instance-level consensus features by aggregating features within the same instance. The other is a Category Consensus Transform (CCT) unit to pursue category-level consensus features through keeping the consensus of features among instances of the same category in scene images. The proposed ICT and CCT units are lightweight, data-driven and end-to-end trainable. The features learned by the two units are more coherent in both instance-level and category-level. Furthermore, we present the Consensus Feature Network (CFNet) based on the proposed ICT and CCT units, and demonstrate the effectiveness of each component in our method by performing extensive ablation experiments. Finally, our proposed CFNet achieves competitive performance on four datasets, including Cityscapes, Pascal Context, CamVid, and COCO Stuff.
Tasks Scene Parsing
Published 2019-07-29
URL https://arxiv.org/abs/1907.12411v2
PDF https://arxiv.org/pdf/1907.12411v2.pdf
PWC https://paperswithcode.com/paper/consensus-feature-network-for-scene-parsing
Repo
Framework

ELKPPNet: An Edge-aware Neural Network with Large Kernel Pyramid Pooling for Learning Discriminative Features in Semantic Segmentation

Title ELKPPNet: An Edge-aware Neural Network with Large Kernel Pyramid Pooling for Learning Discriminative Features in Semantic Segmentation
Authors Xianwei Zheng, Linxi Huan, Hanjiang Xiong, Jianya Gong
Abstract Semantic segmentation has been a hot topic across diverse research fields. Along with the success of deep convolutional neural networks, semantic segmentation has made great achievements and improvements, in terms of both urban scene parsing and indoor semantic segmentation. However, most of the state-of-the-art models are still faced with a challenge in discriminative feature learning, which limits the ability of a model to detect multi-scale objects and to guarantee semantic consistency inside one object or distinguish different adjacent objects with similar appearance. In this paper, a practical and efficient edge-aware neural network is presented for semantic segmentation. This end-to-end trainable engine consists of a new encoder-decoder network, a large kernel spatial pyramid pooling (LKPP) block, and an edge-aware loss function. The encoder-decoder network was designed as a balanced structure to narrow the semantic and resolution gaps in multi-level feature aggregation, while the LKPP block was constructed with a densely expanding receptive field for multi-scale feature extraction and fusion. Furthermore, the new powerful edge-aware loss function is proposed to refine the boundaries directly from the semantic segmentation prediction for more robust and discriminative features. The effectiveness of the proposed model was demonstrated using Cityscapes, CamVid, and NYUDv2 benchmark datasets. The performance of the two structures and the edge-aware loss function in ELKPPNet was validated on the Cityscapes dataset, while the complete ELKPPNet was evaluated on the CamVid and NYUDv2 datasets. A comparative analysis with the state-of-the-art methods under the same conditions confirmed the superiority of the proposed algorithm.
Tasks Scene Parsing, Semantic Segmentation
Published 2019-06-27
URL https://arxiv.org/abs/1906.11428v1
PDF https://arxiv.org/pdf/1906.11428v1.pdf
PWC https://paperswithcode.com/paper/elkppnet-an-edge-aware-neural-network-with
Repo
Framework

Im2Pencil: Controllable Pencil Illustration from Photographs

Title Im2Pencil: Controllable Pencil Illustration from Photographs
Authors Yijun Li, Chen Fang, Aaron Hertzmann, Eli Shechtman, Ming-Hsuan Yang
Abstract We propose a high-quality photo-to-pencil translation method with fine-grained control over the drawing style. This is a challenging task due to multiple stroke types (e.g., outline and shading), structural complexity of pencil shading (e.g., hatching), and the lack of aligned training data pairs. To address these challenges, we develop a two-branch model that learns separate filters for generating sketchy outlines and tonal shading from a collection of pencil drawings. We create training data pairs by extracting clean outlines and tonal illustrations from original pencil drawings using image filtering techniques, and we manually label the drawing styles. In addition, our model creates different pencil styles (e.g., line sketchiness and shading style) in a user-controllable manner. Experimental results on different types of pencil drawings show that the proposed algorithm performs favorably against existing methods in terms of quality, diversity and user evaluations.
Tasks
Published 2019-03-20
URL http://arxiv.org/abs/1903.08682v1
PDF http://arxiv.org/pdf/1903.08682v1.pdf
PWC https://paperswithcode.com/paper/im2pencil-controllable-pencil-illustration
Repo
Framework

On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning

Title On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning
Authors Nick Denis
Abstract Transfer learning across different reinforcement learning (RL) tasks is becoming an increasingly valuable area of research. We consider a goal-based multi-task RL framework and mechanisms by which previously solved tasks can reduce sample complexity and regret when the agent is faced with a new task. Specifically, we introduce two metrics on the state space that encode notions of traversibility of the state space for an agent. Using these metrics a topological covering is constructed by way of a set of landmark states in a fully self-supervised manner. We show that these landmark coverings confer theoretical advantages for transfer learning within the goal-based multi-task RL setting. Specifically, we demonstrate three mechanisms by which landmark coverings can be used for successful transfer learning. First, we extend the Landmark Options Via Reflection (LOVR) framework to this new topological covering; second, we use the landmark-centric value functions themselves as features and define a greedy zombie policy that achieves near oracle performance on a sequence of zero-shot transfer tasks; finally, motivated by the second transfer mechanism, we introduce a learned reward function that provides a more dense reward signal for goal-based RL. Our novel topological landmark covering confers beneficial theoretical results, bounding the Q values at each state-action pair. In doing so, we introduce a mechanism that performs action-pruning at infeasible actions which cannot possibly be part of an optimal policy for the current goal.
Tasks Transfer Learning
Published 2019-07-01
URL https://arxiv.org/abs/1907.00884v1
PDF https://arxiv.org/pdf/1907.00884v1.pdf
PWC https://paperswithcode.com/paper/on-mechanisms-for-transfer-using-landmark
Repo
Framework

Towards Reducing Bias in Gender Classification

Title Towards Reducing Bias in Gender Classification
Authors Komal K. Teru, Aishik Chakraborty
Abstract Societal bias towards certain communities is a big problem that affects a lot of machine learning systems. This work aims at addressing the racial bias present in many modern gender recognition systems. We learn race invariant representations of human faces with an adversarially trained autoencoder model. We show that such representations help us achieve less biased performance in gender classification. We use variance in classification accuracy across different races as a surrogate for the racial bias of the model and achieve a drop of over 40% in variance with race invariant representations.
Tasks
Published 2019-11-16
URL https://arxiv.org/abs/1911.08556v1
PDF https://arxiv.org/pdf/1911.08556v1.pdf
PWC https://paperswithcode.com/paper/towards-reducing-bias-in-gender
Repo
Framework

Accelerating Physics-Based Simulations Using Neural Network Proxies: An Application in Oil Reservoir Modeling

Title Accelerating Physics-Based Simulations Using Neural Network Proxies: An Application in Oil Reservoir Modeling
Authors Jiri Navratil, Alan King, Jesus Rios, Georgios Kollias, Ruben Torrado, Andres Codas
Abstract We develop a proxy model based on deep learning methods to accelerate the simulations of oil reservoirs–by three orders of magnitude–compared to industry-strength physics-based PDE solvers. This paper describes a new architectural approach to this task, accompanied by a thorough experimental evaluation on a publicly available reservoir model. We demonstrate that in a practical setting a speedup of more than 2000X can be achieved with an average sequence error of about 10% relative to the oil-field simulator. The proxy model is contrasted with a high-quality physics-based acceleration baseline and is shown to outperform it by several orders of magnitude. We believe the outcomes presented here are extremely promising and offer a valuable benchmark for continuing research in oil field development optimization. Due to its domain-agnostic architecture, the presented approach can be extended to many applications beyond the field of oil and gas exploration.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1906.01510v1
PDF https://arxiv.org/pdf/1906.01510v1.pdf
PWC https://paperswithcode.com/paper/190601510
Repo
Framework

Semi-Supervised Learning for Cancer Detection of Lymph Node Metastases

Title Semi-Supervised Learning for Cancer Detection of Lymph Node Metastases
Authors Amit Kumar Jaiswal, Ivan Panshin, Dimitrij Shulkin, Nagender Aneja, Samuel Abramov
Abstract Pathologists find tedious to examine the status of the sentinel lymph node on a large number of pathological scans. The examination process of such lymph node which encompasses metastasized cancer cells is histopathologically organized. However, the task of finding metastatic tissues is gradual which is often challenging. In this work, we present our deep convolutional neural network based model validated on PatchCamelyon (PCam) benchmark dataset for fundamental machine learning research in histopathology diagnosis. We find that our proposed model trained with a semi-supervised learning approach by using pseudo labels on PCam-level significantly leads to better performances to strong CNN baseline on the AUC metric.
Tasks
Published 2019-06-23
URL https://arxiv.org/abs/1906.09587v1
PDF https://arxiv.org/pdf/1906.09587v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-for-cancer-detection
Repo
Framework

Careful Selection of Knowledge to solve Open Book Question Answering

Title Careful Selection of Knowledge to solve Open Book Question Answering
Authors Pratyay Banerjee, Kuntal Kumar Pal, Arindam Mitra, Chitta Baral
Abstract Open book question answering is a type of natural language based QA (NLQA) where questions are expected to be answered with respect to a given set of open book facts, and common knowledge about a topic. Recently a challenge involving such QA, OpenBookQA, has been proposed. Unlike most other NLQA tasks that focus on linguistic understanding, OpenBookQA requires deeper reasoning involving linguistic understanding as well as reasoning with common knowledge. In this paper we address QA with respect to the OpenBookQA dataset and combine state of the art language models with abductive information retrieval (IR), information gain based re-ranking, passage selection and weighted scoring to achieve 72.0% accuracy, an 11.6% improvement over the current state of the art.
Tasks Information Retrieval, Question Answering
Published 2019-07-24
URL https://arxiv.org/abs/1907.10738v1
PDF https://arxiv.org/pdf/1907.10738v1.pdf
PWC https://paperswithcode.com/paper/careful-selection-of-knowledge-to-solve-open
Repo
Framework

Robust Lane Marking Detection Algorithm Using Drivable Area Segmentation and Extended SLT

Title Robust Lane Marking Detection Algorithm Using Drivable Area Segmentation and Extended SLT
Authors Umar Ozgunalp, Rui Fan, Shanshan Cheng, Yuxiang Sun, Weixun Zuo, Yilong Zhu, Bohuan Xue, Linwei Zheng, Qing Liang, Ming Liu
Abstract In this paper, a robust lane detection algorithm is proposed, where the vertical road profile of the road is estimated using dynamic programming from the v-disparity map and, based on the estimated profile, the road area is segmented. Since the lane markings are on the road area and any feature point above the ground will be a noise source for the lane detection, a mask is created for the road area to remove some of the noise for lane detection. The estimated mask is multiplied by the lane feature map in a bird’s eye view (BEV). The lane feature points are extracted by using an extended version of symmetrical local threshold (SLT), which not only considers dark light dark transition (DLD) of the lane markings, like (SLT), but also considers parallelism on the lane marking borders. The segmentation then uses only the feature points that are on the road area. A maximum of two linear lane markings are detected using an efficient 1D Hough transform. Then, the detected linear lane markings are used to create a region of interest (ROI) for parabolic lane detection. Finally, based on the estimated region of interest, parabolic lane models are fitted using robust fitting. Due to the robust lane feature extraction and road area segmentation, the proposed algorithm robustly detects lane markings and achieves lane marking detection with an accuracy of 91% when tested on a sequence from the KITTI dataset.
Tasks Lane Detection
Published 2019-11-20
URL https://arxiv.org/abs/1911.09054v1
PDF https://arxiv.org/pdf/1911.09054v1.pdf
PWC https://paperswithcode.com/paper/robust-lane-marking-detection-algorithm-using
Repo
Framework
comments powered by Disqus