January 28, 2020

2827 words 14 mins read

Paper Group ANR 802

Paper Group ANR 802

Kernel Optimal Orthogonality Weighting: A Balancing Approach to Estimating Effects of Continuous Treatments. Learning Shared Encoding Representation for End-to-End Speech Recognition Models. Fixed-price Diffusion Mechanism Design. Generating a Common Question from Multiple Documents using Multi-source Encoder-Decoder Models. Improving noise robustn …

Kernel Optimal Orthogonality Weighting: A Balancing Approach to Estimating Effects of Continuous Treatments

Title Kernel Optimal Orthogonality Weighting: A Balancing Approach to Estimating Effects of Continuous Treatments
Authors Nathan Kallus, Michele Santacatterina
Abstract Many scientific questions require estimating the effects of continuous treatments. Outcome modeling and weighted regression based on the generalized propensity score are the most commonly used methods to evaluate continuous effects. However, these techniques may be sensitive to model misspecification, extreme weights or both. In this paper, we propose Kernel Optimal Orthogonality Weighting (KOOW), a convex optimization-based method, for estimating the effects of continuous treatments. KOOW finds weights that minimize the worst-case penalized functional covariance between the continuous treatment and the confounders. By minimizing this quantity, KOOW successfully provides weights that orthogonalize confounders and the continuous treatment, thus providing optimal covariate balance, while controlling for extreme weights. We valuate its comparative performance in a simulation study. Using data from the Women’s Health Initiative observational study, we apply KOOW to evaluate the effect of red meat consumption on blood pressure.
Tasks
Published 2019-10-26
URL https://arxiv.org/abs/1910.11972v1
PDF https://arxiv.org/pdf/1910.11972v1.pdf
PWC https://paperswithcode.com/paper/kernel-optimal-orthogonality-weighting-a
Repo
Framework

Learning Shared Encoding Representation for End-to-End Speech Recognition Models

Title Learning Shared Encoding Representation for End-to-End Speech Recognition Models
Authors Thai-Son Nguyen, Sebastian Stueker, Alex Waibel
Abstract In this work, we learn a shared encoding representation for a multi-task neural network model optimized with connectionist temporal classification (CTC) and conventional framewise cross-entropy training criteria. Our experiments show that the multi-task training not only tackles the complexity of optimizing CTC models such as acoustic-to-word but also results in significant improvement compared to the plain-task training with an optimal setup. Furthermore, we propose to use the encoding representation learned by the multi-task network to initialize the encoder of attention-based models. Thereby, we train a deep attention-based end-to-end model with 10 long short-term memory (LSTM) layers of encoder which produces 12.2% and 22.6% word-error-rate on Switchboard and CallHome subsets of the Hub5 2000 evaluation.
Tasks Deep Attention, End-To-End Speech Recognition, Speech Recognition
Published 2019-03-31
URL http://arxiv.org/abs/1904.02147v1
PDF http://arxiv.org/pdf/1904.02147v1.pdf
PWC https://paperswithcode.com/paper/learning-shared-encoding-representation-for
Repo
Framework

Fixed-price Diffusion Mechanism Design

Title Fixed-price Diffusion Mechanism Design
Authors Tianyi Zhang, Dengji Zhao, Wen Zhang, Xuming He
Abstract We consider a fixed-price mechanism design setting where a seller sells one item via a social network, but the seller can only directly communicate with her neighbours initially. Each other node in the network is a potential buyer with a valuation derived from a common distribution. With a standard fixed-price mechanism, the seller can only sell the item among her neighbours. To improve her revenue, she needs more buyers to join in the sale. To achieve this, we propose the very first fixed-price mechanism to incentivize the seller’s neighbours to inform their neighbours about the sale and to eventually inform all buyers in the network to improve seller’s revenue. Compared with the existing mechanisms for the same purpose, our mechanism does not require the buyers to reveal their valuations and it is computationally easy. More importantly, it guarantees that the improved revenue is at least 1/2 of the optimal.
Tasks
Published 2019-05-14
URL https://arxiv.org/abs/1905.05450v1
PDF https://arxiv.org/pdf/1905.05450v1.pdf
PWC https://paperswithcode.com/paper/fixed-price-diffusion-mechanism-design
Repo
Framework

Generating a Common Question from Multiple Documents using Multi-source Encoder-Decoder Models

Title Generating a Common Question from Multiple Documents using Multi-source Encoder-Decoder Models
Authors Woon Sang Cho, Yizhe Zhang, Sudha Rao, Chris Brockett, Sungjin Lee
Abstract Ambiguous user queries in search engines result in the retrieval of documents that often span multiple topics. One potential solution is for the search engine to generate multiple refined queries, each of which relates to a subset of the documents spanning the same topic. A preliminary step towards this goal is to generate a question that captures common concepts of multiple documents. We propose a new task of generating common question from multiple documents and present simple variant of an existing multi-source encoder-decoder framework, called the Multi-Source Question Generator (MSQG). We first train an RNN-based single encoder-decoder generator from (single document, question) pairs. At test time, given multiple documents, the ‘Distribute’ step of our MSQG model predicts target word distributions for each document using the trained model. The ‘Aggregate’ step aggregates these distributions to generate a common question. This simple yet effective strategy significantly outperforms several existing baseline models applied to the new task when evaluated using automated metrics and human judgments on the MS-MARCO-QA dataset.
Tasks
Published 2019-10-25
URL https://arxiv.org/abs/1910.11483v1
PDF https://arxiv.org/pdf/1910.11483v1.pdf
PWC https://paperswithcode.com/paper/generating-a-common-question-from-multiple
Repo
Framework

Improving noise robustness of automatic speech recognition via parallel data and teacher-student learning

Title Improving noise robustness of automatic speech recognition via parallel data and teacher-student learning
Authors Ladislav Mošner, Minhua Wu, Anirudh Raju, Sree Hari Krishnan Parthasarathi, Kenichi Kumatani, Shiva Sundaram, Roland Maas, Björn Hoffmeister
Abstract For real-world speech recognition applications, noise robustness is still a challenge. In this work, we adopt the teacher-student (T/S) learning technique using a parallel clean and noisy corpus for improving automatic speech recognition (ASR) performance under multimedia noise. On top of that, we apply a logits selection method which only preserves the k highest values to prevent wrong emphasis of knowledge from the teacher and to reduce bandwidth needed for transferring data. We incorporate up to 8000 hours of untranscribed data for training and present our results on sequence trained models apart from cross entropy trained ones. The best sequence trained student model yields relative word error rate (WER) reductions of approximately 10.1%, 28.7% and 19.6% on our clean, simulated noisy and real test sets respectively comparing to a sequence trained teacher.
Tasks Speech Recognition
Published 2019-01-05
URL http://arxiv.org/abs/1901.02348v3
PDF http://arxiv.org/pdf/1901.02348v3.pdf
PWC https://paperswithcode.com/paper/improving-noise-robustness-of-automatic
Repo
Framework

On Regularization Properties of Artificial Datasets for Deep Learning

Title On Regularization Properties of Artificial Datasets for Deep Learning
Authors Karol Antczak
Abstract The paper discusses regularization properties of artificial data for deep learning. Artificial datasets allow to train neural networks in the case of a real data shortage. It is demonstrated that the artificial data generation process, described as injecting noise to high-level features, bears several similarities to existing regularization methods for deep neural networks. One can treat this property of artificial data as a kind of “deep” regularization. It is thus possible to regularize hidden layers of the network by generating the training data in a certain way.
Tasks
Published 2019-08-19
URL https://arxiv.org/abs/1908.07005v1
PDF https://arxiv.org/pdf/1908.07005v1.pdf
PWC https://paperswithcode.com/paper/on-regularization-properties-of-artificial
Repo
Framework

Provable Filter Pruning for Efficient Neural Networks

Title Provable Filter Pruning for Efficient Neural Networks
Authors Lucas Liebenwein, Cenk Baykal, Harry Lang, Dan Feldman, Daniela Rus
Abstract We present a provable, sampling-based approach for generating compact Convolutional Neural Networks (CNNs) by identifying and removing redundant filters from an over-parameterized network. Our algorithm uses a small batch of input data points to assign a saliency score to each filter and constructs an importance sampling distribution where filters that highly affect the output are sampled with correspondingly high probability. In contrast to existing filter pruning approaches, our method is simultaneously data-informed, exhibits provable guarantees on the size and performance of the pruned network, and is widely applicable to varying network architectures and data sets. Our analytical bounds bridge the notions of compressibility and importance of network structures, which gives rise to a fully-automated procedure for identifying and preserving filters in layers that are essential to the network’s performance. Our experimental evaluations on popular architectures and data sets show that our algorithm consistently generates sparser and more efficient models than those constructed by existing filter pruning approaches.
Tasks
Published 2019-11-18
URL https://arxiv.org/abs/1911.07412v2
PDF https://arxiv.org/pdf/1911.07412v2.pdf
PWC https://paperswithcode.com/paper/provable-filter-pruning-for-efficient-neural-1
Repo
Framework

Geometric fluid approximation for general continuous-time Markov chains

Title Geometric fluid approximation for general continuous-time Markov chains
Authors Michalis Michaelides, Jane Hillston, Guido Sanguinetti
Abstract Fluid approximations have seen great success in approximating the macro-scale behaviour of Markov systems with a large number of discrete states. However, these methods rely on the continuous-time Markov chain (CTMC) having a particular population structure which suggests a natural continuous state-space endowed with a dynamics for the approximating process. We construct here a general method based on spectral analysis of the transition matrix of the CTMC, without the need for a population structure. Specifically, we use the popular manifold learning method of diffusion maps to analyse the transition matrix as the operator of a hidden continuous process. An embedding of states in a continuous space is recovered, and the space is endowed with a drift vector field inferred via Gaussian process regression. In this manner, we construct an ODE whose solution approximates the evolution of the CTMC mean, mapped onto the continuous space (known as the fluid limit).
Tasks
Published 2019-01-31
URL https://arxiv.org/abs/1901.11417v2
PDF https://arxiv.org/pdf/1901.11417v2.pdf
PWC https://paperswithcode.com/paper/geometric-fluid-approximation-for-general
Repo
Framework

Visualizing Uncertainty and Saliency Maps of Deep Convolutional Neural Networks for Medical Imaging Applications

Title Visualizing Uncertainty and Saliency Maps of Deep Convolutional Neural Networks for Medical Imaging Applications
Authors Jae Duk Seo
Abstract Deep learning models are now used in many different industries, while in certain domains safety is not a critical issue in the medical field it is a huge concern. Not only, we want the models to generalize well but we also want to know the models confidence respect to its decision and which features matter the most. Our team aims to develop a full pipeline in which not only displays the uncertainty of the models decision but also, the saliency map to show which sets of pixels of the input image contribute most to the predictions.
Tasks
Published 2019-07-05
URL https://arxiv.org/abs/1907.02940v1
PDF https://arxiv.org/pdf/1907.02940v1.pdf
PWC https://paperswithcode.com/paper/visualizing-uncertainty-and-saliency-maps-of
Repo
Framework

Stack-VS: Stacked Visual-Semantic Attention for Image Caption Generation

Title Stack-VS: Stacked Visual-Semantic Attention for Image Caption Generation
Authors Wei Wei, Ling Cheng, Xianling Mao, Guangyou Zhou, Feida Zhu
Abstract Recently, automatic image caption generation has been an important focus of the work on multimodal translation task. Existing approaches can be roughly categorized into two classes, i.e., top-down and bottom-up, the former transfers the image information (called as visual-level feature) directly into a caption, and the later uses the extracted words (called as semanticlevel attribute) to generate a description. However, previous methods either are typically based one-stage decoder or partially utilize part of visual-level or semantic-level information for image caption generation. In this paper, we address the problem and propose an innovative multi-stage architecture (called as Stack-VS) for rich fine-gained image caption generation, via combining bottom-up and top-down attention models to effectively handle both visual-level and semantic-level information of an input image. Specifically, we also propose a novel well-designed stack decoder model, which is constituted by a sequence of decoder cells, each of which contains two LSTM-layers work interactively to re-optimize attention weights on both visual-level feature vectors and semantic-level attribute embeddings for generating a fine-gained image caption. Extensive experiments on the popular benchmark dataset MSCOCO show the significant improvements on different evaluation metrics, i.e., the improvements on BLEU-4/CIDEr/SPICE scores are 0.372, 1.226 and 0.216, respectively, as compared to the state-of-the-arts.
Tasks
Published 2019-09-05
URL https://arxiv.org/abs/1909.02489v1
PDF https://arxiv.org/pdf/1909.02489v1.pdf
PWC https://paperswithcode.com/paper/stack-vs-stacked-visual-semantic-attention
Repo
Framework

Optimizing Pipelined Computation and Communication for Latency-Constrained Edge Learning

Title Optimizing Pipelined Computation and Communication for Latency-Constrained Edge Learning
Authors Nicolas Skatchkovsky, Osvaldo Simeone
Abstract Consider a device that is connected to an edge processor via a communication channel. The device holds local data that is to be offloaded to the edge processor so as to train a machine learning model, e.g., for regression or classification. Transmission of the data to the learning processor, as well as training based on Stochastic Gradient Descent (SGD), must be both completed within a time limit. Assuming that communication and computation can be pipelined, this letter investigates the optimal choice for the packet payload size, given the overhead of each data packet transmission and the ratio between the computation and the communication rates. This amounts to a tradeoff between bias and variance, since communicating the entire data set first reduces the bias of the training process but it may not leave sufficient time for learning. Analytical bounds on the expected optimality gap are derived so as to enable an effective optimization, which is validated in numerical results.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.04488v2
PDF https://arxiv.org/pdf/1906.04488v2.pdf
PWC https://paperswithcode.com/paper/optimizing-pipelined-computation-and
Repo
Framework

Events-to-Video: Bringing Modern Computer Vision to Event Cameras

Title Events-to-Video: Bringing Modern Computer Vision to Event Cameras
Authors Henri Rebecq, René Ranftl, Vladlen Koltun, Davide Scaramuzza
Abstract Event cameras are novel sensors that report brightness changes in the form of asynchronous “events” instead of intensity frames. They have significant advantages over conventional cameras: high temporal resolution, high dynamic range, and no motion blur. Since the output of event cameras is fundamentally different from conventional cameras, it is commonly accepted that they require the development of specialized algorithms to accommodate the particular nature of events. In this work, we take a different view and propose to apply existing, mature computer vision techniques to videos reconstructed from event data. We propose a novel recurrent network to reconstruct videos from a stream of events, and train it on a large amount of simulated event data. Our experiments show that our approach surpasses state-of-the-art reconstruction methods by a large margin (> 20%) in terms of image quality. We further apply off-the-shelf computer vision algorithms to videos reconstructed from event data on tasks such as object classification and visual-inertial odometry, and show that this strategy consistently outperforms algorithms that were specifically designed for event data. We believe that our approach opens the door to bringing the outstanding properties of event cameras to an entirely new range of tasks. A video of the experiments is available at https://youtu.be/IdYrC4cUO0I
Tasks Object Classification
Published 2019-04-17
URL http://arxiv.org/abs/1904.08298v1
PDF http://arxiv.org/pdf/1904.08298v1.pdf
PWC https://paperswithcode.com/paper/190408298
Repo
Framework

Formality Style Transfer with Hybrid Textual Annotations

Title Formality Style Transfer with Hybrid Textual Annotations
Authors Ruochen Xu, Tao Ge, Furu Wei
Abstract Formality style transformation is the task of modifying the formality of a given sentence without changing its content. Its challenge is the lack of large-scale sentence-aligned parallel data. In this paper, we propose an omnivorous model that takes parallel data and formality-classified data jointly to alleviate the data sparsity issue. We empirically demonstrate the effectiveness of our approach by achieving the state-of-art performance on a recently proposed benchmark dataset of formality transfer. Furthermore, our model can be readily adapted to other unsupervised text style transfer tasks like unsupervised sentiment transfer and achieve competitive results on three widely recognized benchmarks.
Tasks Style Transfer, Text Style Transfer
Published 2019-03-15
URL http://arxiv.org/abs/1903.06353v1
PDF http://arxiv.org/pdf/1903.06353v1.pdf
PWC https://paperswithcode.com/paper/formality-style-transfer-with-hybrid-textual
Repo
Framework

2D Car Detection in Radar Data with PointNets

Title 2D Car Detection in Radar Data with PointNets
Authors Andreas Danzer, Thomas Griebel, Martin Bach, Klaus Dietmayer
Abstract For many automated driving functions, a highly accurate perception of the vehicle environment is a crucial prerequisite. Modern high-resolution radar sensors generate multiple radar targets per object, which makes these sensors particularly suitable for the 2D object detection task. This work presents an approach to detect 2D objects solely depending on sparse radar data using PointNets. In literature, only methods are presented so far which perform either object classification or bounding box estimation for objects. In contrast, this method facilitates a classification together with a bounding box estimation of objects using a single radar sensor. To this end, PointNets are adjusted for radar data performing 2D object classification with segmentation, and 2D bounding box regression in order to estimate an amodal 2D bounding box. The algorithm is evaluated using an automatically created dataset which consist of various realistic driving maneuvers. The results show the great potential of object detection in high-resolution radar data using PointNets.
Tasks Object Classification, Object Detection, Object Detection in High Resolution
Published 2019-04-17
URL https://arxiv.org/abs/1904.08414v3
PDF https://arxiv.org/pdf/1904.08414v3.pdf
PWC https://paperswithcode.com/paper/2d-car-detection-in-radar-data-with-pointnets
Repo
Framework

Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support

Title Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support
Authors Yuan Zhou, Hongseok Yang, Yee Whye Teh, Tom Rainforth
Abstract Universal probabilistic programming systems (PPSs) provide a powerful framework for specifying rich and complex probabilistic models. They further attempt to automate the process of drawing inferences from these models, but doing this successfully is severely hampered by the wide range of non–standard models they can express. As a result, although one can specify complex models in a universal PPS, the provided inference engines often fall far short of what is required. In particular, we show they produce surprisingly unsatisfactory performance for models where the support may vary between executions, often doing no better than importance sampling from the prior. To address this, we introduce a new inference framework: Divide, Conquer, and Combine, which remains efficient for such models, and show how it can be implemented as an automated and general-purpose PPS inference engine. We empirically demonstrate substantial performance improvements over existing approaches on two examples.
Tasks Probabilistic Programming
Published 2019-10-29
URL https://arxiv.org/abs/1910.13324v2
PDF https://arxiv.org/pdf/1910.13324v2.pdf
PWC https://paperswithcode.com/paper/191013324
Repo
Framework
comments powered by Disqus