February 1, 2020

3172 words 15 mins read

Paper Group AWR 158

Paper Group AWR 158

A Systematic Comparison of English Noun Compound Representations. Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy. Deep Learning architectures for generalized immunofluorescence based nuclear image segmentation. Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization. Sampling from Stochastic Fi …

A Systematic Comparison of English Noun Compound Representations

Title A Systematic Comparison of English Noun Compound Representations
Authors Vered Shwartz
Abstract Building meaningful representations of noun compounds is not trivial since many of them scarcely appear in the corpus. To that end, composition functions approximate the distributional representation of a noun compound by combining its constituent distributional vectors. In the more general case, phrase embeddings have been trained by minimizing the distance between the vectors representing paraphrases. We compare various types of noun compound representations, including distributional, compositional, and paraphrase-based representations, through a series of tasks and analyses, and with an extensive number of underlying word embeddings. We find that indeed, in most cases, composition functions produce higher quality representations than distributional ones, and they improve with computational power. No single function performs best in all scenarios, suggesting that a joint training objective may produce improved representations.
Tasks Word Embeddings
Published 2019-06-11
URL https://arxiv.org/abs/1906.04772v1
PDF https://arxiv.org/pdf/1906.04772v1.pdf
PWC https://paperswithcode.com/paper/a-systematic-comparison-of-english-noun
Repo https://github.com/vered1986/NC_Embeddings
Framework none

Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy

Title Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy
Authors Martin Weigert, Uwe Schmidt, Robert Haase, Ko Sugawara, Gene Myers
Abstract Accurate detection and segmentation of cell nuclei in volumetric (3D) fluorescence microscopy datasets is an important step in many biomedical research projects. Although many automated methods for these tasks exist, they often struggle for images with low signal-to-noise ratios and/or dense packing of nuclei. It was recently shown for 2D microscopy images that these issues can be alleviated by training a neural network to directly predict a suitable shape representation (star-convex polygon) for cell nuclei. In this paper, we adopt and extend this approach to 3D volumes by using star-convex polyhedra to represent cell nuclei and similar shapes. To that end, we overcome the challenges of 1) finding parameter-efficient star-convex polyhedra representations that can faithfully describe cell nuclei shapes, 2) adapting to anisotropic voxel sizes often found in fluorescence microscopy datasets, and 3) efficiently computing intersections between pairs of star-convex polyhedra (required for non-maximum suppression). Although our approach is quite general, since star-convex polyhedra subsume common shapes like bounding boxes and spheres as special cases, our focus is on accurate detection and segmentation of cell nuclei. That that end, we demonstrate on two challenging datasets that our approach (StarDist-3D) leads to superior results when compared to classical and deep-learning based methods.
Tasks 3D Object Detection, Object Detection
Published 2019-08-09
URL https://arxiv.org/abs/1908.03636v1
PDF https://arxiv.org/pdf/1908.03636v1.pdf
PWC https://paperswithcode.com/paper/star-convex-polyhedra-for-3d-object-detection
Repo https://github.com/mpicbg-csbd/stardist
Framework tf

Deep Learning architectures for generalized immunofluorescence based nuclear image segmentation

Title Deep Learning architectures for generalized immunofluorescence based nuclear image segmentation
Authors Florian Kromp, Lukas Fischer, Eva Bozsaky, Inge Ambros, Wolfgang Doerr, Sabine Taschner-Mandl, Peter Ambros, Allan Hanbury
Abstract Separating and labeling each instance of a nucleus (instance-aware segmentation) is the key challenge in segmenting single cell nuclei on fluorescence microscopy images. Deep Neural Networks can learn the implicit transformation of a nuclear image into a probability map indicating the class membership of each pixel (nucleus or background), but the use of post-processing steps to turn the probability map into a labeled object mask is error-prone. This especially accounts for nuclear images of tissue sections and nuclear images across varying tissue preparations. In this work, we aim to evaluate the performance of state-of-the-art deep learning architectures to segment nuclei in fluorescence images of various tissue origins and sample preparation types without post-processing. We compare architectures that operate on pixel to pixel translation and an architecture that operates on object detection and subsequent locally applied segmentation. In addition, we propose a novel strategy to create artificial images to extend the training set. We evaluate the influence of ground truth annotation quality, image scale and segmentation complexity on segmentation performance. Results show that three out of four deep learning architectures (U-Net, U-Net with ResNet34 backbone, Mask R-CNN) can segment fluorescent nuclear images on most of the sample preparation types and tissue origins with satisfactory segmentation performance. Mask R-CNN, an architecture designed to address instance aware segmentation tasks, outperforms other architectures. Equal nuclear mean size, consistent nuclear annotations and the use of artificially generated images result in overall acceptable precision and recall across different tissues and sample preparation types.
Tasks Object Detection, Semantic Segmentation
Published 2019-07-30
URL https://arxiv.org/abs/1907.12975v1
PDF https://arxiv.org/pdf/1907.12975v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-architectures-for-generalized
Repo https://github.com/perlfloccri/NuclearSegmentationPipeline
Framework tf

Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization

Title Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization
Authors Farzin Haddadpour, Mohammad Mahdi Kamani, Mehrdad Mahdavi, Viveck R. Cadambe
Abstract Communication overhead is one of the key challenges that hinders the scalability of distributed optimization algorithms. In this paper, we study local distributed SGD, where data is partitioned among computation nodes, and the computation nodes perform local updates with periodically exchanging the model among the workers to perform averaging. While local SGD is empirically shown to provide promising results, a theoretical understanding of its performance remains open. We strengthen convergence analysis for local SGD, and show that local SGD can be far less expensive and applied far more generally than current theory suggests. Specifically, we show that for loss functions that satisfy the Polyak-{\L}ojasiewicz condition, $O((pT)^{1/3})$ rounds of communication suffice to achieve a linear speed up, that is, an error of $O(1/pT)$, where $T$ is the total number of model updates at each worker. This is in contrast with previous work which required higher number of communication rounds, as well as was limited to strongly convex loss functions, for a similar asymptotic performance. We also develop an adaptive synchronization scheme that provides a general condition for linear speed up. Finally, we validate the theory with experimental results, running over AWS EC2 clouds and an internal GPU cluster.
Tasks Distributed Optimization
Published 2019-10-30
URL https://arxiv.org/abs/1910.13598v1
PDF https://arxiv.org/pdf/1910.13598v1.pdf
PWC https://paperswithcode.com/paper/local-sgd-with-periodic-averaging-tighter
Repo https://github.com/mmkamani7/LUPA-SGD
Framework tf

Sampling from Stochastic Finite Automata with Applications to CTC Decoding

Title Sampling from Stochastic Finite Automata with Applications to CTC Decoding
Authors Martin Jansche, Alexander Gutkin
Abstract Stochastic finite automata arise naturally in many language and speech processing tasks. They include stochastic acceptors, which represent certain probability distributions over random strings. We consider the problem of efficient sampling: drawing random string variates from the probability distribution represented by stochastic automata and transformations of those. We show that path-sampling is effective and can be efficient if the epsilon-graph of a finite automaton is acyclic. We provide an algorithm that ensures this by conflating epsilon-cycles within strongly connected components. Sampling is also effective in the presence of non-injective transformations of strings. We illustrate this in the context of decoding for Connectionist Temporal Classification (CTC), where the predictive probabilities yield auxiliary sequences which are transformed into shorter labeling strings. We can sample efficiently from the transformed labeling distribution and use this in two different strategies for finding the most probable CTC labeling.
Tasks
Published 2019-05-21
URL https://arxiv.org/abs/1905.08760v1
PDF https://arxiv.org/pdf/1905.08760v1.pdf
PWC https://paperswithcode.com/paper/sampling-from-stochastic-finite-automata-with
Repo https://github.com/vadimkantorov/ctc
Framework pytorch

OverSketched Newton: Fast Convex Optimization for Serverless Systems

Title OverSketched Newton: Fast Convex Optimization for Serverless Systems
Authors Vipul Gupta, Swanand Kadhe, Thomas Courtade, Michael W. Mahoney, Kannan Ramchandran
Abstract Motivated by recent developments in serverless systems for large-scale computation as well as improvements in scalable randomized matrix algorithms, we develop OverSketched Newton, a randomized Hessian-based optimization algorithm to solve large-scale convex optimization problems in serverless systems. OverSketched Newton leverages matrix sketching ideas from Randomized Numerical Linear Algebra to compute the Hessian approximately. These sketching methods lead to inbuilt resiliency against stragglers that are a characteristic of serverless architectures. Depending on whether the problem is strongly convex or not, we propose different iteration updates using the approximate Hessian. For both cases, we establish convergence guarantees for OverSketched Newton and empirically validate our results by solving large-scale supervised learning problems on real-world datasets. Experiments demonstrate a reduction of ${\sim}50%$ in total running time on AWS Lambda, compared to state-of-the-art distributed optimization schemes.
Tasks Distributed Optimization
Published 2019-03-21
URL https://arxiv.org/abs/1903.08857v2
PDF https://arxiv.org/pdf/1903.08857v2.pdf
PWC https://paperswithcode.com/paper/oversketched-newton-fast-convex-optimization
Repo https://github.com/vvipgupta/OverSketchedNewton
Framework none

Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations

Title Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations
Authors Ramaravind Kommiya Mothilal, Amit Sharma, Chenhao Tan
Abstract Post-hoc explanations of machine learning models are crucial for people to understand and act on algorithmic predictions. An intriguing class of explanations is through counterfactuals, hypothetical examples that show people how to obtain a different prediction. We posit that effective counterfactual explanations should satisfy two properties: feasibility of the counterfactual actions given user context and constraints, and diversity among the counterfactuals presented. To this end, we propose a framework for generating and evaluating a diverse set of counterfactual explanations based on determinantal point processes. To evaluate the actionability of counterfactuals, we provide metrics that enable comparison of counterfactual-based methods to other local explanation methods. We further address necessary tradeoffs and point to causal implications in optimizing for counterfactuals. Our experiments on four real-world datasets show that our framework can generate a set of counterfactuals that are diverse and well approximate local decision boundaries, outperforming prior approaches to generating diverse counterfactuals. We provide an implementation of the framework at https://github.com/microsoft/DiCE.
Tasks Point Processes
Published 2019-05-19
URL https://arxiv.org/abs/1905.07697v2
PDF https://arxiv.org/pdf/1905.07697v2.pdf
PWC https://paperswithcode.com/paper/explaining-machine-learning-classifiers
Repo https://github.com/microsoft/DiCE
Framework tf

Hybrid coarse-fine classification for head pose estimation

Title Hybrid coarse-fine classification for head pose estimation
Authors Haofan Wang, Zhenghua Chen, Yi Zhou
Abstract Head pose estimation, which computes the intrinsic Euler angles (yaw, pitch, roll) from the human, is crucial for gaze estimation, face alignment, and 3D reconstruction. Traditional approaches heavily relies on the accuracy of facial landmarks. It limits their performances, especially when the visibility of the face is not in good condition. In this paper, to do the estimation without facial landmarks, we combine the coarse and fine regression output together for a deep network. Utilizing more quantization units for the angles, a fine classifier is trained with the help of other auxiliary coarse units. Integrating regression is adopted to get the final prediction. The proposed approach is evaluated on three challenging benchmarks. It achieves the state-of-the-art on AFLW2000, BIWI and performs favorably on AFLW. The code has been released on Github.
Tasks 3D Reconstruction, Face Alignment, Gaze Estimation, Head Pose Estimation, Pose Estimation, Quantization
Published 2019-01-21
URL https://arxiv.org/abs/1901.06778v2
PDF https://arxiv.org/pdf/1901.06778v2.pdf
PWC https://paperswithcode.com/paper/hybrid-coarse-fine-classification-for-head
Repo https://github.com/haofanwang/accurate-head-pose
Framework pytorch

Amortized Monte Carlo Integration

Title Amortized Monte Carlo Integration
Authors Adam Goliński, Frank Wood, Tom Rainforth
Abstract Current approaches to amortizing Bayesian inference focus solely on approximating the posterior distribution. Typically, this approximation is, in turn, used to calculate expectations for one or more target functions - a computational pipeline which is inefficient when the target function(s) are known upfront. In this paper, we address this inefficiency by introducing AMCI, a method for amortizing Monte Carlo integration directly. AMCI operates similarly to amortized inference but produces three distinct amortized proposals, each tailored to a different component of the overall expectation calculation. At runtime, samples are produced separately from each amortized proposal, before being combined to an overall estimate of the expectation. We show that while existing approaches are fundamentally limited in the level of accuracy they can achieve, AMCI can theoretically produce arbitrarily small errors for any integrable target function using only a single sample from each proposal at runtime. We further show that it is able to empirically outperform the theoretically optimal self-normalized importance sampler on a number of example problems. Furthermore, AMCI allows not only for amortizing over datasets but also amortizing over target functions.
Tasks Bayesian Inference
Published 2019-07-18
URL https://arxiv.org/abs/1907.08082v1
PDF https://arxiv.org/pdf/1907.08082v1.pdf
PWC https://paperswithcode.com/paper/amortized-monte-carlo-integration
Repo https://github.com/talesa/amci
Framework none

GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation

Title GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
Authors Marc Brockschmidt
Abstract This paper presents a new Graph Neural Network (GNN) type using feature-wise linear modulation (FiLM). Many standard GNN variants propagate information along the edges of a graph by computing “messages” based only on the representation of the source of each edge. In GNN-FiLM, the representation of the target node of an edge is additionally used to compute a transformation that can be applied to all incoming messages, allowing feature-wise modulation of the passed information. Results of experiments comparing different GNN architectures on three tasks from the literature are presented, based on re-implementations of baseline methods. Hyperparameters for all methods were found using extensive search, yielding somewhat surprising results: differences between baseline models are smaller than reported in the literature. Nonetheless, GNN-FiLM outperforms baseline methods on a regression task on molecular graphs and performs competitively on other tasks.
Tasks
Published 2019-06-28
URL https://arxiv.org/abs/1906.12192v4
PDF https://arxiv.org/pdf/1906.12192v4.pdf
PWC https://paperswithcode.com/paper/gnn-film-graph-neural-networks-with-feature
Repo https://github.com/microsoft/tf-gnn-samples
Framework tf

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping

Title Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping
Authors Antoni Rosinol, Marcus Abate, Yun Chang, Luca Carlone
Abstract We provide an open-source C++ library for real-time metric-semantic visual-inertial Simultaneous Localization And Mapping (SLAM). The library goes beyond existing visual and visual-inertial SLAM libraries (e.g., ORB-SLAM, VINS- Mono, OKVIS, ROVIO) by enabling mesh reconstruction and semantic labeling in 3D. Kimera is designed with modularity in mind and has four key components: a visual-inertial odometry (VIO) module for fast and accurate state estimation, a robust pose graph optimizer for global trajectory estimation, a lightweight 3D mesher module for fast mesh reconstruction, and a dense 3D metric-semantic reconstruction module. The modules can be run in isolation or in combination, hence Kimera can easily fall back to a state-of-the-art VIO or a full SLAM system. Kimera runs in real-time on a CPU and produces a 3D metric-semantic mesh from semantically labeled images, which can be obtained by modern deep learning methods. We hope that the flexibility, computational efficiency, robustness, and accuracy afforded by Kimera will build a solid basis for future metric-semantic SLAM and perception research, and will allow researchers across multiple areas (e.g., VIO, SLAM, 3D reconstruction, segmentation) to benchmark and prototype their own efforts without having to start from scratch.
Tasks 3D Reconstruction, Simultaneous Localization and Mapping
Published 2019-10-06
URL https://arxiv.org/abs/1910.02490v3
PDF https://arxiv.org/pdf/1910.02490v3.pdf
PWC https://paperswithcode.com/paper/kimera-an-open-source-library-for-real-time
Repo https://github.com/MIT-SPARK/Kimera-VIO-ROS
Framework none

Analyzing and Improving the Image Quality of StyleGAN

Title Analyzing and Improving the Image Quality of StyleGAN
Authors Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila
Abstract The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. We expose and analyze several of its characteristic artifacts, and propose changes in both model architecture and training methods to address them. In particular, we redesign the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images. In addition to improving image quality, this path length regularizer yields the additional benefit that the generator becomes significantly easier to invert. This makes it possible to reliably attribute a generated image to a particular network. We furthermore visualize how well the generator utilizes its output resolution, and identify a capacity problem, motivating us to train larger models for additional quality improvements. Overall, our improved model redefines the state of the art in unconditional image modeling, both in terms of existing distribution quality metrics as well as perceived image quality.
Tasks Image Generation
Published 2019-12-03
URL https://arxiv.org/abs/1912.04958v2
PDF https://arxiv.org/pdf/1912.04958v2.pdf
PWC https://paperswithcode.com/paper/analyzing-and-improving-the-image-quality-of
Repo https://github.com/jkcracker/stylegan2
Framework tf

FCSR-GAN: Joint Face Completion and Super-resolution via Multi-task Learning

Title FCSR-GAN: Joint Face Completion and Super-resolution via Multi-task Learning
Authors Jiancheng Cai, Hu Han, Shiguang Shan, Xilin Chen
Abstract Combined variations containing low-resolution and occlusion often present in face images in the wild, e.g., under the scenario of video surveillance. While most of the existing face image recovery approaches can handle only one type of variation per model, in this work, we propose a deep generative adversarial network (FCSR-GAN) for performing joint face completion and face super-resolution via multi-task learning. The generator of FCSR-GAN aims to recover a high-resolution face image without occlusion given an input low-resolution face image with occlusion. The discriminator of FCSR-GAN uses a set of carefully designed losses (an adversarial loss, a perceptual loss, a pixel loss, a smooth loss, a style loss, and a face prior loss) to assure the high quality of the recovered high-resolution face images without occlusion. The whole network of FCSR-GAN can be trained end-to-end using our two-stage training strategy. Experimental results on the public-domain CelebA and Helen databases show that the proposed approach outperforms the state-of-the-art methods in jointly performing face super-resolution (up to 8 $\times$) and face completion, and shows good generalization ability in cross-database testing. Our FCSR-GAN is also useful for improving face identification performance when there are low-resolution and occlusion in face images.
Tasks Face Identification, Facial Inpainting, Multi-Task Learning, Super-Resolution
Published 2019-11-04
URL https://arxiv.org/abs/1911.01045v1
PDF https://arxiv.org/pdf/1911.01045v1.pdf
PWC https://paperswithcode.com/paper/fcsr-gan-joint-face-completion-and-super
Repo https://github.com/swordcheng/FCSR-GAN
Framework pytorch

CAT: Compression-Aware Training for bandwidth reduction

Title CAT: Compression-Aware Training for bandwidth reduction
Authors Chaim Baskin, Brian Chmiel, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson
Abstract Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving visual processing tasks. One of the major obstacles hindering the ubiquitous use of CNNs for inference is their relatively high memory bandwidth requirements, which can be a main energy consumer and throughput bottleneck in hardware accelerators. Accordingly, an efficient feature map compression method can result in substantial performance gains. Inspired by quantization-aware training approaches, we propose a compression-aware training (CAT) method that involves training the model in a way that allows better compression of feature maps during inference. Our method trains the model to achieve low-entropy feature maps, which enables efficient compression at inference time using classical transform coding methods. CAT significantly improves the state-of-the-art results reported for quantization. For example, on ResNet-34 we achieve 73.1% accuracy (0.2% degradation from the baseline) with an average representation of only 1.79 bits per value. Reference implementation accompanies the paper at https://github.com/CAT-teams/CAT
Tasks Quantization
Published 2019-09-25
URL https://arxiv.org/abs/1909.11481v1
PDF https://arxiv.org/pdf/1909.11481v1.pdf
PWC https://paperswithcode.com/paper/cat-compression-aware-training-for-bandwidth-1
Repo https://github.com/CAT-teams/CAT
Framework pytorch

Exploiting Multiple Embeddings for Chinese Named Entity Recognition

Title Exploiting Multiple Embeddings for Chinese Named Entity Recognition
Authors Canwen Xu, Feiyang Wang, Jialong Han, Chenliang Li
Abstract Identifying the named entities mentioned in text would enrich many semantic applications at the downstream level. However, due to the predominant usage of colloquial language in microblogs, the named entity recognition (NER) in Chinese microblogs experience significant performance deterioration, compared with performing NER in formal Chinese corpus. In this paper, we propose a simple yet effective neural framework to derive the character-level embeddings for NER in Chinese text, named ME-CNER. A character embedding is derived with rich semantic information harnessed at multiple granularities, ranging from radical, character to word levels. The experimental results demonstrate that the proposed approach achieves a large performance improvement on Weibo dataset and comparable performance on MSRA news dataset with lower computational cost against the existing state-of-the-art alternatives.
Tasks Chinese Named Entity Recognition, Named Entity Recognition
Published 2019-08-28
URL https://arxiv.org/abs/1908.10657v1
PDF https://arxiv.org/pdf/1908.10657v1.pdf
PWC https://paperswithcode.com/paper/exploiting-multiple-embeddings-for-chinese
Repo https://github.com/WHUIR/ME-CNER
Framework none
comments powered by Disqus