January 31, 2020

3030 words 15 mins read

Paper Group AWR 404

Chunkflow: Distributed Hybrid Cloud Processing of Large 3D Images by Convolutional Nets. Vertebrae Detection and Localization in CT with Two-Stage CNNs and Dense Annotations. MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning. Searching to Exploit Memorization Effect in Learning from Corrupted Labels. ATMSeer: Increasing Transp …

Chunkflow: Distributed Hybrid Cloud Processing of Large 3D Images by Convolutional Nets


Title	Chunkflow: Distributed Hybrid Cloud Processing of Large 3D Images by Convolutional Nets
Authors	Jingpeng Wu, William M. Silversmith, Kisuk Lee, H. Sebastian Seung
Abstract	It is now common to process volumetric biomedical images using 3D Convolutional Networks (ConvNets). This can be challenging for the teravoxel and even petavoxel images that are being acquired today by light or electron microscopy. Here we introduce chunkflow, a software framework for distributing ConvNet processing over local and cloud GPUs and CPUs. The image volume is divided into overlapping chunks, each chunk is processed by a ConvNet, and the results are blended together to yield the output image. The frontend submits ConvNet tasks to a cloud queue. The tasks are executed by local and cloud GPUs and CPUs. Thanks to the fault-tolerant architecture of Chunkflow, cost can be greatly reduced by utilizing cheap unstable cloud instances. Chunkflow currently supports PyTorch for GPUs and PZnet for CPUs. To illustrate its usage, a large 3D brain image from serial section electron microscopy was processed by a 3D ConvNet with a U-Net style architecture. Chunkflow provides some chunk operations for general use, and the operations can be composed flexibly in a command line interface.
Tasks
Published	2019-04-23
URL	https://arxiv.org/abs/1904.10489v3
PDF	https://arxiv.org/pdf/1904.10489v3.pdf
PWC	https://paperswithcode.com/paper/chunkflow-distributed-hybrid-cloud-processing
Repo	https://github.com/seung-lab/chunkflow
Framework	pytorch

Vertebrae Detection and Localization in CT with Two-Stage CNNs and Dense Annotations


Title	Vertebrae Detection and Localization in CT with Two-Stage CNNs and Dense Annotations
Authors	James McCouat, Ben Glocker
Abstract	We propose a new, two-stage approach to the vertebrae centroid detection and localization problem. The first stage detects where the vertebrae appear in the scan using 3D samples, the second identifies the specific vertebrae within that region-of-interest using 2D slices. Our solution utilizes new techniques to improve the accuracy of the algorithm such as a revised approach to dense labelling from sparse centroid annotations and usage of large anisotropic kernels in the base level of a U-net architecture to maximize the receptive field. Our method improves the state-of-the-art’s mean localization accuracy by 0.87mm on a publicly available spine CT benchmark.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.05911v1
PDF	https://arxiv.org/pdf/1910.05911v1.pdf
PWC	https://paperswithcode.com/paper/vertebrae-detection-and-localization-in-ct
Repo	https://github.com/jfm15/SpineFinder
Framework	tf

MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning


Title	MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning
Authors	Zechun Liu, Haoyuan Mu, Xiangyu Zhang, Zichao Guo, Xin Yang, Tim Kwang-Ting Cheng, Jian Sun
Abstract	In this paper, we propose a novel meta learning approach for automatic channel pruning of very deep neural networks. We first train a PruningNet, a kind of meta network, which is able to generate weight parameters for any pruned structure given the target network. We use a simple stochastic structure sampling method for training the PruningNet. Then, we apply an evolutionary procedure to search for good-performing pruned networks. The search is highly efficient because the weights are directly generated by the trained PruningNet and we do not need any finetuning at search time. With a single PruningNet trained for the target network, we can search for various Pruned Networks under different constraints with little human participation. Compared to the state-of-the-art pruning methods, we have demonstrated superior performances on MobileNet V1/V2 and ResNet. Codes are available on https://github.com/liuzechun/MetaPruning.
Tasks	AutoML, Meta-Learning
Published	2019-03-25
URL	https://arxiv.org/abs/1903.10258v3
PDF	https://arxiv.org/pdf/1903.10258v3.pdf
PWC	https://paperswithcode.com/paper/metapruning-meta-learning-for-automatic
Repo	https://github.com/liuzechun/MetaPruning
Framework	pytorch

Searching to Exploit Memorization Effect in Learning from Corrupted Labels


Title	Searching to Exploit Memorization Effect in Learning from Corrupted Labels
Authors	Hansi Yang, Quanming Yao, Bo Han, Gang Niu
Abstract	Sample-selection approaches, which attempt to pick up clean instances from the noisy training data set, have become one promising direction to robust learning from corrupted labels. These methods all build on the memorization effect, which means deep networks learn easy patterns first and then gradually over-fit the training data set. In this paper, we show how to properly select instances so that the training process can benefit the most from the memorization effect is a hard problem. Specifically, memorization can heavily depend on many factors, e.g., data set and network architecture. Nonetheless, there still exist general patterns of how memorization can occur. These facts motivate us to exploit memorization by automated machine learning (AutoML) techniques. First, we design an expressive but compact search space based on observed general patterns. Then, we propose to use the natural gradient-based search algorithm to efficiently search through space. Finally, extensive experiments on both synthetic data sets and benchmark data sets demonstrate that the proposed method can not only be much efficient than existing AutoML algorithms but can also achieve much better performance than the state-of-the-art approaches for learning from corrupted labels.
Tasks	AutoML
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02377v1
PDF	https://arxiv.org/pdf/1911.02377v1.pdf
PWC	https://paperswithcode.com/paper/searching-to-exploit-memorization-effect-in
Repo	https://github.com/bhanML/Co-teaching
Framework	pytorch

ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning


Title	ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning
Authors	Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, Huamin Qu
Abstract	To relieve the pain of manually selecting machine learning algorithms and tuning hyperparameters, automated machine learning (AutoML) methods have been developed to automatically search for good models. Due to the huge model search space, it is impossible to try all models. Users tend to distrust automatic results and increase the search budget as much as they can, thereby undermining the efficiency of AutoML. To address these issues, we design and implement ATMSeer, an interactive visualization tool that supports users in refining the search space of AutoML and analyzing the results. To guide the design of ATMSeer, we derive a workflow of using AutoML based on interviews with machine learning experts. A multi-granularity visualization is proposed to enable users to monitor the AutoML process, analyze the searched models, and refine the search space in real time. We demonstrate the utility and usability of ATMSeer through two case studies, expert interviews, and a user study with 13 end users.
Tasks	AutoML
Published	2019-02-13
URL	http://arxiv.org/abs/1902.05009v1
PDF	http://arxiv.org/pdf/1902.05009v1.pdf
PWC	https://paperswithcode.com/paper/atmseer-increasing-transparency-and
Repo	https://github.com/HDI-Project/ATMSeer
Framework	none

A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities


Title	A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities
Authors	Simon A. A. Kohl, Bernardino Romera-Paredes, Klaus H. Maier-Hein, Danilo Jimenez Rezende, S. M. Ali Eslami, Pushmeet Kohli, Andrew Zisserman, Olaf Ronneberger
Abstract	Medical imaging only indirectly measures the molecular identity of the tissue within each voxel, which often produces only ambiguous image evidence for target measures of interest, like semantic segmentation. This diversity and the variations of plausible interpretations are often specific to given image regions and may thus manifest on various scales, spanning all the way from the pixel to the image level. In order to learn a flexible distribution that can account for multiple scales of variations, we propose the Hierarchical Probabilistic U-Net, a segmentation network with a conditional variational auto-encoder (cVAE) that uses a hierarchical latent space decomposition. We show that this model formulation enables sampling and reconstruction of segmenations with high fidelity, i.e. with finely resolved detail, while providing the flexibility to learn complex structured distributions across scales. We demonstrate these abilities on the task of segmenting ambiguous medical scans as well as on instance segmentation of neurobiological and natural images. Our model automatically separates independent factors across scales, an inductive bias that we deem beneficial in structured output prediction tasks beyond segmentation.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13077v1
PDF	https://arxiv.org/pdf/1905.13077v1.pdf
PWC	https://paperswithcode.com/paper/190513077
Repo	https://github.com/SimonKohl/probabilistic_unet
Framework	none


Title	A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution
Authors	Qing Qu, Xiao Li, Zhihui Zhu
Abstract	We study the multi-channel sparse blind deconvolution (MCS-BD) problem, whose task is to simultaneously recover a kernel $\mathbf a$ and multiple sparse inputs ${\mathbf x_i}{i=1}^p$ from their circulant convolution $\mathbf y_i = \mathbf a \circledast \mathbf x_i $ ($i=1,\cdots,p$). We formulate the task as a nonconvex optimization problem over the sphere. Under mild statistical assumptions of the data, we prove that the vanilla Riemannian gradient descent (RGD) method, with random initializations, provably recovers both the kernel $\mathbf a$ and the signals ${\mathbf x_i}{i=1}^p$ up to a signed shift ambiguity. In comparison with state-of-the-art results, our work shows significant improvements in terms of sample complexity and computational efficiency. Our theoretical results are corroborated by numerical experiments, which demonstrate superior performance of the proposed approach over the previous methods on both synthetic and real datasets.
Tasks
Published	2019-08-28
URL	https://arxiv.org/abs/1908.10776v3
PDF	https://arxiv.org/pdf/1908.10776v3.pdf
PWC	https://paperswithcode.com/paper/a-nonconvex-approach-for-exact-and-efficient
Repo	https://github.com/qingqu06/MCS-BD
Framework	none

MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling


Title	MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling
Authors	Jianyu Wang, Anit Kumar Sahu, Zhouyi Yang, Gauri Joshi, Soummya Kar
Abstract	This paper studies the problem of error-runtime trade-off, typically encountered in decentralized training based on stochastic gradient descent (SGD) using a given network. While a denser (sparser) network topology results in faster (slower) error convergence in terms of iterations, it incurs more (less) communication time/delay per iteration. In this paper, we propose MATCHA, an algorithm that can achieve a win-win in this error-runtime trade-off for any arbitrary network topology. The main idea of MATCHA is to parallelize inter-node communication by decomposing the topology into matchings. To preserve fast error convergence speed, it identifies and communicates more frequently over critical links, and saves communication time by using other links less frequently. Experiments on a suite of datasets and deep neural networks validate the theoretical analyses and demonstrate that MATCHA takes up to $5\times$ less time than vanilla decentralized SGD to reach the same training loss.
Tasks
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09435v3
PDF	https://arxiv.org/pdf/1905.09435v3.pdf
PWC	https://paperswithcode.com/paper/matcha-speeding-up-decentralized-sgd-via
Repo	https://github.com/JYWa/MATCHA
Framework	pytorch

Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video


Title	Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video
Authors	Zongmian Li, Jiri Sedlar, Justin Carpentier, Ivan Laptev, Nicolas Mansard, Josef Sivic
Abstract	In this paper, we introduce a method to automatically reconstruct the 3D motion of a person interacting with an object from a single RGB video. Our method estimates the 3D poses of the person and the object, contact positions, and forces and torques actuated by the human limbs. The main contributions of this work are three-fold. First, we introduce an approach to jointly estimate the motion and the actuation forces of the person on the manipulated object by modeling contacts and the dynamics of their interactions. This is cast as a large-scale trajectory optimization problem. Second, we develop a method to automatically recognize from the input video the position and timing of contacts between the person and the object or the ground, thereby significantly simplifying the complexity of the optimization. Third, we validate our approach on a recent MoCap dataset with ground truth contact forces and demonstrate its performance on a new dataset of Internet videos showing people manipulating a variety of tools in unconstrained environments.
Tasks
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02683v2
PDF	https://arxiv.org/pdf/1904.02683v2.pdf
PWC	https://paperswithcode.com/paper/estimating-3d-motion-and-forces-of-person
Repo	https://github.com/ManifoldFR/recvis-project
Framework	tf

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning


Title	Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
Authors	Hengyuan Hu, Jakob N Foerster
Abstract	In recent years we have seen fast progress on a number of benchmark problems in AI, with modern methods achieving near or super human performance in Go, Poker and Dota. One common aspect of all of these challenges is that they are by design adversarial or, technically speaking, zero-sum. In contrast to these settings, success in the real world commonly requires humans to collaborate and communicate with others, in settings that are, at least partially, cooperative. In the last year, the card game Hanabi has been established as a new benchmark environment for AI to fill this gap. In particular, Hanabi is interesting to humans since it is entirely focused on theory of mind, i.e., the ability to effectively reason over the intentions, beliefs and point of view of other agents when observing their actions. Learning to be informative when observed by others is an interesting challenge for Reinforcement Learning (RL): Fundamentally, RL requires agents to explore in order to discover good policies. However, when done naively, this randomness will inherently make their actions less informative to others during training. We present a new deep multi-agent RL method, the Simplified Action Decoder (SAD), which resolves this contradiction exploiting the centralized training phase. During training SAD allows other agents to not only observe the (exploratory) action chosen, but agents instead also observe the greedy action of their team mates. By combining this simple intuition with best practices for multi-agent learning, SAD establishes a new SOTA for learning methods for 2-5 players on the self-play part of the Hanabi challenge. Our ablations show the contributions of SAD compared with the best practice components. All of our code and trained agents are available at https://github.com/facebookresearch/Hanabi_SAD.
Tasks	Multi-agent Reinforcement Learning
Published	2019-12-04
URL	https://arxiv.org/abs/1912.02288v1
PDF	https://arxiv.org/pdf/1912.02288v1.pdf
PWC	https://paperswithcode.com/paper/191202288
Repo	https://github.com/facebookresearch/Hanabi_SAD
Framework	pytorch


Title	Simultaneous Mapping and Target Driven Navigation
Authors	Georgios Georgakis, Yimeng Li, Jana Kosecka
Abstract	This work presents a modular architecture for simultaneous mapping and target driven navigation in indoors environments. The semantic and appearance stored in 2.5D map is distilled from RGB images, semantic segmentation and outputs of object detectors by convolutional neural networks. Given this representation, the mapping module learns to localize the agent and register consecutive observations in the map. The navigation task is then formulated as a problem of learning a policy for reaching semantic targets using current observations and the up-to-date map. We demonstrate that the use of semantic information improves localization accuracy and the ability of storing spatial semantic map aids the target driven navigation policy. The two modules are evaluated separately and jointly on Active Vision Dataset and Matterport3D environments, demonstrating improved performance on both localization and navigation tasks.
Tasks	Semantic Segmentation
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07980v1
PDF	https://arxiv.org/pdf/1911.07980v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-mapping-and-target-driven
Repo	https://github.com/ggeorgak11/mapping_navigation
Framework	pytorch

Content Enhanced BERT-based Text-to-SQL Generation


Title	Content Enhanced BERT-based Text-to-SQL Generation
Authors	Tong Guo, Huilin Gao
Abstract	We present a simple methods to leverage the table content for the BERT-based model to solve the text-to-SQL problem. Based on the observation that some of the table content match some words in question string and some of the table header also match some words in question string, we encode two addition feature vector for the deep model. Our methods also benefit the model inference in testing time as the tables are almost the same in training and testing time. We test our model on the WikiSQL dataset and outperform the BERT-based baseline by 3.7% in logic form and 3.7% in execution accuracy and achieve state-of-the-art.
Tasks	Semantic Parsing, Text-To-Sql
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07179v4
PDF	https://arxiv.org/pdf/1910.07179v4.pdf
PWC	https://paperswithcode.com/paper/content-enhanced-bert-based-text-to-sql
Repo	https://github.com/guotong1988/NL2SQL-BERT
Framework	pytorch

Earlier Isn’t Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization


Title	Earlier Isn’t Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization
Authors	Taehee Jung, Dongyeop Kang, Lucas Mentch, Eduard Hovy
Abstract	Despite the recent developments on neural summarization systems, the underlying logic behind the improvements from the systems and its corpus-dependency remains largely unexplored. Position of sentences in the original text, for example, is a well known bias for news summarization. Following in the spirit of the claim that summarization is a combination of sub-functions, we define three sub-aspects of summarization: position, importance, and diversity and conduct an extensive analysis of the biases of each sub-aspect with respect to the domain of nine different summarization corpora (e.g., news, academic papers, meeting minutes, movie script, books, posts). We find that while position exhibits substantial bias in news articles, this is not the case, for example, with academic papers and meeting minutes. Furthermore, our empirical study shows that different types of summarization systems (e.g., neural-based) are composed of different degrees of the sub-aspects. Our study provides useful lessons regarding consideration of underlying sub-aspects when collecting a new summarization dataset or developing a new system.
Tasks
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11723v1
PDF	https://arxiv.org/pdf/1908.11723v1.pdf
PWC	https://paperswithcode.com/paper/earlier-isnt-always-better-sub-aspect
Repo	https://github.com/dykang/biassum
Framework	none

Interpretation of machine learning predictions for patient outcomes in electronic health records


Title	Interpretation of machine learning predictions for patient outcomes in electronic health records
Authors	William La Cava, Christopher Bauer, Jason H. Moore, Sarah A Pendergrass
Abstract	Electronic health records are an increasingly important resource for understanding the interactions between patient health, environment, and clinical decisions. In this paper we report an empirical study of predictive modeling of several patient outcomes using three state-of-the-art machine learning methods. Our primary goal is to validate the models by interpreting the importance of predictors in the final models. Central to interpretation is the use of feature importance scores, which vary depending on the underlying methodology. In order to assess feature importance, we compared univariate statistical tests, information-theoretic measures, permutation testing, and normalized coefficients from multivariate logistic regression models. In general we found poor correlation between methods in their assessment of feature importance, even when their performance is comparable and relatively good. However, permutation tests applied to random forest and gradient boosting models showed the most agreement, and the importance scores matched the clinical interpretation most frequently.
Tasks	Feature Importance
Published	2019-03-14
URL	http://arxiv.org/abs/1903.12074v1
PDF	http://arxiv.org/pdf/1903.12074v1.pdf
PWC	https://paperswithcode.com/paper/interpretation-of-machine-learning
Repo	https://github.com/EpistasisLab/interpret_ehr
Framework	none

AD3: Attentive Deep Document Dater


Title	AD3: Attentive Deep Document Dater
Authors	Swayambhu Nath Ray, Shib Sankar Dasgupta, Partha Talukdar
Abstract	Knowledge of the creation date of documents facilitates several tasks such as summarization, event extraction, temporally focused information extraction etc. Unfortunately, for most of the documents on the Web, the time-stamp metadata is either missing or can’t be trusted. Thus, predicting creation time from document content itself is an important task. In this paper, we propose Attentive Deep Document Dater (AD3), an attention-based neural document dating system which utilizes both context and temporal information in documents in a flexible and principled manner. We perform extensive experimentation on multiple real-world datasets to demonstrate the effectiveness of AD3 over neural and non-neural baselines.
Tasks
Published	2019-01-21
URL	http://arxiv.org/abs/1902.02161v1
PDF	http://arxiv.org/pdf/1902.02161v1.pdf
PWC	https://paperswithcode.com/paper/ad3-attentive-deep-document-dater
Repo	https://github.com/malllabiisc/AD3
Framework	tf