January 31, 2020

3520 words 17 mins read

Paper Group AWR 406

KLT Picker: Particle Picking Using Data-Driven Optimal Templates. ALET (Automated Labeling of Equipment and Tools): A Dataset, a Baseline and a Usecase for Tool Detection in the Wild. Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning. Learning Robust Options by Conditional Value at Risk Optimization. End-to-end Named Entity …

KLT Picker: Particle Picking Using Data-Driven Optimal Templates


Title	KLT Picker: Particle Picking Using Data-Driven Optimal Templates
Authors	Amitay Eldar, Boris Landa, Yoel Shkolnisky
Abstract	Particle picking is currently a critical step in the cryo-EM single particle reconstruction pipeline. Despite extensive work on this problem, for many data sets it is still challenging, especially for low SNR micrographs. We present the KLT (Karhunen Loeve Transform) picker, which is fully automatic and requires as an input only the approximated particle size. In particular, it does not require any manual picking. Our method is designed especially to handle low SNR micrographs. It is based on learning a set of optimal templates through the use of multi-variate statistical analysis via the Karhunen Loeve Transform. We evaluate the KLT picker on publicly available data sets and present high-quality results with minimal manual effort.
Tasks
Published	2019-12-12
URL	https://arxiv.org/abs/1912.06500v1
PDF	https://arxiv.org/pdf/1912.06500v1.pdf
PWC	https://paperswithcode.com/paper/klt-picker-particle-picking-using-data-driven
Repo	https://github.com/amitayeldar/KLTpicker
Framework	none

ALET (Automated Labeling of Equipment and Tools): A Dataset, a Baseline and a Usecase for Tool Detection in the Wild


Title	ALET (Automated Labeling of Equipment and Tools): A Dataset, a Baseline and a Usecase for Tool Detection in the Wild
Authors	Fatih Can Kurnaz, Burak Hocaoğlu, Mert Kaan Yılmaz, İdil Sülo, Sinan Kalkan
Abstract	Robots collaborating with humans in realistic environments will need to be able to detect the tools that can be used and manipulated. However, there is no available dataset or study that addresses this challenge in real settings. In this paper, we fill this gap by providing an extensive dataset (METU-ALET) for detecting farming, gardening, office, stonemasonry, vehicle, woodworking and workshop tools. The scenes correspond to sophisticated environments with or without humans using the tools. The scenes we consider introduce several challenges for object detection, including the small scale of the tools, their articulated nature, occlusion, inter-class invariance, etc. Moreover, we train and compare several state of the art deep object detectors (including Faster R-CNN, YOLO and RetinaNet) on our dataset. We observe that the detectors have difficulty in detecting especially small-scale tools or tools that are visually similar to parts of other tools. This in turn supports the importance of our dataset and paper. With the dataset, the code and the trained models, our work provides a basis for further research into tools and their use in robotics applications.
Tasks	Object Detection
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11713v1
PDF	https://arxiv.org/pdf/1910.11713v1.pdf
PWC	https://paperswithcode.com/paper/alet-automated-labeling-of-equipment-and
Repo	https://github.com/metu-kovan/METU-ALET
Framework	none

Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning


Title	Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning
Authors	Shaohui Lin, Rongrong Ji, Yuchao Li, Cheng Deng, Xuelong Li
Abstract	The success of convolutional neural networks (CNNs) in computer vision applications has been accompanied by a significant increase of computation and memory costs, which prohibits its usage on resource-limited environments such as mobile or embedded devices. To this end, the research of CNN compression has recently become emerging. In this paper, we propose a novel filter pruning scheme, termed structured sparsity regularization (SSR), to simultaneously speedup the computation and reduce the memory overhead of CNNs, which can be well supported by various off-the-shelf deep learning libraries. Concretely, the proposed scheme incorporates two different regularizers of structured sparsity into the original objective function of filter pruning, which fully coordinates the global outputs and local pruning operations to adaptively prune filters. We further propose an Alternative Updating with Lagrange Multipliers (AULM) scheme to efficiently solve its optimization. AULM follows the principle of ADMM and alternates between promoting the structured sparsity of CNNs and optimizing the recognition loss, which leads to a very efficient solver (2.5x to the most recent work that directly solves the group sparsity-based regularization). Moreover, by imposing the structured sparsity, the online inference is extremely memory-light, since the number of filters and the output feature maps are simultaneously reduced. The proposed scheme has been deployed to a variety of state-of-the-art CNN structures including LeNet, AlexNet, VGG, ResNet and GoogLeNet over different datasets. Quantitative results demonstrate that the proposed scheme achieves superior performance over the state-of-the-art methods. We further demonstrate the proposed compression scheme for the task of transfer learning, including domain adaptation and object detection, which also show exciting performance gains over the state-of-the-arts.
Tasks	Domain Adaptation, Object Detection, Transfer Learning
Published	2019-01-23
URL	http://arxiv.org/abs/1901.07827v2
PDF	http://arxiv.org/pdf/1901.07827v2.pdf
PWC	https://paperswithcode.com/paper/towards-compact-convnets-via-structure
Repo	https://github.com/ShaohuiLin/SSR
Framework	tf

Learning Robust Options by Conditional Value at Risk Optimization


Title	Learning Robust Options by Conditional Value at Risk Optimization
Authors	Takuya Hiraoka, Takahisa Imagawa, Tatsuya Mori, Takashi Onishi, Yoshimasa Tsuruoka
Abstract	Options are generally learned by using an inaccurate environment model (or simulator), which contains uncertain model parameters. While there are several methods to learn options that are robust against the uncertainty of model parameters, these methods only consider either the worst case or the average (ordinary) case for learning options. This limited consideration of the cases often produces options that do not work well in the unconsidered case. In this paper, we propose a conditional value at risk (CVaR)-based method to learn options that work well in both the average and worst cases. We extend the CVaR-based policy gradient method proposed by Chow and Ghavamzadeh (2014) to deal with robust Markov decision processes and then apply the extended method to learning robust options. We conduct experiments to evaluate our method in multi-joint robot control tasks (HopperIceBlock, Half-Cheetah, and Walker2D). Experimental results show that our method produces options that 1) give better worst-case performance than the options learned only to minimize the average-case loss, and 2) give better average-case performance than the options learned only to minimize the worst-case loss.
Tasks
Published	2019-05-22
URL	https://arxiv.org/abs/1905.09191v4
PDF	https://arxiv.org/pdf/1905.09191v4.pdf
PWC	https://paperswithcode.com/paper/learning-robust-options-by-conditional-value
Repo	https://github.com/TakuyaHiraoka/Learning-Robust-Options-by-Conditional-Value-at-Risk-Optimization
Framework	none

End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models


Title	End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models
Authors	John Giorgi, Xindi Wang, Nicola Sahar, Won Young Shin, Gary D. Bader, Bo Wang
Abstract	Named entity recognition (NER) and relation extraction (RE) are two important tasks in information extraction and retrieval (IE & IR). Recent work has demonstrated that it is beneficial to learn these tasks jointly, which avoids the propagation of error inherent in pipeline-based systems and improves performance. However, state-of-the-art joint models typically rely on external natural language processing (NLP) tools, such as dependency parsers, limiting their usefulness to domains (e.g. news) where those tools perform well. The few neural, end-to-end models that have been proposed are trained almost completely from scratch. In this paper, we propose a neural, end-to-end model for jointly extracting entities and their relations which does not rely on external NLP tools and which integrates a large, pre-trained language model. Because the bulk of our model’s parameters are pre-trained and we eschew recurrence for self-attention, our model is fast to train. On 5 datasets across 3 domains, our model matches or exceeds state-of-the-art performance, sometimes by a large margin.
Tasks	Language Modelling, Named Entity Recognition, Relation Extraction
Published	2019-12-20
URL	https://arxiv.org/abs/1912.13415v1
PDF	https://arxiv.org/pdf/1912.13415v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-named-entity-recognition-and-1
Repo	https://github.com/bowang-lab/joint-ner-and-re
Framework	none

TreyNet: A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages


Title	TreyNet: A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages
Authors	Manuel Carbonell, Alicia Fornés, Mauricio Villegas, Josep Lladós
Abstract	In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that jointly performs handwritten text detection, transcription, and named entity recognition at page level, capable of benefiting from shared features for these tasks. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches.
Tasks	Named Entity Recognition
Published	2019-12-20
URL	https://arxiv.org/abs/1912.10016v1
PDF	https://arxiv.org/pdf/1912.10016v1.pdf
PWC	https://paperswithcode.com/paper/treynet-a-neural-model-for-text-localization
Repo	https://github.com/omni-us/research-e2e-pagereader
Framework	pytorch

MedCAT – Medical Concept Annotation Tool


Title	MedCAT – Medical Concept Annotation Tool
Authors	Zeljko Kraljevic, Daniel Bean, Aurelie Mascio, Lukasz Roguski, Amos Folarin, Angus Roberts, Rebecca Bendayan, Richard Dobson
Abstract	Biomedical documents such as Electronic Health Records (EHRs) contain a large amount of information in an unstructured format. The data in EHRs is a hugely valuable resource documenting clinical narratives and decisions, but whilst the text can be easily understood by human doctors it is challenging to use in research and clinical applications. To uncover the potential of biomedical documents we need to extract and structure the information they contain. The task at hand is Named Entity Recognition and Linking (NER+L). The number of entities, ambiguity of words, overlapping and nesting make the biomedical area significantly more difficult than many others. To overcome these difficulties, we have developed the Medical Concept Annotation Tool (MedCAT), an open-source unsupervised approach to NER+L. MedCAT uses unsupervised machine learning to disambiguate entities. It was validated on MIMIC-III (a freely accessible critical care database) and MedMentions (Biomedical papers annotated with mentions from the Unified Medical Language System). In case of NER+L, the comparison with existing tools shows that MedCAT improves the previous best with only unsupervised learning (F1=0.848 vs 0.691 for disease detection; F1=0.710 vs. 0.222 for general concept detection). A qualitative analysis of the vector embeddings learnt by MedCAT shows that it captures latent medical knowledge available in EHRs (MIMIC-III). Unsupervised learning can improve the performance of large scale entity extraction, but it has some limitations when working with only a couple of entities and a small dataset. In that case options are supervised learning or active learning, both of which are supported in MedCAT via the MedCATtrainer extension. Our approach can detect and link millions of different biomedical concepts with state-of-the-art performance, whilst being lightweight, fast and easy to use.
Tasks	Active Learning, Entity Extraction, Named Entity Recognition
Published	2019-12-18
URL	https://arxiv.org/abs/1912.10166v1
PDF	https://arxiv.org/pdf/1912.10166v1.pdf
PWC	https://paperswithcode.com/paper/medcat-medical-concept-annotation-tool
Repo	https://github.com/CogStack/MedCAT
Framework	none

ADDIS: an adaptive discarding algorithm for online FDR control with conservative nulls


Title	ADDIS: an adaptive discarding algorithm for online FDR control with conservative nulls
Authors	Jinjin Tian, Aaditya Ramdas
Abstract	Major internet companies routinely perform tens of thousands of A/B tests each year. Such large-scale sequential experimentation has resulted in a recent spurt of new algorithms that can provably control the false discovery rate (FDR) in a fully online fashion. However, current state-of-the-art adaptive algorithms can suffer from a significant loss in power if null p-values are conservative (stochastically larger than the uniform distribution), a situation that occurs frequently in practice. In this work, we introduce a new adaptive discarding method called ADDIS that provably controls the FDR and achieves the best of both worlds: it enjoys appreciable power increase over all existing methods if nulls are conservative (the practical case), and rarely loses power if nulls are exactly uniformly distributed (the ideal case). We provide several practical insights on robust choices of tuning parameters, and extend the idea to asynchronous and offline settings as well.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11465v3
PDF	https://arxiv.org/pdf/1905.11465v3.pdf
PWC	https://paperswithcode.com/paper/addis-adaptive-algorithms-for-online-fdr
Repo	https://github.com/JINJINT/ADDIS
Framework	none

Characterizing and Forecasting User Engagement with In-app Action Graph: A Case Study of Snapchat


Title	Characterizing and Forecasting User Engagement with In-app Action Graph: A Case Study of Snapchat
Authors	Yozen Liu, Xiaolin Shi, Lucas Pierce, Xiang Ren
Abstract	While mobile social apps have become increasingly important in people’s daily life, we have limited understanding on what motivates users to engage with these apps. In this paper, we answer the question whether users’ in-app activity patterns help inform their future app engagement (e.g., active days in a future time window)? Previous studies on predicting user app engagement mainly focus on various macroscopic features (e.g., time-series of activity frequency), while ignoring fine-grained inter-dependencies between different in-app actions at the microscopic level. Here we propose to formalize individual user’s in-app action transition patterns as a temporally evolving action graph, and analyze its characteristics in terms of informing future user engagement. Our analysis suggested that action graphs are able to characterize user behavior patterns and inform future engagement. We derive a number of high-order graph features to capture in-app usage patterns and construct interpretable models for predicting trends of engagement changes and active rates. To further enhance predictive power, we design an end-to-end, multi-channel neural model to encode temporal action graphs, activity sequences, and other macroscopic features. Experiments on predicting user engagement for 150k Snapchat new users over a 28-day period demonstrate the effectiveness of the proposed models. The prediction framework is deployed at Snapchat to deliver real world business insights. Our proposed framework is also general and can be applied to other social app platforms.
Tasks	Time Series
Published	2019-06-02
URL	https://arxiv.org/abs/1906.00355v1
PDF	https://arxiv.org/pdf/1906.00355v1.pdf
PWC	https://paperswithcode.com/paper/190600355
Repo	https://github.com/INK-USC/temporal-gcn-lstm
Framework	pytorch

Learning Pairwise Relationship for Multi-object Detection in Crowded Scenes


Title	Learning Pairwise Relationship for Multi-object Detection in Crowded Scenes
Authors	Yu Liu, Lingqiao Liu, Hamid Rezatofighi, Thanh-Toan Do, Qinfeng Shi, Ian Reid
Abstract	As the post-processing step for object detection, non-maximum suppression (GreedyNMS) is widely used in most of the detectors for many years. It is efficient and accurate for sparse scenes, but suffers an inevitable trade-off between precision and recall in crowded scenes. To overcome this drawback, we propose a Pairwise-NMS to cure GreedyNMS. Specifically, a pairwise-relationship network that is based on deep learning is learned to predict if two overlapping proposal boxes contain two objects or zero/one object, which can handle multiple overlapping objects effectively. Through neatly coupling with GreedyNMS without losing efficiency, consistent improvements have been achieved in heavily occluded datasets including MOT15, TUD-Crossing and PETS. In addition, Pairwise-NMS can be integrated into any learning based detectors (Both of Faster-RCNN and DPM detectors are tested in this paper), thus building a bridge between GreedyNMS and end-to-end learning detectors.
Tasks	Object Detection
Published	2019-01-12
URL	http://arxiv.org/abs/1901.03796v1
PDF	http://arxiv.org/pdf/1901.03796v1.pdf
PWC	https://paperswithcode.com/paper/learning-pairwise-relationship-for-multi
Repo	https://github.com/UniLauX/Pairwise-NMS
Framework	none

Fair Regression: Quantitative Definitions and Reduction-based Algorithms


Title	Fair Regression: Quantitative Definitions and Reduction-based Algorithms
Authors	Alekh Agarwal, Miroslav Dudík, Zhiwei Steven Wu
Abstract	In this paper, we study the prediction of a real-valued target, such as a risk score or recidivism rate, while guaranteeing a quantitative notion of fairness with respect to a protected attribute such as gender or race. We call this class of problems \emph{fair regression}. We propose general schemes for fair regression under two notions of fairness: (1) statistical parity, which asks that the prediction be statistically independent of the protected attribute, and (2) bounded group loss, which asks that the prediction error restricted to any protected group remain below some pre-determined level. While we only study these two notions of fairness, our schemes are applicable to arbitrary Lipschitz-continuous losses, and so they encompass least-squares regression, logistic regression, quantile regression, and many other tasks. Our schemes only require access to standard risk minimization algorithms (such as standard classification or least-squares regression) while providing theoretical guarantees on the optimality and fairness of the obtained solutions. In addition to analyzing theoretical properties of our schemes, we empirically demonstrate their ability to uncover fairness–accuracy frontiers on several standard datasets.
Tasks
Published	2019-05-30
URL	https://arxiv.org/abs/1905.12843v1
PDF	https://arxiv.org/pdf/1905.12843v1.pdf
PWC	https://paperswithcode.com/paper/fair-regression-quantitative-definitions-and
Repo	https://github.com/Microsoft/fairlearn
Framework	none

Information Competing Process for Learning Diversified Representations


Title	Information Competing Process for Learning Diversified Representations
Authors	Jie Hu, Rongrong Ji, ShengChuan Zhang, Xiaoshuai Sun, Qixiang Ye, Chia-Wen Lin, Qi Tian
Abstract	Learning representations with diversified information remains as an open problem. Towards learning diversified representations, a new approach, termed Information Competing Process (ICP), is proposed in this paper. Aiming to enrich the information carried by feature representations, ICP separates a representation into two parts with different mutual information constraints. The separated parts are forced to accomplish the downstream task independently in a competitive environment which prevents the two parts from learning what each other learned for the downstream task. Such competing parts are then combined synergistically to complete the task. By fusing representation parts learned competitively under different conditions, ICP facilitates obtaining diversified representations which contain rich information. Experiments on image classification and image reconstruction tasks demonstrate the great potential of ICP to learn discriminative and disentangled representations in both supervised and self-supervised learning settings.
Tasks	Image Classification, Image Reconstruction
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01288v3
PDF	https://arxiv.org/pdf/1906.01288v3.pdf
PWC	https://paperswithcode.com/paper/information-competing-process-for-learning
Repo	https://github.com/hujiecpp/InformationCompetingProcess
Framework	pytorch

Network connectivity dynamics affect the evolution of culturally transmitted variants


Title	Network connectivity dynamics affect the evolution of culturally transmitted variants
Authors	José Segovia Martín, Bradley Walker, Nicolas Fay, Monica Tamariz
Abstract	The distribution of cultural variants in a population is shaped by both neutral evolutionary dynamics and by selection pressures, which include several individual cognitive biases, demographic factors and social network structures. The temporal dynamics of social network connectivity, i.e. the order in which individuals in a population interact with each other, has been largely unexplored. In this paper we investigate how, in a fully connected social network, connectivity dynamics, alone and in interaction with different cognitive biases, affect the evolution of cultural variants. Using agent-based computer simulations, we manipulate population connectivity dynamics (early, middle and late full-population connectivity); content bias, or a preference for high-quality variants; coordination bias, or whether agents tend to use self-produced variants (egocentric bias), or to switch to variants observed in others (allocentric bias); and memory size, or the number of items that agents can store in their memory. We show that connectivity dynamics affect the time-course of variant spread, with lower connectivity slowing down convergence of the population onto a single cultural variant. We also show that, compared to a neutral evolutionary model, content bias accelerates convergence and amplifies the effects of connectivity dynamics, whilst larger memory size and coordination bias, especially egocentric bias, slow down convergence. Furthermore, connectivity dynamics affect the frequency of high quality variants (adaptiveness), with late connectivity populations showing bursts of rapid change in adaptiveness followed by periods of relatively slower change, and early connectivity populations following a single-peak evolutionary dynamic. In this way, we provide for the first time a direct connection between the order of agents’ interactions and punctuational evolution.
Tasks
Published	2019-02-09
URL	http://arxiv.org/abs/1902.06598v1
PDF	http://arxiv.org/pdf/1902.06598v1.pdf
PWC	https://paperswithcode.com/paper/network-connectivity-dynamics-affect-the
Repo	https://github.com/jsegoviamartin/network_connectivity_dynamics_model
Framework	none

AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data


Title	AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather than Data
Authors	Liheng Zhang, Guo-Jun Qi, Liqiang Wang, Jiebo Luo
Abstract	The success of deep neural networks often relies on a large amount of labeled examples, which can be difficult to obtain in many real scenarios. To address this challenge, unsupervised methods are strongly preferred for training neural networks without using any labeled data. In this paper, we present a novel paradigm of unsupervised representation learning by Auto-Encoding Transformation (AET) in contrast to the conventional Auto-Encoding Data (AED) approach. Given a randomly sampled transformation, AET seeks to predict it merely from the encoded features as accurately as possible at the output end. The idea is the following: as long as the unsupervised features successfully encode the essential information about the visual structures of original and transformed images, the transformation can be well predicted. We will show that this AET paradigm allows us to instantiate a large variety of transformations, from parameterized, to non-parameterized and GAN-induced ones. Our experiments show that AET greatly improves over existing unsupervised approaches, setting new state-of-the-art performances being greatly closer to the upper bounds by their fully supervised counterparts on CIFAR-10, ImageNet and Places datasets.
Tasks	Representation Learning, Unsupervised Representation Learning
Published	2019-01-14
URL	http://arxiv.org/abs/1901.04596v2
PDF	http://arxiv.org/pdf/1901.04596v2.pdf
PWC	https://paperswithcode.com/paper/aet-vs-aed-unsupervised-representation
Repo	https://github.com/maple-research-lab/AET
Framework	pytorch

deepCR: Cosmic Ray Rejection with Deep Learning


Title	deepCR: Cosmic Ray Rejection with Deep Learning
Authors	Keming Zhang, Joshua S. Bloom
Abstract	Cosmic ray (CR) identification and replacement are critical components of imaging and spectroscopic reduction pipelines involving solid-state detectors. We present deepCR, a deep learning based framework for CR identification and subsequent image inpainting based on the predicted CR mask. To demonstrate the effectiveness of this framework, we train and evaluate models on Hubble Space Telescope ACS/WFC images of sparse extragalactic fields, globular clusters, and resolved galaxies. We demonstrate that at a false positive rate of 0.5%, deepCR achieves close to 100% detection rates in both extragalactic and globular cluster fields, and 91% in resolved galaxy fields, which is a significant improvement over the current state-of-the-art method LACosmic. Compared to a multicore CPU implementation of LACosmic, deepCR CR mask predictions run up to 6.5 times faster on CPU and 90 times faster on a single GPU. For image inpainting, the mean squared errors of deepCR predictions are 20 times lower in globular cluster fields, 5 times lower in resolved galaxy fields, and 2.5 times lower in extragalactic fields, compared to the best performing non-neural technique tested. We present our framework and the trained models as an open-source Python project, with a simple-to-use API. To facilitate reproducibility of the results we also provide a benchmarking codebase.
Tasks	Image Inpainting
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09500v2
PDF	https://arxiv.org/pdf/1907.09500v2.pdf
PWC	https://paperswithcode.com/paper/deepcr-cosmic-ray-rejection-with-deep
Repo	https://github.com/kmzzhang/deepCR-paper
Framework	pytorch