October 21, 2019

3011 words 15 mins read

Paper Group AWR 22

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search. Bayesian CycleGAN via Marginalizing Latent Sampling. A Review of Different Word Embeddings for Sentiment Classification using Deep Learning. Do-It-Yourself Single Camera 3D Pointer Input Device. Building a Word Segmenter for Sanskrit Overnight. Dealing with Unknown Unknowns: Ident …

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search


Title	Detect-to-Retrieve: Efficient Regional Aggregation for Image Search
Authors	Marvin Teichmann, Andre Araujo, Menglong Zhu, Jack Sim
Abstract	Retrieving object instances among cluttered scenes efficiently requires compact yet comprehensive regional image representations. Intuitively, object semantics can help build the index that focuses on the most relevant regions. However, due to the lack of bounding-box datasets for objects of interest among retrieval benchmarks, most recent work on regional representations has focused on either uniform or class-agnostic region selection. In this paper, we first fill the void by providing a new dataset of landmark bounding boxes, based on the Google Landmarks dataset, that includes $86k$ images with manually curated boxes from $15k$ unique landmarks. Then, we demonstrate how a trained landmark detector, using our new dataset, can be leveraged to index image regions and improve retrieval accuracy while being much more efficient than existing regional methods. In addition, we introduce a novel regional aggregated selective match kernel (R-ASMK) to effectively combine information from detected regions into an improved holistic image representation. R-ASMK boosts image retrieval accuracy substantially with no dimensionality increase, while even outperforming systems that index image regions independently. Our complete image retrieval system improves upon the previous state-of-the-art by significant margins on the Revisited Oxford and Paris datasets. Code and data available at the project webpage: https://github.com/tensorflow/models/tree/master/research/delf.
Tasks	Image Retrieval
Published	2018-12-04
URL	https://arxiv.org/abs/1812.01584v2
PDF	https://arxiv.org/pdf/1812.01584v2.pdf
PWC	https://paperswithcode.com/paper/detect-to-retrieve-efficient-regional
Repo	https://github.com/tensorflow/models/tree/master/research/delf
Framework	tf

Bayesian CycleGAN via Marginalizing Latent Sampling


Title	Bayesian CycleGAN via Marginalizing Latent Sampling
Authors	Haoran You, Yu Cheng, Tianheng Cheng, Chunliang Li, Pan Zhou
Abstract	Recent techniques built on Generative Adversarial Networks (GANs) like CycleGAN are able to learn mappings between domains from unpaired datasets through min-max optimization games between generators and discriminators. However, it remains challenging to stabilize training process and diversify generated results. To address these problems, we present a Bayesian extension of cyclic model and an integrated cyclic framework for inter-domain mappings. The proposed method stimulated by Bayesian GAN explores the full posteriors of Bayesian cyclic model (with latent sampling) and optimizes the model with maximum a posteriori (MAP) estimation. Hence, we name it {\tt Bayesian CycleGAN}. We perform the proposed Bayesian CycleGAN on multiple benchmark datasets, including Cityscapes, Maps, and Monet2photo. The quantitative and qualitative evaluations demonstrate the proposed method can achieve more stable training, superior performance and diversified images generating.
Tasks	Image-to-Image Translation
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07465v2
PDF	http://arxiv.org/pdf/1811.07465v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-cyclegan-via-marginalizing-latent
Repo	https://github.com/ranery/Bayesian-CycleGAN
Framework	pytorch

A Review of Different Word Embeddings for Sentiment Classification using Deep Learning


Title	A Review of Different Word Embeddings for Sentiment Classification using Deep Learning
Authors	Debadri Dutta
Abstract	The web is loaded with textual content, and Natural Language Processing is a standout amongst the most vital fields in Machine Learning. But when data is huge simple Machine Learning algorithms are not able to handle it and that is when Deep Learning comes into play which based on Neural Networks. However since neural networks cannot process raw text, we have to change over them through some diverse strategies of word embedding. This paper demonstrates those distinctive word embedding strategies implemented on an Amazon Review Dataset, which has two sentiments to be classified: Happy and Unhappy based on numerous customer reviews. Moreover we demonstrate the distinction in accuracy with a discourse about which word embedding to apply when.
Tasks	Sentiment Analysis, Word Embeddings
Published	2018-07-05
URL	http://arxiv.org/abs/1807.02471v1
PDF	http://arxiv.org/pdf/1807.02471v1.pdf
PWC	https://paperswithcode.com/paper/a-review-of-different-word-embeddings-for
Repo	https://github.com/debadridtt/A-Review-of-Different-Word-Embeddings-for-Sentiment-Classification-using-Deep-Learning
Framework	none

Do-It-Yourself Single Camera 3D Pointer Input Device


Title	Do-It-Yourself Single Camera 3D Pointer Input Device
Authors	Bernard Llanos, Yee-Hong Yang
Abstract	We present a new algorithm for single camera 3D reconstruction, or 3D input for human-computer interfaces, based on precise tracking of an elongated object, such as a pen, having a pattern of colored bands. To configure the system, the user provides no more than one labelled image of a handmade pointer, measurements of its colored bands, and the camera’s pinhole projection matrix. Other systems are of much higher cost and complexity, requiring combinations of multiple cameras, stereocameras, and pointers with sensors and lights. Instead of relying on information from multiple devices, we examine our single view more closely, integrating geometric and appearance constraints to robustly track the pointer in the presence of occlusion and distractor objects. By probing objects of known geometry with the pointer, we demonstrate acceptable accuracy of 3D localization.
Tasks	3D Reconstruction
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04704v1
PDF	http://arxiv.org/pdf/1809.04704v1.pdf
PWC	https://paperswithcode.com/paper/do-it-yourself-single-camera-3d-pointer-input
Repo	https://github.com/bllanos/linear-probe
Framework	none

Building a Word Segmenter for Sanskrit Overnight


Title	Building a Word Segmenter for Sanskrit Overnight
Authors	Vikas Reddy, Amrith Krishna, Vishnu Dutt Sharma, Prateek Gupta, Vineeth M R, Pawan Goyal
Abstract	There is an abundance of digitised texts available in Sanskrit. However, the word segmentation task in such texts are challenging due to the issue of ‘Sandhi’. In Sandhi, words in a sentence often fuse together to form a single chunk of text, where the word delimiter vanishes and sounds at the word boundaries undergo transformations, which is also reflected in the written text. Here, we propose an approach that uses a deep sequence to sequence (seq2seq) model that takes only the sandhied string as the input and predicts the unsandhied string. The state of the art models are linguistically involved and have external dependencies for the lexical and morphological analysis of the input. Our model can be trained “overnight” and be used for production. In spite of the knowledge lean approach, our system preforms better than the current state of the art by gaining a percentage increase of 16.79 % than the current state of the art.
Tasks	Morphological Analysis
Published	2018-02-17
URL	http://arxiv.org/abs/1802.06185v1
PDF	http://arxiv.org/pdf/1802.06185v1.pdf
PWC	https://paperswithcode.com/paper/building-a-word-segmenter-for-sanskrit
Repo	https://github.com/cvikasreddy/skt
Framework	tf

Dealing with Unknown Unknowns: Identification and Selection of Minimal Sensing for Fractional Dynamics with Unknown Inputs


Title	Dealing with Unknown Unknowns: Identification and Selection of Minimal Sensing for Fractional Dynamics with Unknown Inputs
Authors	Gaurav Gupta, Sergio Pequito, Paul Bogdan
Abstract	This paper focuses on analysis and design of time-varying complex networks having fractional order dynamics. These systems are key in modeling the complex dynamical processes arising in several natural and man made systems. Notably, examples include neurophysiological signals such as electroencephalogram (EEG) that captures the variation in potential fields, and blood oxygenation level dependent (BOLD) signal, which serves as a proxy for neuronal activity. Notwithstanding, the complex networks originated by locally measuring EEG and BOLD are often treated as isolated networks and do not capture the dependency from external stimuli, e.g., originated in subcortical structures such as the thalamus and the brain stem. Therefore, we propose a paradigm-shift towards the analysis of such complex networks under unknown unknowns (i.e., excitations). Consequently, the main contributions of the present paper are threefold: (i) we present an alternating scheme that enables to determine the best estimate of the model parameters and unknown stimuli; (ii) we provide necessary and sufficient conditions to ensure that it is possible to retrieve the state and unknown stimuli; and (iii) upon these conditions we determine a small subset of variables that need to be measured to ensure that both state and input can be recovered, while establishing sub-optimality guarantees with respect to the smallest possible subset. Finally, we present several pedagogical examples of the main results using real data collected from an EEG wearable device.
Tasks	EEG
Published	2018-03-10
URL	http://arxiv.org/abs/1803.04866v2
PDF	http://arxiv.org/pdf/1803.04866v2.pdf
PWC	https://paperswithcode.com/paper/dealing-with-unknown-unknowns-identification
Repo	https://github.com/gaurav71531/UUknowns
Framework	none

Achievable Information Rates for Nonlinear Fiber Communication via End-to-end Autoencoder Learning


Title	Achievable Information Rates for Nonlinear Fiber Communication via End-to-end Autoencoder Learning
Authors	Shen Li, Christian Häger, Nil Garcia, Henk Wymeersch
Abstract	Machine learning is used to compute achievable information rates (AIRs) for a simplified fiber channel. The approach jointly optimizes the input distribution (constellation shaping) and the auxiliary channel distribution to compute AIRs without explicit channel knowledge in an end-to-end fashion.
Tasks
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07675v2
PDF	http://arxiv.org/pdf/1804.07675v2.pdf
PWC	https://paperswithcode.com/paper/achievable-information-rates-for-nonlinear
Repo	https://github.com/henkwymeersch/AutoencoderFiber
Framework	none

DeepTract: A Probabilistic Deep Learning Framework for White Matter Fiber Tractography


Title	DeepTract: A Probabilistic Deep Learning Framework for White Matter Fiber Tractography
Authors	Itay Benou, Tammy Riklin-Raviv
Abstract	We present DeepTract, a deep-learning framework for estimating white matter fibers orientation and streamline tractography. We adopt a data-driven approach for fiber reconstruction from diffusion weighted images (DWI), which does not assume a specific diffusion model. We use a recurrent neural network for mapping sequences of DWI values into probabilistic fiber orientation distributions. Based on these estimations, our model facilitates both deterministic and probabilistic streamline tractography. We quantitatively evaluate our method using the Tractometer tool, demonstrating competitive performance with state-of-the art classical and machine learning based tractography algorithms. We further present qualitative results of bundle-specific probabilistic tractography obtained using our method. The code is publicly available at: https://github.com/itaybenou/DeepTract.git.
Tasks	White Matter Fiber Tractography
Published	2018-12-12
URL	https://arxiv.org/abs/1812.05129v3
PDF	https://arxiv.org/pdf/1812.05129v3.pdf
PWC	https://paperswithcode.com/paper/deeptract-a-probabilistic-deep-learning
Repo	https://github.com/itaybenou/DeepTract
Framework	none

Complex-YOLO: Real-time 3D Object Detection on Point Clouds


Title	Complex-YOLO: Real-time 3D Object Detection on Point Clouds
Authors	Martin Simon, Stefan Milz, Karl Amende, Horst-Michael Gross
Abstract	Lidar based 3D object detection is inevitable for autonomous driving, because it directly links to environmental understanding and therefore builds the base for prediction and motion planning. The capacity of inferencing highly sparse 3D data in real-time is an ill-posed problem for lots of other application areas besides automated vehicles, e.g. augmented reality, personal robotics or industrial automation. We introduce Complex-YOLO, a state of the art real-time 3D object detection network on point clouds only. In this work, we describe a network that expands YOLOv2, a fast 2D standard object detector for RGB images, by a specific complex regression strategy to estimate multi-class 3D boxes in Cartesian space. Thus, we propose a specific Euler-Region-Proposal Network (E-RPN) to estimate the pose of the object by adding an imaginary and a real fraction to the regression network. This ends up in a closed complex space and avoids singularities, which occur by single angle estimations. The E-RPN supports to generalize well during training. Our experiments on the KITTI benchmark suite show that we outperform current leading methods for 3D object detection specifically in terms of efficiency. We achieve state of the art results for cars, pedestrians and cyclists by being more than five times faster than the fastest competitor. Further, our model is capable of estimating all eight KITTI-classes, including Vans, Trucks or sitting pedestrians simultaneously with high accuracy.
Tasks	3D Object Detection, Autonomous Driving, Motion Planning, Object Detection
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06199v2
PDF	http://arxiv.org/pdf/1803.06199v2.pdf
PWC	https://paperswithcode.com/paper/complex-yolo-real-time-3d-object-detection-on
Repo	https://github.com/ghimiredhikura/Complex-YOLO-V3
Framework	pytorch

Adversarial Robustness Toolbox v1.0.0


Title	Adversarial Robustness Toolbox v1.0.0
Authors	Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Martin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, Ian M. Molloy, Ben Edwards
Abstract	Adversarial Robustness Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defences and test them with adversarial attacks. Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which is required to test defenses with state-of-the-art threat models. Supported Machine Learning Libraries include TensorFlow (v1 and v2), Keras, PyTorch, MXNet, Scikit-learn, XGBoost, LightGBM, CatBoost, and GPy. The source code of ART is released with MIT license at https://github.com/IBM/adversarial-robustness-toolbox. The release includes code examples, notebooks with tutorials and documentation (http://adversarial-robustness-toolbox.readthedocs.io).
Tasks	Gaussian Processes, Time Series
Published	2018-07-03
URL	https://arxiv.org/abs/1807.01069v4
PDF	https://arxiv.org/pdf/1807.01069v4.pdf
PWC	https://paperswithcode.com/paper/adversarial-robustness-toolbox-v040
Repo	https://github.com/sgxcj777/Adversarial-testing-toolbox-with-CLEVER
Framework	none

Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)


Title	Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)
Authors	Xiao Yan, Xinyan Dai, Jie Liu, Kaiwen Zhou, James Cheng
Abstract	Recently, locality sensitive hashing (LSH) was shown to be effective for MIPS and several algorithms including $L_2$-ALSH, Sign-ALSH and Simple-LSH have been proposed. In this paper, we introduce the norm-range partition technique, which partitions the original dataset into sub-datasets containing items with similar 2-norms and builds hash index independently for each sub-dataset. We prove that norm-range partition reduces the query processing complexity for all existing LSH based MIPS algorithms under mild conditions. The key to performance improvement is that norm-range partition allows to use smaller normalization factor most sub-datasets. For efficient query processing, we also formulate a unified framework to rank the buckets from the hash indexes of different sub-datasets. Experiments on real datasets show that norm-range partition significantly reduces the number of probed for LSH based MIPS algorithms when achieving the same recall.
Tasks
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09104v2
PDF	http://arxiv.org/pdf/1810.09104v2.pdf
PWC	https://paperswithcode.com/paper/norm-range-partition-a-universal-catalyst-for
Repo	https://github.com/xinyandai/similarity-search
Framework	none

GraKeL: A Graph Kernel Library in Python


Title	GraKeL: A Graph Kernel Library in Python
Authors	Giannis Siglidis, Giannis Nikolentzos, Stratis Limnios, Christos Giatsidis, Konstantinos Skianis, Michalis Vazirgiannis
Abstract	The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Graph kernels have recently emerged as a promising approach to this problem. There are now many kernels, each focusing on different structural aspects of graphs. Here, we present GraKeL, a library that unifies several graph kernels into a common framework. The library is written in Python and adheres to the scikit-learn interface. It is simple to use and can be naturally combined with scikit-learn’s modules to build a complete machine learning pipeline for tasks such as graph classification and clustering. The code is BSD licensed and is available at: https://github.com/ysig/GraKeL .
Tasks	Graph Classification
Published	2018-06-06
URL	https://arxiv.org/abs/1806.02193v2
PDF	https://arxiv.org/pdf/1806.02193v2.pdf
PWC	https://paperswithcode.com/paper/grakel-a-graph-kernel-library-in-python
Repo	https://github.com/ysig/GraKeL
Framework	none

Uncertainty in Neural Networks: Approximately Bayesian Ensembling


Title	Uncertainty in Neural Networks: Approximately Bayesian Ensembling
Authors	Tim Pearce, Felix Leibfried, Alexandra Brintrup, Mohamed Zaki, Andy Neely
Abstract	Understanding the uncertainty of a neural network’s (NN) predictions is essential for many purposes. The Bayesian framework provides a principled approach to this, however applying it to NNs is challenging due to large numbers of parameters and data. Ensembling NNs provides an easily implementable, scalable method for uncertainty quantification, however, it has been criticised for not being Bayesian. This work proposes one modification to the usual process that we argue does result in approximate Bayesian inference; regularising parameters about values drawn from a distribution which can be set equal to the prior. A theoretical analysis of the procedure in a simplified setting suggests the recovered posterior is centred correctly but tends to have an underestimated marginal variance, and overestimated correlation. However, two conditions can lead to exact recovery. We argue that these conditions are partially present in NNs. Empirical evaluations demonstrate it has an advantage over standard ensembling, and is competitive with variational methods.
Tasks	Bayesian Inference, Image Classification
Published	2018-10-12
URL	https://arxiv.org/abs/1810.05546v5
PDF	https://arxiv.org/pdf/1810.05546v5.pdf
PWC	https://paperswithcode.com/paper/uncertainty-in-neural-networks-bayesian
Repo	https://github.com/TeaPearce/Bayesian_NN_Ensembles
Framework	tf

Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation


Title	Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation
Authors	Mengyu Chu, You Xie, Jonas Mayer, Laura Leal-Taixé, Nils Thuerey
Abstract	We focus on temporal self-supervision for GAN-based video generation tasks. While adversarial training successfully yields generative models for a variety of areas, temporal relationship in the generated data is much less explored. This is crucial for sequential generation tasks, e.g. video super-resolution and unpaired video translation. For the former, state-of-the-art methods often favor simpler norm losses such as $L^2$ over adversarial training. However, their averaging nature easily leads to temporally smooth results with an undesirable lack of spatial detail. For unpaired video translation, existing approaches modify the generator networks to form spatio-temporal cycle consistencies. In contrast, we focus on improving the learning objectives and propose a temporally self-supervised algorithm. For both tasks, we show that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. We also propose a novel Ping-Pong loss to improve the long-term temporal consistency. It effectively prevents recurrent networks from accumulating artifacts temporally without depressing detailed features. We also propose a first set of metrics to quantitatively evaluate the accuracy as well as the perceptual quality of the temporal evolution. A series of user studies confirm the rankings computed with these metrics.
Tasks	Image Super-Resolution, Motion Compensation, Super-Resolution, Video Generation, Video Super-Resolution
Published	2018-11-23
URL	https://arxiv.org/abs/1811.09393v3
PDF	https://arxiv.org/pdf/1811.09393v3.pdf
PWC	https://paperswithcode.com/paper/temporally-coherent-gans-for-video-super
Repo	https://github.com/zhusiling/TecoGAN
Framework	pytorch

Explainable Prediction of Medical Codes from Clinical Text


Title	Explainable Prediction of Medical Codes from Clinical Text
Authors	James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, Jacob Eisenstein
Abstract	Clinical notes are text documents that are created by clinicians for each patient encounter. They are typically accompanied by medical codes, which describe the diagnosis and treatment. Annotating these codes is labor intensive and error prone; furthermore, the connection between the codes and the text is not annotated, obscuring the reasons and details behind specific diagnoses and treatments. We present an attentional convolutional network that predicts medical codes from clinical text. Our method aggregates information across the document using a convolutional neural network, and uses an attention mechanism to select the most relevant segments for each of the thousands of possible codes. The method is accurate, achieving precision@8 of 0.71 and a Micro-F1 of 0.54, which are both better than the prior state of the art. Furthermore, through an interpretability evaluation by a physician, we show that the attention mechanism identifies meaningful explanations for each code assignment
Tasks
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05695v2
PDF	http://arxiv.org/pdf/1802.05695v2.pdf
PWC	https://paperswithcode.com/paper/explainable-prediction-of-medical-codes-from
Repo	https://github.com/jamesmullenbach/caml-mimic
Framework	pytorch