October 21, 2019

3011 words 15 mins read

Paper Group AWR 22

Paper Group AWR 22

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search. Bayesian CycleGAN via Marginalizing Latent Sampling. A Review of Different Word Embeddings for Sentiment Classification using Deep Learning. Do-It-Yourself Single Camera 3D Pointer Input Device. Building a Word Segmenter for Sanskrit Overnight. Dealing with Unknown Unknowns: Ident …

Title Detect-to-Retrieve: Efficient Regional Aggregation for Image Search
Authors Marvin Teichmann, Andre Araujo, Menglong Zhu, Jack Sim
Abstract Retrieving object instances among cluttered scenes efficiently requires compact yet comprehensive regional image representations. Intuitively, object semantics can help build the index that focuses on the most relevant regions. However, due to the lack of bounding-box datasets for objects of interest among retrieval benchmarks, most recent work on regional representations has focused on either uniform or class-agnostic region selection. In this paper, we first fill the void by providing a new dataset of landmark bounding boxes, based on the Google Landmarks dataset, that includes $86k$ images with manually curated boxes from $15k$ unique landmarks. Then, we demonstrate how a trained landmark detector, using our new dataset, can be leveraged to index image regions and improve retrieval accuracy while being much more efficient than existing regional methods. In addition, we introduce a novel regional aggregated selective match kernel (R-ASMK) to effectively combine information from detected regions into an improved holistic image representation. R-ASMK boosts image retrieval accuracy substantially with no dimensionality increase, while even outperforming systems that index image regions independently. Our complete image retrieval system improves upon the previous state-of-the-art by significant margins on the Revisited Oxford and Paris datasets. Code and data available at the project webpage: https://github.com/tensorflow/models/tree/master/research/delf.
Tasks Image Retrieval
Published 2018-12-04
URL https://arxiv.org/abs/1812.01584v2
PDF https://arxiv.org/pdf/1812.01584v2.pdf
PWC https://paperswithcode.com/paper/detect-to-retrieve-efficient-regional
Repo https://github.com/tensorflow/models/tree/master/research/delf
Framework tf

Bayesian CycleGAN via Marginalizing Latent Sampling

Title Bayesian CycleGAN via Marginalizing Latent Sampling
Authors Haoran You, Yu Cheng, Tianheng Cheng, Chunliang Li, Pan Zhou
Abstract Recent techniques built on Generative Adversarial Networks (GANs) like CycleGAN are able to learn mappings between domains from unpaired datasets through min-max optimization games between generators and discriminators. However, it remains challenging to stabilize training process and diversify generated results. To address these problems, we present a Bayesian extension of cyclic model and an integrated cyclic framework for inter-domain mappings. The proposed method stimulated by Bayesian GAN explores the full posteriors of Bayesian cyclic model (with latent sampling) and optimizes the model with maximum a posteriori (MAP) estimation. Hence, we name it {\tt Bayesian CycleGAN}. We perform the proposed Bayesian CycleGAN on multiple benchmark datasets, including Cityscapes, Maps, and Monet2photo. The quantitative and qualitative evaluations demonstrate the proposed method can achieve more stable training, superior performance and diversified images generating.
Tasks Image-to-Image Translation
Published 2018-11-19
URL http://arxiv.org/abs/1811.07465v2
PDF http://arxiv.org/pdf/1811.07465v2.pdf
PWC https://paperswithcode.com/paper/bayesian-cyclegan-via-marginalizing-latent
Repo https://github.com/ranery/Bayesian-CycleGAN
Framework pytorch

A Review of Different Word Embeddings for Sentiment Classification using Deep Learning

Title A Review of Different Word Embeddings for Sentiment Classification using Deep Learning
Authors Debadri Dutta
Abstract The web is loaded with textual content, and Natural Language Processing is a standout amongst the most vital fields in Machine Learning. But when data is huge simple Machine Learning algorithms are not able to handle it and that is when Deep Learning comes into play which based on Neural Networks. However since neural networks cannot process raw text, we have to change over them through some diverse strategies of word embedding. This paper demonstrates those distinctive word embedding strategies implemented on an Amazon Review Dataset, which has two sentiments to be classified: Happy and Unhappy based on numerous customer reviews. Moreover we demonstrate the distinction in accuracy with a discourse about which word embedding to apply when.
Tasks Sentiment Analysis, Word Embeddings
Published 2018-07-05
URL http://arxiv.org/abs/1807.02471v1
PDF http://arxiv.org/pdf/1807.02471v1.pdf
PWC https://paperswithcode.com/paper/a-review-of-different-word-embeddings-for
Repo https://github.com/debadridtt/A-Review-of-Different-Word-Embeddings-for-Sentiment-Classification-using-Deep-Learning
Framework none

Do-It-Yourself Single Camera 3D Pointer Input Device

Title Do-It-Yourself Single Camera 3D Pointer Input Device
Authors Bernard Llanos, Yee-Hong Yang
Abstract We present a new algorithm for single camera 3D reconstruction, or 3D input for human-computer interfaces, based on precise tracking of an elongated object, such as a pen, having a pattern of colored bands. To configure the system, the user provides no more than one labelled image of a handmade pointer, measurements of its colored bands, and the camera’s pinhole projection matrix. Other systems are of much higher cost and complexity, requiring combinations of multiple cameras, stereocameras, and pointers with sensors and lights. Instead of relying on information from multiple devices, we examine our single view more closely, integrating geometric and appearance constraints to robustly track the pointer in the presence of occlusion and distractor objects. By probing objects of known geometry with the pointer, we demonstrate acceptable accuracy of 3D localization.
Tasks 3D Reconstruction
Published 2018-09-12
URL http://arxiv.org/abs/1809.04704v1
PDF http://arxiv.org/pdf/1809.04704v1.pdf
PWC https://paperswithcode.com/paper/do-it-yourself-single-camera-3d-pointer-input
Repo https://github.com/bllanos/linear-probe
Framework none

Building a Word Segmenter for Sanskrit Overnight

Title Building a Word Segmenter for Sanskrit Overnight
Authors Vikas Reddy, Amrith Krishna, Vishnu Dutt Sharma, Prateek Gupta, Vineeth M R, Pawan Goyal
Abstract There is an abundance of digitised texts available in Sanskrit. However, the word segmentation task in such texts are challenging due to the issue of ‘Sandhi’. In Sandhi, words in a sentence often fuse together to form a single chunk of text, where the word delimiter vanishes and sounds at the word boundaries undergo transformations, which is also reflected in the written text. Here, we propose an approach that uses a deep sequence to sequence (seq2seq) model that takes only the sandhied string as the input and predicts the unsandhied string. The state of the art models are linguistically involved and have external dependencies for the lexical and morphological analysis of the input. Our model can be trained “overnight” and be used for production. In spite of the knowledge lean approach, our system preforms better than the current state of the art by gaining a percentage increase of 16.79 % than the current state of the art.
Tasks Morphological Analysis
Published 2018-02-17
URL http://arxiv.org/abs/1802.06185v1
PDF http://arxiv.org/pdf/1802.06185v1.pdf
PWC https://paperswithcode.com/paper/building-a-word-segmenter-for-sanskrit
Repo https://github.com/cvikasreddy/skt
Framework tf

Dealing with Unknown Unknowns: Identification and Selection of Minimal Sensing for Fractional Dynamics with Unknown Inputs

Title Dealing with Unknown Unknowns: Identification and Selection of Minimal Sensing for Fractional Dynamics with Unknown Inputs
Authors Gaurav Gupta, Sergio Pequito, Paul Bogdan
Abstract This paper focuses on analysis and design of time-varying complex networks having fractional order dynamics. These systems are key in modeling the complex dynamical processes arising in several natural and man made systems. Notably, examples include neurophysiological signals such as electroencephalogram (EEG) that captures the variation in potential fields, and blood oxygenation level dependent (BOLD) signal, which serves as a proxy for neuronal activity. Notwithstanding, the complex networks originated by locally measuring EEG and BOLD are often treated as isolated networks and do not capture the dependency from external stimuli, e.g., originated in subcortical structures such as the thalamus and the brain stem. Therefore, we propose a paradigm-shift towards the analysis of such complex networks under unknown unknowns (i.e., excitations). Consequently, the main contributions of the present paper are threefold: (i) we present an alternating scheme that enables to determine the best estimate of the model parameters and unknown stimuli; (ii) we provide necessary and sufficient conditions to ensure that it is possible to retrieve the state and unknown stimuli; and (iii) upon these conditions we determine a small subset of variables that need to be measured to ensure that both state and input can be recovered, while establishing sub-optimality guarantees with respect to the smallest possible subset. Finally, we present several pedagogical examples of the main results using real data collected from an EEG wearable device.
Tasks EEG
Published 2018-03-10
URL http://arxiv.org/abs/1803.04866v2
PDF http://arxiv.org/pdf/1803.04866v2.pdf
PWC https://paperswithcode.com/paper/dealing-with-unknown-unknowns-identification
Repo https://github.com/gaurav71531/UUknowns
Framework none

Achievable Information Rates for Nonlinear Fiber Communication via End-to-end Autoencoder Learning

Title Achievable Information Rates for Nonlinear Fiber Communication via End-to-end Autoencoder Learning
Authors Shen Li, Christian Häger, Nil Garcia, Henk Wymeersch
Abstract Machine learning is used to compute achievable information rates (AIRs) for a simplified fiber channel. The approach jointly optimizes the input distribution (constellation shaping) and the auxiliary channel distribution to compute AIRs without explicit channel knowledge in an end-to-end fashion.
Tasks
Published 2018-04-20
URL http://arxiv.org/abs/1804.07675v2
PDF http://arxiv.org/pdf/1804.07675v2.pdf
PWC https://paperswithcode.com/paper/achievable-information-rates-for-nonlinear
Repo https://github.com/henkwymeersch/AutoencoderFiber
Framework none

DeepTract: A Probabilistic Deep Learning Framework for White Matter Fiber Tractography

Title DeepTract: A Probabilistic Deep Learning Framework for White Matter Fiber Tractography
Authors Itay Benou, Tammy Riklin-Raviv
Abstract We present DeepTract, a deep-learning framework for estimating white matter fibers orientation and streamline tractography. We adopt a data-driven approach for fiber reconstruction from diffusion weighted images (DWI), which does not assume a specific diffusion model. We use a recurrent neural network for mapping sequences of DWI values into probabilistic fiber orientation distributions. Based on these estimations, our model facilitates both deterministic and probabilistic streamline tractography. We quantitatively evaluate our method using the Tractometer tool, demonstrating competitive performance with state-of-the art classical and machine learning based tractography algorithms. We further present qualitative results of bundle-specific probabilistic tractography obtained using our method. The code is publicly available at: https://github.com/itaybenou/DeepTract.git.
Tasks White Matter Fiber Tractography
Published 2018-12-12
URL https://arxiv.org/abs/1812.05129v3
PDF https://arxiv.org/pdf/1812.05129v3.pdf
PWC https://paperswithcode.com/paper/deeptract-a-probabilistic-deep-learning
Repo https://github.com/itaybenou/DeepTract
Framework none

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Title Complex-YOLO: Real-time 3D Object Detection on Point Clouds
Authors Martin Simon, Stefan Milz, Karl Amende, Horst-Michael Gross
Abstract Lidar based 3D object detection is inevitable for autonomous driving, because it directly links to environmental understanding and therefore builds the base for prediction and motion planning. The capacity of inferencing highly sparse 3D data in real-time is an ill-posed problem for lots of other application areas besides automated vehicles, e.g. augmented reality, personal robotics or industrial automation. We introduce Complex-YOLO, a state of the art real-time 3D object detection network on point clouds only. In this work, we describe a network that expands YOLOv2, a fast 2D standard object detector for RGB images, by a specific complex regression strategy to estimate multi-class 3D boxes in Cartesian space. Thus, we propose a specific Euler-Region-Proposal Network (E-RPN) to estimate the pose of the object by adding an imaginary and a real fraction to the regression network. This ends up in a closed complex space and avoids singularities, which occur by single angle estimations. The E-RPN supports to generalize well during training. Our experiments on the KITTI benchmark suite show that we outperform current leading methods for 3D object detection specifically in terms of efficiency. We achieve state of the art results for cars, pedestrians and cyclists by being more than five times faster than the fastest competitor. Further, our model is capable of estimating all eight KITTI-classes, including Vans, Trucks or sitting pedestrians simultaneously with high accuracy.
Tasks 3D Object Detection, Autonomous Driving, Motion Planning, Object Detection
Published 2018-03-16
URL http://arxiv.org/abs/1803.06199v2
PDF http://arxiv.org/pdf/1803.06199v2.pdf
PWC https://paperswithcode.com/paper/complex-yolo-real-time-3d-object-detection-on
Repo https://github.com/ghimiredhikura/Complex-YOLO-V3
Framework pytorch

Adversarial Robustness Toolbox v1.0.0

Title Adversarial Robustness Toolbox v1.0.0
Authors Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Martin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, Ian M. Molloy, Ben Edwards
Abstract Adversarial Robustness Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defences and test them with adversarial attacks. Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which is required to test defenses with state-of-the-art threat models. Supported Machine Learning Libraries include TensorFlow (v1 and v2), Keras, PyTorch, MXNet, Scikit-learn, XGBoost, LightGBM, CatBoost, and GPy. The source code of ART is released with MIT license at https://github.com/IBM/adversarial-robustness-toolbox. The release includes code examples, notebooks with tutorials and documentation (http://adversarial-robustness-toolbox.readthedocs.io).
Tasks Gaussian Processes, Time Series
Published 2018-07-03
URL https://arxiv.org/abs/1807.01069v4
PDF https://arxiv.org/pdf/1807.01069v4.pdf
PWC https://paperswithcode.com/paper/adversarial-robustness-toolbox-v040
Repo https://github.com/sgxcj777/Adversarial-testing-toolbox-with-CLEVER
Framework none

Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)

Title Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)
Authors Xiao Yan, Xinyan Dai, Jie Liu, Kaiwen Zhou, James Cheng
Abstract Recently, locality sensitive hashing (LSH) was shown to be effective for MIPS and several algorithms including $L_2$-ALSH, Sign-ALSH and Simple-LSH have been proposed. In this paper, we introduce the norm-range partition technique, which partitions the original dataset into sub-datasets containing items with similar 2-norms and builds hash index independently for each sub-dataset. We prove that norm-range partition reduces the query processing complexity for all existing LSH based MIPS algorithms under mild conditions. The key to performance improvement is that norm-range partition allows to use smaller normalization factor most sub-datasets. For efficient query processing, we also formulate a unified framework to rank the buckets from the hash indexes of different sub-datasets. Experiments on real datasets show that norm-range partition significantly reduces the number of probed for LSH based MIPS algorithms when achieving the same recall.
Tasks
Published 2018-10-22
URL http://arxiv.org/abs/1810.09104v2
PDF http://arxiv.org/pdf/1810.09104v2.pdf
PWC https://paperswithcode.com/paper/norm-range-partition-a-universal-catalyst-for
Repo https://github.com/xinyandai/similarity-search
Framework none

GraKeL: A Graph Kernel Library in Python

Title GraKeL: A Graph Kernel Library in Python
Authors Giannis Siglidis, Giannis Nikolentzos, Stratis Limnios, Christos Giatsidis, Konstantinos Skianis, Michalis Vazirgiannis
Abstract The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Graph kernels have recently emerged as a promising approach to this problem. There are now many kernels, each focusing on different structural aspects of graphs. Here, we present GraKeL, a library that unifies several graph kernels into a common framework. The library is written in Python and adheres to the scikit-learn interface. It is simple to use and can be naturally combined with scikit-learn’s modules to build a complete machine learning pipeline for tasks such as graph classification and clustering. The code is BSD licensed and is available at: https://github.com/ysig/GraKeL .
Tasks Graph Classification
Published 2018-06-06
URL https://arxiv.org/abs/1806.02193v2
PDF https://arxiv.org/pdf/1806.02193v2.pdf
PWC https://paperswithcode.com/paper/grakel-a-graph-kernel-library-in-python
Repo https://github.com/ysig/GraKeL
Framework none

Uncertainty in Neural Networks: Approximately Bayesian Ensembling

Title Uncertainty in Neural Networks: Approximately Bayesian Ensembling
Authors Tim Pearce, Felix Leibfried, Alexandra Brintrup, Mohamed Zaki, Andy Neely
Abstract Understanding the uncertainty of a neural network’s (NN) predictions is essential for many purposes. The Bayesian framework provides a principled approach to this, however applying it to NNs is challenging due to large numbers of parameters and data. Ensembling NNs provides an easily implementable, scalable method for uncertainty quantification, however, it has been criticised for not being Bayesian. This work proposes one modification to the usual process that we argue does result in approximate Bayesian inference; regularising parameters about values drawn from a distribution which can be set equal to the prior. A theoretical analysis of the procedure in a simplified setting suggests the recovered posterior is centred correctly but tends to have an underestimated marginal variance, and overestimated correlation. However, two conditions can lead to exact recovery. We argue that these conditions are partially present in NNs. Empirical evaluations demonstrate it has an advantage over standard ensembling, and is competitive with variational methods.
Tasks Bayesian Inference, Image Classification
Published 2018-10-12
URL https://arxiv.org/abs/1810.05546v5
PDF https://arxiv.org/pdf/1810.05546v5.pdf
PWC https://paperswithcode.com/paper/uncertainty-in-neural-networks-bayesian
Repo https://github.com/TeaPearce/Bayesian_NN_Ensembles
Framework tf

Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation

Title Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation
Authors Mengyu Chu, You Xie, Jonas Mayer, Laura Leal-Taixé, Nils Thuerey
Abstract We focus on temporal self-supervision for GAN-based video generation tasks. While adversarial training successfully yields generative models for a variety of areas, temporal relationship in the generated data is much less explored. This is crucial for sequential generation tasks, e.g. video super-resolution and unpaired video translation. For the former, state-of-the-art methods often favor simpler norm losses such as $L^2$ over adversarial training. However, their averaging nature easily leads to temporally smooth results with an undesirable lack of spatial detail. For unpaired video translation, existing approaches modify the generator networks to form spatio-temporal cycle consistencies. In contrast, we focus on improving the learning objectives and propose a temporally self-supervised algorithm. For both tasks, we show that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. We also propose a novel Ping-Pong loss to improve the long-term temporal consistency. It effectively prevents recurrent networks from accumulating artifacts temporally without depressing detailed features. We also propose a first set of metrics to quantitatively evaluate the accuracy as well as the perceptual quality of the temporal evolution. A series of user studies confirm the rankings computed with these metrics.
Tasks Image Super-Resolution, Motion Compensation, Super-Resolution, Video Generation, Video Super-Resolution
Published 2018-11-23
URL https://arxiv.org/abs/1811.09393v3
PDF https://arxiv.org/pdf/1811.09393v3.pdf
PWC https://paperswithcode.com/paper/temporally-coherent-gans-for-video-super
Repo https://github.com/zhusiling/TecoGAN
Framework pytorch

Explainable Prediction of Medical Codes from Clinical Text

Title Explainable Prediction of Medical Codes from Clinical Text
Authors James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, Jacob Eisenstein
Abstract Clinical notes are text documents that are created by clinicians for each patient encounter. They are typically accompanied by medical codes, which describe the diagnosis and treatment. Annotating these codes is labor intensive and error prone; furthermore, the connection between the codes and the text is not annotated, obscuring the reasons and details behind specific diagnoses and treatments. We present an attentional convolutional network that predicts medical codes from clinical text. Our method aggregates information across the document using a convolutional neural network, and uses an attention mechanism to select the most relevant segments for each of the thousands of possible codes. The method is accurate, achieving precision@8 of 0.71 and a Micro-F1 of 0.54, which are both better than the prior state of the art. Furthermore, through an interpretability evaluation by a physician, we show that the attention mechanism identifies meaningful explanations for each code assignment
Tasks
Published 2018-02-15
URL http://arxiv.org/abs/1802.05695v2
PDF http://arxiv.org/pdf/1802.05695v2.pdf
PWC https://paperswithcode.com/paper/explainable-prediction-of-medical-codes-from
Repo https://github.com/jamesmullenbach/caml-mimic
Framework pytorch
comments powered by Disqus