Paper Group AWR 22
Detect-to-Retrieve: Efficient Regional Aggregation for Image Search. Bayesian CycleGAN via Marginalizing Latent Sampling. A Review of Different Word Embeddings for Sentiment Classification using Deep Learning. Do-It-Yourself Single Camera 3D Pointer Input Device. Building a Word Segmenter for Sanskrit Overnight. Dealing with Unknown Unknowns: Ident …
Detect-to-Retrieve: Efficient Regional Aggregation for Image Search
Title | Detect-to-Retrieve: Efficient Regional Aggregation for Image Search |
Authors | Marvin Teichmann, Andre Araujo, Menglong Zhu, Jack Sim |
Abstract | Retrieving object instances among cluttered scenes efficiently requires compact yet comprehensive regional image representations. Intuitively, object semantics can help build the index that focuses on the most relevant regions. However, due to the lack of bounding-box datasets for objects of interest among retrieval benchmarks, most recent work on regional representations has focused on either uniform or class-agnostic region selection. In this paper, we first fill the void by providing a new dataset of landmark bounding boxes, based on the Google Landmarks dataset, that includes $86k$ images with manually curated boxes from $15k$ unique landmarks. Then, we demonstrate how a trained landmark detector, using our new dataset, can be leveraged to index image regions and improve retrieval accuracy while being much more efficient than existing regional methods. In addition, we introduce a novel regional aggregated selective match kernel (R-ASMK) to effectively combine information from detected regions into an improved holistic image representation. R-ASMK boosts image retrieval accuracy substantially with no dimensionality increase, while even outperforming systems that index image regions independently. Our complete image retrieval system improves upon the previous state-of-the-art by significant margins on the Revisited Oxford and Paris datasets. Code and data available at the project webpage: https://github.com/tensorflow/models/tree/master/research/delf. |
Tasks | Image Retrieval |
Published | 2018-12-04 |
URL | https://arxiv.org/abs/1812.01584v2 |
https://arxiv.org/pdf/1812.01584v2.pdf | |
PWC | https://paperswithcode.com/paper/detect-to-retrieve-efficient-regional |
Repo | https://github.com/tensorflow/models/tree/master/research/delf |
Framework | tf |
Bayesian CycleGAN via Marginalizing Latent Sampling
Title | Bayesian CycleGAN via Marginalizing Latent Sampling |
Authors | Haoran You, Yu Cheng, Tianheng Cheng, Chunliang Li, Pan Zhou |
Abstract | Recent techniques built on Generative Adversarial Networks (GANs) like CycleGAN are able to learn mappings between domains from unpaired datasets through min-max optimization games between generators and discriminators. However, it remains challenging to stabilize training process and diversify generated results. To address these problems, we present a Bayesian extension of cyclic model and an integrated cyclic framework for inter-domain mappings. The proposed method stimulated by Bayesian GAN explores the full posteriors of Bayesian cyclic model (with latent sampling) and optimizes the model with maximum a posteriori (MAP) estimation. Hence, we name it {\tt Bayesian CycleGAN}. We perform the proposed Bayesian CycleGAN on multiple benchmark datasets, including Cityscapes, Maps, and Monet2photo. The quantitative and qualitative evaluations demonstrate the proposed method can achieve more stable training, superior performance and diversified images generating. |
Tasks | Image-to-Image Translation |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07465v2 |
http://arxiv.org/pdf/1811.07465v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-cyclegan-via-marginalizing-latent |
Repo | https://github.com/ranery/Bayesian-CycleGAN |
Framework | pytorch |
A Review of Different Word Embeddings for Sentiment Classification using Deep Learning
Title | A Review of Different Word Embeddings for Sentiment Classification using Deep Learning |
Authors | Debadri Dutta |
Abstract | The web is loaded with textual content, and Natural Language Processing is a standout amongst the most vital fields in Machine Learning. But when data is huge simple Machine Learning algorithms are not able to handle it and that is when Deep Learning comes into play which based on Neural Networks. However since neural networks cannot process raw text, we have to change over them through some diverse strategies of word embedding. This paper demonstrates those distinctive word embedding strategies implemented on an Amazon Review Dataset, which has two sentiments to be classified: Happy and Unhappy based on numerous customer reviews. Moreover we demonstrate the distinction in accuracy with a discourse about which word embedding to apply when. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.02471v1 |
http://arxiv.org/pdf/1807.02471v1.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-different-word-embeddings-for |
Repo | https://github.com/debadridtt/A-Review-of-Different-Word-Embeddings-for-Sentiment-Classification-using-Deep-Learning |
Framework | none |
Do-It-Yourself Single Camera 3D Pointer Input Device
Title | Do-It-Yourself Single Camera 3D Pointer Input Device |
Authors | Bernard Llanos, Yee-Hong Yang |
Abstract | We present a new algorithm for single camera 3D reconstruction, or 3D input for human-computer interfaces, based on precise tracking of an elongated object, such as a pen, having a pattern of colored bands. To configure the system, the user provides no more than one labelled image of a handmade pointer, measurements of its colored bands, and the camera’s pinhole projection matrix. Other systems are of much higher cost and complexity, requiring combinations of multiple cameras, stereocameras, and pointers with sensors and lights. Instead of relying on information from multiple devices, we examine our single view more closely, integrating geometric and appearance constraints to robustly track the pointer in the presence of occlusion and distractor objects. By probing objects of known geometry with the pointer, we demonstrate acceptable accuracy of 3D localization. |
Tasks | 3D Reconstruction |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04704v1 |
http://arxiv.org/pdf/1809.04704v1.pdf | |
PWC | https://paperswithcode.com/paper/do-it-yourself-single-camera-3d-pointer-input |
Repo | https://github.com/bllanos/linear-probe |
Framework | none |
Building a Word Segmenter for Sanskrit Overnight
Title | Building a Word Segmenter for Sanskrit Overnight |
Authors | Vikas Reddy, Amrith Krishna, Vishnu Dutt Sharma, Prateek Gupta, Vineeth M R, Pawan Goyal |
Abstract | There is an abundance of digitised texts available in Sanskrit. However, the word segmentation task in such texts are challenging due to the issue of ‘Sandhi’. In Sandhi, words in a sentence often fuse together to form a single chunk of text, where the word delimiter vanishes and sounds at the word boundaries undergo transformations, which is also reflected in the written text. Here, we propose an approach that uses a deep sequence to sequence (seq2seq) model that takes only the sandhied string as the input and predicts the unsandhied string. The state of the art models are linguistically involved and have external dependencies for the lexical and morphological analysis of the input. Our model can be trained “overnight” and be used for production. In spite of the knowledge lean approach, our system preforms better than the current state of the art by gaining a percentage increase of 16.79 % than the current state of the art. |
Tasks | Morphological Analysis |
Published | 2018-02-17 |
URL | http://arxiv.org/abs/1802.06185v1 |
http://arxiv.org/pdf/1802.06185v1.pdf | |
PWC | https://paperswithcode.com/paper/building-a-word-segmenter-for-sanskrit |
Repo | https://github.com/cvikasreddy/skt |
Framework | tf |
Dealing with Unknown Unknowns: Identification and Selection of Minimal Sensing for Fractional Dynamics with Unknown Inputs
Title | Dealing with Unknown Unknowns: Identification and Selection of Minimal Sensing for Fractional Dynamics with Unknown Inputs |
Authors | Gaurav Gupta, Sergio Pequito, Paul Bogdan |
Abstract | This paper focuses on analysis and design of time-varying complex networks having fractional order dynamics. These systems are key in modeling the complex dynamical processes arising in several natural and man made systems. Notably, examples include neurophysiological signals such as electroencephalogram (EEG) that captures the variation in potential fields, and blood oxygenation level dependent (BOLD) signal, which serves as a proxy for neuronal activity. Notwithstanding, the complex networks originated by locally measuring EEG and BOLD are often treated as isolated networks and do not capture the dependency from external stimuli, e.g., originated in subcortical structures such as the thalamus and the brain stem. Therefore, we propose a paradigm-shift towards the analysis of such complex networks under unknown unknowns (i.e., excitations). Consequently, the main contributions of the present paper are threefold: (i) we present an alternating scheme that enables to determine the best estimate of the model parameters and unknown stimuli; (ii) we provide necessary and sufficient conditions to ensure that it is possible to retrieve the state and unknown stimuli; and (iii) upon these conditions we determine a small subset of variables that need to be measured to ensure that both state and input can be recovered, while establishing sub-optimality guarantees with respect to the smallest possible subset. Finally, we present several pedagogical examples of the main results using real data collected from an EEG wearable device. |
Tasks | EEG |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.04866v2 |
http://arxiv.org/pdf/1803.04866v2.pdf | |
PWC | https://paperswithcode.com/paper/dealing-with-unknown-unknowns-identification |
Repo | https://github.com/gaurav71531/UUknowns |
Framework | none |
Achievable Information Rates for Nonlinear Fiber Communication via End-to-end Autoencoder Learning
Title | Achievable Information Rates for Nonlinear Fiber Communication via End-to-end Autoencoder Learning |
Authors | Shen Li, Christian Häger, Nil Garcia, Henk Wymeersch |
Abstract | Machine learning is used to compute achievable information rates (AIRs) for a simplified fiber channel. The approach jointly optimizes the input distribution (constellation shaping) and the auxiliary channel distribution to compute AIRs without explicit channel knowledge in an end-to-end fashion. |
Tasks | |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07675v2 |
http://arxiv.org/pdf/1804.07675v2.pdf | |
PWC | https://paperswithcode.com/paper/achievable-information-rates-for-nonlinear |
Repo | https://github.com/henkwymeersch/AutoencoderFiber |
Framework | none |
DeepTract: A Probabilistic Deep Learning Framework for White Matter Fiber Tractography
Title | DeepTract: A Probabilistic Deep Learning Framework for White Matter Fiber Tractography |
Authors | Itay Benou, Tammy Riklin-Raviv |
Abstract | We present DeepTract, a deep-learning framework for estimating white matter fibers orientation and streamline tractography. We adopt a data-driven approach for fiber reconstruction from diffusion weighted images (DWI), which does not assume a specific diffusion model. We use a recurrent neural network for mapping sequences of DWI values into probabilistic fiber orientation distributions. Based on these estimations, our model facilitates both deterministic and probabilistic streamline tractography. We quantitatively evaluate our method using the Tractometer tool, demonstrating competitive performance with state-of-the art classical and machine learning based tractography algorithms. We further present qualitative results of bundle-specific probabilistic tractography obtained using our method. The code is publicly available at: https://github.com/itaybenou/DeepTract.git. |
Tasks | White Matter Fiber Tractography |
Published | 2018-12-12 |
URL | https://arxiv.org/abs/1812.05129v3 |
https://arxiv.org/pdf/1812.05129v3.pdf | |
PWC | https://paperswithcode.com/paper/deeptract-a-probabilistic-deep-learning |
Repo | https://github.com/itaybenou/DeepTract |
Framework | none |
Complex-YOLO: Real-time 3D Object Detection on Point Clouds
Title | Complex-YOLO: Real-time 3D Object Detection on Point Clouds |
Authors | Martin Simon, Stefan Milz, Karl Amende, Horst-Michael Gross |
Abstract | Lidar based 3D object detection is inevitable for autonomous driving, because it directly links to environmental understanding and therefore builds the base for prediction and motion planning. The capacity of inferencing highly sparse 3D data in real-time is an ill-posed problem for lots of other application areas besides automated vehicles, e.g. augmented reality, personal robotics or industrial automation. We introduce Complex-YOLO, a state of the art real-time 3D object detection network on point clouds only. In this work, we describe a network that expands YOLOv2, a fast 2D standard object detector for RGB images, by a specific complex regression strategy to estimate multi-class 3D boxes in Cartesian space. Thus, we propose a specific Euler-Region-Proposal Network (E-RPN) to estimate the pose of the object by adding an imaginary and a real fraction to the regression network. This ends up in a closed complex space and avoids singularities, which occur by single angle estimations. The E-RPN supports to generalize well during training. Our experiments on the KITTI benchmark suite show that we outperform current leading methods for 3D object detection specifically in terms of efficiency. We achieve state of the art results for cars, pedestrians and cyclists by being more than five times faster than the fastest competitor. Further, our model is capable of estimating all eight KITTI-classes, including Vans, Trucks or sitting pedestrians simultaneously with high accuracy. |
Tasks | 3D Object Detection, Autonomous Driving, Motion Planning, Object Detection |
Published | 2018-03-16 |
URL | http://arxiv.org/abs/1803.06199v2 |
http://arxiv.org/pdf/1803.06199v2.pdf | |
PWC | https://paperswithcode.com/paper/complex-yolo-real-time-3d-object-detection-on |
Repo | https://github.com/ghimiredhikura/Complex-YOLO-V3 |
Framework | pytorch |
Adversarial Robustness Toolbox v1.0.0
Title | Adversarial Robustness Toolbox v1.0.0 |
Authors | Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Martin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, Ian M. Molloy, Ben Edwards |
Abstract | Adversarial Robustness Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defences and test them with adversarial attacks. Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which is required to test defenses with state-of-the-art threat models. Supported Machine Learning Libraries include TensorFlow (v1 and v2), Keras, PyTorch, MXNet, Scikit-learn, XGBoost, LightGBM, CatBoost, and GPy. The source code of ART is released with MIT license at https://github.com/IBM/adversarial-robustness-toolbox. The release includes code examples, notebooks with tutorials and documentation (http://adversarial-robustness-toolbox.readthedocs.io). |
Tasks | Gaussian Processes, Time Series |
Published | 2018-07-03 |
URL | https://arxiv.org/abs/1807.01069v4 |
https://arxiv.org/pdf/1807.01069v4.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-robustness-toolbox-v040 |
Repo | https://github.com/sgxcj777/Adversarial-testing-toolbox-with-CLEVER |
Framework | none |
Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)
Title | Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS) |
Authors | Xiao Yan, Xinyan Dai, Jie Liu, Kaiwen Zhou, James Cheng |
Abstract | Recently, locality sensitive hashing (LSH) was shown to be effective for MIPS and several algorithms including $L_2$-ALSH, Sign-ALSH and Simple-LSH have been proposed. In this paper, we introduce the norm-range partition technique, which partitions the original dataset into sub-datasets containing items with similar 2-norms and builds hash index independently for each sub-dataset. We prove that norm-range partition reduces the query processing complexity for all existing LSH based MIPS algorithms under mild conditions. The key to performance improvement is that norm-range partition allows to use smaller normalization factor most sub-datasets. For efficient query processing, we also formulate a unified framework to rank the buckets from the hash indexes of different sub-datasets. Experiments on real datasets show that norm-range partition significantly reduces the number of probed for LSH based MIPS algorithms when achieving the same recall. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09104v2 |
http://arxiv.org/pdf/1810.09104v2.pdf | |
PWC | https://paperswithcode.com/paper/norm-range-partition-a-universal-catalyst-for |
Repo | https://github.com/xinyandai/similarity-search |
Framework | none |
GraKeL: A Graph Kernel Library in Python
Title | GraKeL: A Graph Kernel Library in Python |
Authors | Giannis Siglidis, Giannis Nikolentzos, Stratis Limnios, Christos Giatsidis, Konstantinos Skianis, Michalis Vazirgiannis |
Abstract | The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Graph kernels have recently emerged as a promising approach to this problem. There are now many kernels, each focusing on different structural aspects of graphs. Here, we present GraKeL, a library that unifies several graph kernels into a common framework. The library is written in Python and adheres to the scikit-learn interface. It is simple to use and can be naturally combined with scikit-learn’s modules to build a complete machine learning pipeline for tasks such as graph classification and clustering. The code is BSD licensed and is available at: https://github.com/ysig/GraKeL . |
Tasks | Graph Classification |
Published | 2018-06-06 |
URL | https://arxiv.org/abs/1806.02193v2 |
https://arxiv.org/pdf/1806.02193v2.pdf | |
PWC | https://paperswithcode.com/paper/grakel-a-graph-kernel-library-in-python |
Repo | https://github.com/ysig/GraKeL |
Framework | none |
Uncertainty in Neural Networks: Approximately Bayesian Ensembling
Title | Uncertainty in Neural Networks: Approximately Bayesian Ensembling |
Authors | Tim Pearce, Felix Leibfried, Alexandra Brintrup, Mohamed Zaki, Andy Neely |
Abstract | Understanding the uncertainty of a neural network’s (NN) predictions is essential for many purposes. The Bayesian framework provides a principled approach to this, however applying it to NNs is challenging due to large numbers of parameters and data. Ensembling NNs provides an easily implementable, scalable method for uncertainty quantification, however, it has been criticised for not being Bayesian. This work proposes one modification to the usual process that we argue does result in approximate Bayesian inference; regularising parameters about values drawn from a distribution which can be set equal to the prior. A theoretical analysis of the procedure in a simplified setting suggests the recovered posterior is centred correctly but tends to have an underestimated marginal variance, and overestimated correlation. However, two conditions can lead to exact recovery. We argue that these conditions are partially present in NNs. Empirical evaluations demonstrate it has an advantage over standard ensembling, and is competitive with variational methods. |
Tasks | Bayesian Inference, Image Classification |
Published | 2018-10-12 |
URL | https://arxiv.org/abs/1810.05546v5 |
https://arxiv.org/pdf/1810.05546v5.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-in-neural-networks-bayesian |
Repo | https://github.com/TeaPearce/Bayesian_NN_Ensembles |
Framework | tf |
Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation
Title | Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation |
Authors | Mengyu Chu, You Xie, Jonas Mayer, Laura Leal-Taixé, Nils Thuerey |
Abstract | We focus on temporal self-supervision for GAN-based video generation tasks. While adversarial training successfully yields generative models for a variety of areas, temporal relationship in the generated data is much less explored. This is crucial for sequential generation tasks, e.g. video super-resolution and unpaired video translation. For the former, state-of-the-art methods often favor simpler norm losses such as $L^2$ over adversarial training. However, their averaging nature easily leads to temporally smooth results with an undesirable lack of spatial detail. For unpaired video translation, existing approaches modify the generator networks to form spatio-temporal cycle consistencies. In contrast, we focus on improving the learning objectives and propose a temporally self-supervised algorithm. For both tasks, we show that temporal adversarial learning is key to achieving temporally coherent solutions without sacrificing spatial detail. We also propose a novel Ping-Pong loss to improve the long-term temporal consistency. It effectively prevents recurrent networks from accumulating artifacts temporally without depressing detailed features. We also propose a first set of metrics to quantitatively evaluate the accuracy as well as the perceptual quality of the temporal evolution. A series of user studies confirm the rankings computed with these metrics. |
Tasks | Image Super-Resolution, Motion Compensation, Super-Resolution, Video Generation, Video Super-Resolution |
Published | 2018-11-23 |
URL | https://arxiv.org/abs/1811.09393v3 |
https://arxiv.org/pdf/1811.09393v3.pdf | |
PWC | https://paperswithcode.com/paper/temporally-coherent-gans-for-video-super |
Repo | https://github.com/zhusiling/TecoGAN |
Framework | pytorch |
Explainable Prediction of Medical Codes from Clinical Text
Title | Explainable Prediction of Medical Codes from Clinical Text |
Authors | James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, Jacob Eisenstein |
Abstract | Clinical notes are text documents that are created by clinicians for each patient encounter. They are typically accompanied by medical codes, which describe the diagnosis and treatment. Annotating these codes is labor intensive and error prone; furthermore, the connection between the codes and the text is not annotated, obscuring the reasons and details behind specific diagnoses and treatments. We present an attentional convolutional network that predicts medical codes from clinical text. Our method aggregates information across the document using a convolutional neural network, and uses an attention mechanism to select the most relevant segments for each of the thousands of possible codes. The method is accurate, achieving precision@8 of 0.71 and a Micro-F1 of 0.54, which are both better than the prior state of the art. Furthermore, through an interpretability evaluation by a physician, we show that the attention mechanism identifies meaningful explanations for each code assignment |
Tasks | |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05695v2 |
http://arxiv.org/pdf/1802.05695v2.pdf | |
PWC | https://paperswithcode.com/paper/explainable-prediction-of-medical-codes-from |
Repo | https://github.com/jamesmullenbach/caml-mimic |
Framework | pytorch |