October 18, 2019

3172 words 15 mins read

Paper Group ANR 611

Paper Group ANR 611

An Improved Learning Framework for Covariant Local Feature Detection. Generating Triples with Adversarial Networks for Scene Graph Construction. Comparing Generative Adversarial Network Techniques for Image Creation and Modification. How convolutional neural network see the world - A survey of convolutional neural network visualization methods. Rob …

An Improved Learning Framework for Covariant Local Feature Detection

Title An Improved Learning Framework for Covariant Local Feature Detection
Authors Nehal Doiphode, Rahul Mitra, Shuaib Ahmed, Arjun Jain
Abstract Learning feature detection has been largely an unexplored area when compared to handcrafted feature detection. Recent learning formulations use the covariant constraint in their loss function to learn covariant detectors. However, just learning from covariant constraint can lead to detection of unstable features. To impart further, stability detec- tors are trained to extract pre-determined features obtained by hand- crafted detectors. However, in the process they lose the ability to detect novel features. In an attempt to overcome the above limitations, we pro- pose an improved scheme by incorporating covariant constraints in form of triplets with addition to an affine covariant constraint. We show that using these additional constraints one can learn to detect novel and sta- ble features without using pre-determined features for training. Extensive experiments show our model achieves state-of-the-art performance in re- peatability score on the well known datasets such as Vgg-Affine, EF, and Webcam.
Tasks
Published 2018-11-01
URL http://arxiv.org/abs/1811.00438v1
PDF http://arxiv.org/pdf/1811.00438v1.pdf
PWC https://paperswithcode.com/paper/an-improved-learning-framework-for-covariant
Repo
Framework

Generating Triples with Adversarial Networks for Scene Graph Construction

Title Generating Triples with Adversarial Networks for Scene Graph Construction
Authors Matthew Klawonn, Eric Heim
Abstract Driven by successes in deep learning, computer vision research has begun to move beyond object detection and image classification to more sophisticated tasks like image captioning or visual question answering. Motivating such endeavors is the desire for models to capture not only objects present in an image, but more fine-grained aspects of a scene such as relationships between objects and their attributes. Scene graphs provide a formal construct for capturing these aspects of an image. Despite this, there have been only a few recent efforts to generate scene graphs from imagery. Previous works limit themselves to settings where bounding box information is available at train time and do not attempt to generate scene graphs with attributes. In this paper we propose a method, based on recent advancements in Generative Adversarial Networks, to overcome these deficiencies. We take the approach of first generating small subgraphs, each describing a single statement about a scene from a specific region of the input image chosen using an attention mechanism. By doing so, our method is able to produce portions of the scene graphs with attribute information without the need for bounding box labels. Then, the complete scene graph is constructed from these subgraphs. We show that our model improves upon prior work in scene graph generation on state-of-the-art data sets and accepted metrics. Further, we demonstrate that our model is capable of handling a larger vocabulary size than prior work has attempted.
Tasks graph construction, Graph Generation, Image Captioning, Image Classification, Object Detection, Question Answering, Scene Graph Generation, Visual Question Answering
Published 2018-02-07
URL http://arxiv.org/abs/1802.02598v1
PDF http://arxiv.org/pdf/1802.02598v1.pdf
PWC https://paperswithcode.com/paper/generating-triples-with-adversarial-networks
Repo
Framework

Comparing Generative Adversarial Network Techniques for Image Creation and Modification

Title Comparing Generative Adversarial Network Techniques for Image Creation and Modification
Authors Mathijs Pieters, Marco Wiering
Abstract Generative adversarial networks (GANs) have demonstrated to be successful at generating realistic real-world images. In this paper we compare various GAN techniques, both supervised and unsupervised. The effects on training stability of different objective functions are compared. We add an encoder to the network, making it possible to encode images to the latent space of the GAN. The generator, discriminator and encoder are parameterized by deep convolutional neural networks. For the discriminator network we experimented with using the novel Capsule Network, a state-of-the-art technique for detecting global features in images. Experiments are performed using a digit and face dataset, with various visualizations illustrating the results. The results show that using the encoder network it is possible to reconstruct images. With the conditional GAN we can alter visual attributes of generated or encoded images. The experiments with the Capsule Network as discriminator result in generated images of a lower quality, compared to a standard convolutional neural network.
Tasks
Published 2018-03-24
URL http://arxiv.org/abs/1803.09093v1
PDF http://arxiv.org/pdf/1803.09093v1.pdf
PWC https://paperswithcode.com/paper/comparing-generative-adversarial-network
Repo
Framework

How convolutional neural network see the world - A survey of convolutional neural network visualization methods

Title How convolutional neural network see the world - A survey of convolutional neural network visualization methods
Authors Zhuwei Qin, Fuxun Yu, Chenchen Liu, Xiang Chen
Abstract Nowadays, the Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs outstanding capability to learn the input features with deep layers of neuron structures and iterative training process. However, these learned features are hard to identify and interpret from a human vision perspective, causing a lack of understanding of the CNNs internal working mechanism. To improve the CNN interpretability, the CNN visualization is well utilized as a qualitative analysis method, which translates the internal features into visually perceptible patterns. And many CNN visualization works have been proposed in the literature to interpret the CNN in perspectives of network structure, operation, and semantic concept. In this paper, we expect to provide a comprehensive survey of several representative CNN visualization methods, including Activation Maximization, Network Inversion, Deconvolutional Neural Networks (DeconvNet), and Network Dissection based visualization. These methods are presented in terms of motivations, algorithms, and experiment results. Based on these visualization methods, we also discuss their practical applications to demonstrate the significance of the CNN interpretability in areas of network design, optimization, security enhancement, etc.
Tasks Image Retrieval, Object Detection
Published 2018-04-30
URL http://arxiv.org/abs/1804.11191v2
PDF http://arxiv.org/pdf/1804.11191v2.pdf
PWC https://paperswithcode.com/paper/how-convolutional-neural-network-see-the
Repo
Framework

Robust inference on the average treatment effect using the outcome highly adaptive lasso

Title Robust inference on the average treatment effect using the outcome highly adaptive lasso
Authors Cheng Ju, David Benkeser, Mark J. van der Laan
Abstract Many estimators of the average effect of a treatment on an outcome require estimation of the propensity score, the outcome regression, or both. It is often beneficial to utilize flexible techniques such as semiparametric regression or machine learning to estimate these quantities. However, optimal estimation of these regressions does not necessarily lead to optimal estimation of the average treatment effect, particularly in settings with strong instrumental variables. A recent proposal addressed these issues via the outcome-adaptive lasso, a penalized regression technique for estimating the propensity score that seeks to minimize the impact of instrumental variables on treatment effect estimators. However, a notable limitation of this approach is that its application is restricted to parametric models. We propose a more flexible alternative that we call the outcome highly adaptive lasso. We discuss large sample theory for this estimator and propose closed form confidence intervals based on the proposed estimator. We show via simulation that our method offers benefits over several popular approaches.
Tasks
Published 2018-06-18
URL https://arxiv.org/abs/1806.06784v3
PDF https://arxiv.org/pdf/1806.06784v3.pdf
PWC https://paperswithcode.com/paper/flexible-collaborative-estimation-of-the
Repo
Framework

A note on solving nonlinear optimization problems in variable precision

Title A note on solving nonlinear optimization problems in variable precision
Authors S. Gratton, Ph. L. Toint
Abstract This short note considers an efficient variant of the trust-region algorithm with dynamic accuracy proposed Carter (1993) and Conn, Gould and Toint (2000) as a tool for very high-performance computing, an area where it is critical to allow multi-precision computations for keeping the energy dissipation under control. Numerical experiments are presented indicating that the use of the considered method can bring substantial savings in objective function’s and gradient’s evaluation “energy costs” by efficiently exploiting multi-precision computations.
Tasks
Published 2018-12-09
URL http://arxiv.org/abs/1812.03467v3
PDF http://arxiv.org/pdf/1812.03467v3.pdf
PWC https://paperswithcode.com/paper/a-note-on-solving-nonlinear-optimization
Repo
Framework

CAESAR: Context Awareness Enabled Summary-Attentive Reader

Title CAESAR: Context Awareness Enabled Summary-Attentive Reader
Authors Long-Huei Chen, Kshitiz Tripathi
Abstract Comprehending meaning from natural language is a primary objective of Natural Language Processing (NLP), and text comprehension is the cornerstone for achieving this objective upon which all other problems like chat bots, language translation and others can be achieved. We report a Summary-Attentive Reader we designed to better emulate the human reading process, along with a dictiontary-based solution regarding out-of-vocabulary (OOV) words in the data, to generate answer based on machine comprehension of reading passages and question from the SQuAD benchmark. Our implementation of these features with two popular models (Match LSTM and Dynamic Coattention) was able to reach close to matching the results obtained from humans.
Tasks Reading Comprehension
Published 2018-03-04
URL http://arxiv.org/abs/1803.01335v1
PDF http://arxiv.org/pdf/1803.01335v1.pdf
PWC https://paperswithcode.com/paper/caesar-context-awareness-enabled-summary
Repo
Framework

Accelerating Imitation Learning with Predictive Models

Title Accelerating Imitation Learning with Predictive Models
Authors Ching-An Cheng, Xinyan Yan, Evangelos A. Theodorou, Byron Boots
Abstract Sample efficiency is critical in solving real-world reinforcement learning problems, where agent-environment interactions can be costly. Imitation learning from expert advice has proved to be an effective strategy for reducing the number of interactions required to train a policy. Online imitation learning, which interleaves policy evaluation and policy optimization, is a particularly effective technique with provable performance guarantees. In this work, we seek to further accelerate the convergence rate of online imitation learning, thereby making it more sample efficient. We propose two model-based algorithms inspired by Follow-the-Leader (FTL) with prediction: MoBIL-VI based on solving variational inequalities and MoBIL-Prox based on stochastic first-order updates. These two methods leverage a model to predict future gradients to speed up policy learning. When the model oracle is learned online, these algorithms can provably accelerate the best known convergence rate up to an order. Our algorithms can be viewed as a generalization of stochastic Mirror-Prox (Juditsky et al., 2011), and admit a simple constructive FTL-style analysis of performance.
Tasks Imitation Learning
Published 2018-06-12
URL http://arxiv.org/abs/1806.04642v4
PDF http://arxiv.org/pdf/1806.04642v4.pdf
PWC https://paperswithcode.com/paper/accelerating-imitation-learning-with
Repo
Framework

Reverse iterative volume sampling for linear regression

Title Reverse iterative volume sampling for linear regression
Authors Michał Dereziński, Manfred K. Warmuth
Abstract We study the following basic machine learning task: Given a fixed set of $d$-dimensional input points for a linear regression problem, we wish to predict a hidden response value for each of the points. We can only afford to attain the responses for a small subset of the points that are then used to construct linear predictions for all points in the dataset. The performance of the predictions is evaluated by the total square loss on all responses (the attained as well as the hidden ones). We show that a good approximate solution to this least squares problem can be obtained from just dimension $d$ many responses by using a joint sampling technique called volume sampling. Moreover, the least squares solution obtained for the volume sampled subproblem is an unbiased estimator of optimal solution based on all n responses. This unbiasedness is a desirable property that is not shared by other common subset selection techniques. Motivated by these basic properties, we develop a theoretical framework for studying volume sampling, resulting in a number of new matrix expectation equalities and statistical guarantees which are of importance not only to least squares regression but also to numerical linear algebra in general. Our methods also lead to a regularized variant of volume sampling, and we propose the first efficient algorithms for volume sampling which make this technique a practical tool in the machine learning toolbox. Finally, we provide experimental evidence which confirms our theoretical findings.
Tasks
Published 2018-06-06
URL http://arxiv.org/abs/1806.01969v1
PDF http://arxiv.org/pdf/1806.01969v1.pdf
PWC https://paperswithcode.com/paper/reverse-iterative-volume-sampling-for-linear
Repo
Framework

Scalable photonic reinforcement learning by time-division multiplexing of laser chaos

Title Scalable photonic reinforcement learning by time-division multiplexing of laser chaos
Authors Makoto Naruse, Takatomo Mihana, Hirokazu Hori, Hayato Saigo, Kazuya Okamura, Mikio Hasegawa, Atsushi Uchida
Abstract Reinforcement learning involves decision making in dynamic and uncertain environments and constitutes a crucial element of artificial intelligence. In our previous work, we experimentally demonstrated that the ultrafast chaotic oscillatory dynamics of lasers can be used to solve the two-armed bandit problem efficiently, which requires decision making concerning a class of difficult trade-offs called the exploration-exploitation dilemma. However, only two selections were employed in that research; thus, the scalability of the laser-chaos-based reinforcement learning should be clarified. In this study, we demonstrated a scalable, pipelined principle of resolving the multi-armed bandit problem by introducing time-division multiplexing of chaotically oscillated ultrafast time-series. The experimental demonstrations in which bandit problems with up to 64 arms were successfully solved are presented in this report. Detailed analyses are also provided that include performance comparisons among laser chaos signals generated in different physical conditions, which coincide with the diffusivity inherent in the time series. This study paves the way for ultrafast reinforcement learning by taking advantage of the ultrahigh bandwidths of light wave and practical enabling technologies.
Tasks Decision Making, Time Series
Published 2018-03-26
URL http://arxiv.org/abs/1803.09425v1
PDF http://arxiv.org/pdf/1803.09425v1.pdf
PWC https://paperswithcode.com/paper/scalable-photonic-reinforcement-learning-by
Repo
Framework

Language Identification of Bengali-English Code-Mixed data using Character & Phonetic based LSTM Models

Title Language Identification of Bengali-English Code-Mixed data using Character & Phonetic based LSTM Models
Authors Soumil Mandal, Sourya Dipta Das, Dipankar Das
Abstract Language identification of social media text still remains a challenging task due to properties like code-mixing and inconsistent phonetic transliterations. In this paper, we present a supervised learning approach for language identification at the word level of low resource Bengali-English code-mixed data taken from social media. We employ two methods of word encoding, namely character based and root phone based to train our deep LSTM models. Utilizing these two models we created two ensemble models using stacking and threshold technique which gave 91.78% and 92.35% accuracies respectively on our testing data.
Tasks Language Identification
Published 2018-03-10
URL http://arxiv.org/abs/1803.03859v2
PDF http://arxiv.org/pdf/1803.03859v2.pdf
PWC https://paperswithcode.com/paper/language-identification-of-bengali-english
Repo
Framework

Multiple Sclerosis Lesion Inpainting Using Non-Local Partial Convolutions

Title Multiple Sclerosis Lesion Inpainting Using Non-Local Partial Convolutions
Authors Hao Xiong, Chaoyue Wang, Dacheng Tao, Michael Barnett, Chenyu Wang
Abstract Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system (CNS) that results in focal injury to the grey and white matter. The presence of white matter lesions biases morphometric analyses such as registration, individual longitudinal measurements and tissue segmentation for brain volume measurements. Lesion-inpainting with intensities derived from surrounding healthy tissue represents one approach to alleviate such problems. However, existing methods inpaint lesions based on texture information derived from local surrounding tissue, often leading to inconsistent inpainting and the generation of artifacts such as intensity discrepancy and blurriness. Based on these observations, we propose non-local partial convolutions (NLPC) that integrates a Unet-like network with the non-local module. The non-local module is exploited to capture long range dependencies between the lesion area and remaining normal-appearing brain regions. Then, the lesion area is filled by referring to normal-appearing regions with more similar features. This method generates inpainted regions that appear more realistic and natural. Our quantitative experimental results also demonstrate superiority of this technique of existing state-of-the-art inpainting methods.
Tasks
Published 2018-12-24
URL https://arxiv.org/abs/1901.00055v3
PDF https://arxiv.org/pdf/1901.00055v3.pdf
PWC https://paperswithcode.com/paper/multiple-sclerosis-lesion-inpainting-using
Repo
Framework

Simultaneous Measurement Imputation and Outcome Prediction for Achilles Tendon Rupture Rehabilitation

Title Simultaneous Measurement Imputation and Outcome Prediction for Achilles Tendon Rupture Rehabilitation
Authors Charles Hamesse, Ruibo Tu, Paul Ackermann, Hedvig Kjellström, Cheng Zhang
Abstract Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries. Rehabilitation after such a musculoskeletal injury remains a prolonged process with a very variable outcome. Accurately predicting rehabilitation outcome is crucial for treatment decision support. However, it is challenging to train an automatic method for predicting the ATR rehabilitation outcome from treatment data, due to a massive amount of missing entries in the data recorded from ATR patients, as well as complex nonlinear relations between measurements and outcomes. In this work, we design an end-to-end probabilistic framework to impute missing data entries and predict rehabilitation outcomes simultaneously. We evaluate our model on a real-life ATR clinical cohort, comparing with various baselines. The proposed method demonstrates its clear superiority over traditional methods which typically perform imputation and prediction in two separate stages.
Tasks Imputation
Published 2018-09-08
URL https://arxiv.org/abs/1810.03435v2
PDF https://arxiv.org/pdf/1810.03435v2.pdf
PWC https://paperswithcode.com/paper/simultaneous-measurement-imputation-and
Repo
Framework

Temporal coherence-based self-supervised learning for laparoscopic workflow analysis

Title Temporal coherence-based self-supervised learning for laparoscopic workflow analysis
Authors Isabel Funke, Alexander Jenke, Sören Torge Mees, Jürgen Weitz, Stefanie Speidel, Sebastian Bodenstedt
Abstract In order to provide the right type of assistance at the right time, computer-assisted surgery systems need context awareness. To achieve this, methods for surgical workflow analysis are crucial. Currently, convolutional neural networks provide the best performance for video-based workflow analysis tasks. For training such networks, large amounts of annotated data are necessary. However, collecting a sufficient amount of data is often costly, time-consuming, and not always feasible. In this paper, we address this problem by presenting and comparing different approaches for self-supervised pretraining of neural networks on unlabeled laparoscopic videos using temporal coherence. We evaluate our pretrained networks on Cholec80, a publicly available dataset for surgical phase segmentation, on which a maximum F1 score of 84.6 was reached. Furthermore, we were able to achieve an increase of the F1 score of up to 10 points when compared to a non-pretrained neural network.
Tasks
Published 2018-06-18
URL http://arxiv.org/abs/1806.06811v2
PDF http://arxiv.org/pdf/1806.06811v2.pdf
PWC https://paperswithcode.com/paper/temporal-coherence-based-self-supervised
Repo
Framework

Hu-Fu: Hardware and Software Collaborative Attack Framework against Neural Networks

Title Hu-Fu: Hardware and Software Collaborative Attack Framework against Neural Networks
Authors Wenshuo Li, Jincheng Yu, Xuefei Ning, Pengjun Wang, Qi Wei, Yu Wang, Huazhong Yang
Abstract Recently, Deep Learning (DL), especially Convolutional Neural Network (CNN), develops rapidly and is applied to many tasks, such as image classification, face recognition, image segmentation, and human detection. Due to its superior performance, DL-based models have a wide range of application in many areas, some of which are extremely safety-critical, e.g. intelligent surveillance and autonomous driving. Due to the latency and privacy problem of cloud computing, embedded accelerators are popular in these safety-critical areas. However, the robustness of the embedded DL system might be harmed by inserting hardware/software Trojans into the accelerator and the neural network model, since the accelerator and deploy tool (or neural network model) are usually provided by third-party companies. Fortunately, inserting hardware Trojans can only achieve inflexible attack, which means that hardware Trojans can easily break down the whole system or exchange two outputs, but can’t make CNN recognize unknown pictures as targets. Though inserting software Trojans has more freedom of attack, it often requires tampering input images, which is not easy for attackers. So, in this paper, we propose a hardware-software collaborative attack framework to inject hidden neural network Trojans, which works as a back-door without requiring manipulating input images and is flexible for different scenarios. We test our attack framework for image classification and face recognition tasks, and get attack success rate of 92.6% and 100% on CIFAR10 and YouTube Faces, respectively, while keeping almost the same accuracy as the unattacked model in the normal mode. In addition, we show a specific attack scenario in which a face recognition system is attacked and gives a specific wrong answer.
Tasks Autonomous Driving, Face Recognition, Human Detection, Image Classification, Semantic Segmentation
Published 2018-05-14
URL http://arxiv.org/abs/1805.05098v2
PDF http://arxiv.org/pdf/1805.05098v2.pdf
PWC https://paperswithcode.com/paper/hu-fu-hardware-and-software-collaborative
Repo
Framework
comments powered by Disqus