October 18, 2019

2806 words 14 mins read

Paper Group ANR 679

Types, Tokens, and Hapaxes: A New Heap’s Law. Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking. Paying Attention to Attention: Highlighting Influential Samples in Sequential Analysis. Automatic Liver Segmentation with Adversarial Loss and Convolutional Neural Network. PCNNA: A Photonic Conv …

Types, Tokens, and Hapaxes: A New Heap’s Law


Title	Types, Tokens, and Hapaxes: A New Heap’s Law
Authors	Victor Davis
Abstract	Heap’s Law states that in a large enough text corpus, the number of types as a function of tokens grows as $N=KM^\beta$ for some free parameters $K,\beta$. Much has been written about how this result and various generalizations can be derived from Zipf’s Law. Here we derive from first principles a completely novel expression of the type-token curve and prove its superior accuracy on real text. This expression naturally generalizes to equally accurate estimates for counting hapaxes and higher $n$-legomena.
Tasks
Published	2018-12-31
URL	http://arxiv.org/abs/1901.00521v1
PDF	http://arxiv.org/pdf/1901.00521v1.pdf
PWC	https://paperswithcode.com/paper/types-tokens-and-hapaxes-a-new-heaps-law
Repo
Framework

Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking


Title	Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking
Authors	Cong Ma, Changshui Yang, Fan Yang, Yueqing Zhuang, Ziwei Zhang, Huizhu Jia, Xiaodong Xie
Abstract	Multi-Object Tracking (MOT) is a challenging task in the complex scene such as surveillance and autonomous driving. In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU). The tracklet generation utilizes object features extracted by CNN and RNN to create the high-confidence tracklet candidates in sparse scenario. Due to mis-tracking in the generation process, the tracklets from different objects are split into several sub-tracklets by a bidirectional GRU. After that, a Siamese GRU based tracklet re-connection method is applied to link the sub-tracklets which belong to the same object to form a whole trajectory. In addition, we extract the tracklet images from existing MOT datasets and propose a novel dataset to train our networks. The proposed dataset contains more than 95160 pedestrian images. It has 793 different persons in it. On average, there are 120 images for each person with positions and sizes. Experimental results demonstrate the advantages of our model over the state-of-the-art methods on MOT16.
Tasks	Autonomous Driving, Multi-Object Tracking, Multiple Object Tracking, Object Tracking
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04555v1
PDF	http://arxiv.org/pdf/1804.04555v1.pdf
PWC	https://paperswithcode.com/paper/trajectory-factory-tracklet-cleaving-and-re
Repo
Framework

Paying Attention to Attention: Highlighting Influential Samples in Sequential Analysis


Title	Paying Attention to Attention: Highlighting Influential Samples in Sequential Analysis
Authors	Cynthia Freeman, Jonathan Merriman, Abhinav Aggarwal, Ian Beaver, Abdullah Mueen
Abstract	In (Yang et al. 2016), a hierarchical attention network (HAN) is created for document classification. The attention layer can be used to visualize text influential in classifying the document, thereby explaining the model’s prediction. We successfully applied HAN to a sequential analysis task in the form of real-time monitoring of turn taking in conversations. However, we discovered instances where the attention weights were uniform at the stopping point (indicating all turns were equivalently influential to the classifier), preventing meaningful visualization for real-time human review or classifier improvement. We observed that attention weights for turns fluctuated as the conversations progressed, indicating turns had varying influence based on conversation state. Leveraging this observation, we develop a method to create more informative real-time visuals (as confirmed by human reviewers) in cases of uniform attention weights using the changes in turn importance as a conversation progresses over time.
Tasks	Document Classification
Published	2018-08-06
URL	http://arxiv.org/abs/1808.02113v1
PDF	http://arxiv.org/pdf/1808.02113v1.pdf
PWC	https://paperswithcode.com/paper/paying-attention-to-attention-highlighting
Repo
Framework

Automatic Liver Segmentation with Adversarial Loss and Convolutional Neural Network


Title	Automatic Liver Segmentation with Adversarial Loss and Convolutional Neural Network
Authors	Bora Baydar, Savas Ozkan, Gozde Bozdagi Akar
Abstract	Automatic segmentation of medical images is among most demanded works in the medical information field since it saves time of the experts in the field and avoids human error factors. In this work, a method based on Conditional Adversarial Networks and Fully Convolutional Networks is proposed for the automatic segmentation of the liver MRIs. The proposed method, without any post-processing, is achieved the second place in the SIU Liver Segmentation Challenge 2018, data of which is provided by Dokuz Eyl"ul University. In this paper, some improvements for the post-processing step are also proposed and it is shown that with these additions, the method outperforms other baseline methods.
Tasks	Liver Segmentation
Published	2018-11-28
URL	http://arxiv.org/abs/1811.11566v1
PDF	http://arxiv.org/pdf/1811.11566v1.pdf
PWC	https://paperswithcode.com/paper/automatic-liver-segmentation-with-adversarial
Repo
Framework

PCNNA: A Photonic Convolutional Neural Network Accelerator


Title	PCNNA: A Photonic Convolutional Neural Network Accelerator
Authors	Armin Mehrabian, Yousra Al-Kabani, Volker J Sorger, Tarek El-Ghazawi
Abstract	Convolutional Neural Networks (CNN) have been the centerpiece of many applications including but not limited to computer vision, speech processing, and Natural Language Processing (NLP). However, the computationally expensive convolution operations impose many challenges to the performance and scalability of CNNs. In parallel, photonic systems, which are traditionally employed for data communication, have enjoyed recent popularity for data processing due to their high bandwidth, low power consumption, and reconfigurability. Here we propose a Photonic Convolutional Neural Network Accelerator (PCNNA) as a proof of concept design to speedup the convolution operation for CNNs. Our design is based on the recently introduced silicon photonic microring weight banks, which use broadcast-and-weight protocol to perform Multiply And Accumulate (MAC) operation and move data through layers of a neural network. Here, we aim to exploit the synergy between the inherent parallelism of photonics in the form of Wavelength Division Multiplexing (WDM) and sparsity of connections between input feature maps and kernels in CNNs. While our full system design offers up to more than 3 orders of magnitude speedup in execution time, its optical core potentially offers more than 5 order of magnitude speedup compared to state-of-the-art electronic counterparts.
Tasks
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08792v1
PDF	http://arxiv.org/pdf/1807.08792v1.pdf
PWC	https://paperswithcode.com/paper/pcnna-a-photonic-convolutional-neural-network
Repo
Framework

Toward Designing Convergent Deep Operator Splitting Methods for Task-specific Nonconvex Optimization


Title	Toward Designing Convergent Deep Operator Splitting Methods for Task-specific Nonconvex Optimization
Authors	Risheng Liu, Shichao Cheng, Yi He, Xin Fan, Zhongxuan Luo
Abstract	Operator splitting methods have been successfully used in computational sciences, statistics, learning and vision areas to reduce complex problems into a series of simpler subproblems. However, prevalent splitting schemes are mostly established only based on the mathematical properties of some general optimization models. So it is a laborious process and often requires many iterations of ideation and validation to obtain practical and task-specific optimal solutions, especially for nonconvex problems in real-world scenarios. To break through the above limits, we introduce a new algorithmic framework, called Learnable Bregman Splitting (LBS), to perform deep-architecture-based operator splitting for nonconvex optimization based on specific task model. Thanks to the data-dependent (i.e., learnable) nature, our LBS can not only speed up the convergence, but also avoid unwanted trivial solutions for real-world tasks. Though with inexact deep iterations, we can still establish the global convergence and estimate the asymptotic convergence rate of LBS only by enforcing some fairly loose assumptions. Extensive experiments on different applications (e.g., image completion and deblurring) verify our theoretical results and show the superiority of LBS against existing methods.
Tasks	Deblurring
Published	2018-04-28
URL	http://arxiv.org/abs/1804.10798v1
PDF	http://arxiv.org/pdf/1804.10798v1.pdf
PWC	https://paperswithcode.com/paper/toward-designing-convergent-deep-operator
Repo
Framework

A Self-paced Regularization Framework for Partial-Label Learning


Title	A Self-paced Regularization Framework for Partial-Label Learning
Authors	Gengyu Lyu, Songhe Feng, Congyang Lang
Abstract	Partial label learning (PLL) aims to solve the problem where each training instance is associated with a set of candidate labels, one of which is the correct label. Most PLL algorithms try to disambiguate the candidate label set, by either simply treating each candidate label equally or iteratively identifying the true label. Nonetheless, existing algorithms usually treat all labels and instances equally, and the complexities of both labels and instances are not taken into consideration during the learning stage. Inspired by the successful application of self-paced learning strategy in machine learning field, we integrate the self-paced regime into the partial label learning framework and propose a novel Self-Paced Partial-Label Learning (SP-PLL) algorithm, which could control the learning process to alleviate the problem by ranking the priorities of the training examples together with their candidate labels during each learning iteration. Extensive experiments and comparisons with other baseline methods demonstrate the effectiveness and robustness of the proposed method.
Tasks
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07759v2
PDF	http://arxiv.org/pdf/1804.07759v2.pdf
PWC	https://paperswithcode.com/paper/a-self-paced-regularization-framework-for-1
Repo
Framework

Stochastic natural gradient descent draws posterior samples in function space


Title	Stochastic natural gradient descent draws posterior samples in function space
Authors	Samuel L. Smith, Daniel Duckworth, Semon Rezchikov, Quoc V. Le, Jascha Sohl-Dickstein
Abstract	Recent work has argued that stochastic gradient descent can approximate the Bayesian uncertainty in model parameters near local minima. In this work we develop a similar correspondence for minibatch natural gradient descent (NGD). We prove that for sufficiently small learning rates, if the model predictions on the training set approach the true conditional distribution of labels given inputs, the stationary distribution of minibatch NGD approaches a Bayesian posterior near local minima. The temperature $T = \epsilon N / (2B)$ is controlled by the learning rate $\epsilon$, training set size $N$ and batch size $B$. However minibatch NGD is not parameterisation invariant and it does not sample a valid posterior away from local minima. We therefore propose a novel optimiser, “stochastic NGD”, which introduces the additional correction terms required to preserve both properties.
Tasks
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09597v4
PDF	http://arxiv.org/pdf/1806.09597v4.pdf
PWC	https://paperswithcode.com/paper/stochastic-natural-gradient-descent-draws
Repo
Framework


Title	Learning a Discriminative Prior for Blind Image Deblurring
Authors	Lerenhan Li, Jinshan Pan, Wei-Sheng Lai, Changxin Gao, Nong Sang, Ming-Hsuan Yang
Abstract	We present an effective blind image deblurring method based on a data-driven discriminative prior.Our work is motivated by the fact that a good image prior should favor clear images over blurred images.In this work, we formulate the image prior as a binary classifier which can be achieved by a deep convolutional neural network (CNN).The learned prior is able to distinguish whether an input image is clear or not.Embedded into the maximum a posterior (MAP) framework, it helps blind deblurring in various scenarios, including natural, face, text, and low-illumination images.However, it is difficult to optimize the deblurring method with the learned image prior as it involves a non-linear CNN.Therefore, we develop an efficient numerical approach based on the half-quadratic splitting method and gradient decent algorithm to solve the proposed model.Furthermore, the proposed model can be easily extended to non-uniform deblurring.Both qualitative and quantitative experimental results show that our method performs favorably against state-of-the-art algorithms as well as domain-specific image deblurring approaches.
Tasks	Blind Image Deblurring, Deblurring
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03363v2
PDF	http://arxiv.org/pdf/1803.03363v2.pdf
PWC	https://paperswithcode.com/paper/learning-a-discriminative-prior-for-blind
Repo
Framework

Adversarial Network Compression


Title	Adversarial Network Compression
Authors	Vasileios Belagiannis, Azade Farshad, Fabio Galasso
Abstract	Neural network compression has recently received much attention due to the computational requirements of modern deep models. In this work, our objective is to transfer knowledge from a deep and accurate model to a smaller one. Our contributions are threefold: (i) we propose an adversarial network compression approach to train the small student network to mimic the large teacher, without the need for labels during training; (ii) we introduce a regularization scheme to prevent a trivially-strong discriminator without reducing the network capacity and (iii) our approach generalizes on different teacher-student models. In an extensive evaluation on five standard datasets, we show that our student has small accuracy drop, achieves better performance than other knowledge transfer approaches and it surpasses the performance of the same network trained with labels. In addition, we demonstrate state-of-the-art results compared to other compression strategies.
Tasks	Neural Network Compression, Transfer Learning
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10750v2
PDF	http://arxiv.org/pdf/1803.10750v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-network-compression
Repo
Framework

Thompson Sampling for Noncompliant Bandits


Title	Thompson Sampling for Noncompliant Bandits
Authors	Andrew Stirn, Tony Jebara
Abstract	Thompson sampling, a Bayesian method for balancing exploration and exploitation in bandit problems, has theoretical guarantees and exhibits strong empirical performance in many domains. Traditional Thompson sampling, however, assumes perfect compliance, where an agent’s chosen action is treated as the implemented action. This article introduces a stochastic noncompliance model that relaxes this assumption. We prove that any noncompliance in a 2-armed Bernoulli bandit increases existing regret bounds. With our noncompliance model, we derive Thompson sampling variants that explicitly handle both observed and latent noncompliance. With extensive empirical analysis, we demonstrate that our algorithms either match or outperform traditional Thompson sampling in both compliant and noncompliant environments.
Tasks
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00856v1
PDF	http://arxiv.org/pdf/1812.00856v1.pdf
PWC	https://paperswithcode.com/paper/thompson-sampling-for-noncompliant-bandits
Repo
Framework


Title	Representing Social Media Users for Sarcasm Detection
Authors	Y. Alex Kolchinski, Christopher Potts
Abstract	We explore two methods for representing authors in the context of textual sarcasm detection: a Bayesian approach that directly represents authors’ propensities to be sarcastic, and a dense embedding approach that can learn interactions between the author and the text. Using the SARC dataset of Reddit comments, we show that augmenting a bidirectional RNN with these representations improves performance; the Bayesian approach suffices in homogeneous contexts, whereas the added power of the dense embeddings proves valuable in more diverse ones.
Tasks	Sarcasm Detection
Published	2018-08-25
URL	http://arxiv.org/abs/1808.08470v1
PDF	http://arxiv.org/pdf/1808.08470v1.pdf
PWC	https://paperswithcode.com/paper/representing-social-media-users-for-sarcasm
Repo
Framework

Deep Reinforcement Learning for Join Order Enumeration


Title	Deep Reinforcement Learning for Join Order Enumeration
Authors	Ryan Marcus, Olga Papaemmanouil
Abstract	Join order selection plays a significant role in query performance. However, modern query optimizers typically employ static join enumeration algorithms that do not receive any feedback about the quality of the resulting plan. Hence, optimizers often repeatedly choose the same bad plan, as they do not have a mechanism for “learning from their mistakes”. In this paper, we argue that existing deep reinforcement learning techniques can be applied to address this challenge. These techniques, powered by artificial neural networks, can automatically improve decision making by incorporating feedback from their successes and failures. Towards this goal, we present ReJOIN, a proof-of-concept join enumerator, and present preliminary results indicating that ReJOIN can match or outperform the PostgreSQL optimizer in terms of plan quality and join enumeration efficiency.
Tasks	Decision Making
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00055v2
PDF	http://arxiv.org/pdf/1803.00055v2.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-for-join-order
Repo
Framework

VPE: Variational Policy Embedding for Transfer Reinforcement Learning


Title	VPE: Variational Policy Embedding for Transfer Reinforcement Learning
Authors	Isac Arnekvist, Danica Kragic, Johannes A. Stork
Abstract	Reinforcement Learning methods are capable of solving complex problems, but resulting policies might perform poorly in environments that are even slightly different. In robotics especially, training and deployment conditions often vary and data collection is expensive, making retraining undesirable. Simulation training allows for feasible training times, but on the other hand suffers from a reality-gap when applied in real-world settings. This raises the need of efficient adaptation of policies acting in new environments. We consider this as a problem of transferring knowledge within a family of similar Markov decision processes. For this purpose we assume that Q-functions are generated by some low-dimensional latent variable. Given such a Q-function, we can find a master policy that can adapt given different values of this latent variable. Our method learns both the generative mapping and an approximate posterior of the latent variables, enabling identification of policies for new tasks by searching only in the latent space, rather than the space of all policies. The low-dimensional space, and master policy found by our method enables policies to quickly adapt to new environments. We demonstrate the method on both a pendulum swing-up task in simulation, and for simulation-to-real transfer on a pushing task.
Tasks	Transfer Reinforcement Learning
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03548v2
PDF	http://arxiv.org/pdf/1809.03548v2.pdf
PWC	https://paperswithcode.com/paper/vpe-variational-policy-embedding-for-transfer
Repo
Framework

Nonisometric Surface Registration via Conformal Laplace-Beltrami Basis Pursuit


Title	Nonisometric Surface Registration via Conformal Laplace-Beltrami Basis Pursuit
Authors	Stefan C. Schonsheck, Michael M. Bronstein, Rongjie Lai
Abstract	Surface registration is one of the most fundamental problems in geometry processing. Many approaches have been developed to tackle this problem in cases where the surfaces are nearly isometric. However, it is much more challenging to compute correspondence between surfaces which are intrinsically less similar. In this paper, we propose a variational model to align the Laplace-Beltrami (LB) eigensytems of two non-isometric genus zero shapes via conformal deformations. This method enables us compute to geometric meaningful point-to-point maps between non-isometric shapes. Our model is based on a novel basis pursuit scheme whereby we simultaneously compute a conformal deformation of a ‘target shape’ and its deformed LB eigensytem. We solve the model using an proximal alternating minimization algorithm hybridized with the augmented Lagrangian method which produces accurate correspondences given only a few landmark points. We also propose a reinitialization scheme to overcome some of the difficulties caused by the non-convexity of the variational problem. Intensive numerical experiments illustrate the effectiveness and robustness of the proposed method to handle non-isometric surfaces with large deformation with respect to both noise on the underlying manifolds and errors within the given landmarks.
Tasks
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07399v1
PDF	http://arxiv.org/pdf/1809.07399v1.pdf
PWC	https://paperswithcode.com/paper/nonisometric-surface-registration-via
Repo
Framework