Paper Group ANR 679
Types, Tokens, and Hapaxes: A New Heap’s Law. Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking. Paying Attention to Attention: Highlighting Influential Samples in Sequential Analysis. Automatic Liver Segmentation with Adversarial Loss and Convolutional Neural Network. PCNNA: A Photonic Conv …
Types, Tokens, and Hapaxes: A New Heap’s Law
Title | Types, Tokens, and Hapaxes: A New Heap’s Law |
Authors | Victor Davis |
Abstract | Heap’s Law states that in a large enough text corpus, the number of types as a function of tokens grows as $N=KM^\beta$ for some free parameters $K,\beta$. Much has been written about how this result and various generalizations can be derived from Zipf’s Law. Here we derive from first principles a completely novel expression of the type-token curve and prove its superior accuracy on real text. This expression naturally generalizes to equally accurate estimates for counting hapaxes and higher $n$-legomena. |
Tasks | |
Published | 2018-12-31 |
URL | http://arxiv.org/abs/1901.00521v1 |
http://arxiv.org/pdf/1901.00521v1.pdf | |
PWC | https://paperswithcode.com/paper/types-tokens-and-hapaxes-a-new-heaps-law |
Repo | |
Framework | |
Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking
Title | Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking |
Authors | Cong Ma, Changshui Yang, Fan Yang, Yueqing Zhuang, Ziwei Zhang, Huizhu Jia, Xiaodong Xie |
Abstract | Multi-Object Tracking (MOT) is a challenging task in the complex scene such as surveillance and autonomous driving. In this paper, we propose a novel tracklet processing method to cleave and re-connect tracklets on crowd or long-term occlusion by Siamese Bi-Gated Recurrent Unit (GRU). The tracklet generation utilizes object features extracted by CNN and RNN to create the high-confidence tracklet candidates in sparse scenario. Due to mis-tracking in the generation process, the tracklets from different objects are split into several sub-tracklets by a bidirectional GRU. After that, a Siamese GRU based tracklet re-connection method is applied to link the sub-tracklets which belong to the same object to form a whole trajectory. In addition, we extract the tracklet images from existing MOT datasets and propose a novel dataset to train our networks. The proposed dataset contains more than 95160 pedestrian images. It has 793 different persons in it. On average, there are 120 images for each person with positions and sizes. Experimental results demonstrate the advantages of our model over the state-of-the-art methods on MOT16. |
Tasks | Autonomous Driving, Multi-Object Tracking, Multiple Object Tracking, Object Tracking |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04555v1 |
http://arxiv.org/pdf/1804.04555v1.pdf | |
PWC | https://paperswithcode.com/paper/trajectory-factory-tracklet-cleaving-and-re |
Repo | |
Framework | |
Paying Attention to Attention: Highlighting Influential Samples in Sequential Analysis
Title | Paying Attention to Attention: Highlighting Influential Samples in Sequential Analysis |
Authors | Cynthia Freeman, Jonathan Merriman, Abhinav Aggarwal, Ian Beaver, Abdullah Mueen |
Abstract | In (Yang et al. 2016), a hierarchical attention network (HAN) is created for document classification. The attention layer can be used to visualize text influential in classifying the document, thereby explaining the model’s prediction. We successfully applied HAN to a sequential analysis task in the form of real-time monitoring of turn taking in conversations. However, we discovered instances where the attention weights were uniform at the stopping point (indicating all turns were equivalently influential to the classifier), preventing meaningful visualization for real-time human review or classifier improvement. We observed that attention weights for turns fluctuated as the conversations progressed, indicating turns had varying influence based on conversation state. Leveraging this observation, we develop a method to create more informative real-time visuals (as confirmed by human reviewers) in cases of uniform attention weights using the changes in turn importance as a conversation progresses over time. |
Tasks | Document Classification |
Published | 2018-08-06 |
URL | http://arxiv.org/abs/1808.02113v1 |
http://arxiv.org/pdf/1808.02113v1.pdf | |
PWC | https://paperswithcode.com/paper/paying-attention-to-attention-highlighting |
Repo | |
Framework | |
Automatic Liver Segmentation with Adversarial Loss and Convolutional Neural Network
Title | Automatic Liver Segmentation with Adversarial Loss and Convolutional Neural Network |
Authors | Bora Baydar, Savas Ozkan, Gozde Bozdagi Akar |
Abstract | Automatic segmentation of medical images is among most demanded works in the medical information field since it saves time of the experts in the field and avoids human error factors. In this work, a method based on Conditional Adversarial Networks and Fully Convolutional Networks is proposed for the automatic segmentation of the liver MRIs. The proposed method, without any post-processing, is achieved the second place in the SIU Liver Segmentation Challenge 2018, data of which is provided by Dokuz Eyl"ul University. In this paper, some improvements for the post-processing step are also proposed and it is shown that with these additions, the method outperforms other baseline methods. |
Tasks | Liver Segmentation |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11566v1 |
http://arxiv.org/pdf/1811.11566v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-liver-segmentation-with-adversarial |
Repo | |
Framework | |
PCNNA: A Photonic Convolutional Neural Network Accelerator
Title | PCNNA: A Photonic Convolutional Neural Network Accelerator |
Authors | Armin Mehrabian, Yousra Al-Kabani, Volker J Sorger, Tarek El-Ghazawi |
Abstract | Convolutional Neural Networks (CNN) have been the centerpiece of many applications including but not limited to computer vision, speech processing, and Natural Language Processing (NLP). However, the computationally expensive convolution operations impose many challenges to the performance and scalability of CNNs. In parallel, photonic systems, which are traditionally employed for data communication, have enjoyed recent popularity for data processing due to their high bandwidth, low power consumption, and reconfigurability. Here we propose a Photonic Convolutional Neural Network Accelerator (PCNNA) as a proof of concept design to speedup the convolution operation for CNNs. Our design is based on the recently introduced silicon photonic microring weight banks, which use broadcast-and-weight protocol to perform Multiply And Accumulate (MAC) operation and move data through layers of a neural network. Here, we aim to exploit the synergy between the inherent parallelism of photonics in the form of Wavelength Division Multiplexing (WDM) and sparsity of connections between input feature maps and kernels in CNNs. While our full system design offers up to more than 3 orders of magnitude speedup in execution time, its optical core potentially offers more than 5 order of magnitude speedup compared to state-of-the-art electronic counterparts. |
Tasks | |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08792v1 |
http://arxiv.org/pdf/1807.08792v1.pdf | |
PWC | https://paperswithcode.com/paper/pcnna-a-photonic-convolutional-neural-network |
Repo | |
Framework | |
Toward Designing Convergent Deep Operator Splitting Methods for Task-specific Nonconvex Optimization
Title | Toward Designing Convergent Deep Operator Splitting Methods for Task-specific Nonconvex Optimization |
Authors | Risheng Liu, Shichao Cheng, Yi He, Xin Fan, Zhongxuan Luo |
Abstract | Operator splitting methods have been successfully used in computational sciences, statistics, learning and vision areas to reduce complex problems into a series of simpler subproblems. However, prevalent splitting schemes are mostly established only based on the mathematical properties of some general optimization models. So it is a laborious process and often requires many iterations of ideation and validation to obtain practical and task-specific optimal solutions, especially for nonconvex problems in real-world scenarios. To break through the above limits, we introduce a new algorithmic framework, called Learnable Bregman Splitting (LBS), to perform deep-architecture-based operator splitting for nonconvex optimization based on specific task model. Thanks to the data-dependent (i.e., learnable) nature, our LBS can not only speed up the convergence, but also avoid unwanted trivial solutions for real-world tasks. Though with inexact deep iterations, we can still establish the global convergence and estimate the asymptotic convergence rate of LBS only by enforcing some fairly loose assumptions. Extensive experiments on different applications (e.g., image completion and deblurring) verify our theoretical results and show the superiority of LBS against existing methods. |
Tasks | Deblurring |
Published | 2018-04-28 |
URL | http://arxiv.org/abs/1804.10798v1 |
http://arxiv.org/pdf/1804.10798v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-designing-convergent-deep-operator |
Repo | |
Framework | |
A Self-paced Regularization Framework for Partial-Label Learning
Title | A Self-paced Regularization Framework for Partial-Label Learning |
Authors | Gengyu Lyu, Songhe Feng, Congyang Lang |
Abstract | Partial label learning (PLL) aims to solve the problem where each training instance is associated with a set of candidate labels, one of which is the correct label. Most PLL algorithms try to disambiguate the candidate label set, by either simply treating each candidate label equally or iteratively identifying the true label. Nonetheless, existing algorithms usually treat all labels and instances equally, and the complexities of both labels and instances are not taken into consideration during the learning stage. Inspired by the successful application of self-paced learning strategy in machine learning field, we integrate the self-paced regime into the partial label learning framework and propose a novel Self-Paced Partial-Label Learning (SP-PLL) algorithm, which could control the learning process to alleviate the problem by ranking the priorities of the training examples together with their candidate labels during each learning iteration. Extensive experiments and comparisons with other baseline methods demonstrate the effectiveness and robustness of the proposed method. |
Tasks | |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07759v2 |
http://arxiv.org/pdf/1804.07759v2.pdf | |
PWC | https://paperswithcode.com/paper/a-self-paced-regularization-framework-for-1 |
Repo | |
Framework | |
Stochastic natural gradient descent draws posterior samples in function space
Title | Stochastic natural gradient descent draws posterior samples in function space |
Authors | Samuel L. Smith, Daniel Duckworth, Semon Rezchikov, Quoc V. Le, Jascha Sohl-Dickstein |
Abstract | Recent work has argued that stochastic gradient descent can approximate the Bayesian uncertainty in model parameters near local minima. In this work we develop a similar correspondence for minibatch natural gradient descent (NGD). We prove that for sufficiently small learning rates, if the model predictions on the training set approach the true conditional distribution of labels given inputs, the stationary distribution of minibatch NGD approaches a Bayesian posterior near local minima. The temperature $T = \epsilon N / (2B)$ is controlled by the learning rate $\epsilon$, training set size $N$ and batch size $B$. However minibatch NGD is not parameterisation invariant and it does not sample a valid posterior away from local minima. We therefore propose a novel optimiser, “stochastic NGD”, which introduces the additional correction terms required to preserve both properties. |
Tasks | |
Published | 2018-06-25 |
URL | http://arxiv.org/abs/1806.09597v4 |
http://arxiv.org/pdf/1806.09597v4.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-natural-gradient-descent-draws |
Repo | |
Framework | |
Learning a Discriminative Prior for Blind Image Deblurring
Title | Learning a Discriminative Prior for Blind Image Deblurring |
Authors | Lerenhan Li, Jinshan Pan, Wei-Sheng Lai, Changxin Gao, Nong Sang, Ming-Hsuan Yang |
Abstract | We present an effective blind image deblurring method based on a data-driven discriminative prior.Our work is motivated by the fact that a good image prior should favor clear images over blurred images.In this work, we formulate the image prior as a binary classifier which can be achieved by a deep convolutional neural network (CNN).The learned prior is able to distinguish whether an input image is clear or not.Embedded into the maximum a posterior (MAP) framework, it helps blind deblurring in various scenarios, including natural, face, text, and low-illumination images.However, it is difficult to optimize the deblurring method with the learned image prior as it involves a non-linear CNN.Therefore, we develop an efficient numerical approach based on the half-quadratic splitting method and gradient decent algorithm to solve the proposed model.Furthermore, the proposed model can be easily extended to non-uniform deblurring.Both qualitative and quantitative experimental results show that our method performs favorably against state-of-the-art algorithms as well as domain-specific image deblurring approaches. |
Tasks | Blind Image Deblurring, Deblurring |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03363v2 |
http://arxiv.org/pdf/1803.03363v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-discriminative-prior-for-blind |
Repo | |
Framework | |
Adversarial Network Compression
Title | Adversarial Network Compression |
Authors | Vasileios Belagiannis, Azade Farshad, Fabio Galasso |
Abstract | Neural network compression has recently received much attention due to the computational requirements of modern deep models. In this work, our objective is to transfer knowledge from a deep and accurate model to a smaller one. Our contributions are threefold: (i) we propose an adversarial network compression approach to train the small student network to mimic the large teacher, without the need for labels during training; (ii) we introduce a regularization scheme to prevent a trivially-strong discriminator without reducing the network capacity and (iii) our approach generalizes on different teacher-student models. In an extensive evaluation on five standard datasets, we show that our student has small accuracy drop, achieves better performance than other knowledge transfer approaches and it surpasses the performance of the same network trained with labels. In addition, we demonstrate state-of-the-art results compared to other compression strategies. |
Tasks | Neural Network Compression, Transfer Learning |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.10750v2 |
http://arxiv.org/pdf/1803.10750v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-network-compression |
Repo | |
Framework | |
Thompson Sampling for Noncompliant Bandits
Title | Thompson Sampling for Noncompliant Bandits |
Authors | Andrew Stirn, Tony Jebara |
Abstract | Thompson sampling, a Bayesian method for balancing exploration and exploitation in bandit problems, has theoretical guarantees and exhibits strong empirical performance in many domains. Traditional Thompson sampling, however, assumes perfect compliance, where an agent’s chosen action is treated as the implemented action. This article introduces a stochastic noncompliance model that relaxes this assumption. We prove that any noncompliance in a 2-armed Bernoulli bandit increases existing regret bounds. With our noncompliance model, we derive Thompson sampling variants that explicitly handle both observed and latent noncompliance. With extensive empirical analysis, we demonstrate that our algorithms either match or outperform traditional Thompson sampling in both compliant and noncompliant environments. |
Tasks | |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.00856v1 |
http://arxiv.org/pdf/1812.00856v1.pdf | |
PWC | https://paperswithcode.com/paper/thompson-sampling-for-noncompliant-bandits |
Repo | |
Framework | |
Representing Social Media Users for Sarcasm Detection
Title | Representing Social Media Users for Sarcasm Detection |
Authors | Y. Alex Kolchinski, Christopher Potts |
Abstract | We explore two methods for representing authors in the context of textual sarcasm detection: a Bayesian approach that directly represents authors’ propensities to be sarcastic, and a dense embedding approach that can learn interactions between the author and the text. Using the SARC dataset of Reddit comments, we show that augmenting a bidirectional RNN with these representations improves performance; the Bayesian approach suffices in homogeneous contexts, whereas the added power of the dense embeddings proves valuable in more diverse ones. |
Tasks | Sarcasm Detection |
Published | 2018-08-25 |
URL | http://arxiv.org/abs/1808.08470v1 |
http://arxiv.org/pdf/1808.08470v1.pdf | |
PWC | https://paperswithcode.com/paper/representing-social-media-users-for-sarcasm |
Repo | |
Framework | |
Deep Reinforcement Learning for Join Order Enumeration
Title | Deep Reinforcement Learning for Join Order Enumeration |
Authors | Ryan Marcus, Olga Papaemmanouil |
Abstract | Join order selection plays a significant role in query performance. However, modern query optimizers typically employ static join enumeration algorithms that do not receive any feedback about the quality of the resulting plan. Hence, optimizers often repeatedly choose the same bad plan, as they do not have a mechanism for “learning from their mistakes”. In this paper, we argue that existing deep reinforcement learning techniques can be applied to address this challenge. These techniques, powered by artificial neural networks, can automatically improve decision making by incorporating feedback from their successes and failures. Towards this goal, we present ReJOIN, a proof-of-concept join enumerator, and present preliminary results indicating that ReJOIN can match or outperform the PostgreSQL optimizer in terms of plan quality and join enumeration efficiency. |
Tasks | Decision Making |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1803.00055v2 |
http://arxiv.org/pdf/1803.00055v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-for-join-order |
Repo | |
Framework | |
VPE: Variational Policy Embedding for Transfer Reinforcement Learning
Title | VPE: Variational Policy Embedding for Transfer Reinforcement Learning |
Authors | Isac Arnekvist, Danica Kragic, Johannes A. Stork |
Abstract | Reinforcement Learning methods are capable of solving complex problems, but resulting policies might perform poorly in environments that are even slightly different. In robotics especially, training and deployment conditions often vary and data collection is expensive, making retraining undesirable. Simulation training allows for feasible training times, but on the other hand suffers from a reality-gap when applied in real-world settings. This raises the need of efficient adaptation of policies acting in new environments. We consider this as a problem of transferring knowledge within a family of similar Markov decision processes. For this purpose we assume that Q-functions are generated by some low-dimensional latent variable. Given such a Q-function, we can find a master policy that can adapt given different values of this latent variable. Our method learns both the generative mapping and an approximate posterior of the latent variables, enabling identification of policies for new tasks by searching only in the latent space, rather than the space of all policies. The low-dimensional space, and master policy found by our method enables policies to quickly adapt to new environments. We demonstrate the method on both a pendulum swing-up task in simulation, and for simulation-to-real transfer on a pushing task. |
Tasks | Transfer Reinforcement Learning |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03548v2 |
http://arxiv.org/pdf/1809.03548v2.pdf | |
PWC | https://paperswithcode.com/paper/vpe-variational-policy-embedding-for-transfer |
Repo | |
Framework | |
Nonisometric Surface Registration via Conformal Laplace-Beltrami Basis Pursuit
Title | Nonisometric Surface Registration via Conformal Laplace-Beltrami Basis Pursuit |
Authors | Stefan C. Schonsheck, Michael M. Bronstein, Rongjie Lai |
Abstract | Surface registration is one of the most fundamental problems in geometry processing. Many approaches have been developed to tackle this problem in cases where the surfaces are nearly isometric. However, it is much more challenging to compute correspondence between surfaces which are intrinsically less similar. In this paper, we propose a variational model to align the Laplace-Beltrami (LB) eigensytems of two non-isometric genus zero shapes via conformal deformations. This method enables us compute to geometric meaningful point-to-point maps between non-isometric shapes. Our model is based on a novel basis pursuit scheme whereby we simultaneously compute a conformal deformation of a ‘target shape’ and its deformed LB eigensytem. We solve the model using an proximal alternating minimization algorithm hybridized with the augmented Lagrangian method which produces accurate correspondences given only a few landmark points. We also propose a reinitialization scheme to overcome some of the difficulties caused by the non-convexity of the variational problem. Intensive numerical experiments illustrate the effectiveness and robustness of the proposed method to handle non-isometric surfaces with large deformation with respect to both noise on the underlying manifolds and errors within the given landmarks. |
Tasks | |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07399v1 |
http://arxiv.org/pdf/1809.07399v1.pdf | |
PWC | https://paperswithcode.com/paper/nonisometric-surface-registration-via |
Repo | |
Framework | |