Paper Group ANR 524
GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution. The ACRV Picking Benchmark (APB): A Robotic Shelf Picking Benchmark to Foster Reproducible Research. A Local-Global Approach to Semantic Segmentation in Aerial Images. Learning an Invariant Hilbert Space for Domain Adaptation. Learning Social Circles in Ego Networks based …
GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution
Title | GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution |
Authors | Matt J. Kusner, José Miguel Hernández-Lobato |
Abstract | Generative Adversarial Networks (GAN) have limitations when the goal is to generate sequences of discrete elements. The reason for this is that samples from a distribution on discrete objects such as the multinomial are not differentiable with respect to the distribution parameters. This problem can be avoided by using the Gumbel-softmax distribution, which is a continuous approximation to a multinomial distribution parameterized in terms of the softmax function. In this work, we evaluate the performance of GANs based on recurrent neural networks with Gumbel-softmax output distributions in the task of generating sequences of discrete elements. |
Tasks | |
Published | 2016-11-12 |
URL | http://arxiv.org/abs/1611.04051v1 |
http://arxiv.org/pdf/1611.04051v1.pdf | |
PWC | https://paperswithcode.com/paper/gans-for-sequences-of-discrete-elements-with |
Repo | |
Framework | |
The ACRV Picking Benchmark (APB): A Robotic Shelf Picking Benchmark to Foster Reproducible Research
Title | The ACRV Picking Benchmark (APB): A Robotic Shelf Picking Benchmark to Foster Reproducible Research |
Authors | Jürgen Leitner, Adam W. Tow, Jake E. Dean, Niko Suenderhauf, Joseph W. Durham, Matthew Cooper, Markus Eich, Christopher Lehnert, Ruben Mangels, Christopher McCool, Peter Kujala, Lachlan Nicholson, Trung Pham, James Sergeant, Liao Wu, Fangyi Zhang, Ben Upcroft, Peter Corke |
Abstract | Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress. They make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of contestants, and the test conditions are very difficult to replicate after the main event. We present a new physical benchmark challenge for robotic picking: the ACRV Picking Benchmark (APB). Designed to be reproducible, it consists of a set of 42 common objects, a widely available shelf, and exact guidelines for object arrangement using stencils. A well-defined evaluation protocol enables the comparison of \emph{complete} robotic systems – including perception and manipulation – instead of sub-systems only. Our paper also describes and reports results achieved by an open baseline system based on a Baxter robot. |
Tasks | |
Published | 2016-09-17 |
URL | http://arxiv.org/abs/1609.05258v2 |
http://arxiv.org/pdf/1609.05258v2.pdf | |
PWC | https://paperswithcode.com/paper/the-acrv-picking-benchmark-apb-a-robotic |
Repo | |
Framework | |
A Local-Global Approach to Semantic Segmentation in Aerial Images
Title | A Local-Global Approach to Semantic Segmentation in Aerial Images |
Authors | Alina Elena Marcu |
Abstract | Aerial images are often taken under poor lighting conditions and contain low resolution objects, many times occluded by other objects. In this domain, visual context could be of great help, but there are still very few papers that consider context in aerial image understanding and still remains an open problem in computer vision. We propose a dual-stream deep neural network that processes information along two independent pathways. Our model learns to combine local and global appearance in a complementary way, such that together form a powerful classifier. We test our dual-stream network on the task of buildings segmentation in aerial images and obtain state-of-the-art results on the Massachusetts Buildings Dataset. We study the relative importance of local appearance versus the larger scene, as well as their performance in combination on three new buildings datasets. We clearly demonstrate the effectiveness of visual context in conjunction with deep neural networks for aerial image understanding. |
Tasks | Semantic Segmentation |
Published | 2016-07-19 |
URL | http://arxiv.org/abs/1607.05620v1 |
http://arxiv.org/pdf/1607.05620v1.pdf | |
PWC | https://paperswithcode.com/paper/a-local-global-approach-to-semantic |
Repo | |
Framework | |
Learning an Invariant Hilbert Space for Domain Adaptation
Title | Learning an Invariant Hilbert Space for Domain Adaptation |
Authors | Samitha Herath, Mehrtash Harandi, Fatih Porikli |
Abstract | This paper introduces a learning scheme to construct a Hilbert space (i.e., a vector space along its inner product) to address both unsupervised and semi-supervised domain adaptation problems. This is achieved by learning projections from each domain to a latent space along the Mahalanobis metric of the latent space to simultaneously minimizing a notion of domain variance while maximizing a measure of discriminatory power. In particular, we make use of the Riemannian optimization techniques to match statistical properties (e.g., first and second order statistics) between samples projected into the latent space from different domains. Upon availability of class labels, we further deem samples sharing the same label to form more compact clusters while pulling away samples coming from different classes.We extensively evaluate and contrast our proposal against state-of-the-art methods for the task of visual domain adaptation using both handcrafted and deep-net features. Our experiments show that even with a simple nearest neighbor classifier, the proposed method can outperform several state-of-the-art methods benefitting from more involved classification schemes. |
Tasks | Domain Adaptation |
Published | 2016-11-25 |
URL | http://arxiv.org/abs/1611.08350v2 |
http://arxiv.org/pdf/1611.08350v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-an-invariant-hilbert-space-for |
Repo | |
Framework | |
Learning Social Circles in Ego Networks based on Multi-View Social Graphs
Title | Learning Social Circles in Ego Networks based on Multi-View Social Graphs |
Authors | Chao Lan, Yuhao Yang, Xiaoli Li, Bo Luo, Jun Huan |
Abstract | In social network analysis, automatic social circle detection in ego-networks is becoming a fundamental and important task, with many potential applications such as user privacy protection or interest group recommendation. So far, most studies have focused on addressing two questions, namely, how to detect overlapping circles and how to detect circles using a combination of network structure and network node attributes. This paper asks an orthogonal research question, that is, how to detect circles based on network structures that are (usually) described by multiple views? Our investigation begins with crawling ego-networks from Twitter and employing classic techniques to model their structures by six views, including user relationships, user interactions and user content. We then apply both standard and our modified multi-view spectral clustering techniques to detect social circles in these ego-networks. Based on extensive automatic and manual experimental evaluations, we deliver two major findings: first, multi-view clustering techniques perform better than common single-view clustering techniques, which only use one view or naively integrate all views for detection, second, the standard multi-view clustering technique is less robust than our modified technique, which selectively transfers information across views based on an assumption that sparse network structures are (potentially) incomplete. In particular, the second finding makes us believe a direct application of standard clustering on potentially incomplete networks may yield biased results. We lightly examine this issue in theory, where we derive an upper bound for such bias by integrating theories of spectral clustering and matrix perturbation, and discuss how it may be affected by several network characteristics. |
Tasks | |
Published | 2016-07-16 |
URL | http://arxiv.org/abs/1607.04747v2 |
http://arxiv.org/pdf/1607.04747v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-social-circles-in-ego-networks-based |
Repo | |
Framework | |
Architectural Complexity Measures of Recurrent Neural Networks
Title | Architectural Complexity Measures of Recurrent Neural Networks |
Authors | Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan Salakhutdinov, Yoshua Bengio |
Abstract | In this paper, we systematically analyze the connecting architectures of recurrent neural networks (RNNs). Our main contribution is twofold: first, we present a rigorous graph-theoretic framework describing the connecting architectures of RNNs in general. Second, we propose three architecture complexity measures of RNNs: (a) the recurrent depth, which captures the RNN’s over-time nonlinear complexity, (b) the feedforward depth, which captures the local input-output nonlinearity (similar to the “depth” in feedforward neural networks (FNNs)), and (c) the recurrent skip coefficient which captures how rapidly the information propagates over time. We rigorously prove each measure’s existence and computability. Our experimental results show that RNNs might benefit from larger recurrent depth and feedforward depth. We further demonstrate that increasing recurrent skip coefficient offers performance boosts on long term dependency problems. |
Tasks | |
Published | 2016-02-26 |
URL | http://arxiv.org/abs/1602.08210v3 |
http://arxiv.org/pdf/1602.08210v3.pdf | |
PWC | https://paperswithcode.com/paper/architectural-complexity-measures-of |
Repo | |
Framework | |
Voronoi-based compact image descriptors: Efficient Region-of-Interest retrieval with VLAD and deep-learning-based descriptors
Title | Voronoi-based compact image descriptors: Efficient Region-of-Interest retrieval with VLAD and deep-learning-based descriptors |
Authors | Aaron Chadha, Yiannis Andreopoulos |
Abstract | We investigate the problem of image retrieval based on visual queries when the latter comprise arbitrary regions-of-interest (ROI) rather than entire images. Our proposal is a compact image descriptor that combines the state-of-the-art in content-based descriptor extraction with a multi-level, Voronoi-based spatial partitioning of each dataset image. The proposed multi-level Voronoi-based encoding uses a spatial hierarchical K-means over interest-point locations, and computes a content-based descriptor over each cell. In order to reduce the matching complexity with minimal or no sacrifice in retrieval performance: (i) we utilize the tree structure of the spatial hierarchical K-means to perform a top-to-bottom pruning for local similarity maxima; (ii) we propose a new image similarity score that combines relevant information from all partition levels into a single measure for similarity; (iii) we combine our proposal with a novel and efficient approach for optimal bit allocation within quantized descriptor representations. By deriving both a Voronoi-based VLAD descriptor (termed as Fast-VVLAD) and a Voronoi-based deep convolutional neural network (CNN) descriptor (termed as Fast-VDCNN), we demonstrate that our Voronoi-based framework is agnostic to the descriptor basis, and can easily be slotted into existing frameworks. Via a range of ROI queries in two standard datasets, it is shown that the Voronoi-based descriptors achieve comparable or higher mean Average Precision against conventional grid-based spatial search, while offering more than two-fold reduction in complexity. Finally, beyond ROI queries, we show that Voronoi partitioning improves the geometric invariance of compact CNN descriptors, thereby resulting in competitive performance to the current state-of-the-art on whole image retrieval. |
Tasks | Image Retrieval |
Published | 2016-11-27 |
URL | http://arxiv.org/abs/1611.08906v2 |
http://arxiv.org/pdf/1611.08906v2.pdf | |
PWC | https://paperswithcode.com/paper/voronoi-based-compact-image-descriptors |
Repo | |
Framework | |
Neurogenesis Deep Learning
Title | Neurogenesis Deep Learning |
Authors | Timothy J. Draelos, Nadine E. Miner, Christopher C. Lamb, Jonathan A. Cox, Craig M. Vineyard, Kristofor D. Carlson, William M. Severa, Conrad D. James, James B. Aimone |
Abstract | Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which are capable of learning continuously, deep artificial networks have a limited ability for incorporating new information in an already trained network. As a result, methods for continuous learning are potentially highly impactful in enabling the application of deep networks to dynamic data sets. Here, inspired by the process of adult neurogenesis in the hippocampus, we explore the potential for adding new neurons to deep layers of artificial neural networks in order to facilitate their acquisition of novel information while preserving previously trained data representations. Our results on the MNIST handwritten digit dataset and the NIST SD 19 dataset, which includes lower and upper case letters and digits, demonstrate that neurogenesis is well suited for addressing the stability-plasticity dilemma that has long challenged adaptive machine learning algorithms. |
Tasks | |
Published | 2016-12-12 |
URL | http://arxiv.org/abs/1612.03770v2 |
http://arxiv.org/pdf/1612.03770v2.pdf | |
PWC | https://paperswithcode.com/paper/neurogenesis-deep-learning |
Repo | |
Framework | |
Semantic Video Trailers
Title | Semantic Video Trailers |
Authors | Harrie Oosterhuis, Sujith Ravi, Michael Bendersky |
Abstract | Query-based video summarization is the task of creating a brief visual trailer, which captures the parts of the video (or a collection of videos) that are most relevant to the user-issued query. In this paper, we propose an unsupervised label propagation approach for this task. Our approach effectively captures the multimodal semantics of queries and videos using state-of-the-art deep neural networks and creates a summary that is both semantically coherent and visually attractive. We describe the theoretical framework of our graph-based approach and empirically evaluate its effectiveness in creating relevant and attractive trailers. Finally, we showcase example video trailers generated by our system. |
Tasks | Video Summarization |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.01819v1 |
http://arxiv.org/pdf/1609.01819v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-video-trailers |
Repo | |
Framework | |
Query-Focused Extractive Video Summarization
Title | Query-Focused Extractive Video Summarization |
Authors | Aidean Sharghi, Boqing Gong, Mubarak Shah |
Abstract | Video data is explosively growing. As a result of the “big video data”, intelligent algorithms for automatic video summarization have re-emerged as a pressing need. We develop a probabilistic model, Sequential and Hierarchical Determinantal Point Process (SH-DPP), for query-focused extractive video summarization. Given a user query and a long video sequence, our algorithm returns a summary by selecting key shots from the video. The decision to include a shot in the summary depends on the shot’s relevance to the user query and importance in the context of the video, jointly. We verify our approach on two densely annotated video datasets. The query-focused video summarization is particularly useful for search engines, e.g., to display snippets of videos. |
Tasks | Video Summarization |
Published | 2016-07-18 |
URL | http://arxiv.org/abs/1607.05177v1 |
http://arxiv.org/pdf/1607.05177v1.pdf | |
PWC | https://paperswithcode.com/paper/query-focused-extractive-video-summarization |
Repo | |
Framework | |
A recurrent neural network without chaos
Title | A recurrent neural network without chaos |
Authors | Thomas Laurent, James von Brecht |
Abstract | We introduce an exceptionally simple gated recurrent neural network (RNN) that achieves performance comparable to well-known gated architectures, such as LSTMs and GRUs, on the word-level language modeling task. We prove that our model has simple, predicable and non-chaotic dynamics. This stands in stark contrast to more standard gated architectures, whose underlying dynamical systems exhibit chaotic behavior. |
Tasks | Language Modelling |
Published | 2016-12-19 |
URL | http://arxiv.org/abs/1612.06212v1 |
http://arxiv.org/pdf/1612.06212v1.pdf | |
PWC | https://paperswithcode.com/paper/a-recurrent-neural-network-without-chaos |
Repo | |
Framework | |
Neural Networks and Chaos: Construction, Evaluation of Chaotic Networks, and Prediction of Chaos with Multilayer Feedforward Networks
Title | Neural Networks and Chaos: Construction, Evaluation of Chaotic Networks, and Prediction of Chaos with Multilayer Feedforward Networks |
Authors | Jacques M. Bahi, Jean-François Couchot, Christophe Guyeux, Michel Salomon |
Abstract | Many research works deal with chaotic neural networks for various fields of application. Unfortunately, up to now these networks are usually claimed to be chaotic without any mathematical proof. The purpose of this paper is to establish, based on a rigorous theoretical framework, an equivalence between chaotic iterations according to Devaney and a particular class of neural networks. On the one hand we show how to build such a network, on the other hand we provide a method to check if a neural network is a chaotic one. Finally, the ability of classical feedforward multilayer perceptrons to learn sets of data obtained from a dynamical system is regarded. Various Boolean functions are iterated on finite states. Iterations of some of them are proven to be chaotic as it is defined by Devaney. In that context, important differences occur in the training process, establishing with various neural networks that chaotic behaviors are far more difficult to learn. |
Tasks | |
Published | 2016-08-21 |
URL | http://arxiv.org/abs/1608.05916v1 |
http://arxiv.org/pdf/1608.05916v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-networks-and-chaos-construction |
Repo | |
Framework | |
A Simple Approach to Multilingual Polarity Classification in Twitter
Title | A Simple Approach to Multilingual Polarity Classification in Twitter |
Authors | Eric S. Tellez, Sabino Miranda Jiménez, Mario Graff, Daniela Moctezuma, Ranyart R. Suárez, Oscar S. Siordia |
Abstract | Recently, sentiment analysis has received a lot of attention due to the interest in mining opinions of social media users. Sentiment analysis consists in determining the polarity of a given text, i.e., its degree of positiveness or negativeness. Traditionally, Sentiment Analysis algorithms have been tailored to a specific language given the complexity of having a number of lexical variations and errors introduced by the people generating content. In this contribution, our aim is to provide a simple to implement and easy to use multilingual framework, that can serve as a baseline for sentiment analysis contests, and as starting point to build new sentiment analysis systems. We compare our approach in eight different languages, three of them have important international contests, namely, SemEval (English), TASS (Spanish), and SENTIPOLC (Italian). Within the competitions our approach reaches from medium to high positions in the rankings; whereas in the remaining languages our approach outperforms the reported results. |
Tasks | Sentiment Analysis |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.05270v1 |
http://arxiv.org/pdf/1612.05270v1.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-approach-to-multilingual-polarity |
Repo | |
Framework | |
Learning activation functions from data using cubic spline interpolation
Title | Learning activation functions from data using cubic spline interpolation |
Authors | Simone Scardapane, Michele Scarpiniti, Danilo Comminiello, Aurelio Uncini |
Abstract | Neural networks require a careful design in order to perform properly on a given task. In particular, selecting a good activation function (possibly in a data-dependent fashion) is a crucial step, which remains an open problem in the research community. Despite a large amount of investigations, most current implementations simply select one fixed function from a small set of candidates, which is not adapted during training, and is shared among all neurons throughout the different layers. However, neither two of these assumptions can be supposed optimal in practice. In this paper, we present a principled way to have data-dependent adaptation of the activation functions, which is performed independently for each neuron. This is achieved by leveraging over past and present advances on cubic spline interpolation, allowing for local adaptation of the functions around their regions of use. The resulting algorithm is relatively cheap to implement, and overfitting is counterbalanced by the inclusion of a novel damping criterion, which penalizes unwanted oscillations from a predefined shape. Experimental results validate the proposal over two well-known benchmarks. |
Tasks | |
Published | 2016-05-18 |
URL | http://arxiv.org/abs/1605.05509v2 |
http://arxiv.org/pdf/1605.05509v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-activation-functions-from-data-using |
Repo | |
Framework | |
Two-sample testing in non-sparse high-dimensional linear models
Title | Two-sample testing in non-sparse high-dimensional linear models |
Authors | Yinchu Zhu, Jelena Bradic |
Abstract | In analyzing high-dimensional models, sparsity of the model parameter is a common but often undesirable assumption. In this paper, we study the following two-sample testing problem: given two samples generated by two high-dimensional linear models, we aim to test whether the regression coefficients of the two linear models are identical. We propose a framework named TIERS (short for TestIng Equality of Regression Slopes), which solves the two-sample testing problem without making any assumptions on the sparsity of the regression parameters. TIERS builds a new model by convolving the two samples in such a way that the original hypothesis translates into a new moment condition. A self-normalization construction is then developed to form a moment test. We provide rigorous theory for the developed framework. Under very weak conditions of the feature covariance, we show that the accuracy of the proposed test in controlling Type I errors is robust both to the lack of sparsity in the features and to the heavy tails in the error distribution, even when the sample size is much smaller than the feature dimension. Moreover, we discuss minimax optimality and efficiency properties of the proposed test. Simulation analysis demonstrates excellent finite-sample performance of our test. In deriving the test, we also develop tools that are of independent interest. The test is built upon a novel estimator, called Auto-aDaptive Dantzig Selector (ADDS), which not only automatically chooses an appropriate scale of the error term but also incorporates prior information. To effectively approximate the critical value of the test statistic, we develop a novel high-dimensional plug-in approach that complements the recent advances in Gaussian approximation theory. |
Tasks | |
Published | 2016-10-14 |
URL | http://arxiv.org/abs/1610.04580v1 |
http://arxiv.org/pdf/1610.04580v1.pdf | |
PWC | https://paperswithcode.com/paper/two-sample-testing-in-non-sparse-high |
Repo | |
Framework | |