Paper Group ANR 582
Online Learning Without Prior Information. Agnostic Learning by Refuting. Multiscale Residual Mixture of PCA: Dynamic Dictionaries for Optimal Basis Learning. Efficient Online Inference for Infinite Evolutionary Cluster models with Applications to Latent Social Event Discovery. Peduncle Detection of Sweet Pepper for Autonomous Crop Harvesting - Com …
Online Learning Without Prior Information
Title | Online Learning Without Prior Information |
Authors | Ashok Cutkosky, Kwabena Boahen |
Abstract | The vast majority of optimization and online learning algorithms today require some prior information about the data (often in the form of bounds on gradients or on the optimal parameter value). When this information is not available, these algorithms require laborious manual tuning of various hyperparameters, motivating the search for algorithms that can adapt to the data with no prior information. We describe a frontier of new lower bounds on the performance of such algorithms, reflecting a tradeoff between a term that depends on the optimal parameter value and a term that depends on the gradients’ rate of growth. Further, we construct a family of algorithms whose performance matches any desired point on this frontier, which no previous algorithm reaches. |
Tasks | |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02629v2 |
http://arxiv.org/pdf/1703.02629v2.pdf | |
PWC | https://paperswithcode.com/paper/online-learning-without-prior-information |
Repo | |
Framework | |
Agnostic Learning by Refuting
Title | Agnostic Learning by Refuting |
Authors | Pravesh K. Kothari, Roi Livni |
Abstract | The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of \emph{efficient} agnostic learning. We introduce \emph{refutation complexity}, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample complexity of \emph{efficient} agnostic learning. Informally, refutation complexity of a class $\mathcal{C}$ is the minimum number of example-label pairs required to efficiently distinguish between the case that the labels correlate with the evaluation of some member of $\mathcal{C}$ (\emph{structure}) and the case where the labels are i.i.d. Rademacher random variables (\emph{noise}). The easy direction of this relationship was implicitly used in the recent framework for improper PAC learning lower bounds of Daniely and co-authors via connections to the hardness of refuting random constraint satisfaction problems. Our work can be seen as making the relationship between agnostic learning and refutation implicit in their work into an explicit equivalence. In a recent, independent work, Salil Vadhan discovered a similar relationship between refutation and PAC-learning in the realizable (i.e. noiseless) case. |
Tasks | |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03871v2 |
http://arxiv.org/pdf/1709.03871v2.pdf | |
PWC | https://paperswithcode.com/paper/agnostic-learning-by-refuting |
Repo | |
Framework | |
Multiscale Residual Mixture of PCA: Dynamic Dictionaries for Optimal Basis Learning
Title | Multiscale Residual Mixture of PCA: Dynamic Dictionaries for Optimal Basis Learning |
Authors | Randall Balestriero |
Abstract | In this paper we are interested in the problem of learning an over-complete basis and a methodology such that the reconstruction or inverse problem does not need optimization. We analyze the optimality of the presented approaches, their link to popular already known techniques s.a. Artificial Neural Networks,k-means or Oja’s learning rule. Finally, we will see that one approach to reach the optimal dictionary is a factorial and hierarchical approach. The derived approach lead to a formulation of a Deep Oja Network. We present results on different tasks and present the resulting very efficient learning algorithm which brings a new vision on the training of deep nets. Finally, the theoretical work shows that deep frameworks are one way to efficiently have over-complete (combinatorially large) dictionary yet allowing easy reconstruction. We thus present the Deep Residual Oja Network (DRON). We demonstrate that a recursive deep approach working on the residuals allow exponential decrease of the error w.r.t. the depth. |
Tasks | |
Published | 2017-07-18 |
URL | http://arxiv.org/abs/1707.05840v1 |
http://arxiv.org/pdf/1707.05840v1.pdf | |
PWC | https://paperswithcode.com/paper/multiscale-residual-mixture-of-pca-dynamic |
Repo | |
Framework | |
Efficient Online Inference for Infinite Evolutionary Cluster models with Applications to Latent Social Event Discovery
Title | Efficient Online Inference for Infinite Evolutionary Cluster models with Applications to Latent Social Event Discovery |
Authors | Wei Wei, Kennth Joseph, Kathleen Carley |
Abstract | The Recurrent Chinese Restaurant Process (RCRP) is a powerful statistical method for modeling evolving clusters in large scale social media data. With the RCRP, one can allow both the number of clusters and the cluster parameters in a model to change over time. However, application of the RCRP has largely been limited due to the non-conjugacy between the cluster evolutionary priors and the Multinomial likelihood. This non-conjugacy makes inference di cult and restricts the scalability of models which use the RCRP, leading to the RCRP being applied only in simple problems, such as those that can be approximated by a single Gaussian emission. In this paper, we provide a novel solution for the non-conjugacy issues for the RCRP and an example of how to leverage our solution for one speci c problem - the social event discovery problem. By utilizing Sequential Monte Carlo methods in inference, our approach can be massively paralleled and is highly scalable, to the extent it can work on tens of millions of documents. We are able to generate high quality topical and location distributions of the clusters that can be directly interpreted as real social events, and our experimental results suggest that the approaches proposed achieve much better predictive performance than techniques reported in prior work. We also demonstrate how the techniques we develop can be used in a much more general ways toward similar problems. |
Tasks | |
Published | 2017-08-20 |
URL | http://arxiv.org/abs/1708.06000v1 |
http://arxiv.org/pdf/1708.06000v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-online-inference-for-infinite |
Repo | |
Framework | |
Peduncle Detection of Sweet Pepper for Autonomous Crop Harvesting - Combined Colour and 3D Information
Title | Peduncle Detection of Sweet Pepper for Autonomous Crop Harvesting - Combined Colour and 3D Information |
Authors | Inkyu Sa, Chris Lehnert, Andrew English, Chris McCool, Feras Dayoub, Ben Upcroft, Tristan Perez |
Abstract | This paper presents a 3D visual detection method for the challenging task of detecting peduncles of sweet peppers (Capsicum annuum) in the field. Cutting the peduncle cleanly is one of the most difficult stages of the harvesting process, where the peduncle is the part of the crop that attaches it to the main stem of the plant. Accurate peduncle detection in 3D space is therefore a vital step in reliable autonomous harvesting of sweet peppers, as this can lead to precise cutting while avoiding damage to the surrounding plant. This paper makes use of both colour and geometry information acquired from an RGB-D sensor and utilises a supervised-learning approach for the peduncle detection task. The performance of the proposed method is demonstrated and evaluated using qualitative and quantitative results (the Area-Under-the-Curve (AUC) of the detection precision-recall curve). We are able to achieve an AUC of 0.71 for peduncle detection on field-grown sweet peppers. We release a set of manually annotated 3D sweet pepper and peduncle images to assist the research community in performing further research on this topic. |
Tasks | |
Published | 2017-01-30 |
URL | http://arxiv.org/abs/1701.08608v1 |
http://arxiv.org/pdf/1701.08608v1.pdf | |
PWC | https://paperswithcode.com/paper/peduncle-detection-of-sweet-pepper-for |
Repo | |
Framework | |
One-Shot Imitation Learning
Title | One-Shot Imitation Learning |
Authors | Yan Duan, Marcin Andrychowicz, Bradly C. Stadie, Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba |
Abstract | Imitation learning has been commonly applied to solve different tasks in isolation. This usually requires either careful feature engineering, or a significant number of samples. This is far from what we desire: ideally, robots should be able to learn from very few demonstrations of any given task, and instantly generalize to new situations of the same task, without requiring task-specific engineering. In this paper, we propose a meta-learning framework for achieving such capability, which we call one-shot imitation learning. Specifically, we consider the setting where there is a very large set of tasks, and each task has many instantiations. For example, a task could be to stack all blocks on a table into a single tower, another task could be to place all blocks on a table into two-block towers, etc. In each case, different instances of the task would consist of different sets of blocks with different initial states. At training time, our algorithm is presented with pairs of demonstrations for a subset of all tasks. A neural net is trained that takes as input one demonstration and the current state (which initially is the initial state of the other demonstration of the pair), and outputs an action with the goal that the resulting sequence of states and actions matches as closely as possible with the second demonstration. At test time, a demonstration of a single instance of a new task is presented, and the neural net is expected to perform well on new instances of this new task. The use of soft attention allows the model to generalize to conditions and tasks unseen in the training data. We anticipate that by training this model on a much greater variety of tasks and settings, we will obtain a general system that can turn any demonstrations into robust policies that can accomplish an overwhelming variety of tasks. Videos available at https://bit.ly/nips2017-oneshot . |
Tasks | Feature Engineering, Imitation Learning, Meta-Learning |
Published | 2017-03-21 |
URL | http://arxiv.org/abs/1703.07326v3 |
http://arxiv.org/pdf/1703.07326v3.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-imitation-learning |
Repo | |
Framework | |
Indoor Localization by Fusing a Group of Fingerprints Based on Random Forests
Title | Indoor Localization by Fusing a Group of Fingerprints Based on Random Forests |
Authors | Xiansheng Guo, Nirwan Ansari, Huiyong Li |
Abstract | Indoor localization based on SIngle Of Fingerprint (SIOF) is rather susceptible to the changing environment, multipath, and non-line-of-sight (NLOS) propagation. Building SIOF is also a very time-consuming process. Recently, we first proposed a GrOup Of Fingerprints (GOOF) to improve the localization accuracy and reduce the burden of building fingerprints. However, the main drawback is the timeliness. In this paper, we propose a novel localization framework by Fusing A Group Of fingerprinTs (FAGOT) based on random forests. In the offline phase, we first build a GOOF from different transformations of the received signals of multiple antennas. Then, we design multiple GOOF strong classifiers based on Random Forests (GOOF-RF) by training each fingerprint in the GOOF. In the online phase, we input the corresponding transformations of the real measurements into these strong classifiers to obtain multiple independent decisions. Finally, we propose a Sliding Window aIded Mode-based (SWIM) fusion algorithm to balance the localization accuracy and time. Our proposed approaches can work better in an unknown indoor scenario. The burden of building fingerprints can also be reduced drastically. We demonstrate the performance of our algorithms through simulations and real experimental data using two Universal Software Radio Peripheral (USRP) platforms. |
Tasks | |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02185v1 |
http://arxiv.org/pdf/1703.02185v1.pdf | |
PWC | https://paperswithcode.com/paper/indoor-localization-by-fusing-a-group-of |
Repo | |
Framework | |
Image Classification of Melanoma, Nevus and Seborrheic Keratosis by Deep Neural Network Ensemble
Title | Image Classification of Melanoma, Nevus and Seborrheic Keratosis by Deep Neural Network Ensemble |
Authors | Kazuhisa Matsunaga, Akira Hamada, Akane Minagawa, Hiroshi Koga |
Abstract | This short paper reports the method and the evaluation results of Casio and Shinshu University joint team for the ISBI Challenge 2017 - Skin Lesion Analysis Towards Melanoma Detection - Part 3: Lesion Classification hosted by ISIC. Our online validation score was 0.958 with melanoma classifier AUC 0.924 and seborrheic keratosis classifier AUC 0.993. |
Tasks | Image Classification |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03108v1 |
http://arxiv.org/pdf/1703.03108v1.pdf | |
PWC | https://paperswithcode.com/paper/image-classification-of-melanoma-nevus-and |
Repo | |
Framework | |
Compact Neural Networks based on the Multiscale Entanglement Renormalization Ansatz
Title | Compact Neural Networks based on the Multiscale Entanglement Renormalization Ansatz |
Authors | Andrew Hallam, Edward Grant, Vid Stojevic, Simone Severini, Andrew G. Green |
Abstract | This paper demonstrates a method for tensorizing neural networks based upon an efficient way of approximating scale invariant quantum states, the Multi-scale Entanglement Renormalization Ansatz (MERA). We employ MERA as a replacement for the fully connected layers in a convolutional neural network and test this implementation on the CIFAR-10 and CIFAR-100 datasets. The proposed method outperforms factorization using tensor trains, providing greater compression for the same level of accuracy and greater accuracy for the same level of compression. We demonstrate MERA layers with 14000 times fewer parameters and a reduction in accuracy of less than 1% compared to the equivalent fully connected layers, scaling like O(N). |
Tasks | |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03357v3 |
http://arxiv.org/pdf/1711.03357v3.pdf | |
PWC | https://paperswithcode.com/paper/compact-neural-networks-based-on-the |
Repo | |
Framework | |
Exploiting Linguistic Resources for Neural Machine Translation Using Multi-task Learning
Title | Exploiting Linguistic Resources for Neural Machine Translation Using Multi-task Learning |
Authors | Jan Niehues, Eunah Cho |
Abstract | Linguistic resources such as part-of-speech (POS) tags have been extensively used in statistical machine translation (SMT) frameworks and have yielded better performances. However, usage of such linguistic annotations in neural machine translation (NMT) systems has been left under-explored. In this work, we show that multi-task learning is a successful and a easy approach to introduce an additional knowledge into an end-to-end neural attentional model. By jointly training several natural language processing (NLP) tasks in one system, we are able to leverage common information and improve the performance of the individual task. We analyze the impact of three design decisions in multi-task learning: the tasks used in training, the training schedule, and the degree of parameter sharing across the tasks, which is defined by the network architecture. The experiments are conducted for an German to English translation task. As additional linguistic resources, we exploit POS information and named-entities (NE). Experiments show that the translation quality can be improved by up to 1.5 BLEU points under the low-resource condition. The performance of the POS tagger is also improved using the multi-task learning scheme. |
Tasks | Machine Translation, Multi-Task Learning |
Published | 2017-08-03 |
URL | http://arxiv.org/abs/1708.00993v1 |
http://arxiv.org/pdf/1708.00993v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-linguistic-resources-for-neural |
Repo | |
Framework | |
Joint Named Entity Recognition and Stance Detection in Tweets
Title | Joint Named Entity Recognition and Stance Detection in Tweets |
Authors | Dilek Küçük |
Abstract | Named entity recognition (NER) is a well-established task of information extraction which has been studied for decades. More recently, studies reporting NER experiments on social media texts have emerged. On the other hand, stance detection is a considerably new research topic usually considered within the scope of sentiment analysis. Stance detection studies are mostly applied to texts of online debates where the stance of the text owner for a particular target, either explicitly or implicitly mentioned in text, is explored. In this study, we investigate the possible contribution of named entities to the stance detection task in tweets. We report the evaluation results of NER experiments as well as that of the subsequent stance detection experiments using named entities, on a publicly-available stance-annotated data set of tweets. Our results indicate that named entities obtained with a high-performance NER system can contribute to stance detection performance on tweets. |
Tasks | Named Entity Recognition, Sentiment Analysis, Stance Detection |
Published | 2017-07-30 |
URL | http://arxiv.org/abs/1707.09611v1 |
http://arxiv.org/pdf/1707.09611v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-named-entity-recognition-and-stance |
Repo | |
Framework | |
Model-based Iterative Restoration for Binary Document Image Compression with Dictionary Learning
Title | Model-based Iterative Restoration for Binary Document Image Compression with Dictionary Learning |
Authors | Yandong Guo, Cheng Lu, Jan P. Allebach, Charles A. Bouman |
Abstract | The inherent noise in the observed (e.g., scanned) binary document image degrades the image quality and harms the compression ratio through breaking the pattern repentance and adding entropy to the document images. In this paper, we design a cost function in Bayesian framework with dictionary learning. Minimizing our cost function produces a restored image which has better quality than that of the observed noisy image, and a dictionary for representing and encoding the image. After the restoration, we use this dictionary (from the same cost function) to encode the restored image following the symbol-dictionary framework by JBIG2 standard with the lossless mode. Experimental results with a variety of document images demonstrate that our method improves the image quality compared with the observed image, and simultaneously improves the compression ratio. For the test images with synthetic noise, our method reduces the number of flipped pixels by 48.2% and improves the compression ratio by 36.36% as compared with the best encoding methods. For the test images with real noise, our method visually improves the image quality, and outperforms the cutting-edge method by 28.27% in terms of the compression ratio. |
Tasks | Dictionary Learning, Image Compression |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07019v1 |
http://arxiv.org/pdf/1704.07019v1.pdf | |
PWC | https://paperswithcode.com/paper/model-based-iterative-restoration-for-binary |
Repo | |
Framework | |
An initialization method for the k-means using the concept of useful nearest centers
Title | An initialization method for the k-means using the concept of useful nearest centers |
Authors | Hassan Ismkhan |
Abstract | The aim of the k-means is to minimize squared sum of Euclidean distance from the mean (SSEDM) of each cluster. The k-means can effectively optimize this function, but it is too sensitive for initial centers (seeds). This paper proposed a method for initialization of the k-means using the concept of useful nearest center for each data point. |
Tasks | |
Published | 2017-05-10 |
URL | http://arxiv.org/abs/1705.03613v1 |
http://arxiv.org/pdf/1705.03613v1.pdf | |
PWC | https://paperswithcode.com/paper/an-initialization-method-for-the-k-means |
Repo | |
Framework | |
Characterizing Types of Convolution in Deep Convolutional Recurrent Neural Networks for Robust Speech Emotion Recognition
Title | Characterizing Types of Convolution in Deep Convolutional Recurrent Neural Networks for Robust Speech Emotion Recognition |
Authors | Che-Wei Huang, Shrikanth. S. Narayanan |
Abstract | Deep convolutional neural networks are being actively investigated in a wide range of speech and audio processing applications including speech recognition, audio event detection and computational paralinguistics, owing to their ability to reduce factors of variations, for learning from speech. However, studies have suggested to favor a certain type of convolutional operations when building a deep convolutional neural network for speech applications although there has been promising results using different types of convolutional operations. In this work, we study four types of convolutional operations on different input features for speech emotion recognition under noisy and clean conditions in order to derive a comprehensive understanding. Since affective behavioral information has been shown to reflect temporally varying of mental state and convolutional operation are applied locally in time, all deep neural networks share a deep recurrent sub-network architecture for further temporal modeling. We present detailed quantitative module-wise performance analysis to gain insights into information flows within the proposed architectures. In particular, we demonstrate the interplay of affective information and the other irrelevant information during the progression from one module to another. Finally we show that all of our deep neural networks provide state-of-the-art performance on the eNTERFACE’05 corpus. |
Tasks | Emotion Recognition, Speech Emotion Recognition, Speech Recognition |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.02901v2 |
http://arxiv.org/pdf/1706.02901v2.pdf | |
PWC | https://paperswithcode.com/paper/characterizing-types-of-convolution-in-deep |
Repo | |
Framework | |
Novel Ranking-Based Lexical Similarity Measure for Word Embedding
Title | Novel Ranking-Based Lexical Similarity Measure for Word Embedding |
Authors | Jakub Dutkiewicz, Czesław Jędrzejek |
Abstract | Distributional semantics models derive word space from linguistic items in context. Meaning is obtained by defining a distance measure between vectors corresponding to lexical entities. Such vectors present several problems. In this paper we provide a guideline for post process improvements to the baseline vectors. We focus on refining the similarity aspect, address imperfections of the model by applying the hubness reduction method, implementing relational knowledge into the model, and providing a new ranking similarity definition that give maximum weight to the top 1 component value. This feature ranking is similar to the one used in information retrieval. All these enrichments outperform current literature results for joint ESL and TOEF sets comparison. Since single word embedding is a basic element of any semantic task one can expect a significant improvement of results for these tasks. Moreover, our improved method of text processing can be translated to continuous distributed representation of biological sequences for deep proteomics and genomics. |
Tasks | Information Retrieval |
Published | 2017-12-22 |
URL | http://arxiv.org/abs/1712.08439v1 |
http://arxiv.org/pdf/1712.08439v1.pdf | |
PWC | https://paperswithcode.com/paper/novel-ranking-based-lexical-similarity |
Repo | |
Framework | |