Paper Group ANR 122
Support Spinor Machine. Learning to Learn from Noisy Web Videos. Checkpoint Ensembles: Ensemble Methods from a Single Training Process. Towards Instance Segmentation with Object Priority: Prominent Object Detection and Recognition. Benchmarking Decoupled Neural Interfaces with Synthetic Gradients. Document Image Binarization with Fully Convolutiona …
Support Spinor Machine
Title | Support Spinor Machine |
Authors | Kabin Kanjamapornkul, Richard Pinčák, Sanphet Chunithpaisan, Erik Bartoš |
Abstract | We generalize a support vector machine to a support spinor machine by using the mathematical structure of wedge product over vector machine in order to extend field from vector field to spinor field. The separated hyperplane is extended to Kolmogorov space in time series data which allow us to extend a structure of support vector machine to a support tensor machine and a support tensor machine moduli space. Our performance test on support spinor machine is done over one class classification of end point in physiology state of time series data after empirical mode analysis and compared with support vector machine test. We implement algorithm of support spinor machine by using Holo-Hilbert amplitude modulation for fully nonlinear and nonstationary time series data analysis. |
Tasks | Time Series |
Published | 2017-09-11 |
URL | http://arxiv.org/abs/1709.03943v1 |
http://arxiv.org/pdf/1709.03943v1.pdf | |
PWC | https://paperswithcode.com/paper/support-spinor-machine |
Repo | |
Framework | |
Learning to Learn from Noisy Web Videos
Title | Learning to Learn from Noisy Web Videos |
Authors | Serena Yeung, Vignesh Ramanathan, Olga Russakovsky, Liyue Shen, Greg Mori, Li Fei-Fei |
Abstract | Understanding the simultaneously very diverse and intricately fine-grained set of possible human actions is a critical open problem in computer vision. Manually labeling training videos is feasible for some action classes but doesn’t scale to the full long-tailed distribution of actions. A promising way to address this is to leverage noisy data from web queries to learn new actions, using semi-supervised or “webly-supervised” approaches. However, these methods typically do not learn domain-specific knowledge, or rely on iterative hand-tuned data labeling policies. In this work, we instead propose a reinforcement learning-based formulation for selecting the right examples for training a classifier from noisy web search results. Our method uses Q-learning to learn a data labeling policy on a small labeled training dataset, and then uses this to automatically label noisy web data for new visual concepts. Experiments on the challenging Sports-1M action recognition benchmark as well as on additional fine-grained and newly emerging action classes demonstrate that our method is able to learn good labeling policies for noisy data and use this to learn accurate visual concept classifiers. |
Tasks | Q-Learning, Temporal Action Localization |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1706.02884v1 |
http://arxiv.org/pdf/1706.02884v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-learn-from-noisy-web-videos |
Repo | |
Framework | |
Checkpoint Ensembles: Ensemble Methods from a Single Training Process
Title | Checkpoint Ensembles: Ensemble Methods from a Single Training Process |
Authors | Hugh Chen, Scott Lundberg, Su-In Lee |
Abstract | We present the checkpoint ensembles method that can learn ensemble models on a single training process. Although checkpoint ensembles can be applied to any parametric iterative learning technique, here we focus on neural networks. Neural networks’ composable and simple neurons make it possible to capture many individual and interaction effects among features. However, small sample sizes and sampling noise may result in patterns in the training data that are not representative of the true relationship between the features and the outcome. As a solution, regularization during training is often used (e.g. dropout). However, regularization is no panacea – it does not perfectly address overfitting. Even with methods like dropout, two methodologies are commonly used in practice. First is to utilize a validation set independent to the training set as a way to decide when to stop training. Second is to use ensemble methods to further reduce overfitting and take advantage of local optima (i.e. averaging over the predictions of several models). In this paper, we explore checkpoint ensembles – a simple technique that combines these two ideas in one training process. Checkpoint ensembles improve performance by averaging the predictions from “checkpoints” of the best models within single training process. We use three real-world data sets – text, image, and electronic health record data – using three prediction models: a vanilla neural network, a convolutional neural network, and a long short term memory network to show that checkpoint ensembles outperform existing methods: a method that selects a model by minimum validation score, and two methods that average models by weights. Our results also show that checkpoint ensembles capture a portion of the performance gains that traditional ensembles provide. |
Tasks | |
Published | 2017-10-09 |
URL | http://arxiv.org/abs/1710.03282v1 |
http://arxiv.org/pdf/1710.03282v1.pdf | |
PWC | https://paperswithcode.com/paper/checkpoint-ensembles-ensemble-methods-from-a |
Repo | |
Framework | |
Towards Instance Segmentation with Object Priority: Prominent Object Detection and Recognition
Title | Towards Instance Segmentation with Object Priority: Prominent Object Detection and Recognition |
Authors | Hamed R. Tavakoli, Jorma Laaksonen |
Abstract | This manuscript introduces the problem of prominent object detection and recognition inspired by the fact that human seems to priorities perception of scene elements. The problem deals with finding the most important region of interest, segmenting the relevant item/object in that area, and assigning it an object class label. In other words, we are solving the three problems of saliency modeling, saliency detection, and object recognition under one umbrella. The motivation behind such a problem formulation is (1) the benefits to the knowledge representation-based vision pipelines, and (2) the potential improvements in emulating bio-inspired vision systems by solving these three problems together. We are foreseeing extending this problem formulation to fully semantically segmented scenes with instance object priority for high-level inferences in various applications including assistive vision. Along with a new problem definition, we also propose a method to achieve such a task. The proposed model predicts the most important area in the image, segments the associated objects, and labels them. The proposed problem and method are evaluated against human fixations, annotated segmentation masks, and object class categories. We define a chance level for each of the evaluation criterion to compare the proposed algorithm with. Despite the good performance of the proposed baseline, the overall evaluations indicate that the problem of prominent object detection and recognition is a challenging task that is still worth investigating further. |
Tasks | Instance Segmentation, Object Detection, Object Recognition, Saliency Detection, Semantic Segmentation |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07402v2 |
http://arxiv.org/pdf/1704.07402v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-instance-segmentation-with-object |
Repo | |
Framework | |
Benchmarking Decoupled Neural Interfaces with Synthetic Gradients
Title | Benchmarking Decoupled Neural Interfaces with Synthetic Gradients |
Authors | Ekaba Bisong |
Abstract | Artifical Neural Networks are a particular class of learning systems modeled after biological neural functions with an interesting penchant for Hebbian learning, that is “neurons that wire together, fire together”. However, unlike their natural counterparts, artificial neural networks have a close and stringent coupling between the modules of neurons in the network. This coupling or locking imposes upon the network a strict and inflexible structure that prevent layers in the network from updating their weights until a full feed-forward and backward pass has occurred. Such a constraint though may have sufficed for a while, is now no longer feasible in the era of very-large-scale machine learning, coupled with the increased desire for parallelization of the learning process across multiple computing infrastructures. To solve this problem, synthetic gradients (SG) with decoupled neural interfaces (DNI) are introduced as a viable alternative to the backpropagation algorithm. This paper performs a speed benchmark to compare the speed and accuracy capabilities of SG-DNI as opposed to a standard neural interface using multilayer perceptron MLP. SG-DNI shows good promise, in that it not only captures the learning problem, it is also over 3-fold faster due to it asynchronous learning capabilities. |
Tasks | |
Published | 2017-12-22 |
URL | http://arxiv.org/abs/1712.08314v3 |
http://arxiv.org/pdf/1712.08314v3.pdf | |
PWC | https://paperswithcode.com/paper/benchmarking-decoupled-neural-interfaces-with |
Repo | |
Framework | |
Document Image Binarization with Fully Convolutional Neural Networks
Title | Document Image Binarization with Fully Convolutional Neural Networks |
Authors | Chris Tensmeyer, Tony Martinez |
Abstract | Binarization of degraded historical manuscript images is an important pre-processing step for many document processing tasks. We formulate binarization as a pixel classification learning task and apply a novel Fully Convolutional Network (FCN) architecture that operates at multiple image scales, including full resolution. The FCN is trained to optimize a continuous version of the Pseudo F-measure metric and an ensemble of FCNs outperform the competition winners on 4 of 7 DIBCO competitions. This same binarization technique can also be applied to different domains such as Palm Leaf Manuscripts with good performance. We analyze the performance of the proposed model w.r.t. the architectural hyperparameters, size and diversity of training data, and the input features chosen. |
Tasks | |
Published | 2017-08-10 |
URL | http://arxiv.org/abs/1708.03276v1 |
http://arxiv.org/pdf/1708.03276v1.pdf | |
PWC | https://paperswithcode.com/paper/document-image-binarization-with-fully |
Repo | |
Framework | |
On Optimistic versus Randomized Exploration in Reinforcement Learning
Title | On Optimistic versus Randomized Exploration in Reinforcement Learning |
Authors | Ian Osband, Benjamin Van Roy |
Abstract | We discuss the relative merits of optimistic and randomized approaches to exploration in reinforcement learning. Optimistic approaches presented in the literature apply an optimistic boost to the value estimate at each state-action pair and select actions that are greedy with respect to the resulting optimistic value function. Randomized approaches sample from among statistically plausible value functions and select actions that are greedy with respect to the random sample. Prior computational experience suggests that randomized approaches can lead to far more statistically efficient learning. We present two simple analytic examples that elucidate why this is the case. In principle, there should be optimistic approaches that fare well relative to randomized approaches, but that would require intractable computation. Optimistic approaches that have been proposed in the literature sacrifice statistical efficiency for the sake of computational efficiency. Randomized approaches, on the other hand, may enable simultaneous statistical and computational efficiency. |
Tasks | |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.04241v1 |
http://arxiv.org/pdf/1706.04241v1.pdf | |
PWC | https://paperswithcode.com/paper/on-optimistic-versus-randomized-exploration |
Repo | |
Framework | |
Fast Hough Transform and approximation properties of dyadic patterns
Title | Fast Hough Transform and approximation properties of dyadic patterns |
Authors | E. I. Ershov, S. M. Karpenko |
Abstract | Hough transform is a popular low-level computer vision algorithm. Its computationally effective modification, Fast Hough transform (FHT), makes use of special subsets of image matrix to approximate geometric lines on it. Because of their special structure, these subset are called dyadic patterns. In this paper various properties of dyadic patterns are investigated. Exact upper bounds on approximation error are derived. In a simplest case, this error proves to be equal to $\frac{1}{6} log(n)$ for $n \times n$ sized images, as was conjectured previously by Goetz et al. |
Tasks | |
Published | 2017-12-15 |
URL | http://arxiv.org/abs/1712.05615v1 |
http://arxiv.org/pdf/1712.05615v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-hough-transform-and-approximation |
Repo | |
Framework | |
Drug-drug Interaction Extraction via Recurrent Neural Network with Multiple Attention Layers
Title | Drug-drug Interaction Extraction via Recurrent Neural Network with Multiple Attention Layers |
Authors | Zibo Yi, Shasha Li, Jie Yu, Qingbo Wu |
Abstract | Drug-drug interaction (DDI) is a vital information when physicians and pharmacists intend to co-administer two or more drugs. Thus, several DDI databases are constructed to avoid mistakenly combined use. In recent years, automatically extracting DDIs from biomedical text has drawn researchers’ attention. However, the existing work utilize either complex feature engineering or NLP tools, both of which are insufficient for sentence comprehension. Inspired by the deep learning approaches in natural language processing, we propose a recur- rent neural network model with multiple attention layers for DDI classification. We evaluate our model on 2013 SemEval DDIExtraction dataset. The experiments show that our model classifies most of the drug pairs into correct DDI categories, which outperforms the existing NLP or deep learning methods. |
Tasks | Feature Engineering |
Published | 2017-05-09 |
URL | http://arxiv.org/abs/1705.03261v2 |
http://arxiv.org/pdf/1705.03261v2.pdf | |
PWC | https://paperswithcode.com/paper/drug-drug-interaction-extraction-via |
Repo | |
Framework | |
Affective Neural Response Generation
Title | Affective Neural Response Generation |
Authors | Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, Lili Mou |
Abstract | Existing neural conversational models process natural language primarily on a lexico-syntactic level, thereby ignoring one of the most crucial components of human-to-human dialogue: its affective content. We take a step in this direction by proposing three novel ways to incorporate affective/emotional aspects into long short term memory (LSTM) encoder-decoder neural conversation models: (1) affective word embeddings, which are cognitively engineered, (2) affect-based objective functions that augment the standard cross-entropy loss, and (3) affectively diverse beam search for decoding. Experiments show that these techniques improve the open-domain conversational prowess of encoder-decoder networks by enabling them to produce emotionally rich responses that are more interesting and natural. |
Tasks | Word Embeddings |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03968v1 |
http://arxiv.org/pdf/1709.03968v1.pdf | |
PWC | https://paperswithcode.com/paper/affective-neural-response-generation |
Repo | |
Framework | |
Adaptive Feature Representation for Visual Tracking
Title | Adaptive Feature Representation for Visual Tracking |
Authors | Yuqi Han, Chenwei Deng, Zengshuo Zhang, Jiatong Li, Baojun Zhao |
Abstract | Robust feature representation plays significant role in visual tracking. However, it remains a challenging issue, since many factors may affect the experimental performance. The existing method which combine different features by setting them equally with the fixed weight could hardly solve the issues, due to the different statistical properties of different features across various of scenarios and attributes. In this paper, by exploiting the internal relationship among these features, we develop a robust method to construct a more stable feature representation. More specifically, we utilize a co-training paradigm to formulate the intrinsic complementary information of multi-feature template into the efficient correlation filter framework. We test our approach on challenging se- quences with illumination variation, scale variation, deformation etc. Experimental results demonstrate that the proposed method outperforms state-of-the-art methods favorably. |
Tasks | Visual Tracking |
Published | 2017-05-12 |
URL | http://arxiv.org/abs/1705.04442v1 |
http://arxiv.org/pdf/1705.04442v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-feature-representation-for-visual |
Repo | |
Framework | |
Statistical Vs Rule Based Machine Translation; A Case Study on Indian Language Perspective
Title | Statistical Vs Rule Based Machine Translation; A Case Study on Indian Language Perspective |
Authors | Sreelekha S |
Abstract | In this paper we present our work on a case study between Statistical Machien Transaltion (SMT) and Rule-Based Machine Translation (RBMT) systems on English-Indian langugae and Indian to Indian langugae perspective. Main objective of our study is to make a five way performance compariosn; such as, a) SMT and RBMT b) SMT on English-Indian langugae c) RBMT on English-Indian langugae d) SMT on Indian to Indian langugae perspective e) RBMT on Indian to Indian langugae perspective. Through a detailed analysis we describe the Rule Based and the Statistical Machine Translation system developments and its evaluations. Through a detailed error analysis, we point out the relative strengths and weaknesses of both systems. The observations based on our study are: a) SMT systems outperforms RBMT b) In the case of SMT, English to Indian language MT systmes performs better than Indian to English langugae MT systems c) In the case of RBMT, English to Indian langugae MT systems perofrms better than Indian to Englsih Language MT systems d) SMT systems performs better for Indian to Indian language MT systems compared to RBMT. Effectively, we shall see that even with a small amount of training corpus a statistical machine translation system has many advantages for high quality domain specific machine translation over that of a rule-based counterpart. |
Tasks | Machine Translation |
Published | 2017-08-12 |
URL | http://arxiv.org/abs/1708.04559v1 |
http://arxiv.org/pdf/1708.04559v1.pdf | |
PWC | https://paperswithcode.com/paper/statistical-vs-rule-based-machine-translation |
Repo | |
Framework | |
A Kind of Affine Weighted Moment Invariants
Title | A Kind of Affine Weighted Moment Invariants |
Authors | Hanlin Mo, You Hao, Shirui Li, Hua Li |
Abstract | A new kind of geometric invariants is proposed in this paper, which is called affine weighted moment invariant (AWMI). By combination of local affine differential invariants and a framework of global integral, they can more effectively extract features of images and help to increase the number of low-order invariants and to decrease the calculating cost. The experimental results show that AWMIs have good stability and distinguishability and achieve better results in image retrieval than traditional moment invariants. An extension to 3D is straightforward. |
Tasks | Image Retrieval |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01209v2 |
http://arxiv.org/pdf/1706.01209v2.pdf | |
PWC | https://paperswithcode.com/paper/a-kind-of-affine-weighted-moment-invariants |
Repo | |
Framework | |
Prolongation of SMAP to Spatio-temporally Seamless Coverage of Continental US Using a Deep Learning Neural Network
Title | Prolongation of SMAP to Spatio-temporally Seamless Coverage of Continental US Using a Deep Learning Neural Network |
Authors | Kuai Fang, Chaopeng Shen, Daniel Kifer, Xiao Yang |
Abstract | The Soil Moisture Active Passive (SMAP) mission has delivered valuable sensing of surface soil moisture since 2015. However, it has a short time span and irregular revisit schedule. Utilizing a state-of-the-art time-series deep learning neural network, Long Short-Term Memory (LSTM), we created a system that predicts SMAP level-3 soil moisture data with atmospheric forcing, model-simulated moisture, and static physiographic attributes as inputs. The system removes most of the bias with model simulations and improves predicted moisture climatology, achieving small test root-mean-squared error (<0.035) and high correlation coefficient >0.87 for over 75% of Continental United States, including the forested Southeast. As the first application of LSTM in hydrology, we show the proposed network avoids overfitting and is robust for both temporal and spatial extrapolation tests. LSTM generalizes well across regions with distinct climates and physiography. With high fidelity to SMAP, LSTM shows great potential for hindcasting, data assimilation, and weather forecasting. |
Tasks | Time Series, Weather Forecasting |
Published | 2017-07-20 |
URL | http://arxiv.org/abs/1707.06611v3 |
http://arxiv.org/pdf/1707.06611v3.pdf | |
PWC | https://paperswithcode.com/paper/prolongation-of-smap-to-spatio-temporally |
Repo | |
Framework | |
Remedies against the Vocabulary Gap in Information Retrieval
Title | Remedies against the Vocabulary Gap in Information Retrieval |
Authors | Christophe Van Gysel |
Abstract | Search engines rely heavily on term-based approaches that represent queries and documents as bags of words. Text—a document or a query—is represented by a bag of its words that ignores grammar and word order, but retains word frequency counts. When presented with a search query, the engine then ranks documents according to their relevance scores by computing, among other things, the matching degrees between query and document terms. While term-based approaches are intuitive and effective in practice, they are based on the hypothesis that documents that exactly contain the query terms are highly relevant regardless of query semantics. Inversely, term-based approaches assume documents that do not contain query terms as irrelevant. However, it is known that a high matching degree at the term level does not necessarily mean high relevance and, vice versa, documents that match null query terms may still be relevant. Consequently, there exists a vocabulary gap between queries and documents that occurs when both use different words to describe the same concepts. It is the alleviation of the effect brought forward by this vocabulary gap that is the topic of this dissertation. More specifically, we propose (1) methods to formulate an effective query from complex textual structures and (2) latent vector space models that circumvent the vocabulary gap in information retrieval. |
Tasks | Information Retrieval |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.06004v1 |
http://arxiv.org/pdf/1711.06004v1.pdf | |
PWC | https://paperswithcode.com/paper/remedies-against-the-vocabulary-gap-in |
Repo | |
Framework | |