Paper Group ANR 216
Exploring Segment Representations for Neural Segmentation Models. A Theory of Generative ConvNet. Optimal Margin Distribution Machine. Asymmetric Move Selection Strategies in Monte-Carlo Tree Search: Minimizing the Simple Regret at Max Nodes. Different approaches for identifying important concepts in probabilistic biomedical text summarization. A n …
Exploring Segment Representations for Neural Segmentation Models
Title | Exploring Segment Representations for Neural Segmentation Models |
Authors | Yijia Liu, Wanxiang Che, Jiang Guo, Bing Qin, Ting Liu |
Abstract | Many natural language processing (NLP) tasks can be generalized into segmentation problem. In this paper, we combine semi-CRF with neural network to solve NLP segmentation tasks. Our model represents a segment both by composing the input units and embedding the entire segment. We thoroughly study different composition functions and different segment embeddings. We conduct extensive experiments on two typical segmentation tasks: named entity recognition (NER) and Chinese word segmentation (CWS). Experimental results show that our neural semi-CRF model benefits from representing the entire segment and achieves the state-of-the-art performance on CWS benchmark dataset and competitive results on the CoNLL03 dataset. |
Tasks | Chinese Word Segmentation, Named Entity Recognition |
Published | 2016-04-19 |
URL | http://arxiv.org/abs/1604.05499v1 |
http://arxiv.org/pdf/1604.05499v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-segment-representations-for-neural |
Repo | |
Framework | |
A Theory of Generative ConvNet
Title | A Theory of Generative ConvNet |
Authors | Jianwen Xie, Yang Lu, Song-Chun Zhu, Ying Nian Wu |
Abstract | We show that a generative random field model, which we call generative ConvNet, can be derived from the commonly used discriminative ConvNet, by assuming a ConvNet for multi-category classification and assuming one of the categories is a base category generated by a reference distribution. If we further assume that the non-linearity in the ConvNet is Rectified Linear Unit (ReLU) and the reference distribution is Gaussian white noise, then we obtain a generative ConvNet model that is unique among energy-based models: The model is piecewise Gaussian, and the means of the Gaussian pieces are defined by an auto-encoder, where the filters in the bottom-up encoding become the basis functions in the top-down decoding, and the binary activation variables detected by the filters in the bottom-up convolution process become the coefficients of the basis functions in the top-down deconvolution process. The Langevin dynamics for sampling the generative ConvNet is driven by the reconstruction error of this auto-encoder. The contrastive divergence learning of the generative ConvNet reconstructs the training images by the auto-encoder. The maximum likelihood learning algorithm can synthesize realistic natural image patterns. |
Tasks | |
Published | 2016-02-10 |
URL | http://arxiv.org/abs/1602.03264v3 |
http://arxiv.org/pdf/1602.03264v3.pdf | |
PWC | https://paperswithcode.com/paper/a-theory-of-generative-convnet |
Repo | |
Framework | |
Optimal Margin Distribution Machine
Title | Optimal Margin Distribution Machine |
Authors | Teng Zhang, Zhi-Hua Zhou |
Abstract | Support vector machine (SVM) has been one of the most popular learning algorithms, with the central idea of maximizing the minimum margin, i.e., the smallest distance from the instances to the classification boundary. Recent theoretical results, however, disclosed that maximizing the minimum margin does not necessarily lead to better generalization performances, and instead, the margin distribution has been proven to be more crucial. Based on this idea, we propose a new method, named Optimal margin Distribution Machine (ODM), which tries to achieve a better generalization performance by optimizing the margin distribution. We characterize the margin distribution by the first- and second-order statistics, i.e., the margin mean and variance. The proposed method is a general learning approach which can be used in any place where SVM can be applied, and their superiority is verified both theoretically and empirically in this paper. |
Tasks | |
Published | 2016-04-12 |
URL | http://arxiv.org/abs/1604.03348v1 |
http://arxiv.org/pdf/1604.03348v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-margin-distribution-machine |
Repo | |
Framework | |
Asymmetric Move Selection Strategies in Monte-Carlo Tree Search: Minimizing the Simple Regret at Max Nodes
Title | Asymmetric Move Selection Strategies in Monte-Carlo Tree Search: Minimizing the Simple Regret at Max Nodes |
Authors | Yun-Ching Liu, Yoshimasa Tsuruoka |
Abstract | The combination of multi-armed bandit (MAB) algorithms with Monte-Carlo tree search (MCTS) has made a significant impact in various research fields. The UCT algorithm, which combines the UCB bandit algorithm with MCTS, is a good example of the success of this combination. The recent breakthrough made by AlphaGo, which incorporates convolutional neural networks with bandit algorithms in MCTS, also highlights the necessity of bandit algorithms in MCTS. However, despite the various investigations carried out on MCTS, nearly all of them still follow the paradigm of treating every node as an independent instance of the MAB problem, and applying the same bandit algorithm and heuristics on every node. As a result, this paradigm may leave some properties of the game tree unexploited. In this work, we propose that max nodes and min nodes have different concerns regarding their value estimation, and different bandit algorithms should be applied accordingly. We develop the Asymmetric-MCTS algorithm, which is an MCTS variant that applies a simple regret algorithm on max nodes, and the UCB algorithm on min nodes. We will demonstrate the performance of the Asymmetric-MCTS algorithm on the game of $9\times 9$ Go, $9\times 9$ NoGo, and Othello. |
Tasks | |
Published | 2016-05-08 |
URL | http://arxiv.org/abs/1605.02321v1 |
http://arxiv.org/pdf/1605.02321v1.pdf | |
PWC | https://paperswithcode.com/paper/asymmetric-move-selection-strategies-in-monte |
Repo | |
Framework | |
Different approaches for identifying important concepts in probabilistic biomedical text summarization
Title | Different approaches for identifying important concepts in probabilistic biomedical text summarization |
Authors | Milad Moradi, Nasser Ghadiri |
Abstract | Automatic text summarization tools help users in biomedical domain to acquire their intended information from various textual resources more efficiently. Some of the biomedical text summarization systems put the basis of their sentence selection approach on the frequency of concepts extracted from the input text. However, it seems that exploring other measures rather than the frequency for identifying the valuable content of the input document, and considering the correlations existing between concepts may be more useful for this type of summarization. In this paper, we describe a Bayesian summarizer for biomedical text documents. The Bayesian summarizer initially maps the input text to the Unified Medical Language System (UMLS) concepts, then it selects the important ones to be used as classification features. We introduce different feature selection approaches to identify the most important concepts of the text and to select the most informative content according to the distribution of these concepts. We show that with the use of an appropriate feature selection approach, the Bayesian biomedical summarizer can improve the performance of summarization. We perform extensive evaluations on a corpus of scientific papers in biomedical domain. The results show that the Bayesian summarizer outperforms the biomedical summarizers that rely on the frequency of concepts, the domain-independent and baseline methods based on the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics. Moreover, the results suggest that using the meaningfulness measure and considering the correlations of concepts in the feature selection step lead to a significant increase in the performance of summarization. |
Tasks | Feature Selection, Text Summarization |
Published | 2016-05-10 |
URL | http://arxiv.org/abs/1605.02948v3 |
http://arxiv.org/pdf/1605.02948v3.pdf | |
PWC | https://paperswithcode.com/paper/different-approaches-for-identifying |
Repo | |
Framework | |
A non-extensive entropy feature and its application to texture classification
Title | A non-extensive entropy feature and its application to texture classification |
Authors | Seba Susan, Madasu Hanmandlu |
Abstract | This paper proposes a new probabilistic non-extensive entropy feature for texture characterization, based on a Gaussian information measure. The highlights of the new entropy are that it is bounded by finite limits and that it is non additive in nature. The non additive property of the proposed entropy makes it useful for the representation of information content in the non-extensive systems containing some degree of regularity or correlation. The effectiveness of the proposed entropy in representing the correlated random variables is demonstrated by applying it for the texture classification problem since textures found in nature are random and at the same time contain some degree of correlation or regularity at some scale. The gray level co-occurrence probabilities (GLCP) are used for computing the entropy function. The experimental results indicate high degree of the classification accuracy. The performance of the new entropy function is found superior to other forms of entropy such as Shannon, Renyi, Tsallis and Pal and Pal entropies on comparison. Using the feature based polar interaction maps (FBIM) the proposed entropy is shown to be the best measure among the entropies compared for representing the correlated textures. |
Tasks | Texture Classification |
Published | 2016-03-08 |
URL | http://arxiv.org/abs/1603.02466v1 |
http://arxiv.org/pdf/1603.02466v1.pdf | |
PWC | https://paperswithcode.com/paper/a-non-extensive-entropy-feature-and-its |
Repo | |
Framework | |
Quantification of Ultrasonic Texture heterogeneity via Volumetric Stochastic Modeling for Tissue Characterization
Title | Quantification of Ultrasonic Texture heterogeneity via Volumetric Stochastic Modeling for Tissue Characterization |
Authors | O. S. Al-Kadi, Daniel Y. F. Chung, Robert C. Carlisle, Constantin C. Coussios, J. Alison Noble |
Abstract | Intensity variations in image texture can provide powerful quantitative information about physical properties of biological tissue. However, tissue patterns can vary according to the utilized imaging system and are intrinsically correlated to the scale of analysis. In the case of ultrasound, the Nakagami distribution is a general model of the ultrasonic backscattering envelope under various scattering conditions and densities where it can be employed for characterizing image texture, but the subtle intra-heterogeneities within a given mass are difficult to capture via this model as it works at a single spatial scale. This paper proposes a locally adaptive 3D multi-resolution Nakagami-based fractal feature descriptor that extends Nakagami-based texture analysis to accommodate subtle speckle spatial frequency tissue intensity variability in volumetric scans. Local textural fractal descriptors - which are invariant to affine intensity changes - are extracted from volumetric patches at different spatial resolutions from voxel lattice-based generated shape and scale Nakagami parameters. Using ultrasound radio-frequency datasets we found that after applying an adaptive fractal decomposition label transfer approach on top of the generated Nakagami voxels, tissue characterization results were superior to the state of art. Experimental results on real 3D ultrasonic pre-clinical and clinical datasets suggest that describing tumor intra-heterogeneity via this descriptor may facilitate improved prediction of therapy response and disease characterization. |
Tasks | Texture Classification |
Published | 2016-01-14 |
URL | http://arxiv.org/abs/1601.03531v1 |
http://arxiv.org/pdf/1601.03531v1.pdf | |
PWC | https://paperswithcode.com/paper/quantification-of-ultrasonic-texture |
Repo | |
Framework | |
Cross-Domain Face Verification: Matching ID Document and Self-Portrait Photographs
Title | Cross-Domain Face Verification: Matching ID Document and Self-Portrait Photographs |
Authors | Guilherme Folego, Marcus A. Angeloni, José Augusto Stuchi, Alan Godoy, Anderson Rocha |
Abstract | Cross-domain biometrics has been emerging as a new necessity, which poses several additional challenges, including harsh illumination changes, noise, pose variation, among others. In this paper, we explore approaches to cross-domain face verification, comparing self-portrait photographs (“selfies”) to ID documents. We approach the problem with proper image photometric adjustment and data standardization techniques, along with deep learning methods to extract the most prominent features from the data, reducing the effects of domain shift in this problem. We validate the methods using a novel dataset comprising 50 individuals. The obtained results are promising and indicate that the adopted path is worth further investigation. |
Tasks | Face Verification |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05755v1 |
http://arxiv.org/pdf/1611.05755v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-face-verification-matching-id |
Repo | |
Framework | |
ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events
Title | ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events |
Authors | Evan Racah, Christopher Beckham, Tegan Maharaj, Samira Ebrahimi Kahou, Prabhat, Christopher Pal |
Abstract | Then detection and identification of extreme weather events in large-scale climate simulations is an important problem for risk management, informing governmental policy decisions and advancing our basic understanding of the climate system. Recent work has shown that fully supervised convolutional neural networks (CNNs) can yield acceptable accuracy for classifying well-known types of extreme weather events when large amounts of labeled data are available. However, many different types of spatially localized climate patterns are of interest including hurricanes, extra-tropical cyclones, weather fronts, and blocking events among others. Existing labeled data for these patterns can be incomplete in various ways, such as covering only certain years or geographic areas and having false negatives. This type of climate data therefore poses a number of interesting machine learning challenges. We present a multichannel spatiotemporal CNN architecture for semi-supervised bounding box prediction and exploratory data analysis. We demonstrate that our approach is able to leverage temporal information and unlabeled data to improve the localization of extreme weather events. Further, we explore the representations learned by our model in order to better understand this important data. We present a dataset, ExtremeWeather, to encourage machine learning research in this area and to help facilitate further work in understanding and mitigating the effects of climate change. The dataset is available at extremeweatherdataset.github.io and the code is available at https://github.com/eracah/hur-detect. |
Tasks | |
Published | 2016-12-07 |
URL | http://arxiv.org/abs/1612.02095v2 |
http://arxiv.org/pdf/1612.02095v2.pdf | |
PWC | https://paperswithcode.com/paper/extremeweather-a-large-scale-climate-dataset |
Repo | |
Framework | |
Interferences in match kernels
Title | Interferences in match kernels |
Authors | Naila Murray, Hervé Jégou, Florent Perronnin, Andrew Zisserman |
Abstract | We consider the design of an image representation that embeds and aggregates a set of local descriptors into a single vector. Popular representations of this kind include the bag-of-visual-words, the Fisher vector and the VLAD. When two such image representations are compared with the dot-product, the image-to-image similarity can be interpreted as a match kernel. In match kernels, one has to deal with interference, i.e. with the fact that even if two descriptors are unrelated, their matching score may contribute to the overall similarity. We formalise this problem and propose two related solutions, both aimed at equalising the individual contributions of the local descriptors in the final representation. These methods modify the aggregation stage by including a set of per-descriptor weights. They differ by the objective function that is optimised to compute those weights. The first is a “democratisation” strategy that aims at equalising the relative importance of each descriptor in the set comparison metric. The second one involves equalising the match of a single descriptor to the aggregated vector. These concurrent methods give a substantial performance boost over the state of the art in image search with short or mid-size vectors, as demonstrated by our experiments on standard public image retrieval benchmarks. |
Tasks | Image Retrieval |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08194v1 |
http://arxiv.org/pdf/1611.08194v1.pdf | |
PWC | https://paperswithcode.com/paper/interferences-in-match-kernels |
Repo | |
Framework | |
Deviant Learning Algorithm: Learning Sparse Mismatch Representations through Time and Space
Title | Deviant Learning Algorithm: Learning Sparse Mismatch Representations through Time and Space |
Authors | Emmanuel Ndidi Osegi, Vincent Ike Anireh |
Abstract | Predictive coding (PDC) has recently attracted attention in the neuroscience and computing community as a candidate unifying paradigm for neuronal studies and artificial neural network implementations particularly targeted at unsupervised learning systems. The Mismatch Negativity (MMN) has also recently been studied in relation to PC and found to be a useful ingredient in neural predictive coding systems. Backed by the behavior of living organisms, such networks are particularly useful in forming spatio-temporal transitions and invariant representations of the input world. However, most neural systems still do not account for large number of synapses even though this has been shown by a few machine learning researchers as an effective and very important component of any neural system if such a system is to behave properly. Our major point here is that PDC systems with the MMN effect in addition to a large number of synapses can greatly improve any neural learning system’s performance and ability to make decisions in the machine world. In this paper, we propose a novel bio-mimetic computational intelligence algorithm – the Deviant Learning Algorithm, inspired by these key ideas and functional properties of recent brain-cognitive discoveries and theories. We also show by numerical experiments guided by theoretical insights, how our invented bio-mimetic algorithm can achieve competitive predictions even with very small problem specific data. |
Tasks | |
Published | 2016-09-06 |
URL | http://arxiv.org/abs/1609.01459v6 |
http://arxiv.org/pdf/1609.01459v6.pdf | |
PWC | https://paperswithcode.com/paper/deviant-learning-algorithm-learning-sparse |
Repo | |
Framework | |
Fast nonlinear embeddings via structured matrices
Title | Fast nonlinear embeddings via structured matrices |
Authors | Krzysztof Choromanski, Francois Fagan |
Abstract | We present a new paradigm for speeding up randomized computations of several frequently used functions in machine learning. In particular, our paradigm can be applied for improving computations of kernels based on random embeddings. Above that, the presented framework covers multivariate randomized functions. As a byproduct, we propose an algorithmic approach that also leads to a significant reduction of space complexity. Our method is based on careful recycling of Gaussian vectors into structured matrices that share properties of fully random matrices. The quality of the proposed structured approach follows from combinatorial properties of the graphs encoding correlations between rows of these structured matrices. Our framework covers as special cases already known structured approaches such as the Fast Johnson-Lindenstrauss Transform, but is much more general since it can be applied also to highly nonlinear embeddings. We provide strong concentration results showing the quality of the presented paradigm. |
Tasks | |
Published | 2016-04-25 |
URL | http://arxiv.org/abs/1604.07356v1 |
http://arxiv.org/pdf/1604.07356v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-nonlinear-embeddings-via-structured |
Repo | |
Framework | |
Automatic Detection and Decoding of Photogrammetric Coded Targets
Title | Automatic Detection and Decoding of Photogrammetric Coded Targets |
Authors | Udaya Wijenayake, Sung-In Choi, Soon-Yong Park |
Abstract | Close-range Photogrammetry is widely used in many industries because of the cost effectiveness and efficiency of the technique. In this research, we introduce an automated coded target detection method which can be used to enhance the efficiency of the Photogrammetry. |
Tasks | |
Published | 2016-01-04 |
URL | http://arxiv.org/abs/1601.00396v1 |
http://arxiv.org/pdf/1601.00396v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-detection-and-decoding-of |
Repo | |
Framework | |
When to Reset Your Keys: Optimal Timing of Security Updates via Learning
Title | When to Reset Your Keys: Optimal Timing of Security Updates via Learning |
Authors | Zizhan Zheng, Ness B. Shroff, Prasant Mohapatra |
Abstract | Cybersecurity is increasingly threatened by advanced and persistent attacks. As these attacks are often designed to disable a system (or a critical resource, e.g., a user account) repeatedly, it is crucial for the defender to keep updating its security measures to strike a balance between the risk of being compromised and the cost of security updates. Moreover, these decisions often need to be made with limited and delayed feedback due to the stealthy nature of advanced attacks. In addition to targeted attacks, such an optimal timing policy under incomplete information has broad applications in cybersecurity. Examples include key rotation, password change, application of patches, and virtual machine refreshing. However, rigorous studies of optimal timing are rare. Further, existing solutions typically rely on a pre-defined attack model that is known to the defender, which is often not the case in practice. In this work, we make an initial effort towards achieving optimal timing of security updates in the face of unknown stealthy attacks. We consider a variant of the influential FlipIt game model with asymmetric feedback and unknown attack time distribution, which provides a general model to consecutive security updates. The defender’s problem is then modeled as a time associative bandit problem with dependent arms. We derive upper confidence bound based learning policies that achieve low regret compared with optimal periodic defense strategies that can only be derived when attack time distributions are known. |
Tasks | |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00108v2 |
http://arxiv.org/pdf/1612.00108v2.pdf | |
PWC | https://paperswithcode.com/paper/when-to-reset-your-keys-optimal-timing-of |
Repo | |
Framework | |
Human-Algorithm Interaction Biases in the Big Data Cycle: A Markov Chain Iterated Learning Framework
Title | Human-Algorithm Interaction Biases in the Big Data Cycle: A Markov Chain Iterated Learning Framework |
Authors | Olfa Nasraoui, Patrick Shafto |
Abstract | Early supervised machine learning algorithms have relied on reliable expert labels to build predictive models. However, the gates of data generation have recently been opened to a wider base of users who started participating increasingly with casual labeling, rating, annotating, etc. The increased online presence and participation of humans has led not only to a democratization of unchecked inputs to algorithms, but also to a wide democratization of the “consumption” of machine learning algorithms’ outputs by general users. Hence, these algorithms, many of which are becoming essential building blocks of recommender systems and other information filters, started interacting with users at unprecedented rates. The result is machine learning algorithms that consume more and more data that is unchecked, or at the very least, not fitting conventional assumptions made by various machine learning algorithms. These include biased samples, biased labels, diverging training and testing sets, and cyclical interaction between algorithms, humans, information consumed by humans, and data consumed by algorithms. Yet, the continuous interaction between humans and algorithms is rarely taken into account in machine learning algorithm design and analysis. In this paper, we present a preliminary theoretical model and analysis of the mutual interaction between humans and algorithms, based on an iterated learning framework that is inspired from the study of human language evolution. We also define the concepts of human and algorithm blind spots and outline machine learning approaches to mend iterated bias through two novel notions: antidotes and reactive learning. |
Tasks | Recommendation Systems |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.07895v1 |
http://arxiv.org/pdf/1608.07895v1.pdf | |
PWC | https://paperswithcode.com/paper/human-algorithm-interaction-biases-in-the-big |
Repo | |
Framework | |