Paper Group ANR 373
Elastic Solver: Balancing Solution Time and Energy Consumption. Minimum Conditional Description Length Estimation for Markov Random Fields. Learning a Tree-Structured Ising Model in Order to Make Predictions. Zero Shot Hashing. Machine Learning, Linear and Bayesian Models for Logistic Regression in Failure Detection Problems. A Novel Approach to Dr …
Elastic Solver: Balancing Solution Time and Energy Consumption
Title | Elastic Solver: Balancing Solution Time and Energy Consumption |
Authors | Barry Hurley, Deepak Mehta, Barry O’Sullivan |
Abstract | Combinatorial decision problems arise in many different domains such as scheduling, routing, packing, bioinformatics, and many more. Despite recent advances in developing scalable solvers, there are still many problems which are often very hard to solve. Typically the most advanced solvers include elements which are stochastic in nature. If a same instance is solved many times using different seeds then depending on the inherent characteristics of a problem instance and the solver, one can observe a highly-variant distribution of times spanning multiple orders of magnitude. Therefore, to solve a problem instance efficiently it is often useful to solve the same instance in parallel with different seeds. With the proliferation of cloud computing, it is natural to think about an elastic solver which can scale up by launching searches in parallel on thousands of machines (or cores). However, this could result in consuming a lot of energy. Moreover, not every instance would require thousands of machines. The challenge is to resolve the tradeoff between solution time and energy consumption optimally for a given problem instance. We analyse the impact of the number of machines (or cores) on not only solution time but also on energy consumption. We highlight that although solution time always drops as the number of machines increases, the relation between the number of machines and energy consumption is more complicated. In many cases, the optimal energy consumption may be achieved by a middle ground, we analyse this relationship in detail. The tradeoff between solution time and energy consumption is studied further, showing that the energy consumption of a solver can be reduced drastically if we increase the solution time marginally. We also develop a prediction model, demonstrating that such insights can be exploited to achieve faster solutions times in a more energy efficient manor. |
Tasks | |
Published | 2016-05-23 |
URL | http://arxiv.org/abs/1605.06940v1 |
http://arxiv.org/pdf/1605.06940v1.pdf | |
PWC | https://paperswithcode.com/paper/elastic-solver-balancing-solution-time-and |
Repo | |
Framework | |
Minimum Conditional Description Length Estimation for Markov Random Fields
Title | Minimum Conditional Description Length Estimation for Markov Random Fields |
Authors | Matthew G. Reyes, David L. Neuhoff |
Abstract | In this paper we discuss a method, which we call Minimum Conditional Description Length (MCDL), for estimating the parameters of a subset of sites within a Markov random field. We assume that the edges are known for the entire graph $G=(V,E)$. Then, for a subset $U\subset V$, we estimate the parameters for nodes and edges in $U$ as well as for edges incident to a node in $U$, by finding the exponential parameter for that subset that yields the best compression conditioned on the values on the boundary $\partial U$. Our estimate is derived from a temporally stationary sequence of observations on the set $U$. We discuss how this method can also be applied to estimate a spatially invariant parameter from a single configuration, and in so doing, derive the Maximum Pseudo-Likelihood (MPL) estimate. |
Tasks | |
Published | 2016-02-09 |
URL | http://arxiv.org/abs/1602.03061v2 |
http://arxiv.org/pdf/1602.03061v2.pdf | |
PWC | https://paperswithcode.com/paper/minimum-conditional-description-length |
Repo | |
Framework | |
Learning a Tree-Structured Ising Model in Order to Make Predictions
Title | Learning a Tree-Structured Ising Model in Order to Make Predictions |
Authors | Guy Bresler, Mina Karzand |
Abstract | We study the problem of learning a tree Ising model from samples such that subsequent predictions made using the model are accurate. The prediction task considered in this paper is that of predicting the values of a subset of variables given values of some other subset of variables. Virtually all previous work on graphical model learning has focused on recovering the true underlying graph. We define a distance (“small set TV” or ssTV) between distributions $P$ and $Q$ by taking the maximum, over all subsets $\mathcal{S}$ of a given size, of the total variation between the marginals of $P$ and $Q$ on $\mathcal{S}$; this distance captures the accuracy of the prediction task of interest. We derive non-asymptotic bounds on the number of samples needed to get a distribution (from the same class) with small ssTV relative to the one generating the samples. One of the main messages of this paper is that far fewer samples are needed than for recovering the underlying tree, which means that accurate predictions are possible using the wrong tree. |
Tasks | |
Published | 2016-04-22 |
URL | http://arxiv.org/abs/1604.06749v3 |
http://arxiv.org/pdf/1604.06749v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-tree-structured-ising-model-in |
Repo | |
Framework | |
Zero Shot Hashing
Title | Zero Shot Hashing |
Authors | Shubham Pachori, Shanmuganathan Raman |
Abstract | This paper provides a framework to hash images containing instances of unknown object classes. In many object recognition problems, we might have access to huge amount of data. It may so happen that even this huge data doesn’t cover the objects belonging to classes that we see in our day to day life. Zero shot learning exploits auxiliary information (also called as signatures) in order to predict the labels corresponding to unknown classes. In this work, we attempt to generate the hash codes for images belonging to unseen classes, information of which is available only through the textual corpus. We formulate this as an unsupervised hashing formulation as the exact labels are not available for the instances of unseen classes. We show that the proposed solution is able to generate hash codes which can predict labels corresponding to unseen classes with appreciably good precision. |
Tasks | Object Recognition, Zero-Shot Learning |
Published | 2016-10-09 |
URL | http://arxiv.org/abs/1610.02651v1 |
http://arxiv.org/pdf/1610.02651v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-hashing |
Repo | |
Framework | |
Machine Learning, Linear and Bayesian Models for Logistic Regression in Failure Detection Problems
Title | Machine Learning, Linear and Bayesian Models for Logistic Regression in Failure Detection Problems |
Authors | B. Pavlyshenko |
Abstract | In this work, we study the use of logistic regression in manufacturing failures detection. As a data set for the analysis, we used the data from Kaggle competition Bosch Production Line Performance. We considered the use of machine learning, linear and Bayesian models. For machine learning approach, we analyzed XGBoost tree based classifier to obtain high scored classification. Using the generalized linear model for logistic regression makes it possible to analyze the influence of the factors under study. The Bayesian approach for logistic regression gives the statistical distribution for the parameters of the model. It can be useful in the probabilistic analysis, e.g. risk assessment. |
Tasks | |
Published | 2016-12-17 |
URL | http://arxiv.org/abs/1612.05740v1 |
http://arxiv.org/pdf/1612.05740v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-linear-and-bayesian-models |
Repo | |
Framework | |
A Novel Approach to Dropped Pronoun Translation
Title | A Novel Approach to Dropped Pronoun Translation |
Authors | Longyue Wang, Zhaopeng Tu, Xiaojun Zhang, Hang Li, Andy Way, Qun Liu |
Abstract | Dropped Pronouns (DP) in which pronouns are frequently dropped in the source language but should be retained in the target language are challenge in machine translation. In response to this problem, we propose a semi-supervised approach to recall possibly missing pronouns in the translation. Firstly, we build training data for DP generation in which the DPs are automatically labelled according to the alignment information from a parallel corpus. Secondly, we build a deep learning-based DP generator for input sentences in decoding when no corresponding references exist. More specifically, the generation is two-phase: (1) DP position detection, which is modeled as a sequential labelling task with recurrent neural networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs into our translation system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences. Experimental results show that our approach achieves a significant improvement of 1.58 BLEU points in translation performance with 66% F-score for DP generation accuracy. |
Tasks | Machine Translation |
Published | 2016-04-21 |
URL | http://arxiv.org/abs/1604.06285v1 |
http://arxiv.org/pdf/1604.06285v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-approach-to-dropped-pronoun |
Repo | |
Framework | |
Person Re-Identification by Unsupervised Video Matching
Title | Person Re-Identification by Unsupervised Video Matching |
Authors | Xiaolong Ma, Xiatian Zhu, Shaogang Gong, Xudong Xie, Jianming Hu, Kin-Man Lam, Yisheng Zhong |
Abstract | Most existing person re-identification (ReID) methods rely only on the spatial appearance information from either one or multiple person images, whilst ignore the space-time cues readily available in video or image-sequence data. Moreover, they often assume the availability of exhaustively labelled cross-view pairwise data for every camera pair, making them non-scalable to ReID applications in real-world large scale camera networks. In this work, we introduce a novel video based person ReID method capable of accurately matching people across views from arbitrary unaligned image-sequences without any labelled pairwise data. Specifically, we introduce a new space-time person representation by encoding multiple granularities of spatio-temporal dynamics in form of time series. Moreover, a Time Shift Dynamic Time Warping (TS-DTW) model is derived for performing automatically alignment whilst achieving data selection and matching between inherently inaccurate and incomplete sequences in a unified way. We further extend the TS-DTW model for accommodating multiple feature-sequences of an image-sequence in order to fuse information from different descriptions. Crucially, this model does not require pairwise labelled training data (i.e. unsupervised) therefore readily scalable to large scale camera networks of arbitrary camera pairs without the need for exhaustive data annotation for every camera pair. We show the effectiveness and advantages of the proposed method by extensive comparisons with related state-of-the-art approaches using two benchmarking ReID datasets, PRID2011 and iLIDS-VID. |
Tasks | Person Re-Identification, Time Series |
Published | 2016-11-25 |
URL | http://arxiv.org/abs/1611.08512v2 |
http://arxiv.org/pdf/1611.08512v2.pdf | |
PWC | https://paperswithcode.com/paper/person-re-identification-by-unsupervised |
Repo | |
Framework | |
A Bi-LSTM-RNN Model for Relation Classification Using Low-Cost Sequence Features
Title | A Bi-LSTM-RNN Model for Relation Classification Using Low-Cost Sequence Features |
Authors | Fei Li, Meishan Zhang, Guohong Fu, Tao Qian, Donghong Ji |
Abstract | Relation classification is associated with many potential applications in the artificial intelligence area. Recent approaches usually leverage neural networks based on structure features such as syntactic or dependency features to solve this problem. However, high-cost structure features make such approaches inconvenient to be directly used. In addition, structure features are probably domain-dependent. Therefore, this paper proposes a bi-directional long-short-term-memory recurrent-neural-network (Bi-LSTM-RNN) model based on low-cost sequence features to address relation classification. This model divides a sentence or text segment into five parts, namely two target entities and their three contexts. It learns the representations of entities and their contexts, and uses them to classify relations. We evaluate our model on two standard benchmark datasets in different domains, namely SemEval-2010 Task 8 and BioNLP-ST 2016 Task BB3. In the former dataset, our model achieves comparable performance compared with other models using sequence features. In the latter dataset, our model obtains the third best results compared with other models in the official evaluation. Moreover, we find that the context between two target entities plays the most important role in relation classification. Furthermore, statistic experiments show that the context between two target entities can be used as an approximate replacement of the shortest dependency path when dependency parsing is not used. |
Tasks | Dependency Parsing, Relation Classification |
Published | 2016-08-27 |
URL | http://arxiv.org/abs/1608.07720v1 |
http://arxiv.org/pdf/1608.07720v1.pdf | |
PWC | https://paperswithcode.com/paper/a-bi-lstm-rnn-model-for-relation |
Repo | |
Framework | |
Social Media Argumentation Mining: The Quest for Deliberateness in Raucousness
Title | Social Media Argumentation Mining: The Quest for Deliberateness in Raucousness |
Authors | Jan Šnajder |
Abstract | Argumentation mining from social media content has attracted increasing attention. The task is both challenging and rewarding. The informal nature of user-generated content makes the task dauntingly difficult. On the other hand, the insights that could be gained by a large-scale analysis of social media argumentation make it a very worthwhile task. In this position paper I discuss the motivation for social media argumentation mining, as well as the tasks and challenges involved. |
Tasks | |
Published | 2016-12-31 |
URL | http://arxiv.org/abs/1701.00168v1 |
http://arxiv.org/pdf/1701.00168v1.pdf | |
PWC | https://paperswithcode.com/paper/social-media-argumentation-mining-the-quest |
Repo | |
Framework | |
INSIGHT-1 at SemEval-2016 Task 5: Deep Learning for Multilingual Aspect-based Sentiment Analysis
Title | INSIGHT-1 at SemEval-2016 Task 5: Deep Learning for Multilingual Aspect-based Sentiment Analysis |
Authors | Sebastian Ruder, Parsa Ghaffari, John G. Breslin |
Abstract | This paper describes our deep learning-based approach to multilingual aspect-based sentiment analysis as part of SemEval 2016 Task 5. We use a convolutional neural network (CNN) for both aspect extraction and aspect-based sentiment analysis. We cast aspect extraction as a multi-label classification problem, outputting probabilities over aspects parameterized by a threshold. To determine the sentiment towards an aspect, we concatenate an aspect vector with every word embedding and apply a convolution over it. Our constrained system (unconstrained for English) achieves competitive results across all languages and domains, placing first or second in 5 and 7 out of 11 language-domain pairs for aspect category detection (slot 1) and sentiment polarity (slot 3) respectively, thereby demonstrating the viability of a deep learning-based approach for multilingual aspect-based sentiment analysis. |
Tasks | Aspect-Based Sentiment Analysis, Aspect Extraction, Multi-Label Classification, Sentiment Analysis |
Published | 2016-09-09 |
URL | http://arxiv.org/abs/1609.02748v2 |
http://arxiv.org/pdf/1609.02748v2.pdf | |
PWC | https://paperswithcode.com/paper/insight-1-at-semeval-2016-task-5-deep |
Repo | |
Framework | |
Efficient Estimation of k for the Nearest Neighbors Class of Methods
Title | Efficient Estimation of k for the Nearest Neighbors Class of Methods |
Authors | Aleksander Lodwich, Faisal Shafait, Thomas Breuel |
Abstract | The k Nearest Neighbors (kNN) method has received much attention in the past decades, where some theoretical bounds on its performance were identified and where practical optimizations were proposed for making it work fairly well in high dimensional spaces and on large datasets. From countless experiments of the past it became widely accepted that the value of k has a significant impact on the performance of this method. However, the efficient optimization of this parameter has not received so much attention in literature. Today, the most common approach is to cross-validate or bootstrap this value for all values in question. This approach forces distances to be recomputed many times, even if efficient methods are used. Hence, estimating the optimal k can become expensive even on modern systems. Frequently, this circumstance leads to a sparse manual search of k. In this paper we want to point out that a systematic and thorough estimation of the parameter k can be performed efficiently. The discussed approach relies on large matrices, but we want to argue, that in practice a higher space complexity is often much less of a problem than repetitive distance computations. |
Tasks | |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02617v2 |
http://arxiv.org/pdf/1606.02617v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-estimation-of-k-for-the-nearest |
Repo | |
Framework | |
Characterization of an inconsistency ranking for pairwise comparison matrices
Title | Characterization of an inconsistency ranking for pairwise comparison matrices |
Authors | László Csató |
Abstract | Pairwise comparisons between alternatives are a well-known method for measuring preferences of a decision-maker. Since these often do not exhibit consistency, a number of inconsistency indices has been introduced in order to measure the deviation from this ideal case. We axiomatically characterize the inconsistency ranking induced by the Koczkodaj inconsistency index: six independent properties are presented such that they determine a unique linear order on the set of all pairwise comparison matrices. |
Tasks | |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07388v3 |
http://arxiv.org/pdf/1610.07388v3.pdf | |
PWC | https://paperswithcode.com/paper/characterization-of-an-inconsistency-ranking |
Repo | |
Framework | |
Enhanced high dynamic range 3D shape measurement based on generalized phase-shifting algorithm
Title | Enhanced high dynamic range 3D shape measurement based on generalized phase-shifting algorithm |
Authors | Minmin Wang, Guangliang Du, Canlin Zhou, Chaorui Zhang, Shuchun Si, Hui Li, Zhenkun Lei, YanJie Li |
Abstract | It is a challenge for Phase Measurement Profilometry (PMP) to measure objects with a large range of reflectivity variation across the surface. Saturated or dark pixels in the deformed fringe patterns captured by the camera will lead to phase fluctuations and errors. Jiang et al. proposed a high dynamic range real-time 3D shape measurement method without changing camera exposures. Three inverted phase-shifted fringe patterns are used to complement three regular phase-shifted fringe patterns for phase retrieval when any of the regular fringe patterns are saturated. But Jiang’s method still has some drawbacks: (1) The phases in saturated pixels are respectively estimated by different formulas for different cases. It is shortage of an universal formula; (2) it cannot be extended to four-step phase-shifting algorithm because inverted fringe patterns are the repetition of regular fringe patterns; (3) only three unsaturated intensity values at every pixel of fringe patterns are chosen for phase demodulation, lying idle the other unsaturated ones. We proposed a method for enhanced high dynamic range 3D shape measurement based on generalized phase-shifting algorithm, which combines the complementary technique of inverted and regular fringe patterns with generalized phase-shifting algorithm. Firstly, two sets of complementary phase-shifted fringe patterns, namely regular and inverted fringe patterns are projected and collected. Then all unsaturated intensity values at the same camera pixel from two sets of fringe patterns are selected, and employed to retrieve the phase by generalized phase-shifting algorithm. Finally, simulations and experiments are conducted to prove the validity of the proposed method. The results are analyzed and compared with Jiang’s method, which demonstrate that the proposed method not only expands the scope of Jiang’s method, but also improves the measurement accuracy. |
Tasks | |
Published | 2016-06-07 |
URL | http://arxiv.org/abs/1606.02288v1 |
http://arxiv.org/pdf/1606.02288v1.pdf | |
PWC | https://paperswithcode.com/paper/enhanced-high-dynamic-range-3d-shape |
Repo | |
Framework | |
A Stochastic Approach to STDP
Title | A Stochastic Approach to STDP |
Authors | Runchun Wang, Chetan Singh Thakur, Tara Julia Hamilton, Jonathan Tapson, André van Schaik |
Abstract | We present a digital implementation of the Spike Timing Dependent Plasticity (STDP) learning rule. The proposed digital implementation consists of an exponential decay generator array and a STDP adaptor array. On the arrival of a pre- and post-synaptic spike, the STDP adaptor will send a digital spike to the decay generator. The decay generator will then generate an exponential decay, which will be used by the STDP adaptor to perform the weight adaption. The exponential decay, which is computational expensive, is efficiently implemented by using a novel stochastic approach, which we analyse and characterise here. We use a time multiplexing approach to achieve 8192 (8k) virtual STDP adaptors and decay generators with only one physical implementation of each. We have validated our stochastic STDP approach with measurement results of a balanced excitation/inhibition experiment. Our stochastic approach is ideal for implementing the STDP learning rule in large-scale spiking neural networks running in real time. |
Tasks | |
Published | 2016-03-13 |
URL | http://arxiv.org/abs/1603.04080v1 |
http://arxiv.org/pdf/1603.04080v1.pdf | |
PWC | https://paperswithcode.com/paper/a-stochastic-approach-to-stdp |
Repo | |
Framework | |
Uniform Generalization, Concentration, and Adaptive Learning
Title | Uniform Generalization, Concentration, and Adaptive Learning |
Authors | Ibrahim Alabdulmohsin |
Abstract | One fundamental goal in any learning algorithm is to mitigate its risk for overfitting. Mathematically, this requires that the learning algorithm enjoys a small generalization risk, which is defined either in expectation or in probability. Both types of generalization are commonly used in the literature. For instance, generalization in expectation has been used to analyze algorithms, such as ridge regression and SGD, whereas generalization in probability is used in the VC theory, among others. Recently, a third notion of generalization has been studied, called uniform generalization, which requires that the generalization risk vanishes uniformly in expectation across all bounded parametric losses. It has been shown that uniform generalization is, in fact, equivalent to an information-theoretic stability constraint, and that it recovers classical results in learning theory. It is achievable under various settings, such as sample compression schemes, finite hypothesis spaces, finite domains, and differential privacy. However, the relationship between uniform generalization and concentration remained unknown. In this paper, we answer this question by proving that, while a generalization in expectation does not imply a generalization in probability, a uniform generalization in expectation does imply concentration. We establish a chain rule for the uniform generalization risk of the composition of hypotheses and use it to derive a large deviation bound. Finally, we prove that the bound is tight. |
Tasks | |
Published | 2016-08-22 |
URL | http://arxiv.org/abs/1608.06072v2 |
http://arxiv.org/pdf/1608.06072v2.pdf | |
PWC | https://paperswithcode.com/paper/uniform-generalization-concentration-and |
Repo | |
Framework | |