Paper Group ANR 1433
Fast Convergence of Belief Propagation to Global Optima: Beyond Correlation Decay. Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning (Technical Report). Simulation of Turbulent Flow around a Generic High-Speed Train using Hybrid Models of RANS Numerical Method with Machine Learning. ELIMINA …
Fast Convergence of Belief Propagation to Global Optima: Beyond Correlation Decay
Title | Fast Convergence of Belief Propagation to Global Optima: Beyond Correlation Decay |
Authors | Frederic Koehler |
Abstract | Belief propagation is a fundamental message-passing algorithm for probabilistic reasoning and inference in graphical models. While it is known to be exact on trees, in most applications belief propagation is run on graphs with cycles. Understanding the behavior of “loopy” belief propagation has been a major challenge for researchers in machine learning, and positive convergence results for BP are known under strong assumptions which imply the underlying graphical model exhibits decay of correlations. We show that under a natural initialization, BP converges quickly to the global optimum of the Bethe free energy for Ising models on arbitrary graphs, as long as the Ising model is \emph{ferromagnetic} (i.e. neighbors prefer to be aligned). This holds even though such models can exhibit long range correlations and may have multiple suboptimal BP fixed points. We also show an analogous result for iterating the (naive) mean-field equations; perhaps surprisingly, both results are dimension-free in the sense that a constant number of iterations already provides a good estimate to the Bethe/mean-field free energy. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.09992v1 |
https://arxiv.org/pdf/1905.09992v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-convergence-of-belief-propagation-to |
Repo | |
Framework | |
Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning (Technical Report)
Title | Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning (Technical Report) |
Authors | Zeke Wang, Kaan Kara, Hantian Zhang, Gustavo Alonso, Onur Mutlu, Ce Zhang |
Abstract | Learning from the data stored in a database is an important function increasingly available in relational engines. Methods using lower precision input data are of special interest given their overall higher efficiency but, in databases, these methods have a hidden cost: the quantization of the real value into a smaller number is an expensive step. To address the issue, in this paper we present MLWeaving, a data structure and hardware acceleration technique intended to speed up learning of generalized linear models in databases. ML-Weaving provides a compact, in-memory representation enabling the retrieval of data at any level of precision. MLWeaving also takes advantage of the increasing availability of FPGA-based accelerators to provide a highly efficient implementation of stochastic gradient descent. The solution adopted in MLWeaving is more efficient than existing designs in terms of space (since it can process any resolution on the same design) and resources (via the use of bit-serial multipliers). MLWeaving also enables the runtime tuning of precision, instead of a fixed precision level during the training. We illustrate this using a simple, dynamic precision schedule. Experimental results show MLWeaving achieves up to16 performance improvement over low-precision CPU implementations of first-order methods. |
Tasks | Quantization |
Published | 2019-03-08 |
URL | http://arxiv.org/abs/1903.03404v2 |
http://arxiv.org/pdf/1903.03404v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-generalized-linear-models-with |
Repo | |
Framework | |
Simulation of Turbulent Flow around a Generic High-Speed Train using Hybrid Models of RANS Numerical Method with Machine Learning
Title | Simulation of Turbulent Flow around a Generic High-Speed Train using Hybrid Models of RANS Numerical Method with Machine Learning |
Authors | Alireza Hajipour, Arash Mirabdolah Lavasani, Mohammad Eftekhari Yazdi, Amir Mosavi, Shahaboddin Shamshirband, Kwok-Wing Chau |
Abstract | In the present paper, an aerodynamic investigation of a high-speed train is performed. In the first section of this article, a generic high-speed train against a turbulent flow is simulated, numerically. The Reynolds-Averaged Navier-Stokes (RANS) equations combined with the turbulence model are applied to solve incompressible turbulent flow around a high-speed train. Flow structure, velocity and pressure contours and streamlines at some typical wind directions are the most important results of this simulation. The maximum and minimum values are specified and discussed. Also, the pressure coefficient for some critical points on the train surface is evaluated. In the following, the wind direction influence the aerodynamic key parameters as drag, lift, and side forces at the mentioned wind directions are analyzed and compared. Moreover, the effects of velocity changes (50, 60, 70, 80 and 90 m/s) are estimated and compared on the above flow and aerodynamic parameters. In the second section of the paper, various data-driven methods including Gene Expression Programming (GEP), Gaussian Process Regression (GPR), and random forest (RF), are applied for predicting output parameters. So, drag, lift, and side forces and also minimum and a maximum of pressure coefficients for mentioned wind directions and velocity are predicted and compared using statistical parameters. Obtained results indicated that RF in all coefficients of wind direction and most coefficients of free stream velocity provided the most accurate predictions. As a conclusion, RF may be recommended for the prediction of aerodynamic coefficients. |
Tasks | |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/2001.01569v1 |
https://arxiv.org/pdf/2001.01569v1.pdf | |
PWC | https://paperswithcode.com/paper/simulation-of-turbulent-flow-around-a-generic |
Repo | |
Framework | |
ELIMINATION from Design to Analysis
Title | ELIMINATION from Design to Analysis |
Authors | Ahmed Khalifa, Dan Gopstein, Julian Togelius |
Abstract | Elimination is a word puzzle game for browsers and mobile devices, where all levels are generated by a constrained evolutionary algorithm with no human intervention. This paper describes the design of the game and its level generation methods, and analysis of playtraces from almost a thousand users who played the game since its release. The analysis corroborates that the level generator creates a sawtooth-shaped difficulty curve, as intended. The analysis also offers insights into player behavior in this game. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06379v1 |
https://arxiv.org/pdf/1905.06379v1.pdf | |
PWC | https://paperswithcode.com/paper/elimination-from-design-to-analysis |
Repo | |
Framework | |
Self-Knowledge Distillation in Natural Language Processing
Title | Self-Knowledge Distillation in Natural Language Processing |
Authors | Sangchul Hahn, Heeyoul Choi |
Abstract | Since deep learning became a key player in natural language processing (NLP), many deep learning models have been showing remarkable performances in a variety of NLP tasks, and in some cases, they are even outperforming humans. Such high performance can be explained by efficient knowledge representation of deep learning models. While many methods have been proposed to learn more efficient representation, knowledge distillation from pretrained deep networks suggest that we can use more information from the soft target probability to train other neural networks. In this paper, we propose a new knowledge distillation method self-knowledge distillation, based on the soft target probabilities of the training model itself, where multimode information is distilled from the word embedding space right below the softmax layer. Due to the time complexity, our method approximates the soft target probabilities. In experiments, we applied the proposed method to two different and fundamental NLP tasks: language model and neural machine translation. The experiment results show that our proposed method improves performance on the tasks. |
Tasks | Language Modelling, Machine Translation |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.01851v1 |
https://arxiv.org/pdf/1908.01851v1.pdf | |
PWC | https://paperswithcode.com/paper/self-knowledge-distillation-in-natural |
Repo | |
Framework | |
Relevant features for Gender Classification in NIR Periocular Images
Title | Relevant features for Gender Classification in NIR Periocular Images |
Authors | Ignacio Viedma, Juan Tapia, Andres Iturriaga, Christoph Busch |
Abstract | Most gender classifications methods from NIR images have used iris information. Recent work has explored the use of the whole periocular iris region which has surprisingly achieve better results. This suggests the most relevant information for gender classification is not located in the iris as expected. In this work, we analyze and demonstrate the location of the most relevant features that describe gender in periocular NIR images and evaluate its influence its classification. Experiments show that the periocular region contains more gender information than the iris region. We extracted several features (intensity, texture, and shape) and classified them according to its relevance using the XgBoost algorithm. Support Vector Machine and nine ensemble classifiers were used for testing gender accuracy when using the most relevant features. The best classification results were obtained when 4,000 features located on the periocular region were used (89.22%). Additional experiments with the full periocular iris images versus the iris-Occluded images were performed. The gender classification rates obtained were 84.35% and 85.75% respectively. We also contribute to the state of the art with a new database (UNAB-Gender). From results, we suggest focussing only on the surrounding area of the iris. This allows us to realize a faster classification of gender from NIR periocular images. |
Tasks | |
Published | 2019-04-26 |
URL | http://arxiv.org/abs/1904.12007v1 |
http://arxiv.org/pdf/1904.12007v1.pdf | |
PWC | https://paperswithcode.com/paper/relevant-features-for-gender-classification |
Repo | |
Framework | |
Does Learning Require Memorization? A Short Tale about a Long Tail
Title | Does Learning Require Memorization? A Short Tale about a Long Tail |
Authors | Vitaly Feldman |
Abstract | State-of-the-art results on image recognition tasks are achieved using over-parameterized learning algorithms that (nearly) perfectly fit the training set and are known to fit well even random labels. This tendency to memorize the labels of the training data is not explained by existing theoretical analyses. Memorization of the training data also presents significant privacy risks when the training data contains sensitive personal information and thus it is important to understand whether such memorization is necessary for accurate learning. We provide a simple conceptual explanation and a theoretical model demonstrating that for natural data distributions memorization of labels is necessary for achieving close-to-optimal generalization error. The model is motivated and supported by the results of several recent empirical works. In our model, data is sampled from a mixture of subpopulations and the frequencies of these subpopulations are chosen from some prior. The model allows to quantify the effect of not fitting the training data on the generalization performance of the learned classifier and demonstrates that memorization is necessary whenever frequencies are long-tailed. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. Our results also have concrete implications for the cost of ensuring differential privacy in learning. |
Tasks | |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05271v3 |
https://arxiv.org/pdf/1906.05271v3.pdf | |
PWC | https://paperswithcode.com/paper/does-learning-require-memorization-a-short |
Repo | |
Framework | |
Feature Extraction in Augmented Reality
Title | Feature Extraction in Augmented Reality |
Authors | Jekishan K. Parmar, Ankit Desai |
Abstract | Augmented Reality (AR) is used for various applications associated with the real world. In this paper, first, describe characteristics and essential services of AR. Brief history on Virtual Reality (VR) and AR is also mentioned in the introductory section. Then, AR Technologies along with its workflow is depicted, which includes the complete AR Process consisting of the stages of Image Acquisition, Feature Extraction, Feature Matching, Geometric Verification, and Associated Information Retrieval. Feature extraction is the essence of AR hence its details are furnished in the paper. |
Tasks | Information Retrieval |
Published | 2019-11-09 |
URL | https://arxiv.org/abs/1911.09177v1 |
https://arxiv.org/pdf/1911.09177v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-extraction-in-augmented-reality |
Repo | |
Framework | |
Towards Neural Mixture Recommender for Long Range Dependent User Sequences
Title | Towards Neural Mixture Recommender for Long Range Dependent User Sequences |
Authors | Jiaxi Tang, Francois Belletti, Sagar Jain, Minmin Chen, Alex Beutel, Can Xu, Ed H. Chi |
Abstract | Understanding temporal dynamics has proved to be highly valuable for accurate recommendation. Sequential recommenders have been successful in modeling the dynamics of users and items over time. However, while different model architectures excel at capturing various temporal ranges or dynamics, distinct application contexts require adapting to diverse behaviors. In this paper we examine how to build a model that can make use of different temporal ranges and dynamics depending on the request context. We begin with the analysis of an anonymized Youtube dataset comprising millions of user sequences. We quantify the degree of long-range dependence in these sequences and demonstrate that both short-term and long-term dependent behavioral patterns co-exist. We then propose a neural Multi-temporal-range Mixture Model (M3) as a tailored solution to deal with both short-term and long-term dependencies. Our approach employs a mixture of models, each with a different temporal range. These models are combined by a learned gating mechanism capable of exerting different model combinations given different contextual information. In empirical evaluations on a public dataset and our own anonymized YouTube dataset, M3 consistently outperforms state-of-the-art sequential recommendation methods. |
Tasks | |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08588v1 |
http://arxiv.org/pdf/1902.08588v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-neural-mixture-recommender-for-long |
Repo | |
Framework | |
Optimistic Optimization for Statistical Model Checking with Regret Bounds
Title | Optimistic Optimization for Statistical Model Checking with Regret Bounds |
Authors | Negin Musavi, Dawei Sun, Sayan Mitra, Geir Dullerud, Sanjay Shakkottai |
Abstract | We explore application of multi-armed bandit algorithms to statistical model checking (SMC) of Markov chains initialized to a set of states. We observe that model checking problems requiring maximization of probabilities of sets of execution over all choices of the initial states, can be formulated as a multi-armed bandit problem, for appropriate costs and rewards. Therefore, the problem can be solved using multi-fidelity hierarchical optimistic optimization (MFHOO). Bandit algorithms, and MFHOO in particular, give (regret) bounds on the sample efficiency which rely on the smoothness and the near-optimality dimension of the objective function, and are a new addition to the existing types of bounds in the SMC literature. We present a new SMC tool—HooVer—built on these principles and our experiments suggest that: Compared with exact probabilistic model checking tools like Storm, HooVer scales better; compared with the statistical model checking tool PlasmaLab, HooVer can require much less data to achieve comparable results. |
Tasks | |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01537v1 |
https://arxiv.org/pdf/1911.01537v1.pdf | |
PWC | https://paperswithcode.com/paper/optimistic-optimization-for-statistical-model |
Repo | |
Framework | |
Feature-Attention Graph Convolutional Networks for Noise Resilient Learning
Title | Feature-Attention Graph Convolutional Networks for Noise Resilient Learning |
Authors | Min Shi, Yufei Tang, Xingquan Zhu, Jianxun Liu |
Abstract | Noise and inconsistency commonly exist in real-world information networks, due to inherent error-prone nature of human or user privacy concerns. To date, tremendous efforts have been made to advance feature learning from networks, including the most recent Graph Convolutional Networks (GCN) or attention GCN, by integrating node content and topology structures. However, all existing methods consider networks as error-free sources and treat feature content in each node as independent and equally important to model node relations. The erroneous node content, combined with sparse features, provide essential challenges for existing methods to be used on real-world noisy networks. In this paper, we propose FA-GCN, a feature-attention graph convolution learning framework, to handle networks with noisy and sparse node content. To tackle noise and sparse content in each node, FA-GCN first employs a long short-term memory (LSTM) network to learn dense representation for each feature. To model interactions between neighboring nodes, a feature-attention mechanism is introduced to allow neighboring nodes learn and vary feature importance, with respect to their connections. By using spectral-based graph convolution aggregation process, each node is allowed to concentrate more on the most determining neighborhood features aligned with the corresponding learning task. Experiments and validations, w.r.t. different noise levels, demonstrate that FA-GCN achieves better performance than state-of-the-art methods on both noise-free and noisy networks. |
Tasks | Feature Importance |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/1912.11755v1 |
https://arxiv.org/pdf/1912.11755v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-attention-graph-convolutional |
Repo | |
Framework | |
Random Projections of Mel-Spectrograms as Low-Level Features for Automatic Music Genre Classification
Title | Random Projections of Mel-Spectrograms as Low-Level Features for Automatic Music Genre Classification |
Authors | Juliano Henrique Foleiss, Tiago Fernandes Tavares |
Abstract | In this work, we analyse the random projections of Mel-spectrograms as low-level features for music genre classification. This approach was compared to handcrafted features, features learned using an auto-encoder and features obtained from a transfer learning setting. Tests in five different well-known, publicly available datasets show that random projections leads to results comparable to learned features and outperforms features obtained via transfer learning in a shallow learning scenario. Random projections do not require using extensive specialist knowledge and, simultaneously, requires less computational power for training than other projection-based low-level features. Therefore, they can be are a viable choice for usage in shallow learning content-based music genre classification. |
Tasks | Transfer Learning |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04660v1 |
https://arxiv.org/pdf/1911.04660v1.pdf | |
PWC | https://paperswithcode.com/paper/random-projections-of-mel-spectrograms-as-low |
Repo | |
Framework | |
An Ensemble Rate Adaptation Framework for Dynamic Adaptive Streaming Over HTTP
Title | An Ensemble Rate Adaptation Framework for Dynamic Adaptive Streaming Over HTTP |
Authors | Hui Yuan, Xiaoqian Hu, Junhui Hou, Xuekai Wei, Sam Kwong |
Abstract | Rate adaptation is one of the most important issues in dynamic adaptive streaming over HTTP (DASH). Due to the frequent fluctuations of the network bandwidth and complex variations of video content, it is difficult to deal with the varying network conditions and video content perfectly by using a single rate adaptation method. In this paper, we propose an ensemble rate adaptation framework for DASH, which aims to leverage the advantages of multiple methods involved in the framework to improve the quality of experience (QoE) of users. The proposed framework is simple yet very effective. Specifically, the proposed framework is composed of two modules, i.e., the method pool and method controller. In the method pool, several rate adap tation methods are integrated. At each decision time, only the method that can achieve the best QoE is chosen to determine the bitrate of the requested video segment. Besides, we also propose two strategies for switching methods, i.e., InstAnt Method Switching, and InterMittent Method Switching, for the method controller to determine which method can provide the best QoEs. Simulation results demonstrate that, the proposed framework always achieves the highest QoE for the change of channel environment and video complexity, compared with state-of-the-art rate adaptation methods. |
Tasks | |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/1912.11822v1 |
https://arxiv.org/pdf/1912.11822v1.pdf | |
PWC | https://paperswithcode.com/paper/an-ensemble-rate-adaptation-framework-for |
Repo | |
Framework | |
Adversarial Edit Attacks for Tree Data
Title | Adversarial Edit Attacks for Tree Data |
Authors | Benjamin Paaßen |
Abstract | Many machine learning models can be attacked with adversarial examples, i.e. inputs close to correctly classified examples that are classified incorrectly. However, most research on adversarial attacks to date is limited to vectorial data, in particular image data. In this contribution, we extend the field by introducing adversarial edit attacks for tree-structured data with potential applications in medicine and automated program analysis. Our approach solely relies on the tree edit distance and a logarithmic number of black-box queries to the attacked classifier without any need for gradient information. We evaluate our approach on two programming and two biomedical data sets and show that many established tree classifiers, like tree-kernel-SVMs and recursive neural networks, can be attacked effectively. |
Tasks | |
Published | 2019-08-25 |
URL | https://arxiv.org/abs/1908.09364v2 |
https://arxiv.org/pdf/1908.09364v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-edit-attacks-for-tree-data |
Repo | |
Framework | |
The Importance of Socio-Cultural Differences for Annotating and Detecting the Affective States of Students
Title | The Importance of Socio-Cultural Differences for Annotating and Detecting the Affective States of Students |
Authors | Eda Okur, Sinem Aslan, Nese Alyuz, Asli Arslan Esme, Ryan S. Baker |
Abstract | The development of real-time affect detection models often depends upon obtaining annotated data for supervised learning by employing human experts to label the student data. One open question in annotating affective data for affect detection is whether the labelers (i.e., human experts) need to be socio-culturally similar to the students being labeled, as this impacts the cost feasibility of obtaining the labels. In this study, we investigate the following research questions: For affective state annotation, how does the socio-cultural background of human expert labelers, compared to the subjects, impact the degree of consensus and distribution of affective states obtained? Secondly, how do differences in labeler background impact the performance of affect detection models that are trained using these labels? |
Tasks | |
Published | 2019-01-12 |
URL | http://arxiv.org/abs/1901.03793v1 |
http://arxiv.org/pdf/1901.03793v1.pdf | |
PWC | https://paperswithcode.com/paper/the-importance-of-socio-cultural-differences |
Repo | |
Framework | |