Paper Group ANR 336
Photo-Quality Evaluation based on Computational Aesthetics: Review of Feature Extraction Techniques. Space-Filling Curves as a Novel Crystal Structure Representation for Machine Learning Models. Ms. Pac-Man Versus Ghost Team CIG 2016 Competition. A First Attempt to Cloud-Based User Verification in Distributed System. Heart Beat Characterization fro …
Photo-Quality Evaluation based on Computational Aesthetics: Review of Feature Extraction Techniques
Title | Photo-Quality Evaluation based on Computational Aesthetics: Review of Feature Extraction Techniques |
Authors | Dimitris Spathis |
Abstract | Researchers try to model the aesthetic quality of photographs into low and high- level features, drawing inspiration from art theory, psychology and marketing. We attempt to describe every feature extraction measure employed in the above process. The contribution of this literature review is the taxonomy of each feature by its implementation complexity, considering real-world applications and integration in mobile apps and digital cameras. Also, we discuss the machine learning results along with some unexplored research areas as future work. |
Tasks | |
Published | 2016-12-19 |
URL | http://arxiv.org/abs/1612.06259v1 |
http://arxiv.org/pdf/1612.06259v1.pdf | |
PWC | https://paperswithcode.com/paper/photo-quality-evaluation-based-on |
Repo | |
Framework | |
Space-Filling Curves as a Novel Crystal Structure Representation for Machine Learning Models
Title | Space-Filling Curves as a Novel Crystal Structure Representation for Machine Learning Models |
Authors | Dipti Jasrasaria, Edward O. Pyzer-Knapp, Dmitrij Rappoport, Alan Aspuru-Guzik |
Abstract | A fundamental problem in applying machine learning techniques for chemical problems is to find suitable representations for molecular and crystal structures. While the structure representations based on atom connectivities are prevalent for molecules, two-dimensional descriptors are not suitable for describing molecular crystals. In this work, we introduce the SFC-M family of feature representations, which are based on Morton space-filling curves, as an alternative means of representing crystal structures. Latent Semantic Indexing (LSI) was employed in a novel setting to reduce sparsity of feature representations. The quality of the SFC-M representations were assessed by using them in combination with artificial neural networks to predict Density Functional Theory (DFT) single point, Ewald summed, lattice, and many-body dispersion energies of 839 organic molecular crystal unit cells from the Cambridge Structural Database that consist of the elements C, H, N, and O. Promising initial results suggest that the SFC-M representations merit further exploration to improve its ability to predict solid-state properties of organic crystal structures |
Tasks | |
Published | 2016-08-19 |
URL | http://arxiv.org/abs/1608.05747v1 |
http://arxiv.org/pdf/1608.05747v1.pdf | |
PWC | https://paperswithcode.com/paper/space-filling-curves-as-a-novel-crystal |
Repo | |
Framework | |
Ms. Pac-Man Versus Ghost Team CIG 2016 Competition
Title | Ms. Pac-Man Versus Ghost Team CIG 2016 Competition |
Authors | Piers R. Williams, Diego Perez-Liebana, Simon M. Lucas |
Abstract | This paper introduces the revival of the popular Ms. Pac-Man Versus Ghost Team competition. We present an updated game engine with Partial Observability constraints, a new Multi-Agent Systems approach to developing Ghost agents and several sample controllers to ease the development of entries. A restricted communication protocol is provided for the Ghosts, providing a more challenging environment than before. The competition will debut at the IEEE Computational Intelligence and Games Conference 2016. Some preliminary results showing the effects of Partial Observability and the benefits of simple communication are also presented. |
Tasks | |
Published | 2016-09-08 |
URL | http://arxiv.org/abs/1609.02316v1 |
http://arxiv.org/pdf/1609.02316v1.pdf | |
PWC | https://paperswithcode.com/paper/ms-pac-man-versus-ghost-team-cig-2016 |
Repo | |
Framework | |
A First Attempt to Cloud-Based User Verification in Distributed System
Title | A First Attempt to Cloud-Based User Verification in Distributed System |
Authors | Marcin Wozniak, Dawid Polap, Grzegorz Borowik, Christian Napoli |
Abstract | In this paper, the idea of client verification in distributed systems is presented. The proposed solution presents a sample system where client verification through cloud resources using input signature is discussed. For different signatures the proposed method has been examined. Research results are presented and discussed to show potential advantages. |
Tasks | |
Published | 2016-01-27 |
URL | http://arxiv.org/abs/1601.07446v1 |
http://arxiv.org/pdf/1601.07446v1.pdf | |
PWC | https://paperswithcode.com/paper/a-first-attempt-to-cloud-based-user |
Repo | |
Framework | |
Heart Beat Characterization from Ballistocardiogram Signals using Extended Functions of Multiple Instances
Title | Heart Beat Characterization from Ballistocardiogram Signals using Extended Functions of Multiple Instances |
Authors | Changzhe Jiao, Princess Lyons, Alina Zare, Licet Rosales, Marjorie Skubic |
Abstract | A multiple instance learning (MIL) method, extended Function of Multiple Instances ($e$FUMI), is applied to ballistocardiogram (BCG) signals produced by a hydraulic bed sensor. The goal of this approach is to learn a personalized heartbeat “concept” for an individual. This heartbeat concept is a prototype (or “signature”) that characterizes the heartbeat pattern for an individual in ballistocardiogram data. The $e$FUMI method models the problem of learning a heartbeat concept from a BCG signal as a MIL problem. This approach elegantly addresses the uncertainty inherent in a BCG signal e. g., misalignment between training data and ground truth, mis-collection of heartbeat by some transducers, etc. Given a BCG training signal coupled with a ground truth signal (e.g., a pulse finger sensor), training “bags” labeled with only binary labels denoting if a training bag contains a heartbeat signal or not can be generated. Then, using these bags, $e$FUMI learns a personalized concept of heartbeat for a subject as well as several non-heartbeat background concepts. After learning the heartbeat concept, heartbeat detection and heart rate estimation can be applied to test data. Experimental results show that the estimated heartbeat concept found by $e$FUMI is more representative and a more discriminative prototype of the heartbeat signals than those found by comparison MIL methods in the literature. |
Tasks | Heart rate estimation, Multiple Instance Learning |
Published | 2016-05-16 |
URL | http://arxiv.org/abs/1605.04634v1 |
http://arxiv.org/pdf/1605.04634v1.pdf | |
PWC | https://paperswithcode.com/paper/heart-beat-characterization-from |
Repo | |
Framework | |
Faceless Person Recognition; Privacy Implications in Social Media
Title | Faceless Person Recognition; Privacy Implications in Social Media |
Authors | Seong Joon Oh, Rodrigo Benenson, Mario Fritz, Bernt Schiele |
Abstract | As we shift more of our lives into the virtual domain, the volume of data shared on the web keeps increasing and presents a threat to our privacy. This works contributes to the understanding of privacy implications of such data sharing by analysing how well people are recognisable in social media data. To facilitate a systematic study we define a number of scenarios considering factors such as how many heads of a person are tagged and if those heads are obfuscated or not. We propose a robust person recognition system that can handle large variations in pose and clothing, and can be trained with few training samples. Our results indicate that a handful of images is enough to threaten users’ privacy, even in the presence of obfuscation. We show detailed experimental results, and discuss their implications. |
Tasks | Person Recognition |
Published | 2016-07-28 |
URL | http://arxiv.org/abs/1607.08438v1 |
http://arxiv.org/pdf/1607.08438v1.pdf | |
PWC | https://paperswithcode.com/paper/faceless-person-recognition-privacy |
Repo | |
Framework | |
Complex Matrix Factorization for Face Recognition
Title | Complex Matrix Factorization for Face Recognition |
Authors | Viet-Hang Duong, Yuan-Shan Lee, Bach-Tung Pham, Seksan Mathulaprangsan, Pham The Bao, Jia-Ching Wang |
Abstract | This work developed novel complex matrix factorization methods for face recognition; the methods were complex matrix factorization (CMF), sparse complex matrix factorization (SpaCMF), and graph complex matrix factorization (GraCMF). After real-valued data are transformed into a complex field, the complex-valued matrix will be decomposed into two matrices of bases and coefficients, which are derived from solutions to an optimization problem in a complex domain. The generated objective function is the real-valued function of the reconstruction error, which produces a parametric description. Factorizing the matrix of complex entries directly transformed the constrained optimization problem into an unconstrained optimization problem. Additionally, a complex vector space with N dimensions can be regarded as a 2N-dimensional real vector space. Accordingly, all real analytic properties can be exploited in the complex field. The ability to exploit these important characteristics motivated the development herein of a simpler framework that can provide better recognition results. The effectiveness of this framework will be clearly elucidated in later sections in this paper. |
Tasks | Face Recognition |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02513v1 |
http://arxiv.org/pdf/1612.02513v1.pdf | |
PWC | https://paperswithcode.com/paper/complex-matrix-factorization-for-face |
Repo | |
Framework | |
Multi-Bias Non-linear Activation in Deep Neural Networks
Title | Multi-Bias Non-linear Activation in Deep Neural Networks |
Authors | Hongyang Li, Wanli Ouyang, Xiaogang Wang |
Abstract | As a widely used non-linear activation, Rectified Linear Unit (ReLU) separates noise and signal in a feature map by learning a threshold or bias. However, we argue that the classification of noise and signal not only depends on the magnitude of responses, but also the context of how the feature responses would be used to detect more abstract patterns in higher layers. In order to output multiple response maps with magnitude in different ranges for a particular visual pattern, existing networks employing ReLU and its variants have to learn a large number of redundant filters. In this paper, we propose a multi-bias non-linear activation (MBA) layer to explore the information hidden in the magnitudes of responses. It is placed after the convolution layer to decouple the responses to a convolution kernel into multiple maps by multi-thresholding magnitudes, thus generating more patterns in the feature space at a low computational cost. It provides great flexibility of selecting responses to different visual patterns in different magnitude ranges to form rich representations in higher layers. Such a simple and yet effective scheme achieves the state-of-the-art performance on several benchmarks. |
Tasks | |
Published | 2016-04-03 |
URL | http://arxiv.org/abs/1604.00676v1 |
http://arxiv.org/pdf/1604.00676v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-bias-non-linear-activation-in-deep |
Repo | |
Framework | |
On the efficient representation and execution of deep acoustic models
Title | On the efficient representation and execution of deep acoustic models |
Authors | Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin |
Abstract | In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values. The proposed quantization scheme leads to significant memory savings and enables the use of optimized hardware instructions for integer arithmetic, thus significantly reducing the cost of inference. Finally, we propose a “quantization aware” training process that applies the proposed scheme during network training and find that it allows us to recover most of the loss in accuracy introduced by quantization. We validate the proposed techniques by applying them to a long short-term memory-based acoustic model on an open-ended large vocabulary speech recognition task. |
Tasks | Large Vocabulary Continuous Speech Recognition, Quantization, Speech Recognition |
Published | 2016-07-15 |
URL | http://arxiv.org/abs/1607.04683v2 |
http://arxiv.org/pdf/1607.04683v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-efficient-representation-and-execution |
Repo | |
Framework | |
Dynamic Action Recognition: A convolutional neural network model for temporally organized joint location data
Title | Dynamic Action Recognition: A convolutional neural network model for temporally organized joint location data |
Authors | Adhavan Jayabalan, Harish Karunakaran, Shravan Murlidharan, Tesia Shizume |
Abstract | Motivation: Recognizing human actions in a video is a challenging task which has applications in various fields. Previous works in this area have either used images from a 2D or 3D camera. Few have used the idea that human actions can be easily identified by the movement of the joints in the 3D space and instead used a Recurrent Neural Network (RNN) for modeling. Convolutional neural networks (CNN) have the ability to recognise even the complex patterns in data which makes it suitable for detecting human actions. Thus, we modeled a CNN which can predict the human activity using the joint data. Furthermore, using the joint data representation has the benefit of lower dimensionality than image or video representations. This makes our model simpler and faster than the RNN models. In this study, we have developed a six layer convolutional network, which reduces each input feature vector of the form 15x1961x4 to an one dimensional binary vector which gives us the predicted activity. Results: Our model is able to recognise an activity correctly upto 87% accuracy. Joint data is taken from the Cornell Activity Datasets which have day to day activities like talking, relaxing, eating, cooking etc. |
Tasks | Temporal Action Localization |
Published | 2016-12-20 |
URL | http://arxiv.org/abs/1612.06703v1 |
http://arxiv.org/pdf/1612.06703v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-action-recognition-a-convolutional |
Repo | |
Framework | |
Random Fourier Features for Operator-Valued Kernels
Title | Random Fourier Features for Operator-Valued Kernels |
Authors | Romain Brault, Florence d’Alché-Buc, Markus Heinonen |
Abstract | Devoted to multi-task learning and structured output learning, operator-valued kernels provide a flexible tool to build vector-valued functions in the context of Reproducing Kernel Hilbert Spaces. To scale up these methods, we extend the celebrated Random Fourier Feature methodology to get an approximation of operator-valued kernels. We propose a general principle for Operator-valued Random Fourier Feature construction relying on a generalization of Bochner’s theorem for translation-invariant operator-valued Mercer kernels. We prove the uniform convergence of the kernel approximation for bounded and unbounded operator random Fourier features using appropriate Bernstein matrix concentration inequality. An experimental proof-of-concept shows the quality of the approximation and the efficiency of the corresponding linear models on example datasets. |
Tasks | Multi-Task Learning |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02536v3 |
http://arxiv.org/pdf/1605.02536v3.pdf | |
PWC | https://paperswithcode.com/paper/random-fourier-features-for-operator-valued |
Repo | |
Framework | |
Agreement-based Learning of Parallel Lexicons and Phrases from Non-Parallel Corpora
Title | Agreement-based Learning of Parallel Lexicons and Phrases from Non-Parallel Corpora |
Authors | Chunyang Liu, Yang Liu, Huanbo Luan, Maosong Sun, Heng Yu |
Abstract | We introduce an agreement-based approach to learning parallel lexicons and phrases from non-parallel corpora. The basic idea is to encourage two asymmetric latent-variable translation models (i.e., source-to-target and target-to-source) to agree on identifying latent phrase and word alignments. The agreement is defined at both word and phrase levels. We develop a Viterbi EM algorithm for jointly training the two unidirectional models efficiently. Experiments on the Chinese-English dataset show that agreement-based learning significantly improves both alignment and translation performance. |
Tasks | |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.04597v1 |
http://arxiv.org/pdf/1606.04597v1.pdf | |
PWC | https://paperswithcode.com/paper/agreement-based-learning-of-parallel-lexicons |
Repo | |
Framework | |
Online Learning of Combinatorial Objects via Extended Formulation
Title | Online Learning of Combinatorial Objects via Extended Formulation |
Authors | Holakou Rahmanian, David P. Helmbold, S. V. N. Vishwanathan |
Abstract | The standard techniques for online learning of combinatorial objects perform multiplicative updates followed by projections into the convex hull of all the objects. However, this methodology can be expensive if the convex hull contains many facets. For example, the convex hull of $n$-symbol Huffman trees is known to have exponentially many facets (Maurras et al., 2010). We get around this difficulty by exploiting extended formulations (Kaibel, 2011), which encode the polytope of combinatorial objects in a higher dimensional “extended” space with only polynomially many facets. We develop a general framework for converting extended formulations into efficient online algorithms with good relative loss bounds. We present applications of our framework to online learning of Huffman trees and permutations. The regret bounds of the resulting algorithms are within a factor of $O(\sqrt{\log(n)})$ of the state-of-the-art specialized algorithms for permutations, and depending on the loss regimes, improve on or match the state-of-the-art for Huffman trees. Our method is general and can be applied to other combinatorial objects. |
Tasks | |
Published | 2016-09-17 |
URL | http://arxiv.org/abs/1609.05374v5 |
http://arxiv.org/pdf/1609.05374v5.pdf | |
PWC | https://paperswithcode.com/paper/online-learning-of-combinatorial-objects-via |
Repo | |
Framework | |
Adaptive Visualisation System for Construction Building Information Models Using Saliency
Title | Adaptive Visualisation System for Construction Building Information Models Using Saliency |
Authors | Hugo Martin, Sylvain Chevallier, Eric Monacelli |
Abstract | Building Information Modeling (BIM) is a recent construction process based on a 3D model, containing every component related to the building achievement. Architects, structure engineers, method engineers, and others participant to the building process work on this model through the design-to-construction cycle. The high complexity and the large amount of information included in these models raise several issues, delaying its wide adoption in the industrial world. One of the most important is the visualization: professionals have difficulties to find out the relevant information for their job. Actual solutions suffer from two limitations: the BIM models information are processed manually and insignificant information are simply hidden, leading to inconsistencies in the building model. This paper describes a system relying on an ontological representation of the building information to label automatically the building elements. Depending on the user’s department, the visualization is modified according to these labels by automatically adjusting the colors and image properties based on a saliency model. The proposed saliency model incorporates several adaptations to fit the specificities of architectural images. |
Tasks | |
Published | 2016-03-07 |
URL | http://arxiv.org/abs/1603.02028v1 |
http://arxiv.org/pdf/1603.02028v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-visualisation-system-for |
Repo | |
Framework | |
Large-Scale Shape Retrieval with Sparse 3D Convolutional Neural Networks
Title | Large-Scale Shape Retrieval with Sparse 3D Convolutional Neural Networks |
Authors | Alexandr Notchenko, Ermek Kapushev, Evgeny Burnaev |
Abstract | In this paper we present results of performance evaluation of S3DCNN - a Sparse 3D Convolutional Neural Network - on a large-scale 3D Shape benchmark ModelNet40, and measure how it is impacted by voxel resolution of input shape. We demonstrate comparable classification and retrieval performance to state-of-the-art models, but with much less computational costs in training and inference phases. We also notice that benefits of higher input resolution can be limited by an ability of a neural network to generalize high level features. |
Tasks | |
Published | 2016-11-28 |
URL | http://arxiv.org/abs/1611.09159v2 |
http://arxiv.org/pdf/1611.09159v2.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-shape-retrieval-with-sparse-3d |
Repo | |
Framework | |