July 26, 2019

3306 words 16 mins read

Paper Group ANR 777

Paper Group ANR 777

An Event-based Fast Movement Detection Algorithm for a Positioning Robot Using POWERLINK Communication. A Novel Space-Time Representation on the Positive Semidefinite Con for Facial Expression Recognition. Information-theoretic interpretation of tuning curves for multiple motion directions. Improving classification accuracy of feedforward neural ne …

Title An Event-based Fast Movement Detection Algorithm for a Positioning Robot Using POWERLINK Communication
Authors Juan Barrios-Avilés, Taras Iakymchuk, Jorge Samaniego, Alfredo Rosado-Muñoz
Abstract This work develops a tracking system based on an event-based camera. A bioinspired filtering algorithm to reduce noise and transmitted data while keeping the main features at the scene is implemented in FPGA which also serves as a network node. POWERLINK IEEE 61158 industrial network is used to communicate the FPGA with a controller connected to a self-developed two axis servo-controlled robot. The FPGA includes the network protocol to integrate the event-based camera as any other existing network node. The inverse kinematics for the robot is included in the controller. In addition, another network node is used to control pneumatic valves blowing the ball at different speed and trajectories. To complete the system and provide a comparison, a traditional frame-based camera is also connected to the controller. The imaging data for the tracking system are obtained either from the event-based or frame-based camera. Results show that the robot can accurately follow the ball using fast image recognition, with the intrinsic advantages of the event-based system (size, price, power). This works shows how the development of new equipment and algorithms can be efficiently integrated in an industrial system, merging commercial industrial equipment with the new devices so that new technologies can rapidly enter into the industrial field.
Tasks
Published 2017-07-22
URL http://arxiv.org/abs/1707.07188v1
PDF http://arxiv.org/pdf/1707.07188v1.pdf
PWC https://paperswithcode.com/paper/an-event-based-fast-movement-detection
Repo
Framework

A Novel Space-Time Representation on the Positive Semidefinite Con for Facial Expression Recognition

Title A Novel Space-Time Representation on the Positive Semidefinite Con for Facial Expression Recognition
Authors Anis Kacem, Mohamed Daoudi, Boulbaba Ben Amor, Juan Carlos Alvarez-Paiva
Abstract In this paper, we study the problem of facial expression recognition using a novel space-time geometric representation. We describe the temporal evolution of facial landmarks as parametrized trajectories on the Riemannian manifold of positive semidefinite matrices of fixed-rank. Our representation has the advantage to bring naturally a second desirable quantity when comparing shapes – the spatial covariance – in addition to the conventional affine-shape representation. We derive then geometric and computational tools for rate-invariant analysis and adaptive re-sampling of trajectories, grounding on the Riemannian geometry of the manifold. Specifically, our approach involves three steps: 1) facial landmarks are first mapped into the Riemannian manifold of positive semidefinite matrices of rank 2, to build time-parameterized trajectories; 2) a temporal alignment is performed on the trajectories, providing a geometry-aware (dis-)similarity measure between them; 3) finally, pairwise proximity function SVM (ppfSVM) is used to classify them, incorporating the latter (dis-)similarity measure into the kernel function. We show the effectiveness of the proposed approach on four publicly available benchmarks (CK+, MMI, Oulu-CASIA, and AFEW). The results of the proposed approach are comparable to or better than the state-of-the-art methods when involving only facial landmarks.
Tasks Facial Expression Recognition
Published 2017-07-20
URL http://arxiv.org/abs/1707.06440v1
PDF http://arxiv.org/pdf/1707.06440v1.pdf
PWC https://paperswithcode.com/paper/a-novel-space-time-representation-on-the
Repo
Framework

Information-theoretic interpretation of tuning curves for multiple motion directions

Title Information-theoretic interpretation of tuning curves for multiple motion directions
Authors Wentao Huang, Xin Huang, Kechen Zhang
Abstract We have developed an efficient information-maximization method for computing the optimal shapes of tuning curves of sensory neurons by optimizing the parameters of the underlying feedforward network model. When applied to the problem of population coding of visual motion with multiple directions, our method yields several types of tuning curves with both symmetric and asymmetric shapes that resemble what have been found in the visual cortex. Our result suggests that the diversity or heterogeneity of tuning curve shapes as observed in neurophysiological experiment might actually constitute an optimal population representation of visual motions with multiple components.
Tasks
Published 2017-02-01
URL http://arxiv.org/abs/1702.00493v1
PDF http://arxiv.org/pdf/1702.00493v1.pdf
PWC https://paperswithcode.com/paper/information-theoretic-interpretation-of
Repo
Framework

Improving classification accuracy of feedforward neural networks for spiking neuromorphic chips

Title Improving classification accuracy of feedforward neural networks for spiking neuromorphic chips
Authors Antonio Jimeno Yepes, Jianbin Tang, Benjamin Scott Mashford
Abstract Deep Neural Networks (DNN) achieve human level performance in many image analytics tasks but DNNs are mostly deployed to GPU platforms that consume a considerable amount of power. New hardware platforms using lower precision arithmetic achieve drastic reductions in power consumption. More recently, brain-inspired spiking neuromorphic chips have achieved even lower power consumption, on the order of milliwatts, while still offering real-time processing. However, for deploying DNNs to energy efficient neuromorphic chips the incompatibility between continuous neurons and synaptic weights of traditional DNNs, discrete spiking neurons and synapses of neuromorphic chips need to be overcome. Previous work has achieved this by training a network to learn continuous probabilities, before it is deployed to a neuromorphic architecture, such as IBM TrueNorth Neurosynaptic System, by random sampling these probabilities. The main contribution of this paper is a new learning algorithm that learns a TrueNorth configuration ready for deployment. We achieve this by training directly a binary hardware crossbar that accommodates the TrueNorth axon configuration constrains and we propose a different neuron model. Results of our approach trained on electroencephalogram (EEG) data show a significant improvement with previous work (76% vs 86% accuracy) while maintaining state of the art performance on the MNIST handwritten data set.
Tasks EEG
Published 2017-05-19
URL http://arxiv.org/abs/1705.07755v1
PDF http://arxiv.org/pdf/1705.07755v1.pdf
PWC https://paperswithcode.com/paper/improving-classification-accuracy-of
Repo
Framework

Stacked Convolutional and Recurrent Neural Networks for Music Emotion Recognition

Title Stacked Convolutional and Recurrent Neural Networks for Music Emotion Recognition
Authors Miroslav Malik, Sharath Adavanne, Konstantinos Drossos, Tuomas Virtanen, Dasa Ticha, Roman Jarina
Abstract This paper studies the emotion recognition from musical tracks in the 2-dimensional valence-arousal (V-A) emotional space. We propose a method based on convolutional (CNN) and recurrent neural networks (RNN), having significantly fewer parameters compared with the state-of-the-art method for the same task. We utilize one CNN layer followed by two branches of RNNs trained separately for arousal and valence. The method was evaluated using the ‘MediaEval2015 emotion in music’ dataset. We achieved an RMSE of 0.202 for arousal and 0.268 for valence, which is the best result reported on this dataset.
Tasks Emotion Recognition, Music Emotion Recognition
Published 2017-06-07
URL http://arxiv.org/abs/1706.02292v1
PDF http://arxiv.org/pdf/1706.02292v1.pdf
PWC https://paperswithcode.com/paper/stacked-convolutional-and-recurrent-neural
Repo
Framework

Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train

Title Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train
Authors Valeriu Codreanu, Damian Podareanu, Vikram Saletore
Abstract For the past 5 years, the ILSVRC competition and the ImageNet dataset have attracted a lot of interest from the Computer Vision community, allowing for state-of-the-art accuracy to grow tremendously. This should be credited to the use of deep artificial neural network designs. As these became more complex, the storage, bandwidth, and compute requirements increased. This means that with a non-distributed approach, even when using the most high-density server available, the training process may take weeks, making it prohibitive. Furthermore, as datasets grow, the representation learning potential of deep networks grows as well by using more complex models. This synchronicity triggers a sharp increase in the computational requirements and motivates us to explore the scaling behaviour on petaflop scale supercomputers. In this paper we will describe the challenges and novel solutions needed in order to train ResNet-50 in this large scale environment. We demonstrate above 90% scaling efficiency and a training time of 28 minutes using up to 104K x86 cores. This is supported by software tools from Intel’s ecosystem. Moreover, we show that with regular 90 - 120 epoch train runs we can achieve a top-1 accuracy as high as 77% for the unmodified ResNet-50 topology. We also introduce the novel Collapsed Ensemble (CE) technique that allows us to obtain a 77.5% top-1 accuracy, similar to that of a ResNet-152, while training a unmodified ResNet-50 topology for the same fixed training budget. All ResNet-50 models as well as the scripts needed to replicate them will be posted shortly.
Tasks Representation Learning
Published 2017-11-12
URL http://arxiv.org/abs/1711.04291v2
PDF http://arxiv.org/pdf/1711.04291v2.pdf
PWC https://paperswithcode.com/paper/scale-out-for-large-minibatch-sgd-residual
Repo
Framework

Speaker Role Contextual Modeling for Language Understanding and Dialogue Policy Learning

Title Speaker Role Contextual Modeling for Language Understanding and Dialogue Policy Learning
Authors Ta-Chung Chi, Po-Chun Chen, Shang-Yu Su, Yun-Nung Chen
Abstract Language understanding (LU) and dialogue policy learning are two essential components in conversational systems. Human-human dialogues are not well-controlled and often random and unpredictable due to their own goals and speaking habits. This paper proposes a role-based contextual model to consider different speaker roles independently based on the various speaking patterns in the multi-turn dialogues. The experiments on the benchmark dataset show that the proposed role-based model successfully learns role-specific behavioral patterns for contextual encoding and then significantly improves language understanding and dialogue policy learning tasks.
Tasks
Published 2017-09-30
URL http://arxiv.org/abs/1710.00164v1
PDF http://arxiv.org/pdf/1710.00164v1.pdf
PWC https://paperswithcode.com/paper/speaker-role-contextual-modeling-for-language
Repo
Framework

Bridging the Gap Between Neural Networks and Neuromorphic Hardware with A Neural Network Compiler

Title Bridging the Gap Between Neural Networks and Neuromorphic Hardware with A Neural Network Compiler
Authors Yu Ji, YouHui Zhang, WenGuang Chen, Yuan Xie
Abstract Different from developing neural networks (NNs) for general-purpose processors, the development for NN chips usually faces with some hardware-specific restrictions, such as limited precision of network signals and parameters, constrained computation scale, and limited types of non-linear functions. This paper proposes a general methodology to address the challenges. We decouple the NN applications from the target hardware by introducing a compiler that can transform an existing trained, unrestricted NN into an equivalent network that meets the given hardware’s constraints. We propose multiple techniques to make the transformation adaptable to different kinds of NN chips, and reliable for restrict hardware constraints. We have built such a software tool that supports both spiking neural networks (SNNs) and traditional artificial neural networks (ANNs). We have demonstrated its effectiveness with a fabricated neuromorphic chip and a processing-in-memory (PIM) design. Tests show that the inference error caused by this solution is insignificant and the transformation time is much shorter than the retraining time. Also, we have studied the parameter-sensitivity evaluations to explore the tradeoffs between network error and resource utilization for different transformation strategies, which could provide insights for co-design optimization of neuromorphic hardware and software.
Tasks
Published 2017-11-15
URL http://arxiv.org/abs/1801.00746v3
PDF http://arxiv.org/pdf/1801.00746v3.pdf
PWC https://paperswithcode.com/paper/bridging-the-gap-between-neural-networks-and
Repo
Framework

From Imitation to Prediction, Data Compression vs Recurrent Neural Networks for Natural Language Processing

Title From Imitation to Prediction, Data Compression vs Recurrent Neural Networks for Natural Language Processing
Authors Juan Andrés Laura, Gabriel Masi, Luis Argerich
Abstract In recent studies [1][13][12] Recurrent Neural Networks were used for generative processes and their surprising performance can be explained by their ability to create good predictions. In addition, data compression is also based on predictions. What the problem comes down to is whether a data compressor could be used to perform as well as recurrent neural networks in natural language processing tasks. If this is possible,then the problem comes down to determining if a compression algorithm is even more intelligent than a neural network in specific tasks related to human language. In our journey we discovered what we think is the fundamental difference between a Data Compression Algorithm and a Recurrent Neural Network.
Tasks
Published 2017-05-01
URL http://arxiv.org/abs/1705.00697v1
PDF http://arxiv.org/pdf/1705.00697v1.pdf
PWC https://paperswithcode.com/paper/from-imitation-to-prediction-data-compression
Repo
Framework

SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization

Title SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
Authors Jeff Kinnison, Nathaniel Kremer-Herman, Douglas Thain, Walter Scheirer
Abstract Computer vision is experiencing an AI renaissance, in which machine learning models are expediting important breakthroughs in academic research and commercial applications. Effectively training these models, however, is not trivial due in part to hyperparameters: user-configured values that control a model’s ability to learn from data. Existing hyperparameter optimization methods are highly parallel but make no effort to balance the search across heterogeneous hardware or to prioritize searching high-impact spaces. In this paper, we introduce a framework for massively Scalable Hardware-Aware Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the relative complexity of each search space and monitors performance on the learning task over all trials. These metrics are then used as heuristics to assign hyperparameters to distributed workers based on their hardware. We first demonstrate that our framework achieves double the throughput of a standard distributed hyperparameter optimization framework by optimizing SVM for MNIST using 150 distributed workers. We then conduct model search with SHADHO over the course of one week using 74 GPUs across two compute clusters to optimize U-Net for a cell segmentation task, discovering 515 models that achieve a lower validation loss than standard U-Net.
Tasks Cell Segmentation, Hyperparameter Optimization
Published 2017-07-05
URL http://arxiv.org/abs/1707.01428v2
PDF http://arxiv.org/pdf/1707.01428v2.pdf
PWC https://paperswithcode.com/paper/shadho-massively-scalable-hardware-aware
Repo
Framework

Spatio-temporal Human Action Localisation and Instance Segmentation in Temporally Untrimmed Videos

Title Spatio-temporal Human Action Localisation and Instance Segmentation in Temporally Untrimmed Videos
Authors Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin
Abstract Current state-of-the-art human action recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame. In this work we address the problem of action localisation and instance segmentation in which multiple concurrent actions of the same class may be segmented out of an image sequence. We cast the action tube extraction as an energy maximisation problem in which configurations of region proposals in each frame are assigned a cost and the best action tubes are selected via two passes of dynamic programming. One pass associates region proposals in space and time for each action category, and another pass is used to solve for the tube’s temporal extent and to enforce a smooth label sequence through the video. In addition, by taking advantage of recent work on action foreground-background segmentation, we are able to associate each tube with class-specific segmentations. We demonstrate the performance of our algorithm on the challenging LIRIS-HARL dataset and achieve a new state-of-the-art result which is 14.3 times better than previous methods.
Tasks Instance Segmentation, Semantic Segmentation, Temporal Action Localization
Published 2017-07-22
URL http://arxiv.org/abs/1707.07213v2
PDF http://arxiv.org/pdf/1707.07213v2.pdf
PWC https://paperswithcode.com/paper/spatio-temporal-human-action-localisation-and
Repo
Framework

Learning to Associate Words and Images Using a Large-scale Graph

Title Learning to Associate Words and Images Using a Large-scale Graph
Authors Heqing Ya, Haonan Sun, Jeffrey Helt, Tai Sing Lee
Abstract We develop an approach for unsupervised learning of associations between co-occurring perceptual events using a large graph. We applied this approach to successfully solve the image captcha of China’s railroad system. The approach is based on the principle of suspicious coincidence. In this particular problem, a user is presented with a deformed picture of a Chinese phrase and eight low-resolution images. They must quickly select the relevant images in order to purchase their train tickets. This problem presents several challenges: (1) the teaching labels for both the Chinese phrases and the images were not available for supervised learning, (2) no pre-trained deep convolutional neural networks are available for recognizing these Chinese phrases or the presented images, and (3) each captcha must be solved within a few seconds. We collected 2.6 million captchas, with 2.6 million deformed Chinese phrases and over 21 million images. From these data, we constructed an association graph, composed of over 6 million vertices, and linked these vertices based on co-occurrence information and feature similarity between pairs of images. We then trained a deep convolutional neural network to learn a projection of the Chinese phrases onto a 230-dimensional latent space. Using label propagation, we computed the likelihood of each of the eight images conditioned on the latent space projection of the deformed phrase for each captcha. The resulting system solved captchas with 77% accuracy in 2 seconds on average. Our work, in answering this practical challenge, illustrates the power of this class of unsupervised association learning techniques, which may be related to the brain’s general strategy for associating language stimuli with visual objects on the principle of suspicious coincidence.
Tasks
Published 2017-05-22
URL http://arxiv.org/abs/1705.07768v1
PDF http://arxiv.org/pdf/1705.07768v1.pdf
PWC https://paperswithcode.com/paper/learning-to-associate-words-and-images-using
Repo
Framework

An Automated Text Categorization Framework based on Hyperparameter Optimization

Title An Automated Text Categorization Framework based on Hyperparameter Optimization
Authors Eric S. Tellez, Daniela Moctezuma, Sabino Miranda-Jímenez, Mario Graff
Abstract A great variety of text tasks such as topic or spam identification, user profiling, and sentiment analysis can be posed as a supervised learning problem and tackle using a text classifier. A text classifier consists of several subprocesses, some of them are general enough to be applied to any supervised learning problem, whereas others are specifically designed to tackle a particular task, using complex and computational expensive processes such as lemmatization, syntactic analysis, etc. Contrary to traditional approaches, we propose a minimalistic and wide system able to tackle text classification tasks independent of domain and language, namely microTC. It is composed by some easy to implement text transformations, text representations, and a supervised learning algorithm. These pieces produce a competitive classifier even in the domain of informally written text. We provide a detailed description of microTC along with an extensive experimental comparison with relevant state-of-the-art methods. mircoTC was compared on 30 different datasets. Regarding accuracy, microTC obtained the best performance in 20 datasets while achieves competitive results in the remaining 10. The compared datasets include several problems like topic and polarity classification, spam detection, user profiling and authorship attribution. Furthermore, it is important to state that our approach allows the usage of the technology even without knowledge of machine learning and natural language processing.
Tasks Hyperparameter Optimization, Lemmatization, Sentiment Analysis, Text Categorization, Text Classification
Published 2017-04-06
URL http://arxiv.org/abs/1704.01975v2
PDF http://arxiv.org/pdf/1704.01975v2.pdf
PWC https://paperswithcode.com/paper/an-automated-text-categorization-framework
Repo
Framework

Introspective Classification with Convolutional Nets

Title Introspective Classification with Convolutional Nets
Authors Long Jin, Justin Lazarow, Zhuowen Tu
Abstract We propose introspective convolutional networks (ICN) that emphasize the importance of having convolutional neural networks empowered with generative capabilities. We employ a reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. Our ICN tries to iteratively: (1) synthesize pseudo-negative samples; and (2) enhance itself by improving the classification. The single CNN classifier learned is at the same time generative — being able to directly synthesize new samples within its own discriminative model. We conduct experiments on benchmark datasets including MNIST, CIFAR-10, and SVHN using state-of-the-art CNN architectures, and observe improved classification results.
Tasks
Published 2017-04-25
URL http://arxiv.org/abs/1704.07816v2
PDF http://arxiv.org/pdf/1704.07816v2.pdf
PWC https://paperswithcode.com/paper/introspective-classification-with
Repo
Framework

Multi-Labelled Value Networks for Computer Go

Title Multi-Labelled Value Networks for Computer Go
Authors Ti-Rong Wu, I-Chen Wu, Guan-Wun Chen, Ting-han Wei, Tung-Yi Lai, Hung-Chun Wu, Li-Cheng Lan
Abstract This paper proposes a new approach to a novel value network architecture for the game Go, called a multi-labelled (ML) value network. In the ML value network, different values (win rates) are trained simultaneously for different settings of komi, a compensation given to balance the initiative of playing first. The ML value network has three advantages, (a) it outputs values for different komi, (b) it supports dynamic komi, and (c) it lowers the mean squared error (MSE). This paper also proposes a new dynamic komi method to improve game-playing strength. This paper also performs experiments to demonstrate the merits of the architecture. First, the MSE of the ML value network is generally lower than the value network alone. Second, the program based on the ML value network wins by a rate of 67.6% against the program based on the value network alone. Third, the program with the proposed dynamic komi method significantly improves the playing strength over the baseline that does not use dynamic komi, especially for handicap games. To our knowledge, up to date, no handicap games have been played openly by programs using value networks. This paper provides these programs with a useful approach to playing handicap games.
Tasks
Published 2017-05-30
URL http://arxiv.org/abs/1705.10701v1
PDF http://arxiv.org/pdf/1705.10701v1.pdf
PWC https://paperswithcode.com/paper/multi-labelled-value-networks-for-computer-go
Repo
Framework
comments powered by Disqus