October 17, 2019

3174 words 15 mins read

Paper Group ANR 920

Paper Group ANR 920

Multi-objective evolution for 3D RTS Micro. Copy Move Forgery using Hus Invariant Moments and Log Polar Transformations. Robust Multi-subspace Analysis Using Novel Column L0-norm Constrained Matrix Factorization. A Fog Robotic System for Dynamic Visual Servoing. High-resolution medical image synthesis using progressively grown generative adversaria …

Multi-objective evolution for 3D RTS Micro

Title Multi-objective evolution for 3D RTS Micro
Authors Sushil J. Louis, Siming Liu
Abstract We attack the problem of controlling teams of autonomous units during skirmishes in real-time strategy games. Earlier work had shown promise in evolving control algorithm parameters that lead to high performance team behaviors similar to those favored by good human players in real-time strategy games like Starcraft. This algorithm specifically encoded parameterized kiting and fleeing behaviors and the genetic algorithm evolved these parameter values. In this paper we investigate using influence maps and potential fields alone to compactly represent and control real-time team behavior for entities that can maneuver in three dimensions. A two-objective fitness function that maximizes damage done and minimizes damage taken guides our multi-objective evolutionary algorithm. Preliminary results indicate that evolving friend and enemy unit potential field parameters for distance, weapon characteristics, and entity health suffice to produce complex, high performing, three-dimensional, team tactics.
Tasks Real-Time Strategy Games, Starcraft
Published 2018-03-08
URL http://arxiv.org/abs/1803.02943v1
PDF http://arxiv.org/pdf/1803.02943v1.pdf
PWC https://paperswithcode.com/paper/multi-objective-evolution-for-3d-rts-micro
Repo
Framework

Copy Move Forgery using Hus Invariant Moments and Log Polar Transformations

Title Copy Move Forgery using Hus Invariant Moments and Log Polar Transformations
Authors Tejas K, Swathi C, Rajesh Kumar M
Abstract With the increase in interchange of data, there is a growing necessity of security. Considering the volumes of digital data that is transmitted, they are in need to be secure. Among the many forms of tampering possible, one widespread technique is Copy Move Forgery CMF. This forgery occurs when parts of the image are copied and duplicated elsewhere in the same image. There exist a number of algorithms to detect such a forgery in which the primary step involved is feature extraction. The feature extraction techniques employed must have lesser time and space complexity involved for an efficient and faster processing of media. Also, majority of the existing state of art techniques often tend to falsely match similar genuine objects as copy move forged during the detection process. To tackle these problems, the paper proposes a novel algorithm that recognizes a unique approach of using Hus Invariant Moments and Log polar Transformations to reduce feature vector dimension to one feature per block simultaneously detecting CMF among genuine similar objects in an image. The qualitative and quantitative results obtained demonstrate the effectiveness of this algorithm.
Tasks
Published 2018-06-07
URL http://arxiv.org/abs/1806.02907v1
PDF http://arxiv.org/pdf/1806.02907v1.pdf
PWC https://paperswithcode.com/paper/copy-move-forgery-using-hus-invariant-moments
Repo
Framework

Robust Multi-subspace Analysis Using Novel Column L0-norm Constrained Matrix Factorization

Title Robust Multi-subspace Analysis Using Novel Column L0-norm Constrained Matrix Factorization
Authors Binghui Wang, Chuang Lin
Abstract We study the underlying structure of data (approximately) generated from a union of independent subspaces. Traditional methods learn only one subspace, failing to discover the multi-subspace structure, while state-of-the-art methods analyze the multi-subspace structure using data themselves as the dictionary, which cannot offer the explicit basis to span each subspace and are sensitive to errors via an indirect representation. Additionally, they also suffer from a high computational complexity, being quadratic or cubic to the sample size. To tackle all these problems, we propose a method, called Matrix Factorization with Column L0-norm constraint (MFC0), that can simultaneously learn the basis for each subspace, generate a direct sparse representation for each data sample, as well as removing errors in the data in an efficient way. Furthermore, we develop a first-order alternating direction algorithm, whose computational complexity is linear to the sample size, to stably and effectively solve the nonconvex objective function and non- smooth l0-norm constraint of MFC0. Experimental results on both synthetic and real-world datasets demonstrate that besides the superiority over traditional and state-of-the-art methods for subspace clustering, data reconstruction, error correction, MFC0 also shows its uniqueness for multi-subspace basis learning and direct sparse representation.
Tasks
Published 2018-01-27
URL http://arxiv.org/abs/1801.09111v1
PDF http://arxiv.org/pdf/1801.09111v1.pdf
PWC https://paperswithcode.com/paper/robust-multi-subspace-analysis-using-novel
Repo
Framework

A Fog Robotic System for Dynamic Visual Servoing

Title A Fog Robotic System for Dynamic Visual Servoing
Authors Nan Tian, Jinfa Chen, Mas Ma, Robert Zhang, Bill Huang, Ken Goldberg, Somayeh Sojoudi
Abstract Cloud Robotics is a paradigm where distributed robots are connected to cloud services via networks to access unlimited computation power, at the cost of network communication. However, due to limitations such as network latency and variability, it is difficult to control dynamic, human compliant service robots directly from the cloud. In this work, by leveraging asynchronous protocol with a heartbeat signal, we combine cloud robotics with a smart edge device to build a Fog Robotic system. We use the system to enable robust teleoperation of a dynamic self-balancing robot from the cloud. We first use the system to pick up boxes from static locations, a task commonly performed in warehouse logistics. To make cloud teleoperation more efficient, we deploy image based visual servoing (IBVS) to perform box pickups automatically. Visual feedbacks, including apriltag recognition and tracking, are performed in the cloud to emulate a Fog Robotic object recognition system for IBVS. We demonstrate the feasibility of real-time dynamic automation system using this cloud-edge hybrid, which opens up possibilities of deploying dynamic robotic control with deep-learning recognition systems in Fog Robotics. Finally, we show that Fog Robotics enables the self-balancing service robot to pick up a box automatically from a person under unstructured environments.
Tasks Object Recognition
Published 2018-09-16
URL http://arxiv.org/abs/1809.06716v1
PDF http://arxiv.org/pdf/1809.06716v1.pdf
PWC https://paperswithcode.com/paper/a-fog-robotic-system-for-dynamic-visual
Repo
Framework

High-resolution medical image synthesis using progressively grown generative adversarial networks

Title High-resolution medical image synthesis using progressively grown generative adversarial networks
Authors Andrew Beers, James Brown, Ken Chang, J. Peter Campbell, Susan Ostmo, Michael F. Chiang, Jayashree Kalpathy-Cramer
Abstract Generative adversarial networks (GANs) are a class of unsupervised machine learning algorithms that can produce realistic images from randomly-sampled vectors in a multi-dimensional space. Until recently, it was not possible to generate realistic high-resolution images using GANs, which has limited their applicability to medical images that contain biomarkers only detectable at native resolution. Progressive growing of GANs is an approach wherein an image generator is trained to initially synthesize low resolution synthetic images (8x8 pixels), which are then fed to a discriminator that distinguishes these synthetic images from real downsampled images. Additional convolutional layers are then iteratively introduced to produce images at twice the previous resolution until the desired resolution is reached. In this work, we demonstrate that this approach can produce realistic medical images in two different domains; fundus photographs exhibiting vascular pathology associated with retinopathy of prematurity (ROP), and multi-modal magnetic resonance images of glioma. We also show that fine-grained details associated with pathology, such as retinal vessels or tumor heterogeneity, can be preserved and enhanced by including segmentation maps as additional channels. We envisage several applications of the approach, including image augmentation and unsupervised classification of pathology.
Tasks Image Augmentation, Image Generation
Published 2018-05-08
URL http://arxiv.org/abs/1805.03144v2
PDF http://arxiv.org/pdf/1805.03144v2.pdf
PWC https://paperswithcode.com/paper/high-resolution-medical-image-synthesis-using
Repo
Framework

Segmentation Analysis in Human Centric Cyber-Physical Systems using Graphical Lasso

Title Segmentation Analysis in Human Centric Cyber-Physical Systems using Graphical Lasso
Authors Hari Prasanna Das, Ioannis C. Konstantakopoulos, Aummul Baneen Manasawala, Tanya Veeravalli, Huihan Liu, Costas J. Spanos
Abstract A generalized gamification framework is introduced as a form of smart infrastructure with potential to improve sustainability and energy efficiency by leveraging humans-in-the-loop strategy. The proposed framework enables a Human-Centric Cyber-Physical System using an interface to allow building managers to interact with occupants. The interface is designed for occupant engagement-integration supporting learning of their preferences over resources in addition to understanding how preferences change as a function of external stimuli such as physical control, time or incentives. Towards intelligent and autonomous incentive design, a noble statistical learning algorithm performing occupants energy usage behavior segmentation is proposed. We apply the proposed algorithm, Graphical Lasso, on energy resource usage data by the occupants to obtain feature correlations–dependencies. Segmentation analysis results in characteristic clusters demonstrating different energy usage behaviors. The features–factors characterizing human decision-making are made explainable.
Tasks Decision Making
Published 2018-10-24
URL http://arxiv.org/abs/1810.10533v2
PDF http://arxiv.org/pdf/1810.10533v2.pdf
PWC https://paperswithcode.com/paper/segmentation-analysis-in-human-centric-cyber
Repo
Framework

LGLG-WPCA: An Effective Texture-based Method for Face Recognition

Title LGLG-WPCA: An Effective Texture-based Method for Face Recognition
Authors Chaorong Li, Huang Wei, Huafu Chen
Abstract In this paper, we proposed an effective face feature extraction method by Learning Gabor Log-Euclidean Gaussian with Whitening Principal Component Analysis (WPCA), called LGLG-WPCA. The proposed method learns face features from the embedded multivariate Gaussian in Gabor wavelet domain; it has the robust performance to adverse conditions such as varying poses, skin aging and uneven illumination. Because the space of Gaussian is a Riemannian manifold and it is difficult to incorporate learning mechanism in the model. To address this issue, we use L2EMG to map the multidimensional Gaussian model to the linear space, and then use WPCA to learn face features. We also implemented the key-point-based version of LGLG-WPCA, called LGLG(KP)-WPCA. Experiments show the proposed methods are effective and promising for face texture feature extraction and the combination of the feature of the proposed methods and the features of Deep Convolutional Network (DCNN) achieved the best recognition accuracies on FERET database compared to the state-of-the-art methods. In the next version of this paper, we will test the performance of the proposed methods on the large-varying pose databases.
Tasks Face Recognition
Published 2018-11-20
URL https://arxiv.org/abs/1811.08345v4
PDF https://arxiv.org/pdf/1811.08345v4.pdf
PWC https://paperswithcode.com/paper/lglg-wpca-an-effective-texture-based-method
Repo
Framework

Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning

Title Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning
Authors Guillaume Rabusseau, Tianyu Li, Doina Precup
Abstract In this paper, we unravel a fundamental connection between weighted finite automata~(WFAs) and second-order recurrent neural networks~(2-RNNs): in the case of sequences of discrete symbols, WFAs and 2-RNNs with linear activation functions are expressively equivalent. Motivated by this result, we build upon a recent extension of the spectral learning algorithm to vector-valued WFAs and propose the first provable learning algorithm for linear 2-RNNs defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the so-called Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed method are assessed in a simulation study.
Tasks
Published 2018-07-04
URL http://arxiv.org/abs/1807.01406v2
PDF http://arxiv.org/pdf/1807.01406v2.pdf
PWC https://paperswithcode.com/paper/connecting-weighted-automata-and-recurrent
Repo
Framework

WaveCycleGAN: Synthetic-to-natural speech waveform conversion using cycle-consistent adversarial networks

Title WaveCycleGAN: Synthetic-to-natural speech waveform conversion using cycle-consistent adversarial networks
Authors Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka
Abstract We propose a learning-based filter that allows us to directly modify a synthetic speech waveform into a natural speech waveform. Speech-processing systems using a vocoder framework such as statistical parametric speech synthesis and voice conversion are convenient especially for a limited number of data because it is possible to represent and process interpretable acoustic features over a compact space, such as the fundamental frequency (F0) and mel-cepstrum. However, a well-known problem that leads to the quality degradation of generated speech is an over-smoothing effect that eliminates some detailed structure of generated/converted acoustic features. To address this issue, we propose a synthetic-to-natural speech waveform conversion technique that uses cycle-consistent adversarial networks and which does not require any explicit assumption about speech waveform in adversarial learning. In contrast to current techniques, since our modification is performed at the waveform level, we expect that the proposed method will also make it possible to generate `vocoder-less’ sounding speech even if the input speech is synthesized using a vocoder framework. The experimental results demonstrate that our proposed method can 1) alleviate the over-smoothing effect of the acoustic features despite the direct modification method used for the waveform and 2) greatly improve the naturalness of the generated speech sounds. |
Tasks Speech Synthesis, Voice Conversion
Published 2018-09-25
URL http://arxiv.org/abs/1809.10288v2
PDF http://arxiv.org/pdf/1809.10288v2.pdf
PWC https://paperswithcode.com/paper/wavecyclegan-synthetic-to-natural-speech
Repo
Framework

MVPNet: Multi-View Point Regression Networks for 3D Object Reconstruction from A Single Image

Title MVPNet: Multi-View Point Regression Networks for 3D Object Reconstruction from A Single Image
Authors Jinglu Wang, Bo Sun, Yan Lu
Abstract In this paper, we address the problem of reconstructing an object’s surface from a single image using generative networks. First, we represent a 3D surface with an aggregation of dense point clouds from multiple views. Each point cloud is embedded in a regular 2D grid aligned on an image plane of a viewpoint, making the point cloud convolution-favored and ordered so as to fit into deep network architectures. The point clouds can be easily triangulated by exploiting connectivities of the 2D grids to form mesh-based surfaces. Second, we propose an encoder-decoder network that generates such kind of multiple view-dependent point clouds from a single image by regressing their 3D coordinates and visibilities. We also introduce a novel geometric loss that is able to interpret discrepancy over 3D surfaces as opposed to 2D projective planes, resorting to the surface discretization on the constructed meshes. We demonstrate that the multi-view point regression network outperforms state-of-the-art methods with a significant improvement on challenging datasets.
Tasks 3D Object Reconstruction, 3D Object Reconstruction From A Single Image, Object Reconstruction
Published 2018-11-23
URL http://arxiv.org/abs/1811.09410v1
PDF http://arxiv.org/pdf/1811.09410v1.pdf
PWC https://paperswithcode.com/paper/mvpnet-multi-view-point-regression-networks
Repo
Framework

Near-drowning Early Prediction Technique Using Novel Equations (NEPTUNE) for Swimming Pools

Title Near-drowning Early Prediction Technique Using Novel Equations (NEPTUNE) for Swimming Pools
Authors Bhaskaran David Prakash
Abstract Safety is a critical aspect in all swimming pools. This paper describes a near drowning early prediction technique using novel equations (NEPTUNE). NEPTUNE uses equations or rules that would be able to detect near drowning using at least 1 but not more than 5 seconds of video sequence with no false positives. The backbone of NEPTUNE encompasses a mix of statistical image processing to merge images for a video sequence followed by K means clustering to extract segments in the merged image and finally a revisit to statistical image processing to derive variables for every segment. These variables would be used by the equations to identify near-drowning. NEPTUNE has the potential to be integrated into a swimming pool camera system that would send an alarm to the lifeguards for early response so that the likelihood of recovery is high.
Tasks
Published 2018-05-07
URL http://arxiv.org/abs/1805.02530v3
PDF http://arxiv.org/pdf/1805.02530v3.pdf
PWC https://paperswithcode.com/paper/near-drowning-early-prediction-technique
Repo
Framework

Dynamic Social Interaction Mechanics CrossAnt

Title Dynamic Social Interaction Mechanics CrossAnt
Authors Samuel Gomes, Carlos Martinho, João Dias
Abstract Nowadays, big effort is being put to study gamification and how game elements can be used to engage players. In this scope, we believe there is a growing need to explore the impact game mechanics have on the players’ interactions and perception. This work focuses on the application of game mechanics to lead players to achieve certain types of social interaction (we named this type of mechanics social interaction mechanics). A word matching game called CrossAnt was modified so that it could dynamically generate different social interaction mechanics. These mechanics consisted in different key combinations needed to play the game and were aimed to promote what we think are three important types of social interactions: cooperation, competition and individual exploration. Our evaluation consisted on the execution of several sessions where two players interacted with the game for several levels and had to find for themselves how to perform the actions needed to succeed. While some of the levels required the input from both players in order to be completed, others could be completed by each player independently. Our results show that cooperation was perceived when both players had to intervene to perform the game actions. However, longer interactions may still be needed so that the other types of interactions are promoted.
Tasks
Published 2018-11-17
URL http://arxiv.org/abs/1811.07243v2
PDF http://arxiv.org/pdf/1811.07243v2.pdf
PWC https://paperswithcode.com/paper/dynamic-interaction-mechanics-crossant
Repo
Framework

LDOP: Local Directional Order Pattern for Robust Face Retrieval

Title LDOP: Local Directional Order Pattern for Robust Face Retrieval
Authors Shiv Ram Dubey, Snehasis Mukherjee
Abstract The local descriptors have gained wide range of attention due to their enhanced discriminative abilities. It has been proved that the consideration of multi-scale local neighborhood improves the performance of the descriptor, though at the cost of increased dimension. This paper proposes a novel method to construct a local descriptor using multi-scale neighborhood by finding the local directional order among the intensity values at different scales in a particular direction. Local directional order is the multi-radius relationship factor in a particular direction. The proposed local directional order pattern (LDOP) for a particular pixel is computed by finding the relationship between the center pixel and local directional order indexes. It is required to transform the center value into the range of neighboring orders. Finally, the histogram of LDOP is computed over whole image to construct the descriptor. In contrast to the state-of-the-art descriptors, the dimension of the proposed descriptor does not depend upon the number of neighbors involved to compute the order; it only depends upon the number of directions. The introduced descriptor is evaluated over the image retrieval framework and compared with the state-of-the-art descriptors over challenging face databases such as PaSC, LFW, PubFig, FERET, AR, AT&T, and ExtendedYale. The experimental results confirm the superiority and robustness of the LDOP descriptor.
Tasks Image Retrieval
Published 2018-02-28
URL https://arxiv.org/abs/1803.07441v3
PDF https://arxiv.org/pdf/1803.07441v3.pdf
PWC https://paperswithcode.com/paper/ldop-local-directional-order-pattern-for
Repo
Framework

Classification of EEG Signal based on non-Gaussian Neutral Vector

Title Classification of EEG Signal based on non-Gaussian Neutral Vector
Authors Zhanyu Ma
Abstract In the design of brain-computer interface systems, classification of Electroencephalogram (EEG) signals is the essential part and a challenging task. Recently, as the marginalized discrete wavelet transform (mDWT) representations can reveal features related to the transient nature of the EEG signals, the mDWT coefficients have been frequently used in EEG signal classification. In our previous work, we have proposed a super-Dirichlet distribution-based classifier, which utilized the nonnegative and sum-to-one properties of the mDWT coefficients. The proposed classifier performed better than the state-of-the-art support vector machine-based classifier. In this paper, we further study the neutrality of the mDWT coefficients. Assuming the mDWT vector coefficients to be a neutral vector, we transform them non-linearly into a set of independent scalar coefficients. Feature selection strategy is proposed on the transformed feature domain. Experimental results show that the feature selection strategy helps improving the classification accuracy.
Tasks EEG, Feature Selection
Published 2018-08-02
URL https://arxiv.org/abs/1808.00814v2
PDF https://arxiv.org/pdf/1808.00814v2.pdf
PWC https://paperswithcode.com/paper/classification-of-eeg-signal-based-on-non
Repo
Framework

Sources of Complexity in Semantic Frame Parsing for Information Extraction

Title Sources of Complexity in Semantic Frame Parsing for Information Extraction
Authors Gabriel Marzinotto, Frédéric Béchet, Géraldine Damnati, Alexis Nasr
Abstract This paper describes a Semantic Frame parsing System based on sequence labeling methods, precisely BiLSTM models with highway connections, for performing information extraction on a corpus of French encyclopedic history texts annotated according to the Berkeley FrameNet formalism. The approach proposed in this study relies on an integrated sequence labeling model which jointly optimizes frame identification and semantic role segmentation and identification. The purpose of this study is to analyze the task complexity, to highlight the factors that make Semantic Frame parsing a difficult task and to provide detailed evaluations of the performance on different types of frames and sentences.
Tasks
Published 2018-12-21
URL http://arxiv.org/abs/1812.09193v1
PDF http://arxiv.org/pdf/1812.09193v1.pdf
PWC https://paperswithcode.com/paper/sources-of-complexity-in-semantic-frame
Repo
Framework
comments powered by Disqus