January 30, 2020

3121 words 15 mins read

Paper Group ANR 414

Paper Group ANR 414

Analyzing Periodicity and Saliency for Adult Video Detection. A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots. Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models. A Hybrid Cooperative Co-evolution Algorithm Framework for Opti …

Analyzing Periodicity and Saliency for Adult Video Detection

Title Analyzing Periodicity and Saliency for Adult Video Detection
Authors Yizhi Liu, Xiaoyan Gu, Lei Huang, Junlin Ouyang, Miao Liao, Liangran Wu
Abstract Content-based adult video detection plays an important role in preventing pornography. However, existing methods usually rely on single modality and seldom focus on multi-modality semantics representation. Addressing at this problem, we put forward an approach of analyzing periodicity and saliency for adult video detection. At first, periodic patterns and salient regions are respective-ly analyzed in audio-frames and visual-frames. Next, the multi-modal co-occurrence semantics is described by combining audio periodicity with visual saliency. Moreover, the performance of our approach is evaluated step by step. Experimental results show that our approach obviously outper-forms some state-of-the-art methods.
Tasks
Published 2019-01-11
URL http://arxiv.org/abs/1901.03462v1
PDF http://arxiv.org/pdf/1901.03462v1.pdf
PWC https://paperswithcode.com/paper/analyzing-periodicity-and-saliency-for-adult
Repo
Framework

A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots

Title A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots
Authors Xueliang Zhao, Chongyang Tao, Wei Wu, Can Xu, Dongyan Zhao, Rui Yan
Abstract We present a document-grounded matching network (DGMN) for response selection that can power a knowledge-aware retrieval-based chatbot system. The challenges of building such a model lie in how to ground conversation contexts with background documents and how to recognize important information in the documents for matching. To overcome the challenges, DGMN fuses information in a document and a context into representations of each other, and dynamically determines if grounding is necessary and importance of different parts of the document and the context through hierarchical interaction with a response at the matching step. Empirical studies on two public data sets indicate that DGMN can significantly improve upon state-of-the-art methods and at the same time enjoys good interpretability.
Tasks Chatbot
Published 2019-06-11
URL https://arxiv.org/abs/1906.04362v1
PDF https://arxiv.org/pdf/1906.04362v1.pdf
PWC https://paperswithcode.com/paper/a-document-grounded-matching-network-for
Repo
Framework

Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models

Title Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Authors Kuan-Yu Chen, Che-Ping Tsai, Da-Rong Liu, Hung-Yi Lee, Lin-shan Lee
Abstract Producing a large annotated speech corpus for training ASR systems remains difficult for more than 95% of languages all over the world which are low-resourced, but collecting a relatively big unlabeled data set for such languages is more achievable. This is why some initial effort have been reported on completely unsupervised speech recognition learned from unlabeled data only, although with relatively high error rates. In this paper, we develop a Generative Adversarial Network (GAN) to achieve this purpose, in which a Generator and a Discriminator learn from each other iteratively to improve the performance. We further use a set of Hidden Markov Models (HMMs) iteratively refined from the machine generated labels to work in harmony with the GAN. The initial experiments on TIMIT data set achieve an phone error rate of 33.1%, which is 8.5% lower than the previous state-of-the-art.
Tasks Speech Recognition
Published 2019-04-08
URL https://arxiv.org/abs/1904.04100v3
PDF https://arxiv.org/pdf/1904.04100v3.pdf
PWC https://paperswithcode.com/paper/completely-unsupervised-phoneme-recognition-1
Repo
Framework

A Hybrid Cooperative Co-evolution Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters

Title A Hybrid Cooperative Co-evolution Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters
Authors Mehdi Neshat, Bradley Alexander, Markus Wagner
Abstract Wave energy technologies have the potential to play a significant role in the supply of renewable energy on a world scale. One of the most promising designs for wave energy converters (WECs) are fully submerged buoys. In this work, we explore the optimisation of WEC arrays consisting of a three-tether buoy model called CETO. Such arrays can be optimised for total energy output by adjusting both the relative positions of buoys in farms and also the power-take-off (PTO) parameters for each buoy. The search space for these parameters is complex and multi-modal. Moreover, the evaluation of each parameter setting is computationally expensive – limiting the number of full model evaluations that can be made. To handle this problem, we propose a new hybrid cooperative co-evolution algorithm (HCCA). HCCA consists of a symmetric local search plus Nelder-Mead and a cooperative co-evolution algorithm (CC) with a backtracking strategy for optimising the positions and PTO settings of WECs, respectively. Moreover, a new adaptive scenario is proposed for tuning grey wolf optimiser (AGWO) hyper-parameter. AGWO participates notably with other applied optimisers in HCCA. For assessing the effectiveness of the proposed approach five popular Evolutionary Algorithms (EAs), four alternating optimisation methods and two modern hybrid ideas (LS-NM and SLS-NM-B) are carefully compared in four real wave situations (Adelaide, Tasmania, Sydney and Perth) with two wave farm sizes (4 and 16). According to the experimental outcomes, the hybrid cooperative framework exhibits better performance in terms of both runtime and quality of obtained solutions.
Tasks
Published 2019-10-03
URL https://arxiv.org/abs/1910.01280v1
PDF https://arxiv.org/pdf/1910.01280v1.pdf
PWC https://paperswithcode.com/paper/a-hybrid-cooperative-co-evolution-algorithm
Repo
Framework

Bilingual Lexicon Induction through Unsupervised Machine Translation

Title Bilingual Lexicon Induction through Unsupervised Machine Translation
Authors Mikel Artetxe, Gorka Labaka, Eneko Agirre
Abstract A recent research line has obtained strong results on bilingual lexicon induction by aligning independently trained word embeddings in two languages and using the resulting cross-lingual embeddings to induce word translation pairs through nearest neighbor or related retrieval methods. In this paper, we propose an alternative approach to this problem that builds on the recent work on unsupervised machine translation. This way, instead of directly inducing a bilingual lexicon from cross-lingual embeddings, we use them to build a phrase-table, combine it with a language model, and use the resulting machine translation system to generate a synthetic parallel corpus, from which we extract the bilingual lexicon using statistical word alignment techniques. As such, our method can work with any word embedding and cross-lingual mapping technique, and it does not require any additional resource besides the monolingual corpus used to train the embeddings. When evaluated on the exact same cross-lingual embeddings, our proposed method obtains an average improvement of 6 accuracy points over nearest neighbor and 4 points over CSLS retrieval, establishing a new state-of-the-art in the standard MUSE dataset.
Tasks Language Modelling, Machine Translation, Unsupervised Machine Translation, Word Alignment, Word Embeddings
Published 2019-07-24
URL https://arxiv.org/abs/1907.10761v1
PDF https://arxiv.org/pdf/1907.10761v1.pdf
PWC https://paperswithcode.com/paper/bilingual-lexicon-induction-through
Repo
Framework

Generating the support with extreme value losses

Title Generating the support with extreme value losses
Authors Nicholas Guttenberg
Abstract When optimizing against the mean loss over a distribution of predictions in the context of a regression task, then even if there is a distribution of targets the optimal prediction distribution is always a delta function at a single value. Methods of constructing generative models need to overcome this tendency. We consider a simple method of summarizing the prediction error, such that the optimal strategy corresponds to outputting a distribution of predictions with a support that matches the support of the distribution of targets — optimizing against the minimal value of the loss given a set of samples from the prediction distribution, rather than the mean. We show that models trained against this loss learn to capture the support of the target distribution and, when combined with an auxiliary classifier-like prediction task, can be projected via rejection sampling to reproduce the full distribution of targets. The resulting method works well compared to other generative modeling approaches particularly in low dimensional spaces with highly non-trivial distributions, due to mode collapse solutions being globally suboptimal with respect to the extreme value loss. However, the method is less suited to high-dimensional spaces such as images due to the scaling of the number of samples needed in order to accurately estimate the extreme value loss when the dimension of the data manifold becomes large.
Tasks
Published 2019-02-08
URL http://arxiv.org/abs/1902.02940v1
PDF http://arxiv.org/pdf/1902.02940v1.pdf
PWC https://paperswithcode.com/paper/generating-the-support-with-extreme-value
Repo
Framework

Differentiable programming and its applications to dynamical systems

Title Differentiable programming and its applications to dynamical systems
Authors Adrián Hernández, José M. Amigó
Abstract Differentiable programming is the combination of classical neural networks modules with algorithmic ones in an end-to-end differentiable model. These new models, that use automatic differentiation to calculate gradients, have new learning capabilities (reasoning, attention and memory). In this tutorial, aimed at researchers in nonlinear systems with prior knowledge of deep learning, we present this new programming paradigm, describe some of its new features such as attention mechanisms, and highlight the benefits they bring. Then, we analyse the uses and limitations of traditional deep learning models in the modeling and prediction of dynamical systems. Here, a dynamical system is meant to be a set of state variables that evolve in time under general internal and external interactions. Finally, we review the advantages and applications of differentiable programming to dynamical systems.
Tasks
Published 2019-12-17
URL https://arxiv.org/abs/1912.08168v1
PDF https://arxiv.org/pdf/1912.08168v1.pdf
PWC https://paperswithcode.com/paper/differentiable-programming-and-its
Repo
Framework

Locomotion and gesture tracking in mice and small animals for neurosceince applications: A survey

Title Locomotion and gesture tracking in mice and small animals for neurosceince applications: A survey
Authors Waseem Abbas, David Masip Rodo
Abstract Neuroscience has traditionally relied on manually observing lab animals in controlled environments. Researchers usually record animals behaving in free or restrained manner and then annotate the data manually. The manual annotation is not desirable for three reasons; one, it is time consuming, two, it is prone to human errors and three, no two human annotators will 100% agree on annotation, so it is not reproducible. Consequently, automated annotation of such data has gained traction because it is efficient and replicable. Usually, the automatic annotation of neuroscience data relies on computer vision and machine leaning techniques. In this article, we have covered most of the approaches taken by researchers for locomotion and gesture tracking of lab animals. We have divided these papers in categories based upon the hardware they use and the software approach they take. We also have summarized their strengths and weaknesses.
Tasks
Published 2019-03-25
URL http://arxiv.org/abs/1903.10422v1
PDF http://arxiv.org/pdf/1903.10422v1.pdf
PWC https://paperswithcode.com/paper/locomotion-and-gesture-tracking-in-mice-and
Repo
Framework

A Quest for Structure: Jointly Learning the Graph Structure and Semi-Supervised Classification

Title A Quest for Structure: Jointly Learning the Graph Structure and Semi-Supervised Classification
Authors Xuan Wu, Lingxiao Zhao, Leman Akoglu
Abstract Semi-supervised learning (SSL) is effectively used for numerous classification problems, thanks to its ability to make use of abundant unlabeled data. The main assumption of various SSL algorithms is that the nearby points on the data manifold are likely to share a label. Graph-based SSL constructs a graph from point-cloud data as an approximation to the underlying manifold, followed by label inference. It is no surprise that the quality of the constructed graph in capturing the essential structure of the data is critical to the accuracy of the subsequent inference step [6]. How should one construct a graph from the input point-cloud data for graph-based SSL? In this work we introduce a new, parallel graph learning framework (called PG-learn) for the graph construction step of SSL. Our solution has two main ingredients: (1) a gradient-based optimization of the edge weights (more specifically, different kernel bandwidths in each dimension) based on a validation loss function, and (2) a parallel hyperparameter search algorithm with an adaptive resource allocation scheme. In essence, (1) allows us to search around a (random) initial hyperparameter configuration for a better one with lower validation loss. Since the search space of hyperparameters is huge for high-dimensional problems, (2) empowers our gradient-based search to go through as many different initial configurations as possible, where runs for relatively unpromising starting configurations are terminated early to allocate the time for others. As such, PG-learn is a carefully-designed hybrid of random and adaptive search. Through experiments on multi-class classification problems, we show that PG-learn significantly outperforms a variety of existing graph construction schemes in accuracy (per fixed time budget for hyperparameter tuning), and scales more effectively to high dimensional problems.
Tasks graph construction
Published 2019-09-26
URL https://arxiv.org/abs/1909.12385v1
PDF https://arxiv.org/pdf/1909.12385v1.pdf
PWC https://paperswithcode.com/paper/a-quest-for-structure-jointly-learning-the
Repo
Framework

Deep Convolutional Generative Adversarial Networks Based Flame Detection in Video

Title Deep Convolutional Generative Adversarial Networks Based Flame Detection in Video
Authors Süleyman Aslan, Uğur Güdükbay, B. Uğur Töreyin, A. Enis Çetin
Abstract Real-time flame detection is crucial in video based surveillance systems. We propose a vision-based method to detect flames using Deep Convolutional Generative Adversarial Neural Networks (DCGANs). Many existing supervised learning approaches using convolutional neural networks do not take temporal information into account and require substantial amount of labeled data. In order to have a robust representation of sequences with and without flame, we propose a two-stage training of a DCGAN exploiting spatio-temporal flame evolution. Our training framework includes the regular training of a DCGAN with real spatio-temporal images, namely, temporal slice images, and noise vectors, and training the discriminator separately using the temporal flame images without the generator. Experimental results show that the proposed method effectively detects flame in video with negligible false positive rates in real-time.
Tasks
Published 2019-02-05
URL http://arxiv.org/abs/1902.01824v1
PDF http://arxiv.org/pdf/1902.01824v1.pdf
PWC https://paperswithcode.com/paper/deep-convolutional-generative-adversarial-1
Repo
Framework

InSphereNet: a Concise Representation and Classification Method for 3D Object

Title InSphereNet: a Concise Representation and Classification Method for 3D Object
Authors Hui Cao, Haikuan Du, Siyu Zhang, Shen Cai
Abstract In this paper, we present an InSphereNet method for the problem of 3D object classification. Unlike previous methods that use points, voxels, or multi-view images as inputs of deep neural network (DNN), the proposed method constructs a class of more representative features named infilling spheres from signed distance field (SDF). Because of the admirable spatial representation of infilling spheres, we can not only utilize very fewer number of spheres to accomplish classification task, but also design a lightweight InSphereNet with less layers and parameters than previous methods. Experiments on ModelNet40 show that the proposed method leads to superior performance than PointNet and PointNet++ in accuracy. In particular, if there are only a few dozen sphere inputs or about 100000 DNN parameters, the accuracy of our method remains at a very high level (over 88%). This further validates the conciseness and effectiveness of the proposed InSphere 3D representation. Keywords: 3D object classification , signed distance field , deep learning , infilling sphere
Tasks 3D Object Classification, Object Classification
Published 2019-12-25
URL https://arxiv.org/abs/1912.11606v2
PDF https://arxiv.org/pdf/1912.11606v2.pdf
PWC https://paperswithcode.com/paper/inspherenet-a-concise-representation-and
Repo
Framework

Addressing the Sim2Real Gap in Robotic 3D Object Classification

Title Addressing the Sim2Real Gap in Robotic 3D Object Classification
Authors Jean-Baptiste Weibel, Timothy Patten, Markus Vincze
Abstract Object classification with 3D data is an essential component of any scene understanding method. It has gained significant interest in a variety of communities, most notably in robotics and computer graphics. While the advent of deep learning has progressed the field of 3D object classification, most work using this data type are solely evaluated on CAD model datasets. Consequently, current work does not address the discrepancies existing between real and artificial data. In this work, we examine this gap in a robotic context by specifically addressing the problem of classification when transferring from artificial CAD models to real reconstructed objects. This is performed by training on ModelNet (CAD models) and evaluating on ScanNet (reconstructed objects). We show that standard methods do not perform well in this task. We thus introduce a method that carefully samples object parts that are reproducible under various transformations and hence robust. Using graph convolution to classify the composed graph of parts, our method significantly improves upon the baseline.
Tasks 3D Object Classification, Object Classification, Scene Understanding
Published 2019-10-28
URL https://arxiv.org/abs/1910.12585v1
PDF https://arxiv.org/pdf/1910.12585v1.pdf
PWC https://paperswithcode.com/paper/addressing-the-sim2real-gap-in-robotic-3d
Repo
Framework

Automated identification of neural cells in the multi-photon images using deep-neural networks

Title Automated identification of neural cells in the multi-photon images using deep-neural networks
Authors Si-Baek Seong, Hae-Jeong Park
Abstract The advancement of the neuroscientific imaging techniques has produced an unprecedented size of neural cell imaging data, which calls for automated processing. In particular, identification of cells from two photon images demands segmentation of neural cells out of various materials and classification of the segmented cells according to their cell types. To automatically segment neural cells, we used U-Net model, followed by classification of excitatory and inhibitory neurons and glia cells using a transfer learning technique. For transfer learning, we tested three public models of resnet18, resnet50 and inceptionv3, after replacing the fully connected layer with that for three classes. The best classification performance was found for the model with inceptionv3. The proposed application of deep learning technique is expected to provide a critical way to cell identification in the era of big neuroscience data.
Tasks Transfer Learning
Published 2019-09-25
URL https://arxiv.org/abs/1909.11269v1
PDF https://arxiv.org/pdf/1909.11269v1.pdf
PWC https://paperswithcode.com/paper/automated-identification-of-neural-cells-in
Repo
Framework

Indirect Local Attacks for Context-aware Semantic Segmentation Networks

Title Indirect Local Attacks for Context-aware Semantic Segmentation Networks
Authors Krishna Kanth Nakka, Mathieu Salzmann
Abstract Recently, deep networks have achieved impressive semantic segmentation performance, in particular thanks to their use of larger contextual information. In this paper, we show that the resulting networks are sensitive not only to global attacks, where perturbations affect the entire input image, but also to indirect local attacks where perturbations are confined to a small image region that does not overlap with the area that we aim to fool. To this end, we introduce several indirect attack strategies, including adaptive local attacks, aiming to find the best image location to perturb, and universal local attacks. Furthermore, we propose attack detection techniques both for the global image level and to obtain a pixel-wise localization of the fooled regions. Our results are unsettling: Because they exploit a larger context, more accurate semantic segmentation networks are more sensitive to indirect local attacks.
Tasks Semantic Segmentation
Published 2019-11-29
URL https://arxiv.org/abs/1911.13038v2
PDF https://arxiv.org/pdf/1911.13038v2.pdf
PWC https://paperswithcode.com/paper/indirect-local-attacks-for-context-aware
Repo
Framework

Paracoherent Answer Set Semantics meets Argumentation Frameworks

Title Paracoherent Answer Set Semantics meets Argumentation Frameworks
Authors Giovanni Amendola, Francesco Ricca
Abstract In the last years, abstract argumentation has met with great success in AI, since it has served to capture several non-monotonic logics for AI. Relations between argumentation framework (AF) semantics and logic programming ones are investigating more and more. In particular, great attention has been given to the well-known stable extensions of an AF, that are closely related to the answer sets of a logic program. However, if a framework admits a small incoherent part, no stable extension can be provided. To overcome this shortcoming, two semantics generalizing stable extensions have been studied, namely semi-stable and stage. In this paper, we show that another perspective is possible on incoherent AFs, called paracoherent extensions, as they have a counterpart in paracoherent answer set semantics. We compare this perspective with semi-stable and stage semantics, by showing that computational costs remain unchanged, and moreover an interesting symmetric behaviour is maintained. Under consideration for acceptance in TPLP.
Tasks Abstract Argumentation
Published 2019-07-22
URL https://arxiv.org/abs/1907.09426v1
PDF https://arxiv.org/pdf/1907.09426v1.pdf
PWC https://paperswithcode.com/paper/paracoherent-answer-set-semantics-meets
Repo
Framework
comments powered by Disqus