January 30, 2020

3121 words 15 mins read

Paper Group ANR 414

Analyzing Periodicity and Saliency for Adult Video Detection. A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots. Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models. A Hybrid Cooperative Co-evolution Algorithm Framework for Opti …

Analyzing Periodicity and Saliency for Adult Video Detection


Title	Analyzing Periodicity and Saliency for Adult Video Detection
Authors	Yizhi Liu, Xiaoyan Gu, Lei Huang, Junlin Ouyang, Miao Liao, Liangran Wu
Abstract	Content-based adult video detection plays an important role in preventing pornography. However, existing methods usually rely on single modality and seldom focus on multi-modality semantics representation. Addressing at this problem, we put forward an approach of analyzing periodicity and saliency for adult video detection. At first, periodic patterns and salient regions are respective-ly analyzed in audio-frames and visual-frames. Next, the multi-modal co-occurrence semantics is described by combining audio periodicity with visual saliency. Moreover, the performance of our approach is evaluated step by step. Experimental results show that our approach obviously outper-forms some state-of-the-art methods.
Tasks
Published	2019-01-11
URL	http://arxiv.org/abs/1901.03462v1
PDF	http://arxiv.org/pdf/1901.03462v1.pdf
PWC	https://paperswithcode.com/paper/analyzing-periodicity-and-saliency-for-adult
Repo
Framework

A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots


Title	A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots
Authors	Xueliang Zhao, Chongyang Tao, Wei Wu, Can Xu, Dongyan Zhao, Rui Yan
Abstract	We present a document-grounded matching network (DGMN) for response selection that can power a knowledge-aware retrieval-based chatbot system. The challenges of building such a model lie in how to ground conversation contexts with background documents and how to recognize important information in the documents for matching. To overcome the challenges, DGMN fuses information in a document and a context into representations of each other, and dynamically determines if grounding is necessary and importance of different parts of the document and the context through hierarchical interaction with a response at the matching step. Empirical studies on two public data sets indicate that DGMN can significantly improve upon state-of-the-art methods and at the same time enjoys good interpretability.
Tasks	Chatbot
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04362v1
PDF	https://arxiv.org/pdf/1906.04362v1.pdf
PWC	https://paperswithcode.com/paper/a-document-grounded-matching-network-for
Repo
Framework

Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models


Title	Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Authors	Kuan-Yu Chen, Che-Ping Tsai, Da-Rong Liu, Hung-Yi Lee, Lin-shan Lee
Abstract	Producing a large annotated speech corpus for training ASR systems remains difficult for more than 95% of languages all over the world which are low-resourced, but collecting a relatively big unlabeled data set for such languages is more achievable. This is why some initial effort have been reported on completely unsupervised speech recognition learned from unlabeled data only, although with relatively high error rates. In this paper, we develop a Generative Adversarial Network (GAN) to achieve this purpose, in which a Generator and a Discriminator learn from each other iteratively to improve the performance. We further use a set of Hidden Markov Models (HMMs) iteratively refined from the machine generated labels to work in harmony with the GAN. The initial experiments on TIMIT data set achieve an phone error rate of 33.1%, which is 8.5% lower than the previous state-of-the-art.
Tasks	Speech Recognition
Published	2019-04-08
URL	https://arxiv.org/abs/1904.04100v3
PDF	https://arxiv.org/pdf/1904.04100v3.pdf
PWC	https://paperswithcode.com/paper/completely-unsupervised-phoneme-recognition-1
Repo
Framework

A Hybrid Cooperative Co-evolution Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters


Title	A Hybrid Cooperative Co-evolution Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters
Authors	Mehdi Neshat, Bradley Alexander, Markus Wagner
Abstract	Wave energy technologies have the potential to play a significant role in the supply of renewable energy on a world scale. One of the most promising designs for wave energy converters (WECs) are fully submerged buoys. In this work, we explore the optimisation of WEC arrays consisting of a three-tether buoy model called CETO. Such arrays can be optimised for total energy output by adjusting both the relative positions of buoys in farms and also the power-take-off (PTO) parameters for each buoy. The search space for these parameters is complex and multi-modal. Moreover, the evaluation of each parameter setting is computationally expensive – limiting the number of full model evaluations that can be made. To handle this problem, we propose a new hybrid cooperative co-evolution algorithm (HCCA). HCCA consists of a symmetric local search plus Nelder-Mead and a cooperative co-evolution algorithm (CC) with a backtracking strategy for optimising the positions and PTO settings of WECs, respectively. Moreover, a new adaptive scenario is proposed for tuning grey wolf optimiser (AGWO) hyper-parameter. AGWO participates notably with other applied optimisers in HCCA. For assessing the effectiveness of the proposed approach five popular Evolutionary Algorithms (EAs), four alternating optimisation methods and two modern hybrid ideas (LS-NM and SLS-NM-B) are carefully compared in four real wave situations (Adelaide, Tasmania, Sydney and Perth) with two wave farm sizes (4 and 16). According to the experimental outcomes, the hybrid cooperative framework exhibits better performance in terms of both runtime and quality of obtained solutions.
Tasks
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01280v1
PDF	https://arxiv.org/pdf/1910.01280v1.pdf
PWC	https://paperswithcode.com/paper/a-hybrid-cooperative-co-evolution-algorithm
Repo
Framework

Bilingual Lexicon Induction through Unsupervised Machine Translation


Title	Bilingual Lexicon Induction through Unsupervised Machine Translation
Authors	Mikel Artetxe, Gorka Labaka, Eneko Agirre
Abstract	A recent research line has obtained strong results on bilingual lexicon induction by aligning independently trained word embeddings in two languages and using the resulting cross-lingual embeddings to induce word translation pairs through nearest neighbor or related retrieval methods. In this paper, we propose an alternative approach to this problem that builds on the recent work on unsupervised machine translation. This way, instead of directly inducing a bilingual lexicon from cross-lingual embeddings, we use them to build a phrase-table, combine it with a language model, and use the resulting machine translation system to generate a synthetic parallel corpus, from which we extract the bilingual lexicon using statistical word alignment techniques. As such, our method can work with any word embedding and cross-lingual mapping technique, and it does not require any additional resource besides the monolingual corpus used to train the embeddings. When evaluated on the exact same cross-lingual embeddings, our proposed method obtains an average improvement of 6 accuracy points over nearest neighbor and 4 points over CSLS retrieval, establishing a new state-of-the-art in the standard MUSE dataset.
Tasks	Language Modelling, Machine Translation, Unsupervised Machine Translation, Word Alignment, Word Embeddings
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10761v1
PDF	https://arxiv.org/pdf/1907.10761v1.pdf
PWC	https://paperswithcode.com/paper/bilingual-lexicon-induction-through
Repo
Framework

Generating the support with extreme value losses


Title	Generating the support with extreme value losses
Authors	Nicholas Guttenberg
Abstract	When optimizing against the mean loss over a distribution of predictions in the context of a regression task, then even if there is a distribution of targets the optimal prediction distribution is always a delta function at a single value. Methods of constructing generative models need to overcome this tendency. We consider a simple method of summarizing the prediction error, such that the optimal strategy corresponds to outputting a distribution of predictions with a support that matches the support of the distribution of targets — optimizing against the minimal value of the loss given a set of samples from the prediction distribution, rather than the mean. We show that models trained against this loss learn to capture the support of the target distribution and, when combined with an auxiliary classifier-like prediction task, can be projected via rejection sampling to reproduce the full distribution of targets. The resulting method works well compared to other generative modeling approaches particularly in low dimensional spaces with highly non-trivial distributions, due to mode collapse solutions being globally suboptimal with respect to the extreme value loss. However, the method is less suited to high-dimensional spaces such as images due to the scaling of the number of samples needed in order to accurately estimate the extreme value loss when the dimension of the data manifold becomes large.
Tasks
Published	2019-02-08
URL	http://arxiv.org/abs/1902.02940v1
PDF	http://arxiv.org/pdf/1902.02940v1.pdf
PWC	https://paperswithcode.com/paper/generating-the-support-with-extreme-value
Repo
Framework

Differentiable programming and its applications to dynamical systems


Title	Differentiable programming and its applications to dynamical systems
Authors	Adrián Hernández, José M. Amigó
Abstract	Differentiable programming is the combination of classical neural networks modules with algorithmic ones in an end-to-end differentiable model. These new models, that use automatic differentiation to calculate gradients, have new learning capabilities (reasoning, attention and memory). In this tutorial, aimed at researchers in nonlinear systems with prior knowledge of deep learning, we present this new programming paradigm, describe some of its new features such as attention mechanisms, and highlight the benefits they bring. Then, we analyse the uses and limitations of traditional deep learning models in the modeling and prediction of dynamical systems. Here, a dynamical system is meant to be a set of state variables that evolve in time under general internal and external interactions. Finally, we review the advantages and applications of differentiable programming to dynamical systems.
Tasks
Published	2019-12-17
URL	https://arxiv.org/abs/1912.08168v1
PDF	https://arxiv.org/pdf/1912.08168v1.pdf
PWC	https://paperswithcode.com/paper/differentiable-programming-and-its
Repo
Framework

Locomotion and gesture tracking in mice and small animals for neurosceince applications: A survey


Title	Locomotion and gesture tracking in mice and small animals for neurosceince applications: A survey
Authors	Waseem Abbas, David Masip Rodo
Abstract	Neuroscience has traditionally relied on manually observing lab animals in controlled environments. Researchers usually record animals behaving in free or restrained manner and then annotate the data manually. The manual annotation is not desirable for three reasons; one, it is time consuming, two, it is prone to human errors and three, no two human annotators will 100% agree on annotation, so it is not reproducible. Consequently, automated annotation of such data has gained traction because it is efficient and replicable. Usually, the automatic annotation of neuroscience data relies on computer vision and machine leaning techniques. In this article, we have covered most of the approaches taken by researchers for locomotion and gesture tracking of lab animals. We have divided these papers in categories based upon the hardware they use and the software approach they take. We also have summarized their strengths and weaknesses.
Tasks
Published	2019-03-25
URL	http://arxiv.org/abs/1903.10422v1
PDF	http://arxiv.org/pdf/1903.10422v1.pdf
PWC	https://paperswithcode.com/paper/locomotion-and-gesture-tracking-in-mice-and
Repo
Framework

A Quest for Structure: Jointly Learning the Graph Structure and Semi-Supervised Classification


Title	A Quest for Structure: Jointly Learning the Graph Structure and Semi-Supervised Classification
Authors	Xuan Wu, Lingxiao Zhao, Leman Akoglu
Abstract	Semi-supervised learning (SSL) is effectively used for numerous classification problems, thanks to its ability to make use of abundant unlabeled data. The main assumption of various SSL algorithms is that the nearby points on the data manifold are likely to share a label. Graph-based SSL constructs a graph from point-cloud data as an approximation to the underlying manifold, followed by label inference. It is no surprise that the quality of the constructed graph in capturing the essential structure of the data is critical to the accuracy of the subsequent inference step [6]. How should one construct a graph from the input point-cloud data for graph-based SSL? In this work we introduce a new, parallel graph learning framework (called PG-learn) for the graph construction step of SSL. Our solution has two main ingredients: (1) a gradient-based optimization of the edge weights (more specifically, different kernel bandwidths in each dimension) based on a validation loss function, and (2) a parallel hyperparameter search algorithm with an adaptive resource allocation scheme. In essence, (1) allows us to search around a (random) initial hyperparameter configuration for a better one with lower validation loss. Since the search space of hyperparameters is huge for high-dimensional problems, (2) empowers our gradient-based search to go through as many different initial configurations as possible, where runs for relatively unpromising starting configurations are terminated early to allocate the time for others. As such, PG-learn is a carefully-designed hybrid of random and adaptive search. Through experiments on multi-class classification problems, we show that PG-learn significantly outperforms a variety of existing graph construction schemes in accuracy (per fixed time budget for hyperparameter tuning), and scales more effectively to high dimensional problems.
Tasks	graph construction
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12385v1
PDF	https://arxiv.org/pdf/1909.12385v1.pdf
PWC	https://paperswithcode.com/paper/a-quest-for-structure-jointly-learning-the
Repo
Framework

Deep Convolutional Generative Adversarial Networks Based Flame Detection in Video


Title	Deep Convolutional Generative Adversarial Networks Based Flame Detection in Video
Authors	Süleyman Aslan, Uğur Güdükbay, B. Uğur Töreyin, A. Enis Çetin
Abstract	Real-time flame detection is crucial in video based surveillance systems. We propose a vision-based method to detect flames using Deep Convolutional Generative Adversarial Neural Networks (DCGANs). Many existing supervised learning approaches using convolutional neural networks do not take temporal information into account and require substantial amount of labeled data. In order to have a robust representation of sequences with and without flame, we propose a two-stage training of a DCGAN exploiting spatio-temporal flame evolution. Our training framework includes the regular training of a DCGAN with real spatio-temporal images, namely, temporal slice images, and noise vectors, and training the discriminator separately using the temporal flame images without the generator. Experimental results show that the proposed method effectively detects flame in video with negligible false positive rates in real-time.
Tasks
Published	2019-02-05
URL	http://arxiv.org/abs/1902.01824v1
PDF	http://arxiv.org/pdf/1902.01824v1.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-generative-adversarial-1
Repo
Framework

InSphereNet: a Concise Representation and Classification Method for 3D Object


Title	InSphereNet: a Concise Representation and Classification Method for 3D Object
Authors	Hui Cao, Haikuan Du, Siyu Zhang, Shen Cai
Abstract	In this paper, we present an InSphereNet method for the problem of 3D object classification. Unlike previous methods that use points, voxels, or multi-view images as inputs of deep neural network (DNN), the proposed method constructs a class of more representative features named infilling spheres from signed distance field (SDF). Because of the admirable spatial representation of infilling spheres, we can not only utilize very fewer number of spheres to accomplish classification task, but also design a lightweight InSphereNet with less layers and parameters than previous methods. Experiments on ModelNet40 show that the proposed method leads to superior performance than PointNet and PointNet++ in accuracy. In particular, if there are only a few dozen sphere inputs or about 100000 DNN parameters, the accuracy of our method remains at a very high level (over 88%). This further validates the conciseness and effectiveness of the proposed InSphere 3D representation. Keywords: 3D object classification , signed distance field , deep learning , infilling sphere
Tasks	3D Object Classification, Object Classification
Published	2019-12-25
URL	https://arxiv.org/abs/1912.11606v2
PDF	https://arxiv.org/pdf/1912.11606v2.pdf
PWC	https://paperswithcode.com/paper/inspherenet-a-concise-representation-and
Repo
Framework

Addressing the Sim2Real Gap in Robotic 3D Object Classification


Title	Addressing the Sim2Real Gap in Robotic 3D Object Classification
Authors	Jean-Baptiste Weibel, Timothy Patten, Markus Vincze
Abstract	Object classification with 3D data is an essential component of any scene understanding method. It has gained significant interest in a variety of communities, most notably in robotics and computer graphics. While the advent of deep learning has progressed the field of 3D object classification, most work using this data type are solely evaluated on CAD model datasets. Consequently, current work does not address the discrepancies existing between real and artificial data. In this work, we examine this gap in a robotic context by specifically addressing the problem of classification when transferring from artificial CAD models to real reconstructed objects. This is performed by training on ModelNet (CAD models) and evaluating on ScanNet (reconstructed objects). We show that standard methods do not perform well in this task. We thus introduce a method that carefully samples object parts that are reproducible under various transformations and hence robust. Using graph convolution to classify the composed graph of parts, our method significantly improves upon the baseline.
Tasks	3D Object Classification, Object Classification, Scene Understanding
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12585v1
PDF	https://arxiv.org/pdf/1910.12585v1.pdf
PWC	https://paperswithcode.com/paper/addressing-the-sim2real-gap-in-robotic-3d
Repo
Framework

Automated identification of neural cells in the multi-photon images using deep-neural networks


Title	Automated identification of neural cells in the multi-photon images using deep-neural networks
Authors	Si-Baek Seong, Hae-Jeong Park
Abstract	The advancement of the neuroscientific imaging techniques has produced an unprecedented size of neural cell imaging data, which calls for automated processing. In particular, identification of cells from two photon images demands segmentation of neural cells out of various materials and classification of the segmented cells according to their cell types. To automatically segment neural cells, we used U-Net model, followed by classification of excitatory and inhibitory neurons and glia cells using a transfer learning technique. For transfer learning, we tested three public models of resnet18, resnet50 and inceptionv3, after replacing the fully connected layer with that for three classes. The best classification performance was found for the model with inceptionv3. The proposed application of deep learning technique is expected to provide a critical way to cell identification in the era of big neuroscience data.
Tasks	Transfer Learning
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11269v1
PDF	https://arxiv.org/pdf/1909.11269v1.pdf
PWC	https://paperswithcode.com/paper/automated-identification-of-neural-cells-in
Repo
Framework

Indirect Local Attacks for Context-aware Semantic Segmentation Networks


Title	Indirect Local Attacks for Context-aware Semantic Segmentation Networks
Authors	Krishna Kanth Nakka, Mathieu Salzmann
Abstract	Recently, deep networks have achieved impressive semantic segmentation performance, in particular thanks to their use of larger contextual information. In this paper, we show that the resulting networks are sensitive not only to global attacks, where perturbations affect the entire input image, but also to indirect local attacks where perturbations are confined to a small image region that does not overlap with the area that we aim to fool. To this end, we introduce several indirect attack strategies, including adaptive local attacks, aiming to find the best image location to perturb, and universal local attacks. Furthermore, we propose attack detection techniques both for the global image level and to obtain a pixel-wise localization of the fooled regions. Our results are unsettling: Because they exploit a larger context, more accurate semantic segmentation networks are more sensitive to indirect local attacks.
Tasks	Semantic Segmentation
Published	2019-11-29
URL	https://arxiv.org/abs/1911.13038v2
PDF	https://arxiv.org/pdf/1911.13038v2.pdf
PWC	https://paperswithcode.com/paper/indirect-local-attacks-for-context-aware
Repo
Framework

Paracoherent Answer Set Semantics meets Argumentation Frameworks


Title	Paracoherent Answer Set Semantics meets Argumentation Frameworks
Authors	Giovanni Amendola, Francesco Ricca
Abstract	In the last years, abstract argumentation has met with great success in AI, since it has served to capture several non-monotonic logics for AI. Relations between argumentation framework (AF) semantics and logic programming ones are investigating more and more. In particular, great attention has been given to the well-known stable extensions of an AF, that are closely related to the answer sets of a logic program. However, if a framework admits a small incoherent part, no stable extension can be provided. To overcome this shortcoming, two semantics generalizing stable extensions have been studied, namely semi-stable and stage. In this paper, we show that another perspective is possible on incoherent AFs, called paracoherent extensions, as they have a counterpart in paracoherent answer set semantics. We compare this perspective with semi-stable and stage semantics, by showing that computational costs remain unchanged, and moreover an interesting symmetric behaviour is maintained. Under consideration for acceptance in TPLP.
Tasks	Abstract Argumentation
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09426v1
PDF	https://arxiv.org/pdf/1907.09426v1.pdf
PWC	https://paperswithcode.com/paper/paracoherent-answer-set-semantics-meets
Repo
Framework