Paper Group ANR 414
Analyzing Periodicity and Saliency for Adult Video Detection. A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots. Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models. A Hybrid Cooperative Co-evolution Algorithm Framework for Opti …
Analyzing Periodicity and Saliency for Adult Video Detection
Title | Analyzing Periodicity and Saliency for Adult Video Detection |
Authors | Yizhi Liu, Xiaoyan Gu, Lei Huang, Junlin Ouyang, Miao Liao, Liangran Wu |
Abstract | Content-based adult video detection plays an important role in preventing pornography. However, existing methods usually rely on single modality and seldom focus on multi-modality semantics representation. Addressing at this problem, we put forward an approach of analyzing periodicity and saliency for adult video detection. At first, periodic patterns and salient regions are respective-ly analyzed in audio-frames and visual-frames. Next, the multi-modal co-occurrence semantics is described by combining audio periodicity with visual saliency. Moreover, the performance of our approach is evaluated step by step. Experimental results show that our approach obviously outper-forms some state-of-the-art methods. |
Tasks | |
Published | 2019-01-11 |
URL | http://arxiv.org/abs/1901.03462v1 |
http://arxiv.org/pdf/1901.03462v1.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-periodicity-and-saliency-for-adult |
Repo | |
Framework | |
A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots
Title | A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots |
Authors | Xueliang Zhao, Chongyang Tao, Wei Wu, Can Xu, Dongyan Zhao, Rui Yan |
Abstract | We present a document-grounded matching network (DGMN) for response selection that can power a knowledge-aware retrieval-based chatbot system. The challenges of building such a model lie in how to ground conversation contexts with background documents and how to recognize important information in the documents for matching. To overcome the challenges, DGMN fuses information in a document and a context into representations of each other, and dynamically determines if grounding is necessary and importance of different parts of the document and the context through hierarchical interaction with a response at the matching step. Empirical studies on two public data sets indicate that DGMN can significantly improve upon state-of-the-art methods and at the same time enjoys good interpretability. |
Tasks | Chatbot |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04362v1 |
https://arxiv.org/pdf/1906.04362v1.pdf | |
PWC | https://paperswithcode.com/paper/a-document-grounded-matching-network-for |
Repo | |
Framework | |
Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Title | Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models |
Authors | Kuan-Yu Chen, Che-Ping Tsai, Da-Rong Liu, Hung-Yi Lee, Lin-shan Lee |
Abstract | Producing a large annotated speech corpus for training ASR systems remains difficult for more than 95% of languages all over the world which are low-resourced, but collecting a relatively big unlabeled data set for such languages is more achievable. This is why some initial effort have been reported on completely unsupervised speech recognition learned from unlabeled data only, although with relatively high error rates. In this paper, we develop a Generative Adversarial Network (GAN) to achieve this purpose, in which a Generator and a Discriminator learn from each other iteratively to improve the performance. We further use a set of Hidden Markov Models (HMMs) iteratively refined from the machine generated labels to work in harmony with the GAN. The initial experiments on TIMIT data set achieve an phone error rate of 33.1%, which is 8.5% lower than the previous state-of-the-art. |
Tasks | Speech Recognition |
Published | 2019-04-08 |
URL | https://arxiv.org/abs/1904.04100v3 |
https://arxiv.org/pdf/1904.04100v3.pdf | |
PWC | https://paperswithcode.com/paper/completely-unsupervised-phoneme-recognition-1 |
Repo | |
Framework | |
A Hybrid Cooperative Co-evolution Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters
Title | A Hybrid Cooperative Co-evolution Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters |
Authors | Mehdi Neshat, Bradley Alexander, Markus Wagner |
Abstract | Wave energy technologies have the potential to play a significant role in the supply of renewable energy on a world scale. One of the most promising designs for wave energy converters (WECs) are fully submerged buoys. In this work, we explore the optimisation of WEC arrays consisting of a three-tether buoy model called CETO. Such arrays can be optimised for total energy output by adjusting both the relative positions of buoys in farms and also the power-take-off (PTO) parameters for each buoy. The search space for these parameters is complex and multi-modal. Moreover, the evaluation of each parameter setting is computationally expensive – limiting the number of full model evaluations that can be made. To handle this problem, we propose a new hybrid cooperative co-evolution algorithm (HCCA). HCCA consists of a symmetric local search plus Nelder-Mead and a cooperative co-evolution algorithm (CC) with a backtracking strategy for optimising the positions and PTO settings of WECs, respectively. Moreover, a new adaptive scenario is proposed for tuning grey wolf optimiser (AGWO) hyper-parameter. AGWO participates notably with other applied optimisers in HCCA. For assessing the effectiveness of the proposed approach five popular Evolutionary Algorithms (EAs), four alternating optimisation methods and two modern hybrid ideas (LS-NM and SLS-NM-B) are carefully compared in four real wave situations (Adelaide, Tasmania, Sydney and Perth) with two wave farm sizes (4 and 16). According to the experimental outcomes, the hybrid cooperative framework exhibits better performance in terms of both runtime and quality of obtained solutions. |
Tasks | |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01280v1 |
https://arxiv.org/pdf/1910.01280v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-cooperative-co-evolution-algorithm |
Repo | |
Framework | |
Bilingual Lexicon Induction through Unsupervised Machine Translation
Title | Bilingual Lexicon Induction through Unsupervised Machine Translation |
Authors | Mikel Artetxe, Gorka Labaka, Eneko Agirre |
Abstract | A recent research line has obtained strong results on bilingual lexicon induction by aligning independently trained word embeddings in two languages and using the resulting cross-lingual embeddings to induce word translation pairs through nearest neighbor or related retrieval methods. In this paper, we propose an alternative approach to this problem that builds on the recent work on unsupervised machine translation. This way, instead of directly inducing a bilingual lexicon from cross-lingual embeddings, we use them to build a phrase-table, combine it with a language model, and use the resulting machine translation system to generate a synthetic parallel corpus, from which we extract the bilingual lexicon using statistical word alignment techniques. As such, our method can work with any word embedding and cross-lingual mapping technique, and it does not require any additional resource besides the monolingual corpus used to train the embeddings. When evaluated on the exact same cross-lingual embeddings, our proposed method obtains an average improvement of 6 accuracy points over nearest neighbor and 4 points over CSLS retrieval, establishing a new state-of-the-art in the standard MUSE dataset. |
Tasks | Language Modelling, Machine Translation, Unsupervised Machine Translation, Word Alignment, Word Embeddings |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10761v1 |
https://arxiv.org/pdf/1907.10761v1.pdf | |
PWC | https://paperswithcode.com/paper/bilingual-lexicon-induction-through |
Repo | |
Framework | |
Generating the support with extreme value losses
Title | Generating the support with extreme value losses |
Authors | Nicholas Guttenberg |
Abstract | When optimizing against the mean loss over a distribution of predictions in the context of a regression task, then even if there is a distribution of targets the optimal prediction distribution is always a delta function at a single value. Methods of constructing generative models need to overcome this tendency. We consider a simple method of summarizing the prediction error, such that the optimal strategy corresponds to outputting a distribution of predictions with a support that matches the support of the distribution of targets — optimizing against the minimal value of the loss given a set of samples from the prediction distribution, rather than the mean. We show that models trained against this loss learn to capture the support of the target distribution and, when combined with an auxiliary classifier-like prediction task, can be projected via rejection sampling to reproduce the full distribution of targets. The resulting method works well compared to other generative modeling approaches particularly in low dimensional spaces with highly non-trivial distributions, due to mode collapse solutions being globally suboptimal with respect to the extreme value loss. However, the method is less suited to high-dimensional spaces such as images due to the scaling of the number of samples needed in order to accurately estimate the extreme value loss when the dimension of the data manifold becomes large. |
Tasks | |
Published | 2019-02-08 |
URL | http://arxiv.org/abs/1902.02940v1 |
http://arxiv.org/pdf/1902.02940v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-the-support-with-extreme-value |
Repo | |
Framework | |
Differentiable programming and its applications to dynamical systems
Title | Differentiable programming and its applications to dynamical systems |
Authors | Adrián Hernández, José M. Amigó |
Abstract | Differentiable programming is the combination of classical neural networks modules with algorithmic ones in an end-to-end differentiable model. These new models, that use automatic differentiation to calculate gradients, have new learning capabilities (reasoning, attention and memory). In this tutorial, aimed at researchers in nonlinear systems with prior knowledge of deep learning, we present this new programming paradigm, describe some of its new features such as attention mechanisms, and highlight the benefits they bring. Then, we analyse the uses and limitations of traditional deep learning models in the modeling and prediction of dynamical systems. Here, a dynamical system is meant to be a set of state variables that evolve in time under general internal and external interactions. Finally, we review the advantages and applications of differentiable programming to dynamical systems. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.08168v1 |
https://arxiv.org/pdf/1912.08168v1.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-programming-and-its |
Repo | |
Framework | |
Locomotion and gesture tracking in mice and small animals for neurosceince applications: A survey
Title | Locomotion and gesture tracking in mice and small animals for neurosceince applications: A survey |
Authors | Waseem Abbas, David Masip Rodo |
Abstract | Neuroscience has traditionally relied on manually observing lab animals in controlled environments. Researchers usually record animals behaving in free or restrained manner and then annotate the data manually. The manual annotation is not desirable for three reasons; one, it is time consuming, two, it is prone to human errors and three, no two human annotators will 100% agree on annotation, so it is not reproducible. Consequently, automated annotation of such data has gained traction because it is efficient and replicable. Usually, the automatic annotation of neuroscience data relies on computer vision and machine leaning techniques. In this article, we have covered most of the approaches taken by researchers for locomotion and gesture tracking of lab animals. We have divided these papers in categories based upon the hardware they use and the software approach they take. We also have summarized their strengths and weaknesses. |
Tasks | |
Published | 2019-03-25 |
URL | http://arxiv.org/abs/1903.10422v1 |
http://arxiv.org/pdf/1903.10422v1.pdf | |
PWC | https://paperswithcode.com/paper/locomotion-and-gesture-tracking-in-mice-and |
Repo | |
Framework | |
A Quest for Structure: Jointly Learning the Graph Structure and Semi-Supervised Classification
Title | A Quest for Structure: Jointly Learning the Graph Structure and Semi-Supervised Classification |
Authors | Xuan Wu, Lingxiao Zhao, Leman Akoglu |
Abstract | Semi-supervised learning (SSL) is effectively used for numerous classification problems, thanks to its ability to make use of abundant unlabeled data. The main assumption of various SSL algorithms is that the nearby points on the data manifold are likely to share a label. Graph-based SSL constructs a graph from point-cloud data as an approximation to the underlying manifold, followed by label inference. It is no surprise that the quality of the constructed graph in capturing the essential structure of the data is critical to the accuracy of the subsequent inference step [6]. How should one construct a graph from the input point-cloud data for graph-based SSL? In this work we introduce a new, parallel graph learning framework (called PG-learn) for the graph construction step of SSL. Our solution has two main ingredients: (1) a gradient-based optimization of the edge weights (more specifically, different kernel bandwidths in each dimension) based on a validation loss function, and (2) a parallel hyperparameter search algorithm with an adaptive resource allocation scheme. In essence, (1) allows us to search around a (random) initial hyperparameter configuration for a better one with lower validation loss. Since the search space of hyperparameters is huge for high-dimensional problems, (2) empowers our gradient-based search to go through as many different initial configurations as possible, where runs for relatively unpromising starting configurations are terminated early to allocate the time for others. As such, PG-learn is a carefully-designed hybrid of random and adaptive search. Through experiments on multi-class classification problems, we show that PG-learn significantly outperforms a variety of existing graph construction schemes in accuracy (per fixed time budget for hyperparameter tuning), and scales more effectively to high dimensional problems. |
Tasks | graph construction |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12385v1 |
https://arxiv.org/pdf/1909.12385v1.pdf | |
PWC | https://paperswithcode.com/paper/a-quest-for-structure-jointly-learning-the |
Repo | |
Framework | |
Deep Convolutional Generative Adversarial Networks Based Flame Detection in Video
Title | Deep Convolutional Generative Adversarial Networks Based Flame Detection in Video |
Authors | Süleyman Aslan, Uğur Güdükbay, B. Uğur Töreyin, A. Enis Çetin |
Abstract | Real-time flame detection is crucial in video based surveillance systems. We propose a vision-based method to detect flames using Deep Convolutional Generative Adversarial Neural Networks (DCGANs). Many existing supervised learning approaches using convolutional neural networks do not take temporal information into account and require substantial amount of labeled data. In order to have a robust representation of sequences with and without flame, we propose a two-stage training of a DCGAN exploiting spatio-temporal flame evolution. Our training framework includes the regular training of a DCGAN with real spatio-temporal images, namely, temporal slice images, and noise vectors, and training the discriminator separately using the temporal flame images without the generator. Experimental results show that the proposed method effectively detects flame in video with negligible false positive rates in real-time. |
Tasks | |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.01824v1 |
http://arxiv.org/pdf/1902.01824v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-generative-adversarial-1 |
Repo | |
Framework | |
InSphereNet: a Concise Representation and Classification Method for 3D Object
Title | InSphereNet: a Concise Representation and Classification Method for 3D Object |
Authors | Hui Cao, Haikuan Du, Siyu Zhang, Shen Cai |
Abstract | In this paper, we present an InSphereNet method for the problem of 3D object classification. Unlike previous methods that use points, voxels, or multi-view images as inputs of deep neural network (DNN), the proposed method constructs a class of more representative features named infilling spheres from signed distance field (SDF). Because of the admirable spatial representation of infilling spheres, we can not only utilize very fewer number of spheres to accomplish classification task, but also design a lightweight InSphereNet with less layers and parameters than previous methods. Experiments on ModelNet40 show that the proposed method leads to superior performance than PointNet and PointNet++ in accuracy. In particular, if there are only a few dozen sphere inputs or about 100000 DNN parameters, the accuracy of our method remains at a very high level (over 88%). This further validates the conciseness and effectiveness of the proposed InSphere 3D representation. Keywords: 3D object classification , signed distance field , deep learning , infilling sphere |
Tasks | 3D Object Classification, Object Classification |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/1912.11606v2 |
https://arxiv.org/pdf/1912.11606v2.pdf | |
PWC | https://paperswithcode.com/paper/inspherenet-a-concise-representation-and |
Repo | |
Framework | |
Addressing the Sim2Real Gap in Robotic 3D Object Classification
Title | Addressing the Sim2Real Gap in Robotic 3D Object Classification |
Authors | Jean-Baptiste Weibel, Timothy Patten, Markus Vincze |
Abstract | Object classification with 3D data is an essential component of any scene understanding method. It has gained significant interest in a variety of communities, most notably in robotics and computer graphics. While the advent of deep learning has progressed the field of 3D object classification, most work using this data type are solely evaluated on CAD model datasets. Consequently, current work does not address the discrepancies existing between real and artificial data. In this work, we examine this gap in a robotic context by specifically addressing the problem of classification when transferring from artificial CAD models to real reconstructed objects. This is performed by training on ModelNet (CAD models) and evaluating on ScanNet (reconstructed objects). We show that standard methods do not perform well in this task. We thus introduce a method that carefully samples object parts that are reproducible under various transformations and hence robust. Using graph convolution to classify the composed graph of parts, our method significantly improves upon the baseline. |
Tasks | 3D Object Classification, Object Classification, Scene Understanding |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12585v1 |
https://arxiv.org/pdf/1910.12585v1.pdf | |
PWC | https://paperswithcode.com/paper/addressing-the-sim2real-gap-in-robotic-3d |
Repo | |
Framework | |
Automated identification of neural cells in the multi-photon images using deep-neural networks
Title | Automated identification of neural cells in the multi-photon images using deep-neural networks |
Authors | Si-Baek Seong, Hae-Jeong Park |
Abstract | The advancement of the neuroscientific imaging techniques has produced an unprecedented size of neural cell imaging data, which calls for automated processing. In particular, identification of cells from two photon images demands segmentation of neural cells out of various materials and classification of the segmented cells according to their cell types. To automatically segment neural cells, we used U-Net model, followed by classification of excitatory and inhibitory neurons and glia cells using a transfer learning technique. For transfer learning, we tested three public models of resnet18, resnet50 and inceptionv3, after replacing the fully connected layer with that for three classes. The best classification performance was found for the model with inceptionv3. The proposed application of deep learning technique is expected to provide a critical way to cell identification in the era of big neuroscience data. |
Tasks | Transfer Learning |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11269v1 |
https://arxiv.org/pdf/1909.11269v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-identification-of-neural-cells-in |
Repo | |
Framework | |
Indirect Local Attacks for Context-aware Semantic Segmentation Networks
Title | Indirect Local Attacks for Context-aware Semantic Segmentation Networks |
Authors | Krishna Kanth Nakka, Mathieu Salzmann |
Abstract | Recently, deep networks have achieved impressive semantic segmentation performance, in particular thanks to their use of larger contextual information. In this paper, we show that the resulting networks are sensitive not only to global attacks, where perturbations affect the entire input image, but also to indirect local attacks where perturbations are confined to a small image region that does not overlap with the area that we aim to fool. To this end, we introduce several indirect attack strategies, including adaptive local attacks, aiming to find the best image location to perturb, and universal local attacks. Furthermore, we propose attack detection techniques both for the global image level and to obtain a pixel-wise localization of the fooled regions. Our results are unsettling: Because they exploit a larger context, more accurate semantic segmentation networks are more sensitive to indirect local attacks. |
Tasks | Semantic Segmentation |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.13038v2 |
https://arxiv.org/pdf/1911.13038v2.pdf | |
PWC | https://paperswithcode.com/paper/indirect-local-attacks-for-context-aware |
Repo | |
Framework | |
Paracoherent Answer Set Semantics meets Argumentation Frameworks
Title | Paracoherent Answer Set Semantics meets Argumentation Frameworks |
Authors | Giovanni Amendola, Francesco Ricca |
Abstract | In the last years, abstract argumentation has met with great success in AI, since it has served to capture several non-monotonic logics for AI. Relations between argumentation framework (AF) semantics and logic programming ones are investigating more and more. In particular, great attention has been given to the well-known stable extensions of an AF, that are closely related to the answer sets of a logic program. However, if a framework admits a small incoherent part, no stable extension can be provided. To overcome this shortcoming, two semantics generalizing stable extensions have been studied, namely semi-stable and stage. In this paper, we show that another perspective is possible on incoherent AFs, called paracoherent extensions, as they have a counterpart in paracoherent answer set semantics. We compare this perspective with semi-stable and stage semantics, by showing that computational costs remain unchanged, and moreover an interesting symmetric behaviour is maintained. Under consideration for acceptance in TPLP. |
Tasks | Abstract Argumentation |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09426v1 |
https://arxiv.org/pdf/1907.09426v1.pdf | |
PWC | https://paperswithcode.com/paper/paracoherent-answer-set-semantics-meets |
Repo | |
Framework | |