October 18, 2019

2845 words 14 mins read

Paper Group ANR 538

Paper Group ANR 538

SARN: Relational Reasoning through Sequential Attention. Learning Local Metrics and Influential Regions for Classification. Spatiotemporal CNNs for Pornography Detection in Videos. Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer. Equivalent Lipschitz surrogates for zero-norm and rank optimization problems. Designin …

SARN: Relational Reasoning through Sequential Attention

Title SARN: Relational Reasoning through Sequential Attention
Authors Jinwon An, Sungwon Lyu, Sungzoon Cho
Abstract This paper proposes an attention module augmented relational network called SARN(Sequential Attention Relational Network) that can carry out relational reasoning by extracting reference objects and making efficient pairing between objects. SARN greatly reduces the computational and memory requirements of the relational network, which computes all object pairs. It also shows high accuracy on the Sort-of-CLEVR dataset compared to other models, especially on relational questions.
Tasks Relational Reasoning
Published 2018-11-01
URL http://arxiv.org/abs/1811.00246v1
PDF http://arxiv.org/pdf/1811.00246v1.pdf
PWC https://paperswithcode.com/paper/sarn-relational-reasoning-through-sequential

Learning Local Metrics and Influential Regions for Classification

Title Learning Local Metrics and Influential Regions for Classification
Authors Mingzhi Dong, Yujiang Wang, Xiaochen Yang, Jing-Hao Xue
Abstract The performance of distance-based classifiers heavily depends on the underlying distance metric, so it is valuable to learn a suitable metric from the data. To address the problem of multimodality, it is desirable to learn local metrics. In this short paper, we define a new intuitive distance with local metrics and influential regions, and subsequently propose a novel local metric learning method for distance-based classification. Our key intuition is to partition the metric space into influential regions and a background region, and then regulate the effectiveness of each local metric to be within the related influential regions. We learn local metrics and influential regions to reduce the empirical hinge loss, and regularize the parameters on the basis of a resultant learning bound. Encouraging experimental results are obtained from various public and popular data sets.
Tasks Metric Learning
Published 2018-02-09
URL http://arxiv.org/abs/1802.03452v1
PDF http://arxiv.org/pdf/1802.03452v1.pdf
PWC https://paperswithcode.com/paper/learning-local-metrics-and-influential

Spatiotemporal CNNs for Pornography Detection in Videos

Title Spatiotemporal CNNs for Pornography Detection in Videos
Authors Murilo Varges da Silva, Aparecido Nilceu Marana
Abstract With the increasing use of social networks and mobile devices, the number of videos posted on the Internet is growing exponentially. Among the inappropriate contents published on the Internet, pornography is one of the most worrying as it can be accessed by teens and children. Two spatiotemporal CNNs, VGG-C3D CNN and ResNet R(2+1)D CNN, were assessed for pornography detection in videos in the present study. Experimental results using the Pornography-800 dataset showed that these spatiotemporal CNNs performed better than some state-of-the-art methods based on bag of visual words and are competitive with other CNN-based approaches, reaching accuracy of 95.1%.
Tasks Pornography Detection, Pornography Detection In Videos
Published 2018-10-24
URL http://arxiv.org/abs/1810.10519v1
PDF http://arxiv.org/pdf/1810.10519v1.pdf
PWC https://paperswithcode.com/paper/spatiotemporal-cnns-for-pornography-detection

Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer

Title Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer
Authors Cicero Nogueira dos Santos, Igor Melnyk, Inkit Padhi
Abstract We introduce a new approach to tackle the problem of offensive language in online social media. Our approach uses unsupervised text style transfer to translate offensive sentences into non-offensive ones. We propose a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss. Experimental results on data from Twitter and Reddit show that our method outperforms a state-of-the-art text style transfer system in two out of three quantitative metrics and produces reliable non-offensive transferred sentences.
Tasks Style Transfer, Text Style Transfer
Published 2018-05-20
URL http://arxiv.org/abs/1805.07685v1
PDF http://arxiv.org/pdf/1805.07685v1.pdf
PWC https://paperswithcode.com/paper/fighting-offensive-language-on-social-media

Equivalent Lipschitz surrogates for zero-norm and rank optimization problems

Title Equivalent Lipschitz surrogates for zero-norm and rank optimization problems
Authors Yulan Liu, Shujun Bi, Shaohua Pan
Abstract This paper proposes a mechanism to produce equivalent Lipschitz surrogates for zero-norm and rank optimization problems by means of the global exact penalty for their equivalent mathematical programs with an equilibrium constraint (MPECs). Specifically, we reformulate these combinatorial problems as equivalent MPECs by the variational characterization of the zero-norm and rank function, show that their penalized problems, yielded by moving the equilibrium constraint into the objective, are the global exact penalization, and obtain the equivalent Lipschitz surrogates by eliminating the dual variable in the global exact penalty. These surrogates, including the popular SCAD function in statistics, are also difference of two convex functions (D.C.) if the function and constraint set involved in zero-norm and rank optimization problems are convex. We illustrate an application by designing a multi-stage convex relaxation approach to the rank plus zero-norm regularized minimization problem.
Published 2018-04-30
URL http://arxiv.org/abs/1804.11062v1
PDF http://arxiv.org/pdf/1804.11062v1.pdf
PWC https://paperswithcode.com/paper/equivalent-lipschitz-surrogates-for-zero-norm

Designing for Democratization: Introducing Novices to Artificial Intelligence Via Maker Kits

Title Designing for Democratization: Introducing Novices to Artificial Intelligence Via Maker Kits
Authors Victor Dibia, Aaron Cox, Justin Weisz
Abstract Existing research highlight the myriad of benefits realized when technology is sufficiently democratized and made accessible to non-technical or novice users. However, democratizing complex technologies such as artificial intelligence (AI) remains hard. In this work, we draw on theoretical underpinnings from the democratization of innovation, in exploring the design of maker kits that help introduce novice users to complex technologies. We report on our work designing TJBot: an open source cardboard robot that can be programmed using pre-built AI services. We highlight principles we adopted in this process (approachable design, simplicity, extensibility and accessibility), insights we learned from showing the kit at workshops (66 participants) and how users interacted with the project on GitHub over a 12-month period (Nov 2016 - Nov 2017). We find that the project succeeds in attracting novice users (40% of users who forked the project are new to GitHub) and a variety of demographics are interested in prototyping use cases such as home automation, task delegation, teaching and learning.
Published 2018-05-28
URL http://arxiv.org/abs/1805.10723v3
PDF http://arxiv.org/pdf/1805.10723v3.pdf
PWC https://paperswithcode.com/paper/designing-for-democratization-introducing

The structure of evolved representations across different substrates for artificial intelligence

Title The structure of evolved representations across different substrates for artificial intelligence
Authors Arend Hintze, Douglas Kirkpatrick, Christoph Adami
Abstract Artificial neural networks (ANNs), while exceptionally useful for classification, are vulnerable to misdirection. Small amounts of noise can significantly affect their ability to correctly complete a task. Instead of generalizing concepts, ANNs seem to focus on surface statistical regularities in a given task. Here we compare how recurrent artificial neural networks, long short-term memory units, and Markov Brains sense and remember their environments. We show that information in Markov Brains is localized and sparsely distributed, while the other neural network substrates “smear” information about the environment across all nodes, which makes them vulnerable to noise.
Published 2018-04-05
URL http://arxiv.org/abs/1804.01660v1
PDF http://arxiv.org/pdf/1804.01660v1.pdf
PWC https://paperswithcode.com/paper/the-structure-of-evolved-representations

Training verified learners with learned verifiers

Title Training verified learners with learned verifiers
Authors Krishnamurthy Dvijotham, Sven Gowal, Robert Stanforth, Relja Arandjelovic, Brendan O’Donoghue, Jonathan Uesato, Pushmeet Kohli
Abstract This paper proposes a new algorithmic framework, predictor-verifier training, to train neural networks that are verifiable, i.e., networks that provably satisfy some desired input-output properties. The key idea is to simultaneously train two networks: a predictor network that performs the task at hand,e.g., predicting labels given inputs, and a verifier network that computes a bound on how well the predictor satisfies the properties being verified. Both networks can be trained simultaneously to optimize a weighted combination of the standard data-fitting loss and a term that bounds the maximum violation of the property. Experiments show that not only is the predictor-verifier architecture able to train networks to achieve state of the art verified robustness to adversarial examples with much shorter training times (outperforming previous algorithms on small datasets like MNIST and SVHN), but it can also be scaled to produce the first known (to the best of our knowledge) verifiably robust networks for CIFAR-10.
Published 2018-05-25
URL http://arxiv.org/abs/1805.10265v2
PDF http://arxiv.org/pdf/1805.10265v2.pdf
PWC https://paperswithcode.com/paper/training-verified-learners-with-learned

The Kanerva Machine: A Generative Distributed Memory

Title The Kanerva Machine: A Generative Distributed Memory
Authors Yan Wu, Greg Wayne, Alex Graves, Timothy Lillicrap
Abstract We present an end-to-end trained memory system that quickly adapts to new data and generates samples like them. Inspired by Kanerva’s sparse distributed memory, it has a robust distributed reading and writing mechanism. The memory is analytically tractable, which enables optimal on-line compression via a Bayesian update-rule. We formulate it as a hierarchical conditional generative model, where memory provides a rich data-dependent prior distribution. Consequently, the top-down memory and bottom-up perception are combined to produce the code representing an observation. Empirically, we demonstrate that the adaptive memory significantly improves generative models trained on both the Omniglot and CIFAR datasets. Compared with the Differentiable Neural Computer (DNC) and its variants, our memory model has greater capacity and is significantly easier to train.
Tasks Omniglot
Published 2018-04-05
URL http://arxiv.org/abs/1804.01756v3
PDF http://arxiv.org/pdf/1804.01756v3.pdf
PWC https://paperswithcode.com/paper/the-kanerva-machine-a-generative-distributed

Stagewise Safe Bayesian Optimization with Gaussian Processes

Title Stagewise Safe Bayesian Optimization with Gaussian Processes
Authors Yanan Sui, Vincent Zhuang, Joel W. Burdick, Yisong Yue
Abstract Enforcing safety is a key aspect of many problems pertaining to sequential decision making under uncertainty, which require the decisions made at every step to be both informative of the optimal decision and also safe. For example, we value both efficacy and comfort in medical therapy, and efficiency and safety in robotic control. We consider this problem of optimizing an unknown utility function with absolute feedback or preference feedback subject to unknown safety constraints. We develop an efficient safe Bayesian optimization algorithm, StageOpt, that separates safe region expansion and utility function maximization into two distinct stages. Compared to existing approaches which interleave between expansion and optimization, we show that StageOpt is more efficient and naturally applicable to a broader class of problems. We provide theoretical guarantees for both the satisfaction of safety constraints as well as convergence to the optimal utility value. We evaluate StageOpt on both a variety of synthetic experiments, as well as in clinical practice. We demonstrate that StageOpt is more effective than existing safe optimization approaches, and is able to safely and effectively optimize spinal cord stimulation therapy in our clinical experiments.
Tasks Decision Making, Decision Making Under Uncertainty, Gaussian Processes
Published 2018-06-20
URL https://arxiv.org/abs/1806.07555v2
PDF https://arxiv.org/pdf/1806.07555v2.pdf
PWC https://paperswithcode.com/paper/stagewise-safe-bayesian-optimization-with

DeepFakes: a New Threat to Face Recognition? Assessment and Detection

Title DeepFakes: a New Threat to Face Recognition? Assessment and Detection
Authors Pavel Korshunov, Sebastien Marcel
Abstract It is becoming increasingly easy to automatically replace a face of one person in a video with the face of another person by using a pre-trained generative adversarial network (GAN). Recent public scandals, e.g., the faces of celebrities being swapped onto pornographic videos, call for automated ways to detect these Deepfake videos. To help developing such methods, in this paper, we present the first publicly available set of Deepfake videos generated from videos of VidTIMIT database. We used open source software based on GANs to create the Deepfakes, and we emphasize that training and blending parameters can significantly impact the quality of the resulted videos. To demonstrate this impact, we generated videos with low and high visual quality (320 videos each) using differently tuned parameter sets. We showed that the state of the art face recognition systems based on VGG and Facenet neural networks are vulnerable to Deepfake videos, with 85.62% and 95.00% false acceptance rates respectively, which means methods for detecting Deepfake videos are necessary. By considering several baseline approaches, we found that audio-visual approach based on lip-sync inconsistency detection was not able to distinguish Deepfake videos. The best performing method, which is based on visual quality metrics and is often used in presentation attack detection domain, resulted in 8.97% equal error rate on high quality Deepfakes. Our experiments demonstrate that GAN-generated Deepfake videos are challenging for both face recognition systems and existing detection methods, and the further development of face swapping technology will make it even more so.
Tasks Face Recognition, Face Swapping
Published 2018-12-20
URL http://arxiv.org/abs/1812.08685v1
PDF http://arxiv.org/pdf/1812.08685v1.pdf
PWC https://paperswithcode.com/paper/deepfakes-a-new-threat-to-face-recognition

Building Instance Classification Using Street View Images

Title Building Instance Classification Using Street View Images
Authors Jian Kang, Marco Körner, Yuanyuan Wang, Hannes Taubenböck, Xiao Xiang Zhu
Abstract Land-use classification based on spaceborne or aerial remote sensing images has been extensively studied over the past decades. Such classification is usually a patch-wise or pixel-wise labeling over the whole image. But for many applications, such as urban population density mapping or urban utility planning, a classification map based on individual buildings is much more informative. However, such semantic classification still poses some fundamental challenges, for example, how to retrieve fine boundaries of individual buildings. In this paper, we proposed a general framework for classifying the functionality of individual buildings. The proposed method is based on Convolutional Neural Networks (CNNs) which classify facade structures from street view images, such as Google StreetView, in addition to remote sensing images which usually only show roof structures. Geographic information was utilized to mask out individual buildings, and to associate the corresponding street view images. We created a benchmark dataset which was used for training and evaluating CNNs. In addition, the method was applied to generate building classification maps on both region and city scales of several cities in Canada and the US. Keywords: CNN, Building instance classification, Street view images, OpenStreetMap
Published 2018-02-25
URL http://arxiv.org/abs/1802.09026v1
PDF http://arxiv.org/pdf/1802.09026v1.pdf
PWC https://paperswithcode.com/paper/building-instance-classification-using-street

MoCoNet: Motion Correction in 3D MPRAGE images using a Convolutional Neural Network approach

Title MoCoNet: Motion Correction in 3D MPRAGE images using a Convolutional Neural Network approach
Authors Kamlesh Pawar, Zhaolin Chen, N. Jon Shah, Gary F. Egan
Abstract Purpose: The suppression of motion artefacts from MR images is a challenging task. The purpose of this paper is to develop a standalone novel technique to suppress motion artefacts from MR images using a data-driven deep learning approach. Methods: A deep learning convolutional neural network (CNN) was developed to remove motion artefacts in brain MR images. A CNN was trained on simulated motion corrupted images to identify and suppress artefacts due to the motion. The network was an encoder-decoder CNN architecture where the encoder decomposed the motion corrupted images into a set of feature maps. The feature maps were then combined by the decoder network to generate a motion-corrected image. The network was tested on an unseen simulated dataset and an experimental, motion corrupted in vivo brain dataset. Results: The trained network was able to suppress the motion artefacts in the simulated motion corrupted images, and the mean percentage error in the motion corrected images was 2.69 % with a standard deviation of 0.95 %. The network was able to effectively suppress the motion artefacts from the experimental dataset, demonstrating the generalisation capability of the trained network. Conclusion: A novel and generic motion correction technique has been developed that can suppress motion artefacts from motion corrupted MR images. The proposed technique is a standalone post-processing method that does not interfere with data acquisition or reconstruction parameters, thus making it suitable for a multitude of MR sequences.
Published 2018-07-29
URL http://arxiv.org/abs/1807.10831v1
PDF http://arxiv.org/pdf/1807.10831v1.pdf
PWC https://paperswithcode.com/paper/moconet-motion-correction-in-3d-mprage-images

Aggression-annotated Corpus of Hindi-English Code-mixed Data

Title Aggression-annotated Corpus of Hindi-English Code-mixed Data
Authors Ritesh Kumar, Aishwarya N. Reganti, Akshit Bhatia, Tushar Maheshwari
Abstract As the interaction over the web has increased, incidents of aggression and related events like trolling, cyberbullying, flaming, hate speech, etc. too have increased manifold across the globe. While most of these behaviour like bullying or hate speech have predated the Internet, the reach and extent of the Internet has given these an unprecedented power and influence to affect the lives of billions of people. So it is of utmost significance and importance that some preventive measures be taken to provide safeguard to the people using the web such that the web remains a viable medium of communication and connection, in general. In this paper, we discuss the development of an aggression tagset and an annotated corpus of Hindi-English code-mixed data from two of the most popular social networking and social media platforms in India, Twitter and Facebook. The corpus is annotated using a hierarchical tagset of 3 top-level tags and 10 level 2 tags. The final dataset contains approximately 18k tweets and 21k facebook comments and is being released for further research in the field.
Published 2018-03-26
URL http://arxiv.org/abs/1803.09402v1
PDF http://arxiv.org/pdf/1803.09402v1.pdf
PWC https://paperswithcode.com/paper/aggression-annotated-corpus-of-hindi-english

Multiple Subspace Alignment Improves Domain Adaptation

Title Multiple Subspace Alignment Improves Domain Adaptation
Authors Kowshik Thopalli, Rushil Anirudh, Jayaraman J. Thiagarajan, Pavan Turaga
Abstract We present a novel unsupervised domain adaptation (DA) method for cross-domain visual recognition. Though subspace methods have found success in DA, their performance is often limited due to the assumption of approximating an entire dataset using a single low-dimensional subspace. Instead, we develop a method to effectively represent the source and target datasets via a collection of low-dimensional subspaces, and subsequently align them by exploiting the natural geometry of the space of subspaces, on the Grassmann manifold. We demonstrate the effectiveness of this approach, using empirical studies on two widely used benchmarks, with state of the art domain adaptation performance
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2018-11-11
URL http://arxiv.org/abs/1811.04491v1
PDF http://arxiv.org/pdf/1811.04491v1.pdf
PWC https://paperswithcode.com/paper/multiple-subspace-alignment-improves-domain
comments powered by Disqus