October 18, 2019

3004 words 15 mins read

Paper Group ANR 551

Non-local NetVLAD Encoding for Video Classification. Learning Bone Suppression from Dual Energy Chest X-rays using Adversarial Networks. Kappa Learning: A New Method for Measuring Similarity Between Educational Items Using Performance Data. Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks. An Entropic Optimal Transp …

Non-local NetVLAD Encoding for Video Classification


Title	Non-local NetVLAD Encoding for Video Classification
Authors	Yongyi Tang, Xing Zhang, Jingwen Wang, Shaoxiang Chen, Lin Ma, Yu-Gang Jiang
Abstract	This paper describes our solution for the 2$^\text{nd}$ YouTube-8M video understanding challenge organized by Google AI. Unlike the video recognition benchmarks, such as Kinetics and Moments, the YouTube-8M challenge provides pre-extracted visual and audio features instead of raw videos. In this challenge, the submitted model is restricted to 1GB, which encourages participants focus on constructing one powerful single model rather than incorporating of the results from a bunch of models. Our system fuses six different sub-models into one single computational graph, which are categorized into three families. More specifically, the most effective family is the model with non-local operations following the NetVLAD encoding. The other two family models are Soft-BoF and GRU, respectively. In order to further boost single models performance, the model parameters of different checkpoints are averaged. Experimental results demonstrate that our proposed system can effectively perform the video classification task, achieving 0.88763 on the public test set and 0.88704 on the private set in terms of GAP@20, respectively. We finally ranked at the fourth place in the YouTube-8M video understanding challenge.
Tasks	Video Classification, Video Recognition, Video Understanding
Published	2018-09-29
URL	http://arxiv.org/abs/1810.00207v1
PDF	http://arxiv.org/pdf/1810.00207v1.pdf
PWC	https://paperswithcode.com/paper/non-local-netvlad-encoding-for-video
Repo
Framework

Learning Bone Suppression from Dual Energy Chest X-rays using Adversarial Networks


Title	Learning Bone Suppression from Dual Energy Chest X-rays using Adversarial Networks
Authors	Dong Yul Oh, Il Dong Yun
Abstract	Suppressing bones on chest X-rays such as ribs and clavicle is often expected to improve pathologies classification. These bones can interfere with a broad range of diagnostic tasks on pulmonary disease except for musculoskeletal system. Current conventional method for acquisition of bone suppressed X-rays is dual energy imaging, which captures two radiographs at a very short interval with different energy levels; however, the patient is exposed to radiation twice and the artifacts arise due to heartbeats between two shots. In this paper, we introduce a deep generative model trained to predict bone suppressed images on single energy chest X-rays, analyzing a finite set of previously acquired dual energy chest X-rays. Since the relatively small amount of data is available, such approach relies on the methodology maximizing the data utilization. Here we integrate the following two approaches. First, we use a conditional generative adversarial network that complements the traditional regression method minimizing the pairwise image difference. Second, we use Haar 2D wavelet decomposition to offer a perceptual guideline in frequency details to allow the model to converge quickly and efficiently. As a result, we achieve state-of-the-art performance on bone suppression as compared to the existing approaches with dual energy chest X-rays.
Tasks	Bone Suppression From Dual Energy Chest X-Rays
Published	2018-11-05
URL	http://arxiv.org/abs/1811.02628v1
PDF	http://arxiv.org/pdf/1811.02628v1.pdf
PWC	https://paperswithcode.com/paper/learning-bone-suppression-from-dual-energy
Repo
Framework

Kappa Learning: A New Method for Measuring Similarity Between Educational Items Using Performance Data


Title	Kappa Learning: A New Method for Measuring Similarity Between Educational Items Using Performance Data
Authors	Tanya Nazaretsky, Sara Hershkovitz, Giora Alexandron
Abstract	Sequencing items in adaptive learning systems typically relies on a large pool of interactive assessment items (questions) that are analyzed into a hierarchy of skills or Knowledge Components (KCs). Educational data mining techniques can be used to analyze students performance data in order to optimize the mapping of items to KCs. Standard methods that map items into KCs using item-similarity measures make the implicit assumption that students performance on items that depend on the same skill should be similar. This assumption holds if the latent trait (mastery of the underlying skill) is relatively fixed during students activity, as in the context of testing, which is the primary context in which these measures were developed and applied. However, in adaptive learning systems that aim for learning, and address subject matters such as K6 Math that consist of multiple sub-skills, this assumption does not hold. In this paper we propose a new item-similarity measure, termed Kappa Learning (KL), which aims to address this gap. KL identifies similarity between items under the assumption of learning, namely, that learners mastery of the underlying skills changes as they progress through the items. We evaluate Kappa Learning on data from a computerized tutor that teaches Fractions for 4th grade, with experts tagging as ground truth, and on simulated data. Our results show that clustering that is based on Kappa Learning outperforms clustering that is based on commonly used similarity measures (Cohen Kappa, Yule, and Pearson).
Tasks
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08390v1
PDF	http://arxiv.org/pdf/1812.08390v1.pdf
PWC	https://paperswithcode.com/paper/kappa-learning-a-new-method-for-measuring
Repo
Framework

Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks


Title	Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
Authors	Xiaodong Cui, Wei Zhang, Zoltán Tüske, Michael Picheny
Abstract	We propose a population-based Evolutionary Stochastic Gradient Descent (ESGD) framework for optimizing deep neural networks. ESGD combines SGD and gradient-free evolutionary algorithms as complementary algorithms in one framework in which the optimization alternates between the SGD step and evolution step to improve the average fitness of the population. With a back-off strategy in the SGD step and an elitist strategy in the evolution step, it guarantees that the best fitness in the population will never degrade. In addition, individuals in the population optimized with various SGD-based optimizers using distinct hyper-parameters in the SGD step are considered as competing species in a coevolution setting such that the complementarity of the optimizers is also taken into account. The effectiveness of ESGD is demonstrated across multiple applications including speech recognition, image recognition and language modeling, using networks with a variety of deep architectures.
Tasks	Language Modelling, Speech Recognition
Published	2018-10-16
URL	http://arxiv.org/abs/1810.06773v1
PDF	http://arxiv.org/pdf/1810.06773v1.pdf
PWC	https://paperswithcode.com/paper/evolutionary-stochastic-gradient-descent-for
Repo
Framework

An Entropic Optimal Transport Loss for Learning Deep Neural Networks under Label Noise in Remote Sensing Images


Title	An Entropic Optimal Transport Loss for Learning Deep Neural Networks under Label Noise in Remote Sensing Images
Authors	Bharath Bhushan Damodaran, Rémi Flamary, Viven Seguy, Nicolas Courty
Abstract	Deep neural networks have established as a powerful tool for large scale supervised classification tasks. The state-of-the-art performances of deep neural networks are conditioned to the availability of large number of accurately labeled samples. In practice, collecting large scale accurately labeled datasets is a challenging and tedious task in most scenarios of remote sensing image analysis, thus cheap surrogate procedures are employed to label the dataset. Training deep neural networks on such datasets with inaccurate labels easily overfits to the noisy training labels and degrades the performance of the classification tasks drastically. To mitigate this effect, we propose an original solution with entropic optimal transportation. It allows to learn in an end-to-end fashion deep neural networks that are, to some extent, robust to inaccurately labeled samples. We empirically demonstrate on several remote sensing datasets, where both scene and pixel-based hyperspectral images are considered for classification. Our method proves to be highly tolerant to significant amounts of label noise and achieves favorable results against state-of-the-art methods.
Tasks
Published	2018-10-02
URL	http://arxiv.org/abs/1810.01163v1
PDF	http://arxiv.org/pdf/1810.01163v1.pdf
PWC	https://paperswithcode.com/paper/an-entropic-optimal-transport-loss-for
Repo
Framework

Towards Formula Translation using Recursive Neural Networks


Title	Towards Formula Translation using Recursive Neural Networks
Authors	Felix Petersen, Moritz Schubotz, Bela Gipp
Abstract	While it has become common to perform automated translations on natural language, performing translations between different representations of mathematical formulae has thus far not been possible. We implemented the first translator for mathematical formulae based on recursive neural networks. We chose recursive neural networks because mathematical formulae inherently include a structural encoding. In our implementation, we developed new techniques and topologies for recursive tree-to-tree neural networks based on multi-variate multi-valued Long Short-Term Memory cells. We propose a novel approach for mini-batch training that utilizes clustering and tree traversal. We evaluate our translator and analyze the behavior of our proposed topologies and techniques based on a translation from generic LaTeX to the semantic LaTeX notation. We use the semantic LaTeX notation from the Digital Library for Mathematical Formulae and the Digital Repository for Mathematical Formulae at the National Institute for Standards and Technology. We find that a simple heuristics-based clustering algorithm outperforms the conventional clustering algorithms on the task of clustering binary trees of mathematical formulae with respect to their topology. Furthermore, we find a mask for the loss function, which can prevent the neural network from finding a local minimum of the loss function. Given our preliminary results, a complete translation from formula to formula is not yet possible. However, we achieved a prediction accuracy of 47.05% for predicting symbols at the correct position and an accuracy of 92.3% when ignoring the predicted position. Concluding, our work advances the field of recursive neural networks by improving the training speed and quality of training. In the future, we will work towards a complete translation allowing a machine-interpretation of LaTeX formulae.
Tasks
Published	2018-11-10
URL	http://arxiv.org/abs/1811.04234v1
PDF	http://arxiv.org/pdf/1811.04234v1.pdf
PWC	https://paperswithcode.com/paper/towards-formula-translation-using-recursive
Repo
Framework

Two-Stream Neural Networks for Tampered Face Detection


Title	Two-Stream Neural Networks for Tampered Face Detection
Authors	Peng Zhou, Xintong Han, Vlad I. Morariu, Larry S. Davis
Abstract	We propose a two-stream network for face tampering detection. We train GoogLeNet to detect tampering artifacts in a face classification stream, and train a patch based triplet network to leverage features capturing local noise residuals and camera characteristics as a second stream. In addition, we use two different online face swapping applications to create a new dataset that consists of 2010 tampered images, each of which contains a tampered face. We evaluate the proposed two-stream network on our newly collected dataset. Experimental results demonstrate the effectiveness of our method.
Tasks	Face Detection, Face Swapping
Published	2018-03-29
URL	http://arxiv.org/abs/1803.11276v1
PDF	http://arxiv.org/pdf/1803.11276v1.pdf
PWC	https://paperswithcode.com/paper/two-stream-neural-networks-for-tampered-face
Repo
Framework

Natural measures of alignment


Title	Natural measures of alignment
Authors	R. A. Kycia, Z. Tabor
Abstract	Natural coordinate system will be proposed. In this coordinate system alignment procedure of a device and a detector can be easily performed. This approach is generalization of previous specific formulas in the field of calibration and provide top level description of the procedure. A basic example application to linac therapy plan is also provided.
Tasks	Calibration
Published	2018-10-01
URL	http://arxiv.org/abs/1810.00965v1
PDF	http://arxiv.org/pdf/1810.00965v1.pdf
PWC	https://paperswithcode.com/paper/natural-measures-of-alignment
Repo
Framework


Title	Unsupervised Trajectory Segmentation and Promoting of Multi-Modal Surgical Demonstrations
Authors	Zhenzhou Shao, Hongfa Zhao, Jiexin Xie, Ying Qu, Yong Guan, Jindong Tan
Abstract	To improve the efficiency of surgical trajectory segmentation for robot learning in robot-assisted minimally invasive surgery, this paper presents a fast unsupervised method using video and kinematic data, followed by a promoting procedure to address the over-segmentation issue. Unsupervised deep learning network, stacking convolutional auto-encoder, is employed to extract more discriminative features from videos in an effective way. To further improve the accuracy of segmentation, on one hand, wavelet transform is used to filter out the noises existed in the features from video and kinematic data. On the other hand, the segmentation result is promoted by identifying the adjacent segments with no state transition based on the predefined similarity measurements. Extensive experiments on a public dataset JIGSAWS show that our method achieves much higher accuracy of segmentation than state-of-the-art methods in the shorter time.
Tasks
Published	2018-10-01
URL	https://arxiv.org/abs/1810.00599v1
PDF	https://arxiv.org/pdf/1810.00599v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-trajectory-segmentation-and
Repo
Framework

Hybrid Active Inference


Title	Hybrid Active Inference
Authors	André Ofner, Sebastian Stober
Abstract	We describe a framework of hybrid cognition by formulating a hybrid cognitive agent that performs hierarchical active inference across a human and a machine part. We suggest that, in addition to enhancing human cognitive functions with an intelligent and adaptive interface, integrated cognitive processing could accelerate emergent properties within artificial intelligence. To establish this, a machine learning part learns to integrate into human cognition by explaining away multi-modal sensory measurements from the environment and physiology simultaneously with the brain signal. With ongoing training, the amount of predictable brain signal increases. This lends the agent the ability to self-supervise on increasingly high levels of cognitive processing in order to further minimize surprise in predicting the brain signal. Furthermore, with increasing level of integration, the access to sensory information about environment and physiology is substituted with access to their representation in the brain. While integrating into a joint embodiment of human and machine, human action and perception are treated as the machine’s own. The framework can be implemented with invasive as well as non-invasive sensors for environment, body and brain interfacing. Online and offline training with different machine learning approaches are thinkable. Building on previous research on shared representation learning, we suggest a first implementation leading towards hybrid active inference with non-invasive brain interfacing and state of the art probabilistic deep learning methods. We further discuss how implementation might have effect on the meta-cognitive abilities of the described agent and suggest that with adequate implementation the machine part can continue to execute and build upon the learned cognitive processes autonomously.
Tasks	Representation Learning
Published	2018-10-05
URL	http://arxiv.org/abs/1810.02647v1
PDF	http://arxiv.org/pdf/1810.02647v1.pdf
PWC	https://paperswithcode.com/paper/hybrid-active-inference
Repo
Framework

Speaker Naming in Movies


Title	Speaker Naming in Movies
Authors	Mahmoud Azab, Mingzhe Wang, Max Smith, Noriyuki Kojima, Jia Deng, Rada Mihalcea
Abstract	We propose a new model for speaker naming in movies that leverages visual, textual, and acoustic modalities in an unified optimization framework. To evaluate the performance of our model, we introduce a new dataset consisting of six episodes of the Big Bang Theory TV show and eighteen full movies covering different genres. Our experiments show that our multimodal model significantly outperforms several competitive baselines on the average weighted F-score metric. To demonstrate the effectiveness of our framework, we design an end-to-end memory network model that leverages our speaker naming model and achieves state-of-the-art results on the subtitles task of the MovieQA 2017 Challenge.
Tasks
Published	2018-09-24
URL	http://arxiv.org/abs/1809.08761v1
PDF	http://arxiv.org/pdf/1809.08761v1.pdf
PWC	https://paperswithcode.com/paper/speaker-naming-in-movies
Repo
Framework

Semi-automated Annotation of Signal Events in Clinical EEG Data


Title	Semi-automated Annotation of Signal Events in Clinical EEG Data
Authors	Scott Yang, Silvia Lopez, Meysam Golmohammadi, Iyad Obeid, Joseph Picone
Abstract	To be effective, state of the art machine learning technology needs large amounts of annotated data. There are numerous compelling applications in healthcare that can benefit from high performance automated decision support systems provided by deep learning technology, but they lack the comprehensive data resources required to apply sophisticated machine learning models. Further, for economic reasons, it is very difficult to justify the creation of large annotated corpora for these applications. Hence, automated annotation techniques become increasingly important. In this study, we investigated the effectiveness of using an active learning algorithm to automatically annotate a large EEG corpus. The algorithm is designed to annotate six types of EEG events. Two model training schemes, namely threshold-based and volume-based, are evaluated. In the threshold-based scheme the threshold of confidence scores is optimized in the initial training iteration, whereas for the volume-based scheme only a certain amount of data is preserved after each iteration. Recognition performance is improved 2% absolute and the system is capable of automatically annotating previously unlabeled data. Given that the interpretation of clinical EEG data is an exceedingly difficult task, this study provides some evidence that the proposed method is a viable alternative to expensive manual annotation.
Tasks	Active Learning, EEG
Published	2018-01-03
URL	http://arxiv.org/abs/1801.02476v1
PDF	http://arxiv.org/pdf/1801.02476v1.pdf
PWC	https://paperswithcode.com/paper/semi-automated-annotation-of-signal-events-in
Repo
Framework

Fake News Identification on Twitter with Hybrid CNN and RNN Models


Title	Fake News Identification on Twitter with Hybrid CNN and RNN Models
Authors	Oluwaseun Ajao, Deepayan Bhowmik, Shahrzad Zargari
Abstract	The problem associated with the propagation of fake news continues to grow at an alarming scale. This trend has generated much interest from politics to academia and industry alike. We propose a framework that detects and classifies fake news messages from Twitter posts using hybrid of convolutional neural networks and long-short term recurrent neural network models. The proposed work using this deep learning approach achieves 82% accuracy. Our approach intuitively identifies relevant features associated with fake news stories without previous knowledge of the domain.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11316v1
PDF	http://arxiv.org/pdf/1806.11316v1.pdf
PWC	https://paperswithcode.com/paper/fake-news-identification-on-twitter-with
Repo
Framework

Myocardial Segmentation of Contrast Echocardiograms Using Random Forests Guided by Shape Model


Title	Myocardial Segmentation of Contrast Echocardiograms Using Random Forests Guided by Shape Model
Authors	Yuanwei Li, Chin Pang Ho, Navtej Chahal, Roxy Senior, Meng-Xing Tang
Abstract	Myocardial Contrast Echocardiography (MCE) with micro-bubble contrast agent enables myocardial perfusion quantification which is invaluable for the early detection of coronary artery diseases. In this paper, we proposed a new segmentation method called Shape Model guided Random Forests (SMRF) for the analysis of MCE data. The proposed method utilizes a statistical shape model of the myocardium to guide the Random Forest (RF) segmentation in two ways. First, we introduce a novel Shape Model (SM) feature which captures the global structure and shape of the myocardium to produce a more accurate RF probability map. Second, the shape model is fitted to the RF probability map to further refine and constrain the final segmentation to plausible myocardial shapes. Evaluated on clinical MCE images from 15 patients, our method obtained promising results (Dice=0.81, Jaccard=0.70, MAD=1.68 mm, HD=6.53 mm) and showed a notable improvement in segmentation accuracy over the classic RF and its variants.
Tasks
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07490v1
PDF	http://arxiv.org/pdf/1806.07490v1.pdf
PWC	https://paperswithcode.com/paper/myocardial-segmentation-of-contrast
Repo
Framework

In-network Neural Networks


Title	In-network Neural Networks
Authors	Giuseppe Siracusano, Roberto Bifulco
Abstract	We present N2Net, a system that implements binary neural networks using commodity switching chips deployed in network switches and routers. Our system shows that these devices can run simple neural network models, whose input is encoded in the network packets’ header, at packet processing speeds (billions of packets per second). Furthermore, our experience highlights that switching chips could support even more complex models, provided that some minor and cheap modifications to the chip’s design are applied. We believe N2Net provides an interesting building block for future end-to-end networked systems.
Tasks
Published	2018-01-17
URL	http://arxiv.org/abs/1801.05731v1
PDF	http://arxiv.org/pdf/1801.05731v1.pdf
PWC	https://paperswithcode.com/paper/in-network-neural-networks
Repo
Framework