October 17, 2019

2934 words 14 mins read

Paper Group ANR 689

Joint Shape Representation and Classification for Detecting PDAC. Bag-of-Words as Target for Neural Machine Translation. Heuristic Approaches for Goal Recognition in Incomplete Domain Models. Crowd-Assisted Polyp Annotation of Virtual Colonoscopy Videos. Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: c …

Joint Shape Representation and Classification for Detecting PDAC


Title	Joint Shape Representation and Classification for Detecting PDAC
Authors	Fengze Liu, Lingxi Xie, Yingda Xia, Elliot K. Fishman, Alan L. Yuille
Abstract	We aim to detect pancreatic ductal adenocarcinoma (PDAC) in abdominal CT scans, which sheds light on early diagnosis of pancreatic cancer. This is a 3D volume classification task with little training data. We propose a two-stage framework, which first segments the pancreas into a binary mask, then compresses the mask into a shape vector and performs abnormality classification. Shape representation and classification are performed in a joint manner, both to exploit the knowledge that PDAC often changes the shape of the pancreas and to prevent over-fitting. Experiments are performed on 300 normal scans and 136 PDAC cases. We achieve a specificity of 90.2% (false alarm occurs on less than 1/10 normal cases) at a sensitivity of 80.2% (less than 1/5 PDAC cases are not detected), which show promise for clinical applications.
Tasks
Published	2018-04-27
URL	https://arxiv.org/abs/1804.10684v2
PDF	https://arxiv.org/pdf/1804.10684v2.pdf
PWC	https://paperswithcode.com/paper/joint-shape-representation-and-classification
Repo
Framework

Bag-of-Words as Target for Neural Machine Translation


Title	Bag-of-Words as Target for Neural Machine Translation
Authors	Shuming Ma, Xu Sun, Yizhong Wang, Junyang Lin
Abstract	A sentence can be translated into more than one correct sentences. However, most of the existing neural machine translation models only use one of the correct translations as the targets, and the other correct sentences are punished as the incorrect sentences in the training stage. Since most of the correct translations for one sentence share the similar bag-of-words, it is possible to distinguish the correct translations from the incorrect ones by the bag-of-words. In this paper, we propose an approach that uses both the sentences and the bag-of-words as targets in the training stage, in order to encourage the model to generate the potentially correct sentences that are not appeared in the training set. We evaluate our model on a Chinese-English translation dataset, and experiments show our model outperforms the strong baselines by the BLEU score of 4.55.
Tasks	Machine Translation
Published	2018-05-13
URL	http://arxiv.org/abs/1805.04871v1
PDF	http://arxiv.org/pdf/1805.04871v1.pdf
PWC	https://paperswithcode.com/paper/bag-of-words-as-target-for-neural-machine
Repo
Framework

Heuristic Approaches for Goal Recognition in Incomplete Domain Models


Title	Heuristic Approaches for Goal Recognition in Incomplete Domain Models
Authors	Ramon Fraga Pereira, Felipe Meneguzzi
Abstract	Recent approaches to goal recognition have progressively relaxed the assumptions about the amount and correctness of domain knowledge and available observations, yielding accurate and efficient algorithms. These approaches, however, assume completeness and correctness of the domain theory against which their algorithms match observations: this is too strong for most real-world domains. In this paper, we develop goal recognition techniques that are capable of recognizing goals using \textit{incomplete} (and possibly incorrect) domain theories. We show the efficiency and accuracy of our approaches empirically against a large dataset of goal and plan recognition problems with incomplete domains.
Tasks
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05917v1
PDF	http://arxiv.org/pdf/1804.05917v1.pdf
PWC	https://paperswithcode.com/paper/heuristic-approaches-for-goal-recognition-in
Repo
Framework

Crowd-Assisted Polyp Annotation of Virtual Colonoscopy Videos


Title	Crowd-Assisted Polyp Annotation of Virtual Colonoscopy Videos
Authors	Ji Hwan Park, Saad Nadeem, Joseph Marino, Kevin Baker, Matthew Barish, Arie Kaufman
Abstract	Virtual colonoscopy (VC) allows a radiologist to navigate through a 3D colon model reconstructed from a computed tomography scan of the abdomen, looking for polyps, the precursors of colon cancer. Polyps are seen as protrusions on the colon wall and haustral folds, visible in the VC fly-through videos. A complete review of the colon surface requires full navigation from the rectum to the cecum in antegrade and retrograde directions, which is a tedious task that takes an average of 30 minutes. Crowdsourcing is a technique for non-expert users to perform certain tasks, such as image or video annotation. In this work, we use crowdsourcing for the examination of complete VC fly-through videos for polyp annotation by non-experts. The motivation for this is to potentially help the radiologist reach a diagnosis in a shorter period of time, and provide a stronger confirmation of the eventual diagnosis. The crowdsourcing interface includes an interactive tool for the crowd to annotate suspected polyps in the video with an enclosing box. Using our workflow, we achieve an overall polyps-per-patient sensitivity of 87.88% (95.65% for polyps $\geq$5mm and 70% for polyps $<$5mm). We also demonstrate the efficacy and effectiveness of a non-expert user in detecting and annotating polyps and discuss their possibility in aiding radiologists in VC examinations.
Tasks
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06408v1
PDF	http://arxiv.org/pdf/1809.06408v1.pdf
PWC	https://paperswithcode.com/paper/crowd-assisted-polyp-annotation-of-virtual
Repo
Framework

Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines


Title	Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines
Authors	Boyi Yang, Adam Wright
Abstract	Health professionals can use natural language processing (NLP) technologies when reviewing electronic health records (EHR). Machine learning free-text classifiers can help them identify problems and make critical decisions. We aim to develop deep learning neural network algorithms that identify EHR progress notes pertaining to diabetes and validate the algorithms at two institutions. The data used are 2,000 EHR progress notes retrieved from patients with diabetes and all notes were annotated manually as diabetic or non-diabetic. Several deep learning classifiers were developed, and their performances were evaluated with the area under the ROC curve (AUC). The convolutional neural network (CNN) model with a separable convolution layer accurately identified diabetes-related notes in the Brigham and Womens Hospital testing set with the highest AUC of 0.975. Deep learning classifiers can be used to identify EHR progress notes pertaining to diabetes. In particular, the CNN-based classifier can achieve a higher AUC than an SVM-based classifier.
Tasks
Published	2018-09-16
URL	http://arxiv.org/abs/1809.05814v1
PDF	http://arxiv.org/pdf/1809.05814v1.pdf
PWC	https://paperswithcode.com/paper/development-of-deep-learning-algorithms-to
Repo
Framework

Efficient Algorithms for Outlier-Robust Regression


Title	Efficient Algorithms for Outlier-Robust Regression
Authors	Adam Klivans, Pravesh K. Kothari, Raghu Meka
Abstract	We give the first polynomial-time algorithm for performing linear or polynomial regression resilient to adversarial corruptions in both examples and labels. Given a sufficiently large (polynomial-size) training set drawn i.i.d. from distribution D and subsequently corrupted on some fraction of points, our algorithm outputs a linear function whose squared error is close to the squared error of the best-fitting linear function with respect to D, assuming that the marginal distribution of D over the input space is \emph{certifiably hypercontractive}. This natural property is satisfied by many well-studied distributions such as Gaussian, strongly log-concave distributions and, uniform distribution on the hypercube among others. We also give a simple statistical lower bound showing that some distributional assumption is necessary to succeed in this setting. These results are the first of their kind and were not known to be even information-theoretically possible prior to our work. Our approach is based on the sum-of-squares (SoS) method and is inspired by the recent applications of the method for parameter recovery problems in unsupervised learning. Our algorithm can be seen as a natural convex relaxation of the following conceptually simple non-convex optimization problem: find a linear function and a large subset of the input corrupted sample such that the least squares loss of the function over the subset is minimized over all possible large subsets.
Tasks
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03241v2
PDF	http://arxiv.org/pdf/1803.03241v2.pdf
PWC	https://paperswithcode.com/paper/efficient-algorithms-for-outlier-robust
Repo
Framework

Anime Style Space Exploration Using Metric Learning and Generative Adversarial Networks


Title	Anime Style Space Exploration Using Metric Learning and Generative Adversarial Networks
Authors	Sitao Xiang, Hao Li
Abstract	Deep learning-based style transfer between images has recently become a popular area of research. A common way of encoding “style” is through a feature representation based on the Gram matrix of features extracted by some pre-trained neural network or some other form of feature statistics. Such a definition is based on an arbitrary human decision and may not best capture what a style really is. In trying to gain a better understanding of “style”, we propose a metric learning-based method to explicitly encode the style of an artwork. In particular, our definition of style captures the differences between artists, as shown by classification performances, and such that the style representation can be interpreted, manipulated and visualized through style-conditioned image generation through a Generative Adversarial Network. We employ this method to explore the style space of anime portrait illustrations.
Tasks	Image Generation, Metric Learning, Style Transfer
Published	2018-05-21
URL	http://arxiv.org/abs/1805.07997v1
PDF	http://arxiv.org/pdf/1805.07997v1.pdf
PWC	https://paperswithcode.com/paper/anime-style-space-exploration-using-metric
Repo
Framework

A Simplified Active Calibration algorithm for Focal Length Estimation


Title	A Simplified Active Calibration algorithm for Focal Length Estimation
Authors	Mehdi Faraji, Anup Basu
Abstract	We introduce new linear mathematical formulations to calculate the focal length of a camera in an active platform. Through mathematical derivations, we show that the focal lengths in each direction can be estimated using only one point correspondence that relates images taken before and after a degenerate rotation of the camera. The new formulations will be beneficial in robotic and dynamic surveillance environments when the camera needs to be calibrated while it freely moves and zooms. By establishing a correspondence between only two images taken after slightly panning and tilting the camera and a reference image, our proposed Simplified Calibration Method is able to calculate the focal length of the camera. We extensively evaluate the derived formulations on a simulated camera, 3D scenes and real-world images. Our error analysis over simulated and real images indicates that the proposed Simplified Active Calibration formulation estimates the parameters of a camera with low error rates.
Tasks	Calibration
Published	2018-06-10
URL	http://arxiv.org/abs/1806.03584v1
PDF	http://arxiv.org/pdf/1806.03584v1.pdf
PWC	https://paperswithcode.com/paper/a-simplified-active-calibration-algorithm-for
Repo
Framework

Top-Down Tree Structured Text Generation


Title	Top-Down Tree Structured Text Generation
Authors	Qipeng Guo, Xipeng Qiu, Xiangyang Xue, Zheng Zhang
Abstract	Text generation is a fundamental building block in natural language processing tasks. Existing sequential models performs autoregression directly over the text sequence and have difficulty generating long sentences of complex structures. This paper advocates a simple approach that treats sentence generation as a tree-generation task. By explicitly modelling syntactic structures in a constituent syntactic tree and performing top-down, breadth-first tree generation, our model fixes dependencies appropriately and performs implicit global planning. This is in contrast to transition-based depth-first generation process, which has difficulty dealing with incomplete texts when parsing and also does not incorporate future contexts in planning. Our preliminary results on two generation tasks and one parsing task demonstrate that this is an effective strategy.
Tasks	Text Generation
Published	2018-08-14
URL	http://arxiv.org/abs/1808.04865v1
PDF	http://arxiv.org/pdf/1808.04865v1.pdf
PWC	https://paperswithcode.com/paper/top-down-tree-structured-text-generation
Repo
Framework

Using Inter-Sentence Diverse Beam Search to Reduce Redundancy in Visual Storytelling


Title	Using Inter-Sentence Diverse Beam Search to Reduce Redundancy in Visual Storytelling
Authors	Chao-Chun Hsu, Szu-Min Chen, Ming-Hsun Hsieh, Lun-Wei Ku
Abstract	Visual storytelling includes two important parts: coherence between the story and images as well as the story structure. For image to text neural network models, similar images in the sequence would provide close information for story generator to obtain almost identical sentence. However, repeatedly narrating same objects or events will undermine a good story structure. In this paper, we proposed an inter-sentence diverse beam search to generate a more expressive story. Comparing to some recent models of visual storytelling task, which generate story without considering the generated sentence of the previous picture, our proposed method can avoid generating identical sentence even given a sequence of similar pictures.
Tasks	Visual Storytelling
Published	2018-05-30
URL	http://arxiv.org/abs/1805.11867v1
PDF	http://arxiv.org/pdf/1805.11867v1.pdf
PWC	https://paperswithcode.com/paper/using-inter-sentence-diverse-beam-search-to
Repo
Framework

Enhancing Identification of Causal Effects by Pruning


Title	Enhancing Identification of Causal Effects by Pruning
Authors	Santtu Tikka, Juha Karvanen
Abstract	Causal models communicate our assumptions about causes and effects in real-world phe- nomena. Often the interest lies in the identification of the effect of an action which means deriving an expression from the observed probability distribution for the interventional distribution resulting from the action. In many cases an identifiability algorithm may return a complicated expression that contains variables that are in fact unnecessary. In practice this can lead to additional computational burden and increased bias or inefficiency of estimates when dealing with measurement error or missing data. We present graphical criteria to detect variables which are redundant in identifying causal effects. We also provide an improved version of a well-known identifiability algorithm that implements these criteria.
Tasks
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07085v1
PDF	http://arxiv.org/pdf/1806.07085v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-identification-of-causal-effects-by
Repo
Framework

Facelet-Bank for Fast Portrait Manipulation


Title	Facelet-Bank for Fast Portrait Manipulation
Authors	Ying-Cong Chen, Huaijia Lin, Michelle Shu, Ruiyu Li, Xin Tao, Yangang Ye, Xiaoyong Shen, Jiaya Jia
Abstract	Digital face manipulation has become a popular and fascinating way to touch images with the prevalence of smartphones and social networks. With a wide variety of user preferences, facial expressions, and accessories, a general and flexible model is necessary to accommodate different types of facial editing. In this paper, we propose a model to achieve this goal based on an end-to-end convolutional neural network that supports fast inference, edit-effect control, and quick partial-model update. In addition, this model learns from unpaired image sets with different attributes. Experimental results show that our framework can handle a wide range of expressions, accessories, and makeup effects. It produces high-resolution and high-quality results in fast speed.
Tasks
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05576v3
PDF	http://arxiv.org/pdf/1803.05576v3.pdf
PWC	https://paperswithcode.com/paper/facelet-bank-for-fast-portrait-manipulation
Repo
Framework

Scaling Configuration of Energy Harvesting Sensors with Reinforcement Learning


Title	Scaling Configuration of Energy Harvesting Sensors with Reinforcement Learning
Authors	Francesco Fraternali, Bharathan Balaji, Rajesh Gupta
Abstract	With the advent of the Internet of Things (IoT), an increasing number of energy harvesting methods are being used to supplement or supplant battery based sensors. Energy harvesting sensors need to be configured according to the application, hardware, and environmental conditions to maximize their usefulness. As of today, the configuration of sensors is either manual or heuristics based, requiring valuable domain expertise. Reinforcement learning (RL) is a promising approach to automate configuration and efficiently scale IoT deployments, but it is not yet adopted in practice. We propose solutions to bridge this gap: reduce the training phase of RL so that nodes are operational within a short time after deployment and reduce the computational requirements to scale to large deployments. We focus on configuration of the sampling rate of indoor solar panel based energy harvesting sensors. We created a simulator based on 3 months of data collected from 5 sensor nodes subject to different lighting conditions. Our simulation results show that RL can effectively learn energy availability patterns and configure the sampling rate of the sensor nodes to maximize the sensing data while ensuring that energy storage is not depleted. The nodes can be operational within the first day by using our methods. We show that it is possible to reduce the number of RL policies by using a single policy for nodes that share similar lighting conditions.
Tasks
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11259v1
PDF	http://arxiv.org/pdf/1811.11259v1.pdf
PWC	https://paperswithcode.com/paper/scaling-configuration-of-energy-harvesting
Repo
Framework

Deep Learning of Nonnegativity-Constrained Autoencoders for Enhanced Understanding of Data


Title	Deep Learning of Nonnegativity-Constrained Autoencoders for Enhanced Understanding of Data
Authors	Babajide O. Ayinde, Jacek M. Zurada
Abstract	Unsupervised feature extractors are known to perform an efficient and discriminative representation of data. Insight into the mappings they perform and human ability to understand them, however, remain very limited. This is especially prominent when multilayer deep learning architectures are used. This paper demonstrates how to remove these bottlenecks within the architecture of Nonnegativity Constrained Autoencoder (NCSAE). It is shown that by using both L1 and L2 regularization that induce nonnegativity of weights, most of the weights in the network become constrained to be nonnegative thereby resulting into a more understandable structure with minute deterioration in classification accuracy. Also, this proposed approach extracts features that are more sparse and produces additional output layer sparsification. The method is analyzed for accuracy and feature interpretation on the MNIST data, the NORB normalized uniform object data, and the Reuters text categorization dataset.
Tasks	L2 Regularization, Text Categorization
Published	2018-01-31
URL	http://arxiv.org/abs/1802.00003v3
PDF	http://arxiv.org/pdf/1802.00003v3.pdf
PWC	https://paperswithcode.com/paper/deep-learning-of-nonnegativity-constrained
Repo
Framework

Learned Video Compression


Title	Learned Video Compression
Authors	Oren Rippel, Sanjay Nair, Carissa Lew, Steve Branson, Alexander G. Anderson, Lubomir Bourdev
Abstract	We present a new algorithm for video coding, learned end-to-end for the low-latency mode. In this setting, our approach outperforms all existing video codecs across nearly the entire bitrate range. To our knowledge, this is the first ML-based method to do so. We evaluate our approach on standard video compression test sets of varying resolutions, and benchmark against all mainstream commercial codecs, in the low-latency mode. On standard-definition videos, relative to our algorithm, HEVC/H.265, AVC/H.264 and VP9 typically produce codes up to 60% larger. On high-definition 1080p videos, H.265 and VP9 typically produce codes up to 20% larger, and H.264 up to 35% larger. Furthermore, our approach does not suffer from blocking artifacts and pixelation, and thus produces videos that are more visually pleasing. We propose two main contributions. The first is a novel architecture for video compression, which (1) generalizes motion estimation to perform any learned compensation beyond simple translations, (2) rather than strictly relying on previously transmitted reference frames, maintains a state of arbitrary information learned by the model, and (3) enables jointly compressing all transmitted signals (such as optical flow and residual). Secondly, we present a framework for ML-based spatial rate control: namely, a mechanism for assigning variable bitrates across space for each frame. This is a critical component for video coding, which to our knowledge had not been developed within a machine learning setting.
Tasks	Motion Estimation, Optical Flow Estimation, Video Compression
Published	2018-11-16
URL	http://arxiv.org/abs/1811.06981v1
PDF	http://arxiv.org/pdf/1811.06981v1.pdf
PWC	https://paperswithcode.com/paper/learned-video-compression
Repo
Framework