October 19, 2019

2815 words 14 mins read

Paper Group ANR 327

Challenges and Characteristics of Intelligent Autonomy for Internet of Battle Things in Highly Adversarial Environments. Preference-Guided Planning: An Active Elicitation Approach. A Distribution Similarity Based Regularizer for Learning Bayesian Networks. Hierarchical RNN for Information Extraction from Lawsuit Documents. Multiple Convolutional Ne …

Challenges and Characteristics of Intelligent Autonomy for Internet of Battle Things in Highly Adversarial Environments


Title	Challenges and Characteristics of Intelligent Autonomy for Internet of Battle Things in Highly Adversarial Environments
Authors	Alexander Kott
Abstract	Numerous, artificially intelligent, networked things will populate the battlefield of the future, operating in close collaboration with human warfighters, and fighting as teams in highly adversarial environments. This paper explores the characteristics, capabilities and intelligence required of such a network of intelligent things and humans - Internet of Battle Things (IOBT). It will experience unique challenges that are not yet well addressed by the current generation of AI and machine learning.
Tasks
Published	2018-03-20
URL	http://arxiv.org/abs/1803.11256v2
PDF	http://arxiv.org/pdf/1803.11256v2.pdf
PWC	https://paperswithcode.com/paper/challenges-and-characteristics-of-intelligent
Repo
Framework

Preference-Guided Planning: An Active Elicitation Approach


Title	Preference-Guided Planning: An Active Elicitation Approach
Authors	Mayukh Das, Phillip Odom, Md. Rakibul Islam, Janardhan Rao, Doppa, Dan Roth, Sriraam Natarajan
Abstract	Planning with preferences has been employed extensively to quickly generate high-quality plans. However, it may be difficult for the human expert to supply this information without knowledge of the reasoning employed by the planner and the distribution of planning problems. We consider the problem of actively eliciting preferences from a human expert during the planning process. Specifically, we study this problem in the context of the Hierarchical Task Network (HTN) planning framework as it allows easy interaction with the human. Our experimental results on several diverse planning domains show that the preferences gathered using the proposed approach improve the quality and speed of the planner, while reducing the burden on the human expert.
Tasks
Published	2018-04-19
URL	http://arxiv.org/abs/1804.07404v1
PDF	http://arxiv.org/pdf/1804.07404v1.pdf
PWC	https://paperswithcode.com/paper/preference-guided-planning-an-active
Repo
Framework

A Distribution Similarity Based Regularizer for Learning Bayesian Networks


Title	A Distribution Similarity Based Regularizer for Learning Bayesian Networks
Authors	Weirui Kong, Wenyi Wang
Abstract	Probabilistic graphical models compactly represent joint distributions by decomposing them into factors over subsets of random variables. In Bayesian networks, the factors are conditional probability distributions. For many problems, common information exists among those factors. Adding similarity restrictions can be viewed as imposing prior knowledge for model regularization. With proper restrictions, learned models usually generalize better. In this work, we study methods that exploit such high-level similarities to regularize the learning process and apply them to the task of modeling the wave propagation in inhomogeneous media. We propose a novel distribution-based penalization approach that encourages similar conditional probability distribution rather than force the parameters to be similar explicitly. We show in experiment that our proposed algorithm solves the modeling wave propagation problem, which other baseline methods are not able to solve.
Tasks
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06347v1
PDF	http://arxiv.org/pdf/1808.06347v1.pdf
PWC	https://paperswithcode.com/paper/a-distribution-similarity-based-regularizer
Repo
Framework

Hierarchical RNN for Information Extraction from Lawsuit Documents


Title	Hierarchical RNN for Information Extraction from Lawsuit Documents
Authors	Xi Rao, Zhenxing Ke
Abstract	Every lawsuit document contains the information about the party’s claim, court’s analysis, decision and others, and all of this information are helpful to understand the case better and predict the judge’s decision on similar case in the future. However, the extraction of these information from the document is difficult because the language is too complicated and sentences varied at length. We treat this problem as a task of sequence labeling, and this paper presents the first research to extract relevant information from the civil lawsuit document in China with the hierarchical RNN framework.
Tasks
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09321v1
PDF	http://arxiv.org/pdf/1804.09321v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-rnn-for-information-extraction
Repo
Framework

Multiple Convolutional Neural Network for Skin Dermoscopic Image Classification


Title	Multiple Convolutional Neural Network for Skin Dermoscopic Image Classification
Authors	Yanhui Guo, Amira S. Ashour
Abstract	Melanoma classification is a serious stage to identify the skin disease. It is considered a challenging process due to the intra-class discrepancy of melanomas, skin lesions low contrast, and the artifacts in the dermoscopy images, including noise, existence of hair, air bubbles, and the similarity between melanoma and non-melanoma cases. To solve these problems, we propose a novel multiple convolution neural network model (MCNN) to classify different seven disease types in dermoscopic images, where several models were trained separately using an additive sample learning strategy. The MCNN model is trained and tested using the training and validation sets from the International Skin Imaging Collaboration (ISIC 2018), respectively. The receiver operating characteristic (ROC) curve is used to evaluate the performance of the proposed method. The values of AUC (the area under the ROC curve) were used to evaluate the performance of the MCNN.
Tasks	Image Classification
Published	2018-07-21
URL	http://arxiv.org/abs/1807.08114v2
PDF	http://arxiv.org/pdf/1807.08114v2.pdf
PWC	https://paperswithcode.com/paper/multiple-convolutional-neural-network-for
Repo
Framework

Brute-Force Facial Landmark Analysis With A 140,000-Way Classifier


Title	Brute-Force Facial Landmark Analysis With A 140,000-Way Classifier
Authors	Mengtian Li, Laszlo Jeni, Deva Ramanan
Abstract	We propose a simple approach to visual alignment, focusing on the illustrative task of facial landmark estimation. While most prior work treats this as a regression problem, we instead formulate it as a discrete $K$-way classification task, where a classifier is trained to return one of $K$ discrete alignments. One crucial benefit of a classifier is the ability to report back a (softmax) distribution over putative alignments. We demonstrate that this distribution is a rich representation that can be marginalized (to generate uncertainty estimates over groups of landmarks) and conditioned on (to incorporate top-down context, provided by temporal constraints in a video stream or an interactive human user). Such capabilities are difficult to integrate into classic regression-based approaches. We study performance as a function of the number of classes $K$, including the extreme “exemplar class” setting where $K$ is equal to the number of training examples (140K in our setting). Perhaps surprisingly, we show that classifiers can still be learned in this setting. When compared to prior work in classification, our $K$ is unprecedentedly large, including many “fine-grained” classes that are very similar. We address these issues by using a multi-label loss function that allows for training examples to be non-uniformly shared across discrete classes. We perform a comprehensive experimental analysis of our method on standard benchmarks, demonstrating state-of-the-art results for facial alignment in videos.
Tasks
Published	2018-02-06
URL	http://arxiv.org/abs/1802.01777v2
PDF	http://arxiv.org/pdf/1802.01777v2.pdf
PWC	https://paperswithcode.com/paper/brute-force-facial-landmark-analysis-with-a
Repo
Framework

A System for Automated Image Editing from Natural Language Commands


Title	A System for Automated Image Editing from Natural Language Commands
Authors	Jacqueline Brixey, Ramesh Manuvinakurike, Nham Le, Tuan Lai, Walter Chang, Trung Bui
Abstract	This work presents the task of modifying images in an image editing program using natural language written commands. We utilize a corpus of over 6000 image edit text requests to alter real world images collected via crowdsourcing. A novel framework composed of actions and entities to map a user’s natural language request to executable commands in an image editing program is described. We resolve previously labeled annotator disagreement through a voting process and complete annotation of the corpus. We experimented with different machine learning models and found that the LSTM, the SVM, and the bidirectional LSTM-CRF joint models are the best to detect image editing actions and associated entities in a given utterance.
Tasks
Published	2018-12-03
URL	http://arxiv.org/abs/1812.01083v1
PDF	http://arxiv.org/pdf/1812.01083v1.pdf
PWC	https://paperswithcode.com/paper/a-system-for-automated-image-editing-from
Repo
Framework

Improving End-of-turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task


Title	Improving End-of-turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task
Authors	Zakaria Aldeneh, Dimitrios Dimitriadis, Emily Mower Provost
Abstract	This work focuses on the use of acoustic cues for modeling turn-taking in dyadic spoken dialogues. Previous work has shown that speaker intentions (e.g., asking a question, uttering a backchannel, etc.) can influence turn-taking behavior and are good predictors of turn-transitions in spoken dialogues. However, speaker intentions are not readily available for use by automated systems at run-time; making it difficult to use this information to anticipate a turn-transition. To this end, we propose a multi-task neural approach for predicting turn- transitions and speaker intentions simultaneously. Our results show that adding the auxiliary task of speaker intention prediction improves the performance of turn-transition prediction in spoken dialogues, without relying on additional input features during run-time.
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.06511v1
PDF	http://arxiv.org/pdf/1805.06511v1.pdf
PWC	https://paperswithcode.com/paper/improving-end-of-turn-detection-in-spoken
Repo
Framework

A Deep Learning Approach to Denoise Optical Coherence Tomography Images of the Optic Nerve Head


Title	A Deep Learning Approach to Denoise Optical Coherence Tomography Images of the Optic Nerve Head
Authors	Sripad Krishna Devalla, Giridhar Subramanian, Tan Hung Pham, Xiaofei Wang, Shamira Perera, Tin A. Tun, Tin Aung, Leopold Schmetterer, Alexandre H. Thiery, Michael J. A. Girard
Abstract	Purpose: To develop a deep learning approach to de-noise optical coherence tomography (OCT) B-scans of the optic nerve head (ONH). Methods: Volume scans consisting of 97 horizontal B-scans were acquired through the center of the ONH using a commercial OCT device (Spectralis) for both eyes of 20 subjects. For each eye, single-frame (without signal averaging), and multi-frame (75x signal averaging) volume scans were obtained. A custom deep learning network was then designed and trained with 2,328 “clean B-scans” (multi-frame B-scans), and their corresponding “noisy B-scans” (clean B-scans + gaussian noise) to de-noise the single-frame B-scans. The performance of the de-noising algorithm was assessed qualitatively, and quantitatively on 1,552 B-scans using the signal to noise ratio (SNR), contrast to noise ratio (CNR), and mean structural similarity index metrics (MSSIM). Results: The proposed algorithm successfully denoised unseen single-frame OCT B-scans. The denoised B-scans were qualitatively similar to their corresponding multi-frame B-scans, with enhanced visibility of the ONH tissues. The mean SNR increased from $4.02 \pm 0.68$ dB (single-frame) to $8.14 \pm 1.03$ dB (denoised). For all the ONH tissues, the mean CNR increased from $3.50 \pm 0.56$ (single-frame) to $7.63 \pm 1.81$ (denoised). The MSSIM increased from $0.13 \pm 0.02$ (single frame) to $0.65 \pm 0.03$ (denoised) when compared with the corresponding multi-frame B-scans. Conclusions: Our deep learning algorithm can denoise a single-frame OCT B-scan of the ONH in under 20 ms, thus offering a framework to obtain superior quality OCT B-scans with reduced scanning times and minimal patient discomfort.
Tasks
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10589v1
PDF	http://arxiv.org/pdf/1809.10589v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-approach-to-denoise-optical
Repo
Framework

Tractable and Scalable Schatten Quasi-Norm Approximations for Rank Minimization


Title	Tractable and Scalable Schatten Quasi-Norm Approximations for Rank Minimization
Authors	Fanhua Shang, Yuanyuan Liu, James Cheng
Abstract	The Schatten quasi-norm was introduced to bridge the gap between the trace norm and rank function. However, existing algorithms are too slow or even impractical for large-scale problems. Motivated by the equivalence relation between the trace norm and its bilinear spectral penalty, we define two tractable Schatten norms, i.e.\ the bi-trace and tri-trace norms, and prove that they are in essence the Schatten-$1/2$ and $1/3$ quasi-norms, respectively. By applying the two defined Schatten quasi-norms to various rank minimization problems such as MC and RPCA, we only need to solve much smaller factor matrices. We design two efficient linearized alternating minimization algorithms to solve our problems and establish that each bounded sequence generated by our algorithms converges to a critical point. We also provide the restricted strong convexity (RSC) based and MC error bounds for our algorithms. Our experimental results verified both the efficiency and effectiveness of our algorithms compared with the state-of-the-art methods.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00420v1
PDF	http://arxiv.org/pdf/1803.00420v1.pdf
PWC	https://paperswithcode.com/paper/tractable-and-scalable-schatten-quasi-norm
Repo
Framework

Unsupervised Learning of Interpretable Dialog Models


Title	Unsupervised Learning of Interpretable Dialog Models
Authors	Dhiraj Madan, Dinesh Raghu, Gaurav Pandey, Sachindra Joshi
Abstract	Recently several deep learning based models have been proposed for end-to-end learning of dialogs. While these models can be trained from data without the need for any additional annotations, it is hard to interpret them. On the other hand, there exist traditional state based dialog systems, where the states of the dialog are discrete and hence easy to interpret. However these states need to be handcrafted and annotated in the data. To achieve the best of both worlds, we propose Latent State Tracking Network (LSTN) using which we learn an interpretable model in unsupervised manner. The model defines a discrete latent variable at each turn of the conversation which can take a finite set of values. Since these discrete variables are not present in the training data, we use EM algorithm to train our model in unsupervised manner. In the experiments, we show that LSTN can help achieve interpretability in dialog models without much decrease in performance compared to end-to-end approaches.
Tasks
Published	2018-11-02
URL	http://arxiv.org/abs/1811.01012v1
PDF	http://arxiv.org/pdf/1811.01012v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-interpretable-dialog
Repo
Framework

Goal Inference Improves Objective and Perceived Performance in Human-Robot Collaboration


Title	Goal Inference Improves Objective and Perceived Performance in Human-Robot Collaboration
Authors	Chang Liu, Jessica B. Hamrick, Jaime F. Fisac, Anca D. Dragan, J. Karl Hedrick, S. Shankar Sastry, Thomas L. Griffiths
Abstract	The study of human-robot interaction is fundamental to the design and use of robotics in real-world applications. Robots will need to predict and adapt to the actions of human collaborators in order to achieve good performance and improve safety and end-user adoption. This paper evaluates a human-robot collaboration scheme that combines the task allocation and motion levels of reasoning: the robotic agent uses Bayesian inference to predict the next goal of its human partner from his or her ongoing motion, and re-plans its own actions in real time. This anticipative adaptation is desirable in many practical scenarios, where humans are unable or unwilling to take on the cognitive overhead required to explicitly communicate their intent to the robot. A behavioral experiment indicates that the combination of goal inference and dynamic task planning significantly improves both objective and perceived performance of the human-robot team. Participants were highly sensitive to the differences between robot behaviors, preferring to work with a robot that adapted to their actions over one that did not.
Tasks	Bayesian Inference
Published	2018-02-06
URL	http://arxiv.org/abs/1802.01780v1
PDF	http://arxiv.org/pdf/1802.01780v1.pdf
PWC	https://paperswithcode.com/paper/goal-inference-improves-objective-and
Repo
Framework

SqueezeJet: High-level Synthesis Accelerator Design for Deep Convolutional Neural Networks


Title	SqueezeJet: High-level Synthesis Accelerator Design for Deep Convolutional Neural Networks
Authors	Panagiotis G. Mousouliotis, Loukas P. Petrou
Abstract	Deep convolutional neural networks have dominated the pattern recognition scene by providing much more accurate solutions in computer vision problems such as object recognition and object detection. Most of these solutions come at a huge computational cost, requiring billions of multiply-accumulate operations and, thus, making their use quite challenging in real-time applications that run on embedded mobile (resource-power constrained) hardware. This work presents the architecture, the high-level synthesis design, and the implementation of SqueezeJet, an FPGA accelerator for the inference phase of the SqueezeNet DCNN architecture, which is designed specifically for use in embedded systems. Results show that SqueezeJet can achieve 15.16 times speed-up compared to the software implementation of SqueezeNet running on an embedded mobile processor with less than 1% drop in top-5 accuracy.
Tasks	Object Detection, Object Recognition
Published	2018-05-06
URL	http://arxiv.org/abs/1805.08695v1
PDF	http://arxiv.org/pdf/1805.08695v1.pdf
PWC	https://paperswithcode.com/paper/squeezejet-high-level-synthesis-accelerator
Repo
Framework

Fusing Hierarchical Convolutional Features for Human Body Segmentation and Clothing Fashion Classification


Title	Fusing Hierarchical Convolutional Features for Human Body Segmentation and Clothing Fashion Classification
Authors	Zheng Zhang, Chengfang Song, Qin Zou
Abstract	The clothing fashion reflects the common aesthetics that people share with each other in dressing. To recognize the fashion time of a clothing is meaningful for both an individual and the industry. In this paper, under the assumption that the clothing fashion changes year by year, the fashion-time recognition problem is mapped into a clothing-fashion classification problem. Specifically, a novel deep neural network is proposed which achieves accurate human body segmentation by fusing multi-scale convolutional features in a fully convolutional network, and then feature learning and fashion classification are performed on the segmented parts avoiding the influence of image background. In the experiments, 9,339 fashion images from 8 continuous years are collected for performance evaluation. The results demonstrate the effectiveness of the proposed body segmentation and fashion classification methods.
Tasks
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03415v2
PDF	http://arxiv.org/pdf/1803.03415v2.pdf
PWC	https://paperswithcode.com/paper/fusing-hierarchical-convolutional-features
Repo
Framework

Stereo Computation for a Single Mixture Image


Title	Stereo Computation for a Single Mixture Image
Authors	Yiran Zhong, Yuchao Dai, Hongdong Li
Abstract	This paper proposes an original problem of \emph{stereo computation from a single mixture image}– a challenging problem that had not been researched before. The goal is to separate (\ie, unmix) a single mixture image into two constitute image layers, such that the two layers form a left-right stereo image pair, from which a valid disparity map can be recovered. This is a severely illposed problem, from one input image one effectively aims to recover three (\ie, left image, right image and a disparity map). In this work we give a novel deep-learning based solution, by jointly solving the two subtasks of image layer separation as well as stereo matching. Training our deep net is a simple task, as it does not need to have disparity maps. Extensive experiments demonstrate the efficacy of our method.
Tasks	Stereo Matching, Stereo Matching Hand
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08690v1
PDF	http://arxiv.org/pdf/1808.08690v1.pdf
PWC	https://paperswithcode.com/paper/stereo-computation-for-a-single-mixture-image
Repo
Framework