October 18, 2019

3156 words 15 mins read

Paper Group ANR 420

Deep Trustworthy Knowledge Tracing. How do Convolutional Neural Networks Learn Design?. Stochastic (Approximate) Proximal Point Methods: Convergence, Optimality, and Adaptivity. Data Augmentation of Railway Images for Track Inspection. Deep Smoke Segmentation. Unsupervised Post-processing of Word Vectors via Conceptor Negation. Solving Fourier ptyc …

Deep Trustworthy Knowledge Tracing


Title	Deep Trustworthy Knowledge Tracing
Authors	Heonseok Ha, Uiwon Hwang, Yongjun Hong, Jahee Jang, Sungroh Yoon
Abstract	Knowledge tracing (KT), a key component of an intelligent tutoring system, is a machine learning technique that estimates the mastery level of a student based on his/her past performance. The objective of KT is to predict a student’s response to the next question. Compared with traditional KT models, deep learning-based KT (DLKT) models show better predictive performance because of the representation power of deep neural networks. Various methods have been proposed to improve the performance of DLKT, but few studies have been conducted on the reliability of DLKT. In this work, we claim that the existing DLKTs are not reliable in real education environments. To substantiate the claim, we show limitations of DLKT from various perspectives such as knowledge state update failure, catastrophic forgetting, and non-interpretability. We then propose a novel regularization to address these problems. The proposed method allows us to achieve trustworthy DLKT. In addition, the proposed model which is trained on scenarios with forgetting can also be easily extended to scenarios without forgetting.
Tasks	Knowledge Tracing
Published	2018-05-28
URL	https://arxiv.org/abs/1805.10768v3
PDF	https://arxiv.org/pdf/1805.10768v3.pdf
PWC	https://paperswithcode.com/paper/memory-augmented-neural-networks-for-1
Repo
Framework

How do Convolutional Neural Networks Learn Design?


Title	How do Convolutional Neural Networks Learn Design?
Authors	Shailza Jolly, Brian Kenji Iwana, Ryohei Kuroki, Seiichi Uchida
Abstract	In this paper, we aim to understand the design principles in book cover images which are carefully crafted by experts. Book covers are designed in a unique way, specific to genres which convey important information to their readers. By using Convolutional Neural Networks (CNN) to predict book genres from cover images, visual cues which distinguish genres can be highlighted and analyzed. In order to understand these visual clues contributing towards the decision of a genre, we present the application of Layer-wise Relevance Propagation (LRP) on the book cover image classification results. We use LRP to explain the pixel-wise contributions of book cover design and highlight the design elements contributing towards particular genres. In addition, with the use of state-of-the-art object and text detection methods, insights about genre-specific book cover designs are discovered.
Tasks	Image Classification
Published	2018-08-25
URL	http://arxiv.org/abs/1808.08402v1
PDF	http://arxiv.org/pdf/1808.08402v1.pdf
PWC	https://paperswithcode.com/paper/how-do-convolutional-neural-networks-learn
Repo
Framework

Stochastic (Approximate) Proximal Point Methods: Convergence, Optimality, and Adaptivity


Title	Stochastic (Approximate) Proximal Point Methods: Convergence, Optimality, and Adaptivity
Authors	Hilal Asi, John C. Duchi
Abstract	We develop model-based methods for solving stochastic convex optimization problems, introducing the approximate-proximal point, or aProx, family, which includes stochastic subgradient, proximal point, and bundle methods. When the modeling approaches we propose are appropriately accurate, the methods enjoy stronger convergence and robustness guarantees than classical approaches, even though the model-based methods typically add little to no computational overhead over stochastic subgradient methods. For example, we show that improved models converge with probability 1 and enjoy optimal asymptotic normality results under weak assumptions; these methods are also adaptive to a natural class of what we term easy optimization problems, achieving linear convergence under appropriate strong growth conditions on the objective. Our substantial experimental investigation shows the advantages of more accurate modeling over standard subgradient methods across many smooth and non-smooth optimization problems.
Tasks
Published	2018-10-12
URL	https://arxiv.org/abs/1810.05633v2
PDF	https://arxiv.org/pdf/1810.05633v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-approximate-proximal-point-methods
Repo
Framework

Data Augmentation of Railway Images for Track Inspection


Title	Data Augmentation of Railway Images for Track Inspection
Authors	S Ritika, Dattaraj Rao
Abstract	Regular maintenance of all the assets is pivotal for proper functioning of railway. Manual maintenance can be very cumbersome and leave room for errors. Track anomalies like vegetation overgrowth, sun kinks affect the track construct and result in unequal load transfer, imbalanced lateral forces on tracks which causes further deterioration of tracks and can ultimately result in derailment of locomotive. Hence there is a need to continuously monitor rail track health. Track anomalies are rare with the skew as high as one anomaly in millions of good images. We propose a method to build training data that will make our algorithms more robust and help us detect real world track issues. The data augmentation will have a direct effect in making us detect better anomalies and hence improve time for railroads that is spent in manual inspection. This paper talks about a real world use case of detecting railway track defects from a camera mounted on a moving locomotive and tracking their locations. The camera is engineered to withstand the environment factors on a moving train and provide a consistent steady image at around 30 frames per second. An image simulation pipeline of track detection, region of interest selection, augmenting image for anomalies is implemented. Training images are simulated for sun kink and vegetation overgrowth. Inception V3 model pretrained on Imagenet dataset is finetuned for a 2 class classification. For the case of vegetation overgrowth, the model generalizes well on actual vegetation images, though it was trained and validated solely on simulated images which might have different distribution than the actual vegetation. Sun kink classifier can classify professionally simulated sun kink videos with a precision of 97.5%.
Tasks	Data Augmentation
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01286v1
PDF	http://arxiv.org/pdf/1802.01286v1.pdf
PWC	https://paperswithcode.com/paper/data-augmentation-of-railway-images-for-track
Repo
Framework

Deep Smoke Segmentation


Title	Deep Smoke Segmentation
Authors	Feiniu Yuan, Lin Zhang, Xue Xia, Boyang Wan, Qinghua Huang, Xuelong Li
Abstract	Inspired by the recent success of fully convolutional networks (FCN) in semantic segmentation, we propose a deep smoke segmentation network to infer high quality segmentation masks from blurry smoke images. To overcome large variations in texture, color and shape of smoke appearance, we divide the proposed network into a coarse path and a fine path. The first path is an encoder-decoder FCN with skip structures, which extracts global context information of smoke and accordingly generates a coarse segmentation mask. To retain fine spatial details of smoke, the second path is also designed as an encoder-decoder FCN with skip structures, but it is shallower than the first path network. Finally, we propose a very small network containing only add, convolution and activation layers to fuse the results of the two paths. Thus, we can easily train the proposed network end to end for simultaneous optimization of network parameters. To avoid the difficulty in manually labelling fuzzy smoke objects, we propose a method to generate synthetic smoke images. According to results of our deep segmentation method, we can easily and accurately perform smoke detection from videos. Experiments on three synthetic smoke datasets and a realistic smoke dataset show that our method achieves much better performance than state-of-the-art segmentation algorithms based on FCNs. Test results of our method on videos are also appealing.
Tasks	Semantic Segmentation
Published	2018-09-04
URL	http://arxiv.org/abs/1809.00774v1
PDF	http://arxiv.org/pdf/1809.00774v1.pdf
PWC	https://paperswithcode.com/paper/deep-smoke-segmentation
Repo
Framework

Unsupervised Post-processing of Word Vectors via Conceptor Negation


Title	Unsupervised Post-processing of Word Vectors via Conceptor Negation
Authors	Tianlin Liu, Lyle Ungar, João Sedoc
Abstract	Word vectors are at the core of many natural language processing tasks. Recently, there has been interest in post-processing word vectors to enrich their semantic information. In this paper, we introduce a novel word vector post-processing technique based on matrix conceptors (Jaeger2014), a family of regularized identity maps. More concretely, we propose to use conceptors to suppress those latent features of word vectors having high variances. The proposed method is purely unsupervised: it does not rely on any corpus or external linguistic database. We evaluate the post-processed word vectors on a battery of intrinsic lexical evaluation tasks, showing that the proposed method consistently outperforms existing state-of-the-art alternatives. We also show that post-processed word vectors can be used for the downstream natural language processing task of dialogue state tracking, yielding improved results in different dialogue domains.
Tasks	Dialogue State Tracking
Published	2018-11-17
URL	http://arxiv.org/abs/1811.11001v2
PDF	http://arxiv.org/pdf/1811.11001v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-post-processing-of-word-vectors
Repo
Framework

Solving Fourier ptychographic imaging problems via neural network modeling and TensorFlow


Title	Solving Fourier ptychographic imaging problems via neural network modeling and TensorFlow
Authors	Shaowei Jiang, Kaikai Guo, Jun Liao, Guoan Zheng
Abstract	Fourier ptychography is a recently developed imaging approach for large field-of-view and high-resolution microscopy. Here we model the Fourier ptychographic forward imaging process using a convolution neural network (CNN) and recover the complex object information in the network training process. In this approach, the input of the network is the point spread function in the spatial domain or the coherent transfer function in the Fourier domain. The object is treated as 2D learnable weights of a convolution or a multiplication layer. The output of the network is modeled as the loss function we aim to minimize. The batch size of the network corresponds to the number of captured low-resolution images in one forward / backward pass. We use a popular open-source machine learning library, TensorFlow, for setting up the network and conducting the optimization process. We analyze the performance of different learning rates, different solvers, and different batch sizes. It is shown that a large batch size with the Adam optimizer achieves the best performance in general. To accelerate the phase retrieval process, we also discuss a strategy to implement Fourier-magnitude projection using a multiplication neural network model. Since convolution and multiplication are the two most-common operations in imaging modeling, the reported approach may provide a new perspective to examine many coherent and incoherent systems. As a demonstration, we discuss the extensions of the reported networks for modeling single-pixel imaging and structured illumination microscopy (SIM). 4-frame resolution doubling is demonstrated using a neural network for SIM. We have made our implementation code open-source for the broad research community.
Tasks
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03434v1
PDF	http://arxiv.org/pdf/1803.03434v1.pdf
PWC	https://paperswithcode.com/paper/solving-fourier-ptychographic-imaging
Repo
Framework

Extracting Linguistic Resources from the Web for Concept-to-Text Generation


Title	Extracting Linguistic Resources from the Web for Concept-to-Text Generation
Authors	Gerasimos Lampouras, Ion Androutsopoulos
Abstract	Many concept-to-text generation systems require domain-specific linguistic resources to produce high quality texts, but manually constructing these resources can be tedious and costly. Focusing on NaturalOWL, a publicly available state of the art natural language generator for OWL ontologies, we propose methods to extract from the Web sentence plans and natural language names, two of the most important types of domain-specific linguistic resources used by the generator. Experiments show that texts generated using linguistic resources extracted by our methods in a semi-automatic manner, with minimal human involvement, are perceived as being almost as good as texts generated using manually authored linguistic resources, and much better than texts produced by using linguistic resources extracted from the relation and entity identifiers of the ontology.
Tasks	Concept-To-Text Generation, Text Generation
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13414v1
PDF	http://arxiv.org/pdf/1810.13414v1.pdf
PWC	https://paperswithcode.com/paper/extracting-linguistic-resources-from-the-web
Repo
Framework

Toward domain-invariant speech recognition via large scale training


Title	Toward domain-invariant speech recognition via large scale training
Authors	Arun Narayanan, Ananya Misra, Khe Chai Sim, Golan Pundak, Anshuman Tripathi, Mohamed Elfeky, Parisa Haghani, Trevor Strohman, Michiel Bacchiani
Abstract	Current state-of-the-art automatic speech recognition systems are trained to work in specific `domains’, defined based on factors like application, sampling rate and codec. When such recognizers are used in conditions that do not match the training domain, performance significantly drops. This work explores the idea of building a single domain-invariant model for varied use-cases by combining large scale training data from multiple application domains. Our final system is trained using 162,000 hours of speech. Additionally, each utterance is artificially distorted during training to simulate effects like background noise, codec distortion, and sampling rates. Our results show that, even at such a scale, a model thus trained works almost as well as those fine-tuned to specific subsets: A single model can be robust to multiple application domains, and variations like codecs and noise. More importantly, such models generalize better to unseen conditions and allow for rapid adaptation – we show that by using as little as 10 hours of data from a new domain, an adapted domain-invariant model can match performance of a domain-specific model trained from scratch using 70 times as much data. We also highlight some of the limitations of such models and areas that need addressing in future work. \|
Tasks	Speech Recognition
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05312v1
PDF	http://arxiv.org/pdf/1808.05312v1.pdf
PWC	https://paperswithcode.com/paper/toward-domain-invariant-speech-recognition
Repo
Framework

Strategy of the Negative Sampling for Training Retrieval-Based Dialogue Systems


Title	Strategy of the Negative Sampling for Training Retrieval-Based Dialogue Systems
Authors	Aigul Nugmanova, Andrei Smirnov, Galina Lavrentyeva, Irina Chernykh
Abstract	The article describes the new approach for quality improvement of automated dialogue systems for customer support service. Analysis produced in the paper demonstrates the dependency of the quality of the retrieval-based dialogue system quality on the choice of negative responses. The proposed approach implies choosing the negative samples according to the distribution of responses in the train set. In this implementation the negative samples are randomly chosen from the original response distribution and from the “artificial” distribution of negative responses, such as uniform distribution or the distribution obtained by transformation of the original one. The results obtained for the implemented systems and reported in this paper confirm the significant improvement of automated dialogue systems quality in case of using the negative responses from transformed distribution.
Tasks
Published	2018-11-24
URL	http://arxiv.org/abs/1811.09785v1
PDF	http://arxiv.org/pdf/1811.09785v1.pdf
PWC	https://paperswithcode.com/paper/strategy-of-the-negative-sampling-for
Repo
Framework

Fast Approximate Geodesics for Deep Generative Models


Title	Fast Approximate Geodesics for Deep Generative Models
Authors	Nutan Chen, Francesco Ferroni, Alexej Klushyn, Alexandros Paraschos, Justin Bayer, Patrick van der Smagt
Abstract	The length of the geodesic between two data points along a Riemannian manifold, induced by a deep generative model, yields a principled measure of similarity. Current approaches are limited to low-dimensional latent spaces, due to the computational complexity of solving a non-convex optimisation problem. We propose finding shortest paths in a finite graph of samples from the aggregate approximate posterior, that can be solved exactly, at greatly reduced runtime, and without a notable loss in quality. Our approach, therefore, is hence applicable to high-dimensional problems, e.g., in the visual domain. We validate our approach empirically on a series of experiments using variational autoencoders applied to image data, including the Chair, FashionMNIST, and human movement data sets.
Tasks
Published	2018-12-19
URL	https://arxiv.org/abs/1812.08284v2
PDF	https://arxiv.org/pdf/1812.08284v2.pdf
PWC	https://paperswithcode.com/paper/fast-approximate-geodesics-for-deep
Repo
Framework

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network


Title	Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
Authors	Liwen Zheng, Canmiao Fu, Yong Zhao
Abstract	Single Shot MultiBox Detector (SSD) is one of the fastest algorithms in the current object detection field, which uses fully convolutional neural network to detect all scaled objects in an image. Deconvolutional Single Shot Detector (DSSD) is an approach which introduces more context information by adding the deconvolution module to SSD. And the mean Average Precision (mAP) of DSSD on PASCAL VOC2007 is improved from SSD’s 77.5% to 78.6%. Although DSSD obtains higher mAP than SSD by 1.1%, the frames per second (FPS) decreases from 46 to 11.8. In this paper, we propose a single stage end-to-end image detection model called ESSD to overcome this dilemma. Our solution to this problem is to cleverly extend better context information for the shallow layers of the best single stage (e.g. SSD) detectors. Experimental results show that our model can reach 79.4% mAP, which is higher than DSSD and SSD by 0.8 and 1.9 points respectively. Meanwhile, our testing speed is 25 FPS in Titan X GPU which is more than double the original DSSD.
Tasks	Object Detection
Published	2018-01-18
URL	http://arxiv.org/abs/1801.05918v1
PDF	http://arxiv.org/pdf/1801.05918v1.pdf
PWC	https://paperswithcode.com/paper/extend-the-shallow-part-of-single-shot
Repo
Framework

DeepMove: Learning Place Representations through Large Scale Movement Data


Title	DeepMove: Learning Place Representations through Large Scale Movement Data
Authors	Yang Zhou, Yan Huang
Abstract	Understanding and reasoning about places and their relationships are critical for many applications. Places are traditionally curated by a small group of people as place gazetteers and are represented by an ID with spatial extent, category, and other descriptions. However, a place context is described to a large extent by movements made from/to other places. Places are linked and related to each other by these movements. This important context is missing from the traditional representation. We present DeepMove, a novel approach for learning latent representations of places. DeepMove advances the current deep learning based place representations by directly model movements between places. We demonstrate DeepMove’s latent representations on place categorization and clustering tasks on large place and movement datasets with respect to important parameters. Our results show that DeepMove outperforms state-of-the-art baselines. DeepMove’s representations can provide up to 15% higher than competing methods in matching rate of place category and result in up to 39% higher silhouette coefficient value for place clusters. DeepMove is spatial and temporal context aware. It is scalable. It outperforms competing models using much smaller training dataset (a month or 1/12 of data). These qualities make it suitable for a broad class of real-world applications.
Tasks
Published	2018-07-11
URL	http://arxiv.org/abs/1807.04241v2
PDF	http://arxiv.org/pdf/1807.04241v2.pdf
PWC	https://paperswithcode.com/paper/deepmove-learning-place-representations
Repo
Framework

AMORE-UPF at SemEval-2018 Task 4: BiLSTM with Entity Library


Title	AMORE-UPF at SemEval-2018 Task 4: BiLSTM with Entity Library
Authors	Laura Aina, Carina Silberer, Ionut-Teodor Sorodoc, Matthijs Westera, Gemma Boleda
Abstract	This paper describes our winning contribution to SemEval 2018 Task 4: Character Identification on Multiparty Dialogues. It is a simple, standard model with one key innovation, an entity library. Our results show that this innovation greatly facilitates the identification of infrequent characters. Because of the generic nature of our model, this finding is potentially relevant to any task that requires effective learning from sparse or unbalanced data.
Tasks
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05370v1
PDF	http://arxiv.org/pdf/1805.05370v1.pdf
PWC	https://paperswithcode.com/paper/amore-upf-at-semeval-2018-task-4-bilstm-with
Repo
Framework

Efficiency, Sequenceability and Deal-Optimality in Fair Division of Indivisible Goods


Title	Efficiency, Sequenceability and Deal-Optimality in Fair Division of Indivisible Goods
Authors	Aurélie Beynier, Sylvain Bouveret, Michel Lemaître, Nicolas Maudet, Simon Rey
Abstract	In fair division of indivisible goods, using sequences of sincere choices (or picking sequences) is a natural way to allocate the objects. The idea is as follows: at each stage, a designated agent picks one object among those that remain. Another intuitive way to obtain an allocation is to give objects to agents in the first place, and to let agents exchange them as long as such “deals” are beneficial. This paper investigates these notions, when agents have additive preferences over objects, and unveils surprising connections between them, and with other efficiency and fairness notions. In particular, we show that an allocation is sequenceable iff it is optimal for a certain type of deals, namely cycle deals involving a single object. Furthermore, any Pareto-optimal allocation is sequenceable, but not the converse. Regarding fairness, we show that an allocation can be envy-free and non-sequenceable, but that every competitive equilibrium with equal incomes is sequenceable. To complete the picture, we show how some domain restrictions may affect the relations between these notions. Finally, we experimentally explore the links between the scales of efficiency and fairness.
Tasks
Published	2018-07-28
URL	http://arxiv.org/abs/1807.11919v1
PDF	http://arxiv.org/pdf/1807.11919v1.pdf
PWC	https://paperswithcode.com/paper/efficiency-sequenceability-and-deal
Repo
Framework