July 28, 2019

3437 words 17 mins read

Paper Group ANR 359

Electricity Theft Detection using Machine Learning. Tackling Dynamic Vehicle Routing Problem with Time Windows by means of Ant Colony System. A Deep Generative Framework for Paraphrase Generation. Enhancing workflow-nets with data for trace completion. Distributed Dual Coordinate Ascent in General Tree Networks and Its Application in Federated Lear …

Electricity Theft Detection using Machine Learning


Title	Electricity Theft Detection using Machine Learning
Authors	Niklas Dahringer
Abstract	Non-technical losses (NTL) in electric power grids arise through electricity theft, broken electric meters or billing errors. They can harm the power supplier as well as the whole economy of a country through losses of up to 40% of the total power distribution. For NTL detection, researchers use artificial intelligence to analyse data. This work is about improving the extraction of more meaningful features from a data set. With these features, the prediction quality will increase.
Tasks
Published	2017-08-19
URL	http://arxiv.org/abs/1708.05907v1
PDF	http://arxiv.org/pdf/1708.05907v1.pdf
PWC	https://paperswithcode.com/paper/electricity-theft-detection-using-machine
Repo
Framework

Tackling Dynamic Vehicle Routing Problem with Time Windows by means of Ant Colony System


Title	Tackling Dynamic Vehicle Routing Problem with Time Windows by means of Ant Colony System
Authors	Raluca Necula, Mihaela Breaban, Madalina Raschip
Abstract	The Dynamic Vehicle Routing Problem with Time Windows (DVRPTW) is an extension of the well-known Vehicle Routing Problem (VRP), which takes into account the dynamic nature of the problem. This aspect requires the vehicle routes to be updated in an ongoing manner as new customer requests arrive in the system and must be incorporated into an evolving schedule during the working day. Besides the vehicle capacity constraint involved in the classical VRP, DVRPTW considers in addition time windows, which are able to better capture real-world situations. Despite this, so far, few studies have focused on tackling this problem of greater practical importance. To this end, this study devises for the resolution of DVRPTW, an ant colony optimization based algorithm, which resorts to a joint solution construction mechanism, able to construct in parallel the vehicle routes. This method is coupled with a local search procedure, aimed to further improve the solutions built by ants, and with an insertion heuristics, which tries to reduce the number of vehicles used to service the available customers. The experiments indicate that the proposed algorithm is competitive and effective, and on DVRPTW instances with a higher dynamicity level, it is able to yield better results compared to existing ant-based approaches.
Tasks
Published	2017-04-06
URL	http://arxiv.org/abs/1704.01859v1
PDF	http://arxiv.org/pdf/1704.01859v1.pdf
PWC	https://paperswithcode.com/paper/tackling-dynamic-vehicle-routing-problem-with
Repo
Framework

A Deep Generative Framework for Paraphrase Generation


Title	A Deep Generative Framework for Paraphrase Generation
Authors	Ankush Gupta, Arvind Agarwal, Prawaan Singh, Piyush Rai
Abstract	Paraphrase generation is an important problem in NLP, especially in question answering, information retrieval, information extraction, conversation systems, to name a few. In this paper, we address the problem of generating paraphrases automatically. Our proposed method is based on a combination of deep generative models (VAE) with sequence-to-sequence models (LSTM) to generate paraphrases, given an input sentence. Traditional VAEs when combined with recurrent neural networks can generate free text but they are not suitable for paraphrase generation for a given sentence. We address this problem by conditioning the both, encoder and decoder sides of VAE, on the original sentence, so that it can generate the given sentence’s paraphrases. Unlike most existing models, our model is simple, modular and can generate multiple paraphrases, for a given sentence. Quantitative evaluation of the proposed method on a benchmark paraphrase dataset demonstrates its efficacy, and its performance improvement over the state-of-the-art methods by a significant margin, whereas qualitative human evaluation indicate that the generated paraphrases are well-formed, grammatically correct, and are relevant to the input sentence. Furthermore, we evaluate our method on a newly released question paraphrase dataset, and establish a new baseline for future research.
Tasks	Information Retrieval, Paraphrase Generation, Question Answering
Published	2017-09-15
URL	http://arxiv.org/abs/1709.05074v1
PDF	http://arxiv.org/pdf/1709.05074v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-generative-framework-for-paraphrase
Repo
Framework

Enhancing workflow-nets with data for trace completion


Title	Enhancing workflow-nets with data for trace completion
Authors	Riccardo De Masellis, Chiara Di Francescomarino, Chiara Ghidini, Sergio Tessaris
Abstract	The growing adoption of IT-systems for modeling and executing (business) processes or services has thrust the scientific investigation towards techniques and tools which support more complex forms of process analysis. Many of them, such as conformance checking, process alignment, mining and enhancement, rely on complete observation of past (tracked and logged) executions. In many real cases, however, the lack of human or IT-support on all the steps of process execution, as well as information hiding and abstraction of model and data, result in incomplete log information of both data and activities. This paper tackles the issue of automatically repairing traces with missing information by notably considering not only activities but also data manipulated by them. Our technique recasts such a problem in a reachability problem and provides an encoding in an action language which allows to virtually use any state-of-the-art planning to return solutions.
Tasks
Published	2017-06-01
URL	http://arxiv.org/abs/1706.00356v1
PDF	http://arxiv.org/pdf/1706.00356v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-workflow-nets-with-data-for-trace
Repo
Framework

Distributed Dual Coordinate Ascent in General Tree Networks and Its Application in Federated Learning


Title	Distributed Dual Coordinate Ascent in General Tree Networks and Its Application in Federated Learning
Authors	Myung Cho, Lifeng Lai, Weiyu Xu
Abstract	Due to the big size of data and limited data storage volume of a single computer or a signal server, data are often stored in a distributed manner. Thus, performing large-scale machine learning operations with the distributed datasets through communication networks is often required. In this paper, we investigate the impact of network communication constraints on the convergence speed of the communication-efficient distributed machine learning algorithm. Firstly, we study the convergence rate of the distributed dual coordinate ascent algorithm in a general tree structured network. Since a tree network model can be understood as the generalization of a star network model, our algorithm can be thought of as the generalization of the distributed dual coordinate ascent in a star network model. Secondly, by considering network communication delays, we optimize the network-constrained distributed dual coordinate ascent algorithm to maximize its convergence speed. In numerical experiments, we take into account federated learning scenarios, where local workers cannot directly reach to a central node due to communication constraints. Through extensive numerical experiments, we demonstrate that the usability of our distributed dual coordinate ascent algorithm in a tree network in the federated learning scenarios. Additionally, under different network communication delays, the delay-dependent number of local and global iterations in the distributed dual coordinated ascent can improve the convergence speed.
Tasks
Published	2017-03-14
URL	https://arxiv.org/abs/1703.04785v5
PDF	https://arxiv.org/pdf/1703.04785v5.pdf
PWC	https://paperswithcode.com/paper/network-constrained-distributed-dual
Repo
Framework

Multi-channel Encoder for Neural Machine Translation


Title	Multi-channel Encoder for Neural Machine Translation
Authors	Hao Xiong, Zhongjun He, Xiaoguang Hu, Hua Wu
Abstract	Attention-based Encoder-Decoder has the effective architecture for neural machine translation (NMT), which typically relies on recurrent neural networks (RNN) to build the blocks that will be lately called by attentive reader during the decoding process. This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN. On the other hand, we often hope the decoder to take pieces of source sentence at varying levels suiting its own linguistic structure: for example, we may want to take the entity name in its raw form while taking an idiom as a perfectly composed unit. Motivated by this demand, we propose Multi-channel Encoder (MCE), which enhances encoding components with different levels of composition. More specifically, in addition to the hidden state of encoding RNN, MCE takes 1) the original word embedding for raw encoding with no composition, and 2) a particular design of external memory in Neural Turing Machine (NTM) for more complex composition, while all three encoding strategies are properly blended during decoding. Empirical study on Chinese-English translation shows that our model can improve by 6.52 BLEU points upon a strong open source NMT system: DL4MT1. On the WMT14 English- French task, our single shallow system achieves BLEU=38.8, comparable with the state-of-the-art deep models.
Tasks	Machine Translation
Published	2017-12-06
URL	http://arxiv.org/abs/1712.02109v1
PDF	http://arxiv.org/pdf/1712.02109v1.pdf
PWC	https://paperswithcode.com/paper/multi-channel-encoder-for-neural-machine
Repo
Framework

Polyp detection inside the capsule endoscopy: an approach for power consumption reduction


Title	Polyp detection inside the capsule endoscopy: an approach for power consumption reduction
Authors	Mohammad Amin Khorsandi, Nader Karimi, Shadrokh Samavi
Abstract	Capsule endoscopy is a novel and non-invasive method for diagnosis, which assists gastroenterologists to monitor the digestive track. Although this new technology has many advantages over the conventional endoscopy, there are weaknesses that limits the usage of this technology. Some weaknesses are due to using small-size batteries. Radio transmitter consumes the largest portion of energy; consequently, a simple way to reduce the power consumption is to reduce the data to be transmitted. Many works are proposed to reduce the amount of data to be transmitted consist of specific compression methods and reduction in video resolution and frame rate. We proposed a system inside the capsule for detecting informative frames and sending these frames instead of several non-informative frames. In this work, we specifically focused on hardware friendly algorithm (with capability of parallelism and pipeline) for implementation of polyp detection. Two features of positive contrast and customized edges of polyps are exploited to define whether the frame consists of polyp or not. The proposed method is devoid of complex and iterative structure to save power and reduce the response time. Experimental results indicate acceptable rate of detection of our work.
Tasks
Published	2017-12-29
URL	http://arxiv.org/abs/1712.10164v1
PDF	http://arxiv.org/pdf/1712.10164v1.pdf
PWC	https://paperswithcode.com/paper/polyp-detection-inside-the-capsule-endoscopy
Repo
Framework

Current-mode Memristor Crossbars for Neuromemristive Systems


Title	Current-mode Memristor Crossbars for Neuromemristive Systems
Authors	Cory Merkel
Abstract	Motivated by advantages of current-mode design, this brief contribution explores the implementation of weight matrices in neuromemristive systems via current-mode memristor crossbar circuits. After deriving theoretical results for the range and distribution of weights in the current-mode design, it is shown that any weight matrix based on voltage-mode crossbars can be mapped to a current-mode crossbar if the voltage-mode weights are carefully bounded. Then, a modified gradient descent rule is derived for the current-mode design that can be used to perform backpropagation training. Behavioral simulations on the MNIST dataset indicate that both voltage and current-mode designs are able to achieve similar accuracy and have similar defect tolerance. However, analysis of trained weight distributions reveals that current-mode and voltage-mode designs may use different feature representations.
Tasks
Published	2017-07-17
URL	http://arxiv.org/abs/1707.05316v1
PDF	http://arxiv.org/pdf/1707.05316v1.pdf
PWC	https://paperswithcode.com/paper/current-mode-memristor-crossbars-for
Repo
Framework

Guided Deep List: Automating the Generation of Epidemiological Line Lists from Open Sources


Title	Guided Deep List: Automating the Generation of Epidemiological Line Lists from Open Sources
Authors	Saurav Ghosh, Prithwish Chakraborty, Bryan L. Lewis, Maimuna S. Majumder, Emily Cohn, John S. Brownstein, Madhav V. Marathe, Naren Ramakrishnan
Abstract	Real-time monitoring and responses to emerging public health threats rely on the availability of timely surveillance data. During the early stages of an epidemic, the ready availability of line lists with detailed tabular information about laboratory-confirmed cases can assist epidemiologists in making reliable inferences and forecasts. Such inferences are crucial to understand the epidemiology of a specific disease early enough to stop or control the outbreak. However, construction of such line lists requires considerable human supervision and therefore, difficult to generate in real-time. In this paper, we motivate Guided Deep List, the first tool for building automated line lists (in near real-time) from open source reports of emerging disease outbreaks. Specifically, we focus on deriving epidemiological characteristics of an emerging disease and the affected population from reports of illness. Guided Deep List uses distributed vector representations (ala word2vec) to discover a set of indicators for each line list feature. This discovery of indicators is followed by the use of dependency parsing based techniques for final extraction in tabular form. We evaluate the performance of Guided Deep List against a human annotated line list provided by HealthMap corresponding to MERS outbreaks in Saudi Arabia. We demonstrate that Guided Deep List extracts line list features with increased accuracy compared to a baseline method. We further show how these automatically extracted line list features can be used for making epidemiological inferences, such as inferring demographics and symptoms-to-hospitalization period of affected individuals.
Tasks	Dependency Parsing, Epidemiology
Published	2017-02-22
URL	http://arxiv.org/abs/1702.06663v1
PDF	http://arxiv.org/pdf/1702.06663v1.pdf
PWC	https://paperswithcode.com/paper/guided-deep-list-automating-the-generation-of
Repo
Framework

VQABQ: Visual Question Answering by Basic Questions


Title	VQABQ: Visual Question Answering by Basic Questions
Authors	Jia-Hong Huang, Modar Alfadly, Bernard Ghanem
Abstract	Taking an image and question as the input of our method, it can output the text-based answer of the query question about the given image, so called Visual Question Answering (VQA). There are two main modules in our algorithm. Given a natural language question about an image, the first module takes the question as input and then outputs the basic questions of the main given question. The second module takes the main question, image and these basic questions as input and then outputs the text-based answer of the main question. We formulate the basic questions generation problem as a LASSO optimization problem, and also propose a criterion about how to exploit these basic questions to help answer main question. Our method is evaluated on the challenging VQA dataset and yields state-of-the-art accuracy, 60.34% in open-ended task.
Tasks	Question Answering, Visual Question Answering
Published	2017-03-19
URL	http://arxiv.org/abs/1703.06492v2
PDF	http://arxiv.org/pdf/1703.06492v2.pdf
PWC	https://paperswithcode.com/paper/vqabq-visual-question-answering-by-basic
Repo
Framework

Transfer Learning to Learn with Multitask Neural Model Search


Title	Transfer Learning to Learn with Multitask Neural Model Search
Authors	Catherine Wong, Andrea Gesmundo
Abstract	Deep learning models require extensive architecture design exploration and hyperparameter optimization to perform well on a given task. The exploration of the model design space is often made by a human expert, and optimized using a combination of grid search and search heuristics over a large space of possible choices. Neural Architecture Search (NAS) is a Reinforcement Learning approach that has been proposed to automate architecture design. NAS has been successfully applied to generate Neural Networks that rival the best human-designed architectures. However, NAS requires sampling, constructing, and training hundreds to thousands of models to achieve well-performing architectures. This procedure needs to be executed from scratch for each new task. The application of NAS to a wide set of tasks currently lacks a way to transfer generalizable knowledge across tasks. In this paper, we present the Multitask Neural Model Search (MNMS) controller. Our goal is to learn a generalizable framework that can condition model construction on successful model searches for previously seen tasks, thus significantly speeding up the search for new tasks. We demonstrate that MNMS can conduct an automated architecture search for multiple tasks simultaneously while still learning well-performing, specialized models for each task. We then show that pre-trained MNMS controllers can transfer learning to new tasks. By leveraging knowledge from previous searches, we find that pre-trained MNMS models start from a better location in the search space and reduce search time on unseen tasks, while still discovering models that outperform published human-designed models.
Tasks	Hyperparameter Optimization, Neural Architecture Search, Transfer Learning
Published	2017-10-30
URL	http://arxiv.org/abs/1710.10776v1
PDF	http://arxiv.org/pdf/1710.10776v1.pdf
PWC	https://paperswithcode.com/paper/transfer-learning-to-learn-with-multitask
Repo
Framework

Dual Motion GAN for Future-Flow Embedded Video Prediction


Title	Dual Motion GAN for Future-Flow Embedded Video Prediction
Authors	Xiaodan Liang, Lisa Lee, Wei Dai, Eric P. Xing
Abstract	Future frame prediction in videos is a promising avenue for unsupervised video representation learning. Video frames are naturally generated by the inherent pixel flows from preceding frames based on the appearance and motion dynamics in the video. However, existing methods focus on directly hallucinating pixel values, resulting in blurry predictions. In this paper, we develop a dual motion Generative Adversarial Net (GAN) architecture, which learns to explicitly enforce future-frame predictions to be consistent with the pixel-wise flows in the video through a dual-learning mechanism. The primal future-frame prediction and dual future-flow prediction form a closed loop, generating informative feedback signals to each other for better video prediction. To make both synthesized future frames and flows indistinguishable from reality, a dual adversarial training method is proposed to ensure that the future-flow prediction is able to help infer realistic future-frames, while the future-frame prediction in turn leads to realistic optical flows. Our dual motion GAN also handles natural motion uncertainty in different pixel locations with a new probabilistic motion encoder, which is based on variational autoencoders. Extensive experiments demonstrate that the proposed dual motion GAN significantly outperforms state-of-the-art approaches on synthesizing new video frames and predicting future flows. Our model generalizes well across diverse visual scenes and shows superiority in unsupervised video representation learning.
Tasks	Representation Learning, Video Prediction
Published	2017-08-01
URL	http://arxiv.org/abs/1708.00284v2
PDF	http://arxiv.org/pdf/1708.00284v2.pdf
PWC	https://paperswithcode.com/paper/dual-motion-gan-for-future-flow-embedded
Repo
Framework

Deep learning analysis of breast MRIs for prediction of occult invasive disease in ductal carcinoma in situ


Title	Deep learning analysis of breast MRIs for prediction of occult invasive disease in ductal carcinoma in situ
Authors	Zhe Zhu, Michael Harowicz, Jun Zhang, Ashirbani Saha, Lars J. Grimm, E. Shelley Hwang, Maciej A. Mazurowski
Abstract	Purpose: To determine whether deep learning-based algorithms applied to breast MR images can aid in the prediction of occult invasive disease following the di- agnosis of ductal carcinoma in situ (DCIS) by core needle biopsy. Material and Methods: In this institutional review board-approved study, we analyzed dynamic contrast-enhanced fat-saturated T1-weighted MRI sequences of 131 patients at our institution with a core needle biopsy-confirmed diagnosis of DCIS. The patients had no preoperative therapy before breast MRI and no prior history of breast cancer. We explored two different deep learning approaches to predict whether there was a hidden (occult) invasive component in the analyzed tumors that was ultimately detected at surgical excision. In the first approach, we adopted the transfer learning strategy, in which a network pre-trained on a large dataset of natural images is fine-tuned with our DCIS images. Specifically, we used the GoogleNet model pre-trained on the ImageNet dataset. In the second approach, we used a pre-trained network to extract deep features, and a support vector machine (SVM) that utilizes these features to predict the upstaging of the DCIS. We used 10-fold cross validation and the area under the ROC curve (AUC) to estimate the performance of the predictive models. Results: The best classification performance was obtained using the deep features approach with GoogleNet model pre-trained on ImageNet as the feature extractor and a polynomial kernel SVM used as the classifier (AUC = 0.70, 95% CI: 0.58- 0.79). For the transfer learning based approach, the highest AUC obtained was 0.53 (95% CI: 0.41-0.62). Conclusion: Convolutional neural networks could potentially be used to identify occult invasive disease in patients diagnosed with DCIS at the initial core needle biopsy.
Tasks	Transfer Learning
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10577v1
PDF	http://arxiv.org/pdf/1711.10577v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-analysis-of-breast-mris-for
Repo
Framework

Hierarchical Model for Long-term Video Prediction


Title	Hierarchical Model for Long-term Video Prediction
Authors	Peter Wang, Zhongxia Yan, Jeff Zhang
Abstract	Video prediction has been an active topic of research in the past few years. Many algorithms focus on pixel-level predictions, which generates results that blur and disintegrate within a few frames. In this project, we use a hierarchical approach for long-term video prediction. We aim at estimating high-level structure in the input frame first, then predict how that structure grows in the future. Finally, we use an image analogy network to recover a realistic image from the predicted structure. Our method is largely adopted from the work by Villegas et al. The method is built with a combination of LSTMs and analogy-based convolutional auto-encoder networks. Additionally, in order to generate more realistic frame predictions, we also adopt adversarial loss. We evaluate our method on the Penn Action dataset, and demonstrate good results on high-level long-term structure prediction.
Tasks	Video Prediction
Published	2017-06-27
URL	http://arxiv.org/abs/1706.08665v2
PDF	http://arxiv.org/pdf/1706.08665v2.pdf
PWC	https://paperswithcode.com/paper/hierarchical-model-for-long-term-video
Repo
Framework

Bellman Gradient Iteration for Inverse Reinforcement Learning


Title	Bellman Gradient Iteration for Inverse Reinforcement Learning
Authors	Kun Li, Yanan Sui, Joel W. Burdick
Abstract	This paper develops an inverse reinforcement learning algorithm aimed at recovering a reward function from the observed actions of an agent. We introduce a strategy to flexibly handle different types of actions with two approximations of the Bellman Optimality Equation, and a Bellman Gradient Iteration method to compute the gradient of the Q-value with respect to the reward function. These methods allow us to build a differentiable relation between the Q-value and the reward function and learn an approximately optimal reward function with gradient methods. We test the proposed method in two simulated environments by evaluating the accuracy of different approximations and comparing the proposed method with existing solutions. The results show that even with a linear reward function, the proposed method has a comparable accuracy with the state-of-the-art method adopting a non-linear reward function, and the proposed method is more flexible because it is defined on observed actions instead of trajectories.
Tasks
Published	2017-07-24
URL	http://arxiv.org/abs/1707.07767v1
PDF	http://arxiv.org/pdf/1707.07767v1.pdf
PWC	https://paperswithcode.com/paper/bellman-gradient-iteration-for-inverse
Repo
Framework