Paper Group ANR 359
Electricity Theft Detection using Machine Learning. Tackling Dynamic Vehicle Routing Problem with Time Windows by means of Ant Colony System. A Deep Generative Framework for Paraphrase Generation. Enhancing workflow-nets with data for trace completion. Distributed Dual Coordinate Ascent in General Tree Networks and Its Application in Federated Lear …
Electricity Theft Detection using Machine Learning
Title | Electricity Theft Detection using Machine Learning |
Authors | Niklas Dahringer |
Abstract | Non-technical losses (NTL) in electric power grids arise through electricity theft, broken electric meters or billing errors. They can harm the power supplier as well as the whole economy of a country through losses of up to 40% of the total power distribution. For NTL detection, researchers use artificial intelligence to analyse data. This work is about improving the extraction of more meaningful features from a data set. With these features, the prediction quality will increase. |
Tasks | |
Published | 2017-08-19 |
URL | http://arxiv.org/abs/1708.05907v1 |
http://arxiv.org/pdf/1708.05907v1.pdf | |
PWC | https://paperswithcode.com/paper/electricity-theft-detection-using-machine |
Repo | |
Framework | |
Tackling Dynamic Vehicle Routing Problem with Time Windows by means of Ant Colony System
Title | Tackling Dynamic Vehicle Routing Problem with Time Windows by means of Ant Colony System |
Authors | Raluca Necula, Mihaela Breaban, Madalina Raschip |
Abstract | The Dynamic Vehicle Routing Problem with Time Windows (DVRPTW) is an extension of the well-known Vehicle Routing Problem (VRP), which takes into account the dynamic nature of the problem. This aspect requires the vehicle routes to be updated in an ongoing manner as new customer requests arrive in the system and must be incorporated into an evolving schedule during the working day. Besides the vehicle capacity constraint involved in the classical VRP, DVRPTW considers in addition time windows, which are able to better capture real-world situations. Despite this, so far, few studies have focused on tackling this problem of greater practical importance. To this end, this study devises for the resolution of DVRPTW, an ant colony optimization based algorithm, which resorts to a joint solution construction mechanism, able to construct in parallel the vehicle routes. This method is coupled with a local search procedure, aimed to further improve the solutions built by ants, and with an insertion heuristics, which tries to reduce the number of vehicles used to service the available customers. The experiments indicate that the proposed algorithm is competitive and effective, and on DVRPTW instances with a higher dynamicity level, it is able to yield better results compared to existing ant-based approaches. |
Tasks | |
Published | 2017-04-06 |
URL | http://arxiv.org/abs/1704.01859v1 |
http://arxiv.org/pdf/1704.01859v1.pdf | |
PWC | https://paperswithcode.com/paper/tackling-dynamic-vehicle-routing-problem-with |
Repo | |
Framework | |
A Deep Generative Framework for Paraphrase Generation
Title | A Deep Generative Framework for Paraphrase Generation |
Authors | Ankush Gupta, Arvind Agarwal, Prawaan Singh, Piyush Rai |
Abstract | Paraphrase generation is an important problem in NLP, especially in question answering, information retrieval, information extraction, conversation systems, to name a few. In this paper, we address the problem of generating paraphrases automatically. Our proposed method is based on a combination of deep generative models (VAE) with sequence-to-sequence models (LSTM) to generate paraphrases, given an input sentence. Traditional VAEs when combined with recurrent neural networks can generate free text but they are not suitable for paraphrase generation for a given sentence. We address this problem by conditioning the both, encoder and decoder sides of VAE, on the original sentence, so that it can generate the given sentence’s paraphrases. Unlike most existing models, our model is simple, modular and can generate multiple paraphrases, for a given sentence. Quantitative evaluation of the proposed method on a benchmark paraphrase dataset demonstrates its efficacy, and its performance improvement over the state-of-the-art methods by a significant margin, whereas qualitative human evaluation indicate that the generated paraphrases are well-formed, grammatically correct, and are relevant to the input sentence. Furthermore, we evaluate our method on a newly released question paraphrase dataset, and establish a new baseline for future research. |
Tasks | Information Retrieval, Paraphrase Generation, Question Answering |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05074v1 |
http://arxiv.org/pdf/1709.05074v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-generative-framework-for-paraphrase |
Repo | |
Framework | |
Enhancing workflow-nets with data for trace completion
Title | Enhancing workflow-nets with data for trace completion |
Authors | Riccardo De Masellis, Chiara Di Francescomarino, Chiara Ghidini, Sergio Tessaris |
Abstract | The growing adoption of IT-systems for modeling and executing (business) processes or services has thrust the scientific investigation towards techniques and tools which support more complex forms of process analysis. Many of them, such as conformance checking, process alignment, mining and enhancement, rely on complete observation of past (tracked and logged) executions. In many real cases, however, the lack of human or IT-support on all the steps of process execution, as well as information hiding and abstraction of model and data, result in incomplete log information of both data and activities. This paper tackles the issue of automatically repairing traces with missing information by notably considering not only activities but also data manipulated by them. Our technique recasts such a problem in a reachability problem and provides an encoding in an action language which allows to virtually use any state-of-the-art planning to return solutions. |
Tasks | |
Published | 2017-06-01 |
URL | http://arxiv.org/abs/1706.00356v1 |
http://arxiv.org/pdf/1706.00356v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-workflow-nets-with-data-for-trace |
Repo | |
Framework | |
Distributed Dual Coordinate Ascent in General Tree Networks and Its Application in Federated Learning
Title | Distributed Dual Coordinate Ascent in General Tree Networks and Its Application in Federated Learning |
Authors | Myung Cho, Lifeng Lai, Weiyu Xu |
Abstract | Due to the big size of data and limited data storage volume of a single computer or a signal server, data are often stored in a distributed manner. Thus, performing large-scale machine learning operations with the distributed datasets through communication networks is often required. In this paper, we investigate the impact of network communication constraints on the convergence speed of the communication-efficient distributed machine learning algorithm. Firstly, we study the convergence rate of the distributed dual coordinate ascent algorithm in a general tree structured network. Since a tree network model can be understood as the generalization of a star network model, our algorithm can be thought of as the generalization of the distributed dual coordinate ascent in a star network model. Secondly, by considering network communication delays, we optimize the network-constrained distributed dual coordinate ascent algorithm to maximize its convergence speed. In numerical experiments, we take into account federated learning scenarios, where local workers cannot directly reach to a central node due to communication constraints. Through extensive numerical experiments, we demonstrate that the usability of our distributed dual coordinate ascent algorithm in a tree network in the federated learning scenarios. Additionally, under different network communication delays, the delay-dependent number of local and global iterations in the distributed dual coordinated ascent can improve the convergence speed. |
Tasks | |
Published | 2017-03-14 |
URL | https://arxiv.org/abs/1703.04785v5 |
https://arxiv.org/pdf/1703.04785v5.pdf | |
PWC | https://paperswithcode.com/paper/network-constrained-distributed-dual |
Repo | |
Framework | |
Multi-channel Encoder for Neural Machine Translation
Title | Multi-channel Encoder for Neural Machine Translation |
Authors | Hao Xiong, Zhongjun He, Xiaoguang Hu, Hua Wu |
Abstract | Attention-based Encoder-Decoder has the effective architecture for neural machine translation (NMT), which typically relies on recurrent neural networks (RNN) to build the blocks that will be lately called by attentive reader during the decoding process. This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN. On the other hand, we often hope the decoder to take pieces of source sentence at varying levels suiting its own linguistic structure: for example, we may want to take the entity name in its raw form while taking an idiom as a perfectly composed unit. Motivated by this demand, we propose Multi-channel Encoder (MCE), which enhances encoding components with different levels of composition. More specifically, in addition to the hidden state of encoding RNN, MCE takes 1) the original word embedding for raw encoding with no composition, and 2) a particular design of external memory in Neural Turing Machine (NTM) for more complex composition, while all three encoding strategies are properly blended during decoding. Empirical study on Chinese-English translation shows that our model can improve by 6.52 BLEU points upon a strong open source NMT system: DL4MT1. On the WMT14 English- French task, our single shallow system achieves BLEU=38.8, comparable with the state-of-the-art deep models. |
Tasks | Machine Translation |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02109v1 |
http://arxiv.org/pdf/1712.02109v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-channel-encoder-for-neural-machine |
Repo | |
Framework | |
Polyp detection inside the capsule endoscopy: an approach for power consumption reduction
Title | Polyp detection inside the capsule endoscopy: an approach for power consumption reduction |
Authors | Mohammad Amin Khorsandi, Nader Karimi, Shadrokh Samavi |
Abstract | Capsule endoscopy is a novel and non-invasive method for diagnosis, which assists gastroenterologists to monitor the digestive track. Although this new technology has many advantages over the conventional endoscopy, there are weaknesses that limits the usage of this technology. Some weaknesses are due to using small-size batteries. Radio transmitter consumes the largest portion of energy; consequently, a simple way to reduce the power consumption is to reduce the data to be transmitted. Many works are proposed to reduce the amount of data to be transmitted consist of specific compression methods and reduction in video resolution and frame rate. We proposed a system inside the capsule for detecting informative frames and sending these frames instead of several non-informative frames. In this work, we specifically focused on hardware friendly algorithm (with capability of parallelism and pipeline) for implementation of polyp detection. Two features of positive contrast and customized edges of polyps are exploited to define whether the frame consists of polyp or not. The proposed method is devoid of complex and iterative structure to save power and reduce the response time. Experimental results indicate acceptable rate of detection of our work. |
Tasks | |
Published | 2017-12-29 |
URL | http://arxiv.org/abs/1712.10164v1 |
http://arxiv.org/pdf/1712.10164v1.pdf | |
PWC | https://paperswithcode.com/paper/polyp-detection-inside-the-capsule-endoscopy |
Repo | |
Framework | |
Current-mode Memristor Crossbars for Neuromemristive Systems
Title | Current-mode Memristor Crossbars for Neuromemristive Systems |
Authors | Cory Merkel |
Abstract | Motivated by advantages of current-mode design, this brief contribution explores the implementation of weight matrices in neuromemristive systems via current-mode memristor crossbar circuits. After deriving theoretical results for the range and distribution of weights in the current-mode design, it is shown that any weight matrix based on voltage-mode crossbars can be mapped to a current-mode crossbar if the voltage-mode weights are carefully bounded. Then, a modified gradient descent rule is derived for the current-mode design that can be used to perform backpropagation training. Behavioral simulations on the MNIST dataset indicate that both voltage and current-mode designs are able to achieve similar accuracy and have similar defect tolerance. However, analysis of trained weight distributions reveals that current-mode and voltage-mode designs may use different feature representations. |
Tasks | |
Published | 2017-07-17 |
URL | http://arxiv.org/abs/1707.05316v1 |
http://arxiv.org/pdf/1707.05316v1.pdf | |
PWC | https://paperswithcode.com/paper/current-mode-memristor-crossbars-for |
Repo | |
Framework | |
Guided Deep List: Automating the Generation of Epidemiological Line Lists from Open Sources
Title | Guided Deep List: Automating the Generation of Epidemiological Line Lists from Open Sources |
Authors | Saurav Ghosh, Prithwish Chakraborty, Bryan L. Lewis, Maimuna S. Majumder, Emily Cohn, John S. Brownstein, Madhav V. Marathe, Naren Ramakrishnan |
Abstract | Real-time monitoring and responses to emerging public health threats rely on the availability of timely surveillance data. During the early stages of an epidemic, the ready availability of line lists with detailed tabular information about laboratory-confirmed cases can assist epidemiologists in making reliable inferences and forecasts. Such inferences are crucial to understand the epidemiology of a specific disease early enough to stop or control the outbreak. However, construction of such line lists requires considerable human supervision and therefore, difficult to generate in real-time. In this paper, we motivate Guided Deep List, the first tool for building automated line lists (in near real-time) from open source reports of emerging disease outbreaks. Specifically, we focus on deriving epidemiological characteristics of an emerging disease and the affected population from reports of illness. Guided Deep List uses distributed vector representations (ala word2vec) to discover a set of indicators for each line list feature. This discovery of indicators is followed by the use of dependency parsing based techniques for final extraction in tabular form. We evaluate the performance of Guided Deep List against a human annotated line list provided by HealthMap corresponding to MERS outbreaks in Saudi Arabia. We demonstrate that Guided Deep List extracts line list features with increased accuracy compared to a baseline method. We further show how these automatically extracted line list features can be used for making epidemiological inferences, such as inferring demographics and symptoms-to-hospitalization period of affected individuals. |
Tasks | Dependency Parsing, Epidemiology |
Published | 2017-02-22 |
URL | http://arxiv.org/abs/1702.06663v1 |
http://arxiv.org/pdf/1702.06663v1.pdf | |
PWC | https://paperswithcode.com/paper/guided-deep-list-automating-the-generation-of |
Repo | |
Framework | |
VQABQ: Visual Question Answering by Basic Questions
Title | VQABQ: Visual Question Answering by Basic Questions |
Authors | Jia-Hong Huang, Modar Alfadly, Bernard Ghanem |
Abstract | Taking an image and question as the input of our method, it can output the text-based answer of the query question about the given image, so called Visual Question Answering (VQA). There are two main modules in our algorithm. Given a natural language question about an image, the first module takes the question as input and then outputs the basic questions of the main given question. The second module takes the main question, image and these basic questions as input and then outputs the text-based answer of the main question. We formulate the basic questions generation problem as a LASSO optimization problem, and also propose a criterion about how to exploit these basic questions to help answer main question. Our method is evaluated on the challenging VQA dataset and yields state-of-the-art accuracy, 60.34% in open-ended task. |
Tasks | Question Answering, Visual Question Answering |
Published | 2017-03-19 |
URL | http://arxiv.org/abs/1703.06492v2 |
http://arxiv.org/pdf/1703.06492v2.pdf | |
PWC | https://paperswithcode.com/paper/vqabq-visual-question-answering-by-basic |
Repo | |
Framework | |
Transfer Learning to Learn with Multitask Neural Model Search
Title | Transfer Learning to Learn with Multitask Neural Model Search |
Authors | Catherine Wong, Andrea Gesmundo |
Abstract | Deep learning models require extensive architecture design exploration and hyperparameter optimization to perform well on a given task. The exploration of the model design space is often made by a human expert, and optimized using a combination of grid search and search heuristics over a large space of possible choices. Neural Architecture Search (NAS) is a Reinforcement Learning approach that has been proposed to automate architecture design. NAS has been successfully applied to generate Neural Networks that rival the best human-designed architectures. However, NAS requires sampling, constructing, and training hundreds to thousands of models to achieve well-performing architectures. This procedure needs to be executed from scratch for each new task. The application of NAS to a wide set of tasks currently lacks a way to transfer generalizable knowledge across tasks. In this paper, we present the Multitask Neural Model Search (MNMS) controller. Our goal is to learn a generalizable framework that can condition model construction on successful model searches for previously seen tasks, thus significantly speeding up the search for new tasks. We demonstrate that MNMS can conduct an automated architecture search for multiple tasks simultaneously while still learning well-performing, specialized models for each task. We then show that pre-trained MNMS controllers can transfer learning to new tasks. By leveraging knowledge from previous searches, we find that pre-trained MNMS models start from a better location in the search space and reduce search time on unseen tasks, while still discovering models that outperform published human-designed models. |
Tasks | Hyperparameter Optimization, Neural Architecture Search, Transfer Learning |
Published | 2017-10-30 |
URL | http://arxiv.org/abs/1710.10776v1 |
http://arxiv.org/pdf/1710.10776v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-to-learn-with-multitask |
Repo | |
Framework | |
Dual Motion GAN for Future-Flow Embedded Video Prediction
Title | Dual Motion GAN for Future-Flow Embedded Video Prediction |
Authors | Xiaodan Liang, Lisa Lee, Wei Dai, Eric P. Xing |
Abstract | Future frame prediction in videos is a promising avenue for unsupervised video representation learning. Video frames are naturally generated by the inherent pixel flows from preceding frames based on the appearance and motion dynamics in the video. However, existing methods focus on directly hallucinating pixel values, resulting in blurry predictions. In this paper, we develop a dual motion Generative Adversarial Net (GAN) architecture, which learns to explicitly enforce future-frame predictions to be consistent with the pixel-wise flows in the video through a dual-learning mechanism. The primal future-frame prediction and dual future-flow prediction form a closed loop, generating informative feedback signals to each other for better video prediction. To make both synthesized future frames and flows indistinguishable from reality, a dual adversarial training method is proposed to ensure that the future-flow prediction is able to help infer realistic future-frames, while the future-frame prediction in turn leads to realistic optical flows. Our dual motion GAN also handles natural motion uncertainty in different pixel locations with a new probabilistic motion encoder, which is based on variational autoencoders. Extensive experiments demonstrate that the proposed dual motion GAN significantly outperforms state-of-the-art approaches on synthesizing new video frames and predicting future flows. Our model generalizes well across diverse visual scenes and shows superiority in unsupervised video representation learning. |
Tasks | Representation Learning, Video Prediction |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.00284v2 |
http://arxiv.org/pdf/1708.00284v2.pdf | |
PWC | https://paperswithcode.com/paper/dual-motion-gan-for-future-flow-embedded |
Repo | |
Framework | |
Deep learning analysis of breast MRIs for prediction of occult invasive disease in ductal carcinoma in situ
Title | Deep learning analysis of breast MRIs for prediction of occult invasive disease in ductal carcinoma in situ |
Authors | Zhe Zhu, Michael Harowicz, Jun Zhang, Ashirbani Saha, Lars J. Grimm, E. Shelley Hwang, Maciej A. Mazurowski |
Abstract | Purpose: To determine whether deep learning-based algorithms applied to breast MR images can aid in the prediction of occult invasive disease following the di- agnosis of ductal carcinoma in situ (DCIS) by core needle biopsy. Material and Methods: In this institutional review board-approved study, we analyzed dynamic contrast-enhanced fat-saturated T1-weighted MRI sequences of 131 patients at our institution with a core needle biopsy-confirmed diagnosis of DCIS. The patients had no preoperative therapy before breast MRI and no prior history of breast cancer. We explored two different deep learning approaches to predict whether there was a hidden (occult) invasive component in the analyzed tumors that was ultimately detected at surgical excision. In the first approach, we adopted the transfer learning strategy, in which a network pre-trained on a large dataset of natural images is fine-tuned with our DCIS images. Specifically, we used the GoogleNet model pre-trained on the ImageNet dataset. In the second approach, we used a pre-trained network to extract deep features, and a support vector machine (SVM) that utilizes these features to predict the upstaging of the DCIS. We used 10-fold cross validation and the area under the ROC curve (AUC) to estimate the performance of the predictive models. Results: The best classification performance was obtained using the deep features approach with GoogleNet model pre-trained on ImageNet as the feature extractor and a polynomial kernel SVM used as the classifier (AUC = 0.70, 95% CI: 0.58- 0.79). For the transfer learning based approach, the highest AUC obtained was 0.53 (95% CI: 0.41-0.62). Conclusion: Convolutional neural networks could potentially be used to identify occult invasive disease in patients diagnosed with DCIS at the initial core needle biopsy. |
Tasks | Transfer Learning |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10577v1 |
http://arxiv.org/pdf/1711.10577v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-analysis-of-breast-mris-for |
Repo | |
Framework | |
Hierarchical Model for Long-term Video Prediction
Title | Hierarchical Model for Long-term Video Prediction |
Authors | Peter Wang, Zhongxia Yan, Jeff Zhang |
Abstract | Video prediction has been an active topic of research in the past few years. Many algorithms focus on pixel-level predictions, which generates results that blur and disintegrate within a few frames. In this project, we use a hierarchical approach for long-term video prediction. We aim at estimating high-level structure in the input frame first, then predict how that structure grows in the future. Finally, we use an image analogy network to recover a realistic image from the predicted structure. Our method is largely adopted from the work by Villegas et al. The method is built with a combination of LSTMs and analogy-based convolutional auto-encoder networks. Additionally, in order to generate more realistic frame predictions, we also adopt adversarial loss. We evaluate our method on the Penn Action dataset, and demonstrate good results on high-level long-term structure prediction. |
Tasks | Video Prediction |
Published | 2017-06-27 |
URL | http://arxiv.org/abs/1706.08665v2 |
http://arxiv.org/pdf/1706.08665v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-model-for-long-term-video |
Repo | |
Framework | |
Bellman Gradient Iteration for Inverse Reinforcement Learning
Title | Bellman Gradient Iteration for Inverse Reinforcement Learning |
Authors | Kun Li, Yanan Sui, Joel W. Burdick |
Abstract | This paper develops an inverse reinforcement learning algorithm aimed at recovering a reward function from the observed actions of an agent. We introduce a strategy to flexibly handle different types of actions with two approximations of the Bellman Optimality Equation, and a Bellman Gradient Iteration method to compute the gradient of the Q-value with respect to the reward function. These methods allow us to build a differentiable relation between the Q-value and the reward function and learn an approximately optimal reward function with gradient methods. We test the proposed method in two simulated environments by evaluating the accuracy of different approximations and comparing the proposed method with existing solutions. The results show that even with a linear reward function, the proposed method has a comparable accuracy with the state-of-the-art method adopting a non-linear reward function, and the proposed method is more flexible because it is defined on observed actions instead of trajectories. |
Tasks | |
Published | 2017-07-24 |
URL | http://arxiv.org/abs/1707.07767v1 |
http://arxiv.org/pdf/1707.07767v1.pdf | |
PWC | https://paperswithcode.com/paper/bellman-gradient-iteration-for-inverse |
Repo | |
Framework | |