Paper Group ANR 384
Dynamic Face Video Segmentation via Reinforcement Learning. Improving Semantic Segmentation of Aerial Images Using Patch-based Attention. Disentangled Representation Learning with Information Maximizing Autoencoder. TBT: Targeted Neural Network Attack with Bit Trojan. Multi-Modal Machine Learning for Flood Detection in News, Social Media and Satell …
Dynamic Face Video Segmentation via Reinforcement Learning
Title | Dynamic Face Video Segmentation via Reinforcement Learning |
Authors | Yujiang Wang, Mingzhi Dong, Jie Shen, Yang Wu, Shiyang Cheng, Maja Pantic |
Abstract | For real-time semantic video segmentation, most recent works utilised a dynamic framework with a key scheduler to make online key/non-key decisions. Some works used a fixed key scheduling policy, while others proposed adaptive key scheduling methods based on heuristic strategies, both of which may lead to suboptimal global performance. To overcome this limitation, we model the online key decision process in dynamic video segmentation as a deep reinforcement learning problem and learn an efficient and effective scheduling policy from expert information about decision history and from the process of maximising global return. Moreover, we study the application of dynamic video segmentation on face videos, a field that has not been investigated before. By evaluating on the 300VW dataset, we show that the performance of our reinforcement key scheduler outperforms that of various baselines in terms of both effective key selections and running speed. Further results on the Cityscapes dataset demonstrate that our proposed method can also generalise to other scenarios. To the best of our knowledge, this is the first work to use reinforcement learning for online key-frame decision in dynamic video segmentation, and also the first work on its application on face videos. |
Tasks | Video Semantic Segmentation |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01296v2 |
https://arxiv.org/pdf/1907.01296v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-face-video-segmentation-via |
Repo | |
Framework | |
Improving Semantic Segmentation of Aerial Images Using Patch-based Attention
Title | Improving Semantic Segmentation of Aerial Images Using Patch-based Attention |
Authors | Lei Ding, Hao Tang, Lorenzo Bruzzone |
Abstract | The trade-off between feature representation power and spatial localization accuracy is crucial for the dense classification/semantic segmentation of aerial images. High-level features extracted from the late layers of a neural network are rich in semantic information, yet have blurred spatial details; low-level features extracted from the early layers of a network contain more pixel-level information, but are isolated and noisy. It is therefore difficult to bridge the gap between high and low-level features due to their difference in terms of physical information content and spatial distribution. In this work, we contribute to solve this problem by enhancing the feature representation in two ways. On the one hand, a patch attention module (PAM) is proposed to enhance the embedding of context information based on a patch-wise calculation of local attention. On the other hand, an attention embedding module (AEM) is proposed to enrich the semantic information of low-level features by embedding local focus from high-level features. Both of the proposed modules are light-weight and can be applied to process the extracted features of convolutional neural networks (CNNs). Experiments show that, by integrating the proposed modules into the baseline Fully Convolutional Network (FCN), the resulting local attention network (LANet) greatly improves the performance over the baseline and outperforms other attention based methods on two aerial image datasets. |
Tasks | Semantic Segmentation |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08877v1 |
https://arxiv.org/pdf/1911.08877v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-semantic-segmentation-of-aerial |
Repo | |
Framework | |
Disentangled Representation Learning with Information Maximizing Autoencoder
Title | Disentangled Representation Learning with Information Maximizing Autoencoder |
Authors | Kazi Nazmul Haque, Siddique Latif, Rajib Rana |
Abstract | Learning disentangled representation from any unlabelled data is a non-trivial problem. In this paper we propose Information Maximising Autoencoder (InfoAE) where the encoder learns powerful disentangled representation through maximizing the mutual information between the representation and given information in an unsupervised fashion. We have evaluated our model on MNIST dataset and achieved 98.9 ($\pm .1$) $%$ test accuracy while using complete unsupervised training. |
Tasks | Representation Learning |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.08613v1 |
http://arxiv.org/pdf/1904.08613v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangled-representation-learning-with |
Repo | |
Framework | |
TBT: Targeted Neural Network Attack with Bit Trojan
Title | TBT: Targeted Neural Network Attack with Bit Trojan |
Authors | Adnan Siraj Rakin, Zhezhi He, Deliang Fan |
Abstract | Security of modern Deep Neural Networks (DNNs) is under severe scrutiny as the deployment of these models become widespread in many intelligence-based applications. Most recently, DNNs are attacked through Trojan which can effectively infect the model during the training phase and get activated only through specific input patterns (i.e, trigger) during inference. In this work, for the first time, we propose a novel Targeted Bit Trojan(TBT) method, which can insert a targeted neural Trojan into a DNN through the bit-flip attack. Our algorithm efficiently generates a trigger specifically designed to locate certain vulnerable bits of DNN weights stored in main memory (i.e., DRAM). The objective is that once the attacker flips these vulnerable bits, the network still operates with normal inference accuracy with benign input. However, when the attacker activates the trigger by embedding it with any input, the network is forced to classify all inputs to a certain target class. We demonstrate that flipping only several vulnerable bits identified by our method, using available bit-flip techniques (i.e, row-hammer), can transform a fully functional DNN model into a Trojan-infected model. We perform extensive experiments of CIFAR-10, SVHN and ImageNet datasets on both VGG-16 and Resnet-18 architectures. Our proposed TBT could classify 92 % of test images to a target class with as little as 84 bit-flips out of 88 million weight bits on Resnet-18 for CIFAR10 dataset. |
Tasks | |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.05193v3 |
https://arxiv.org/pdf/1909.05193v3.pdf | |
PWC | https://paperswithcode.com/paper/tbt-targeted-neural-network-attack-with-bit |
Repo | |
Framework | |
Multi-Modal Machine Learning for Flood Detection in News, Social Media and Satellite Sequences
Title | Multi-Modal Machine Learning for Flood Detection in News, Social Media and Satellite Sequences |
Authors | Kashif Ahmad, Konstantin Pogorelov, Mohib Ullah, Michael Riegler, Nicola Conci, Johannes Langguth, Ala Al-Fuqaha |
Abstract | In this paper we present our methods for the MediaEval 2019 Mul-timedia Satellite Task, which is aiming to extract complementaryinformation associated with adverse events from Social Media andsatellites. For the first challenge, we propose a framework jointly uti-lizing colour, object and scene-level information to predict whetherthe topic of an article containing an image is a flood event or not.Visual features are combined using early and late fusion techniquesachieving an average F1-score of82.63,82.40,81.40and76.77. Forthe multi-modal flood level estimation, we rely on both visualand textual information achieving an average F1-score of58.48and46.03, respectively. Finally, for the flooding detection in time-based satellite image sequences we used a combination of classicalcomputer-vision and machine learning approaches achieving anaverage F1-score of58.82% |
Tasks | |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02932v1 |
https://arxiv.org/pdf/1910.02932v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-modal-machine-learning-for-flood |
Repo | |
Framework | |
Named Entity Recognition in Electronic Health Records Using Transfer Learning Bootstrapped Neural Networks
Title | Named Entity Recognition in Electronic Health Records Using Transfer Learning Bootstrapped Neural Networks |
Authors | Luka Gligic, Andrey Kormilitzin, Paul Goldberg, Alejo Nevado-Holgado |
Abstract | Neural networks (NNs) have become the state of the art in many machine learning applications, especially in image and sound processing [1]. The same, although to a lesser extent [2,3], could be said in natural language processing (NLP) tasks, such as named entity recognition. However, the success of NNs remains dependent on the availability of large labelled datasets, which is a significant hurdle in many important applications. One such case are electronic health records (EHRs), which are arguably the largest source of medical data, most of which lies hidden in natural text [4,5]. Data access is difficult due to data privacy concerns, and therefore annotated datasets are scarce. With scarce data, NNs will likely not be able to extract this hidden information with practical accuracy. In our study, we develop an approach that solves these problems for named entity recognition, obtaining 94.6 F1 score in I2B2 2009 Medical Extraction Challenge [6], 4.3 above the architecture that won the competition. Beyond the official I2B2 challenge, we further achieve 82.4 F1 on extracting relationships between medical terms. To reach this state-of-the-art accuracy, our approach applies transfer learning to leverage on datasets annotated for other I2B2 tasks, and designs and trains embeddings that specially benefit from such transfer. |
Tasks | Named Entity Recognition, Transfer Learning |
Published | 2019-01-06 |
URL | https://arxiv.org/abs/1901.01592v2 |
https://arxiv.org/pdf/1901.01592v2.pdf | |
PWC | https://paperswithcode.com/paper/named-entity-recognition-in-electronic-health |
Repo | |
Framework | |
Holistic evaluation of XML queries with structural preferences on an annotated strong dataguide
Title | Holistic evaluation of XML queries with structural preferences on an annotated strong dataguide |
Authors | Maurice Tchoupé Tchendji, Adolphe Gaius Nkuefone, Thomas Tébougang Tchendji |
Abstract | With the emergence of XML as de facto format for storing and exchanging information over the Internet, the search for ever more innovative and effective techniques for their querying is a major and current concern of the XML database community. Several studies carried out to help solve this problem are mostly oriented towards the evaluation of so-called exact queries which, unfortunately, are likely (especially in the case of semi-structured documents) to yield abundant results (in the case of vague queries) or empty results (in the case of very precise queries). From the observation that users who make requests are not necessarily interested in all possible solutions, but rather in those that are closest to their needs, an important field of research has been opened on the evaluation of preferences queries. In this paper, we propose an approach for the evaluation of such queries, in case the preferences concern the structure of the document. The solution investigated revolves around the proposal of an evaluation plan in three phases: rewriting-evaluation-merge. The rewriting phase makes it possible to obtain, from a partitioning -transformation operation of the initial query, a hierarchical set of preferences path queries which are holistically evaluated in the second phase by an instrumented version of the algorithm TwigStack. The merge phase is the synthesis of the best results. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.08231v1 |
https://arxiv.org/pdf/1906.08231v1.pdf | |
PWC | https://paperswithcode.com/paper/holistic-evaluation-of-xml-queries-with |
Repo | |
Framework | |
Method for Searching of an Optimal Scenario of Impact in Cognitive Maps during Information Operations Recognition
Title | Method for Searching of an Optimal Scenario of Impact in Cognitive Maps during Information Operations Recognition |
Authors | Oleh Dmytrenko, Dmitry Lande, Oleh Andriichuk |
Abstract | In this paper, we consider the problem of choosing the optimal scenario of the impact between nodes based on of the introduced criteria for the optimality of the impact. Two criteria for the optimality of the impact, which are called the force of impact and the speed of implementation of the scenario, are considered. To obtain a unique solution of the problem, a multi-criterial assessment of the received scenarios using the Pareto principle was applied. Based on the criteria of a force of impact and the speed of implementation of the scenario, the choice of the optimal scenario of impact was justified. The results and advantages of the proposed approach in comparison with the Kosko model are presented. |
Tasks | |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.13308v1 |
http://arxiv.org/pdf/1904.13308v1.pdf | |
PWC | https://paperswithcode.com/paper/method-for-searching-of-an-optimal-scenario |
Repo | |
Framework | |
May I Check Again? – A simple but efficient way to generate and use contextual dictionaries for Named Entity Recognition. Application to French Legal Texts
Title | May I Check Again? – A simple but efficient way to generate and use contextual dictionaries for Named Entity Recognition. Application to French Legal Texts |
Authors | Valentin Barriere, Amaury Fouret |
Abstract | In this paper we present a new method to learn a model robust to typos for a Named Entity Recognition task. Our improvement over existing methods helps the model to take into account the context of the sentence inside a court decision in order to recognize an entity with a typo. We used state-of-the-art models and enriched the last layer of the neural network with high-level information linked with the potential of the word to be a certain type of entity. More precisely, we utilized the similarities between the word and the potential entity candidates in the tagged sentence context. The experiments on a dataset of French court decisions show a reduction of the relative F1-score error of 32%, upgrading the score obtained with the most competitive fine-tuned state-of-the-art system from 94.85% to 96.52%. |
Tasks | Named Entity Recognition |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03453v1 |
https://arxiv.org/pdf/1909.03453v1.pdf | |
PWC | https://paperswithcode.com/paper/may-i-check-again-a-simple-but-efficient-way |
Repo | |
Framework | |
Dynamic Packed Compact Tries Revisited
Title | Dynamic Packed Compact Tries Revisited |
Authors | Kazuya Tsuruta, Dominik Köppl, Shunsuke Kanda, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda |
Abstract | Given a dynamic set $K$ of $k$ strings of total length $n$ whose characters are drawn from an alphabet of size $\sigma$, a keyword dictionary is a data structure built on $K$ that provides lookup, prefix search, and update operations on $K$. Under the assumption that $\alpha = w/ \lg \sigma$ characters fit into a single machine word of $w$ bits, we propose a keyword dictionary that represents $K$ in either $n \lg \sigma + \Theta(k \lg n)$ or $T \lg \sigma + \Theta(k w)$ bits of space, where $T$ is the number of nodes of a trie representing $K$. It supports all operations in $O(m / \alpha + \lg \alpha)$ expected time on an input string of length $m$ in the word RAM model. An exhaustive practical evaluation highlights the practical usefulness of the proposed data structure, especially for prefix searches - one of the most essential keyword dictionary operations. |
Tasks | |
Published | 2019-04-16 |
URL | https://arxiv.org/abs/1904.07467v2 |
https://arxiv.org/pdf/1904.07467v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-packed-compact-tries-revisited |
Repo | |
Framework | |
Label Smoothing and Logit Squeezing: A Replacement for Adversarial Training?
Title | Label Smoothing and Logit Squeezing: A Replacement for Adversarial Training? |
Authors | Ali Shafahi, Amin Ghiasi, Furong Huang, Tom Goldstein |
Abstract | Adversarial training is one of the strongest defenses against adversarial attacks, but it requires adversarial examples to be generated for every mini-batch during optimization. The expense of producing these examples during training often precludes adversarial training from use on complex image datasets. In this study, we explore the mechanisms by which adversarial training improves classifier robustness, and show that these mechanisms can be effectively mimicked using simple regularization methods, including label smoothing and logit squeezing. Remarkably, using these simple regularization methods in combination with Gaussian noise injection, we are able to achieve strong adversarial robustness – often exceeding that of adversarial training – using no adversarial examples. |
Tasks | |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11585v1 |
https://arxiv.org/pdf/1910.11585v1.pdf | |
PWC | https://paperswithcode.com/paper/label-smoothing-and-logit-squeezing-a |
Repo | |
Framework | |
Argument Mining for Understanding Peer Reviews
Title | Argument Mining for Understanding Peer Reviews |
Authors | Xinyu Hua, Mitko Nikolov, Nikhil Badugu, Lu Wang |
Abstract | Peer-review plays a critical role in the scientific writing and publication ecosystem. To assess the efficiency and efficacy of the reviewing process, one essential element is to understand and evaluate the reviews themselves. In this work, we study the content and structure of peer reviews under the argument mining framework, through automatically detecting (1) argumentative propositions put forward by reviewers, and (2) their types (e.g., evaluating the work or making suggestions for improvement). We first collect 14.2K reviews from major machine learning and natural language processing venues. 400 reviews are annotated with 10,386 propositions and corresponding types of Evaluation, Request, Fact, Reference, or Quote. We then train state-of-the-art proposition segmentation and classification models on the data to evaluate their utilities and identify new challenges for this new domain, motivating future directions for argument mining. Further experiments show that proposition usage varies across venues in amount, type, and topic. |
Tasks | Argument Mining |
Published | 2019-03-25 |
URL | http://arxiv.org/abs/1903.10104v1 |
http://arxiv.org/pdf/1903.10104v1.pdf | |
PWC | https://paperswithcode.com/paper/argument-mining-for-understanding-peer |
Repo | |
Framework | |
Space Efficient Algorithms for Breadth-Depth Search
Title | Space Efficient Algorithms for Breadth-Depth Search |
Authors | Sankardeep Chakraborty, Anish Mukherjee, Srinivasa Rao Satti |
Abstract | Continuing the recent trend, in this article we design several space-efficient algorithms for two well-known graph search methods. Both these search methods share the same name {\it breadth-depth search} (henceforth {\sf BDS}), although they work entirely in different fashion. The classical implementation for these graph search methods takes $O(m+n)$ time and $O(n \lg n)$ bits of space in the standard word RAM model (with word size being $\Theta(\lg n)$ bits), where $m$ and $n$ denotes the number of edges and vertices of the input graph respectively. Our goal here is to beat the space bound of the classical implementations, and design $o(n \lg n)$ space algorithms for these search methods by paying little to no penalty in the running time. Note that our space bounds (i.e., with $o(n \lg n)$ bits of space) do not even allow us to explicitly store the required information to implement the classical algorithms, yet our algorithms visits and reports all the vertices of the input graph in correct order. |
Tasks | |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.07874v1 |
https://arxiv.org/pdf/1906.07874v1.pdf | |
PWC | https://paperswithcode.com/paper/space-efficient-algorithms-for-breadth-depth |
Repo | |
Framework | |
Overfitting Mechanism and Avoidance in Deep Neural Networks
Title | Overfitting Mechanism and Avoidance in Deep Neural Networks |
Authors | Shaeke Salman, Xiuwen Liu |
Abstract | Assisted by the availability of data and high performance computing, deep learning techniques have achieved breakthroughs and surpassed human performance empirically in difficult tasks, including object recognition, speech recognition, and natural language processing. As they are being used in critical applications, understanding underlying mechanisms for their successes and limitations is imperative. In this paper, we show that overfitting, one of the fundamental issues in deep neural networks, is due to continuous gradient updating and scale sensitiveness of cross entropy loss. By separating samples into correctly and incorrectly classified ones, we show that they behave very differently, where the loss decreases in the correct ones and increases in the incorrect ones. Furthermore, by analyzing dynamics during training, we propose a consensus-based classification algorithm that enables us to avoid overfitting and significantly improve the classification accuracy especially when the number of training samples is limited. As each trained neural network depends on extrinsic factors such as initial values as well as training data, requiring consensus among multiple models reduces extrinsic factors substantially; for statistically independent models, the reduction is exponential. Compared to ensemble algorithms, the proposed algorithm avoids overgeneralization by not classifying ambiguous inputs. Systematic experimental results demonstrate the effectiveness of the proposed algorithm. For example, using only 1000 training samples from MNIST dataset, the proposed algorithm achieves 95% accuracy, significantly higher than any of the individual models, with 90% of the test samples classified. |
Tasks | Object Recognition, Speech Recognition |
Published | 2019-01-19 |
URL | http://arxiv.org/abs/1901.06566v1 |
http://arxiv.org/pdf/1901.06566v1.pdf | |
PWC | https://paperswithcode.com/paper/overfitting-mechanism-and-avoidance-in-deep |
Repo | |
Framework | |
How Good is Artificial Intelligence at Automatically Answering Consumer Questions Related to Alzheimer’s Disease?
Title | How Good is Artificial Intelligence at Automatically Answering Consumer Questions Related to Alzheimer’s Disease? |
Authors | Yanshan Wang, Krishna B. Soundararajan, Sunyang Fu, Luke A. Carlson, Rebecca A. Smith, David S. Knopman, Hongfang Liu |
Abstract | Alzheimer’s Disease (AD) is the most common type of dementia, comprising 60-80% of cases. There were an estimated 5.8 million Americans living with Alzheimer’s dementia in 2019, and this number will almost double every 20 years. The total lifetime cost of care for someone with dementia is estimated to be $350,174 in 2018, 70% of which is associated with family-provided care. Most family caregivers face emotional, financial and physical difficulties. As a medium to relieve this burden, online communities in social media websites such as Twitter, Reddit, and Yahoo! Answers provide potential venues for caregivers to search relevant questions and answers, or post questions and seek answers from other members. However, there are often a limited number of relevant questions and responses to search from, and posted questions are rarely answered immediately. Due to recent advancement in Artificial Intelligence (AI), particularly Natural Language Processing (NLP), we propose to utilize AI to automatically generate answers to AD-related consumer questions posted by caregivers and evaluate how good AI is at answering those questions. To the best of our knowledge, this is the first study in the literature applying and evaluating AI models designed to automatically answer consumer questions related to AD. |
Tasks | |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.10678v1 |
https://arxiv.org/pdf/1908.10678v1.pdf | |
PWC | https://paperswithcode.com/paper/how-good-is-artificial-intelligence-at |
Repo | |
Framework | |