January 28, 2020

3338 words 16 mins read

Paper Group ANR 805

Triple Generative Adversarial Networks. Convolutional neural network for detection and classification of seizures in clinical data. Convolutional Neural Networks for Automatic Meter Reading. The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures. Towards Inconsistency Measurement in …

Triple Generative Adversarial Networks


Title	Triple Generative Adversarial Networks
Authors	Chongxuan Li, Kun Xu, Jiashuo Liu, Jun Zhu, Bo Zhang
Abstract	Generative adversarial networks (GANs) have shown promise in image generation and classification given limited supervision. Existing methods extend the unsupervised GAN framework to incorporate supervision heuristically. Specifically, a single discriminator plays two incompatible roles of identifying fake samples and predicting labels and it only estimates the data without considering the labels. The formulation intrinsically causes two problems: (1) the generator and the discriminator (i.e., the classifier) may not converge to the data distribution at the same time; and (2) the generator cannot control the semantics of the generated samples. In this paper, we present the triple generative adversarial network (Triple-GAN), which consists of three players—a generator, a classifier, and a discriminator. The generator and the classifier characterize the conditional distributions between images and labels, and the discriminator solely focuses on identifying fake image-label pairs. We design compatible objective functions to ensure that the distributions characterized by the generator and the classifier converge to the data distribution. We evaluate Triple-GAN in two challenging settings, namely, semi-supervised learning and the extreme low data regime. In both settings, Triple-GAN can achieve state-of-the-art classification results among deep generative models and generate meaningful samples in a specific class simultaneously.
Tasks	Image Generation
Published	2019-12-20
URL	https://arxiv.org/abs/1912.09784v1
PDF	https://arxiv.org/pdf/1912.09784v1.pdf
PWC	https://paperswithcode.com/paper/triple-generative-adversarial-networks
Repo
Framework

Convolutional neural network for detection and classification of seizures in clinical data


Title	Convolutional neural network for detection and classification of seizures in clinical data
Authors	Tomas Iesmantas, Robertas Alzbutas
Abstract	Epileptic seizure detection and classification in clinical electroencephalogram data still is a challenge, and only low sensitivity with a high rate of false positives has been achieved with commercially available seizure detection tools, which usually are patient non-specific. Epilepsy patients suffer from severe detrimental effects like physical injury or depression due to unpredictable seizures. However, even in hospitals due to the high rate of false positives the seizure alert systems are of poor help for patients as tools of seizure detection are mostly trained on unrealistically clean data, containing little noise and obtained under controlled laboratory conditions, where patient groups are homogeneous, e.g. in terms of age or type of seizures. In this study authors present the approach for detection and classification of a seizure using clinical data of electroencephalograms and a convolutional neural network trained on features of brain synchronisation and power spectrum. Various deep learning methods were applied, and the network was trained on very heterogeneous clinical electroencephalogram dataset. In total, eight different types of seizures were considered, and the patients were of various ages, health conditions and they were observed under clinical conditions. Despite this, classifier presented in this paper achieved sensitivity and specificity equal to 0.68 and 0.67, accordingly, which is a significant improvement as compared to the known results for clinical data.
Tasks	Seizure Detection
Published	2019-03-21
URL	http://arxiv.org/abs/1903.08864v1
PDF	http://arxiv.org/pdf/1903.08864v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-network-for-detection
Repo
Framework

Convolutional Neural Networks for Automatic Meter Reading


Title	Convolutional Neural Networks for Automatic Meter Reading
Authors	Rayson Laroca, Victor Barroso, Matheus A. Diniz, Gabriel R. Gonçalves, William Robson Schwartz, David Menotti
Abstract	In this paper, we tackle Automatic Meter Reading (AMR) by leveraging the high capability of Convolutional Neural Networks (CNNs). We design a two-stage approach that employs the Fast-YOLO object detector for counter detection and evaluates three different CNN-based approaches for counter recognition. In the AMR literature, most datasets are not available to the research community since the images belong to a service company. In this sense, we introduce a new public dataset, called UFPR-AMR dataset, with 2,000 fully and manually annotated images. This dataset is, to the best of our knowledge, three times larger than the largest public dataset found in the literature and contains a well-defined evaluation protocol to assist the development and evaluation of AMR methods. Furthermore, we propose the use of a data augmentation technique to generate a balanced training set with many more examples to train the CNN models for counter recognition. In the proposed dataset, impressive results were obtained and a detailed speed/accuracy trade-off evaluation of each model was performed. In a public dataset, state-of-the-art results were achieved using less than 200 images for training.
Tasks	Data Augmentation
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09600v1
PDF	http://arxiv.org/pdf/1902.09600v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-automatic
Repo
Framework

The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures


Title	The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures
Authors	Sheshera Mysore, Zach Jensen, Edward Kim, Kevin Huang, Haw-Shiuan Chang, Emma Strubell, Jeffrey Flanigan, Andrew McCallum, Elsa Olivetti
Abstract	Materials science literature contains millions of materials synthesis procedures described in unstructured natural language text. Large-scale analysis of these synthesis procedures would facilitate deeper scientific understanding of materials synthesis and enable automated synthesis planning. Such analysis requires extracting structured representations of synthesis procedures from the raw text as a first step. To facilitate the training and evaluation of synthesis extraction models, we introduce a dataset of 230 synthesis procedures annotated by domain experts with labeled graphs that express the semantics of the synthesis sentences. The nodes in this graph are synthesis operations and their typed arguments, and labeled edges specify relations between the nodes. We describe this new resource in detail and highlight some specific challenges to annotating scientific text with shallow semantic structure. We make the corpus available to the community to promote further research and development of scientific information extraction systems.
Tasks
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06939v2
PDF	https://arxiv.org/pdf/1905.06939v2.pdf
PWC	https://paperswithcode.com/paper/the-materials-science-procedural-text-corpus
Repo
Framework

Towards Inconsistency Measurement in Business Rule Bases


Title	Towards Inconsistency Measurement in Business Rule Bases
Authors	Carl Corea, Matthias Thimm
Abstract	We investigate the application of inconsistency measures to the problem of analysing business rule bases. Due to some intricacies of the domain of business rule bases, a straightforward application is not feasible. We therefore develop some new rationality postulates for this setting as well as adapt and modify existing inconsistency measures. We further adapt the notion of inconsistency values (or culpability measures) for this setting and give a comprehensive feasibility study.
Tasks
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08872v1
PDF	https://arxiv.org/pdf/1911.08872v1.pdf
PWC	https://paperswithcode.com/paper/towards-inconsistency-measurement-in-business
Repo
Framework

The Trumpiest Trump? Identifying a Subject’s Most Characteristic Tweets


Title	The Trumpiest Trump? Identifying a Subject’s Most Characteristic Tweets
Authors	Charuta Pethe, Steven Skiena
Abstract	The sequence of documents produced by any given author varies in style and content, but some documents are more typical or representative of the source than others. We quantify the extent to which a given short text is characteristic of a specific person, using a dataset of tweets from fifteen celebrities. Such analysis is useful for generating excerpts of high-volume Twitter profiles, and understanding how representativeness relates to tweet popularity. We first consider the related task of binary author detection (is x the author of text T?), and report a test accuracy of 90.37% for the best of five approaches to this problem. We then use these models to compute characterization scores among all of an author’s texts. A user study shows human evaluators agree with our characterization model for all 15 celebrities in our dataset, each with p-value < 0.05. We use these classifiers to show surprisingly strong correlations between characterization scores and the popularity of the associated texts. Indeed, we demonstrate a statistically significant correlation between this score and tweet popularity (likes/replies/retweets) for 13 of the 15 celebrities in our study.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04002v1
PDF	https://arxiv.org/pdf/1909.04002v1.pdf
PWC	https://paperswithcode.com/paper/the-trumpiest-trump-identifying-a-subjects
Repo
Framework

Dynamical Component Analysis (DyCA) and its application on epileptic EEG


Title	Dynamical Component Analysis (DyCA) and its application on epileptic EEG
Authors	Katharina Korn, Bastian Seifert, Christian Uhl
Abstract	Dynamical Component Analysis (DyCA) is a recently-proposed method to detect projection vectors to reduce the dimensionality of multi-variate deterministic datasets. It is based on the solution of a generalized eigenvalue problem and therefore straight forward to implement. DyCA is introduced and applied to EEG data of epileptic seizures. The obtained eigenvectors are used to project the signal and the corresponding trajectories in phase space are compared with PCA and ICA-projections. The eigenvalues of DyCA are utilized for seizure detection and the obtained results in terms of specificity, false discovery rate and miss rate are compared to other seizure detection algorithms.
Tasks	EEG, Seizure Detection
Published	2019-02-05
URL	http://arxiv.org/abs/1902.01777v1
PDF	http://arxiv.org/pdf/1902.01777v1.pdf
PWC	https://paperswithcode.com/paper/dynamical-component-analysis-dyca-and-its
Repo
Framework

Embodied Visual Recognition


Title	Embodied Visual Recognition
Authors	Jianwei Yang, Zhile Ren, Mingze Xu, Xinlei Chen, David Crandall, Devi Parikh, Dhruv Batra
Abstract	Passive visual systems typically fail to recognize objects in the amodal setting where they are heavily occluded. In contrast, humans and other embodied agents have the ability to move in the environment, and actively control the viewing angle to better understand object shapes and semantics. In this work, we introduce the task of Embodied Visual Recognition (EVR): An agent is instantiated in a 3D environment close to an occluded target object, and is free to move in the environment to perform object classification, amodal object localization, and amodal object segmentation. To address this, we develop a new model called Embodied Mask R-CNN, for agents to learn to move strategically to improve their visual recognition abilities. We conduct experiments using the House3D environment. Experimental results show that: 1) agents with embodiment (movement) achieve better visual recognition performance than passive ones; 2) in order to improve visual recognition abilities, agents can learn strategical moving paths that are different from shortest paths.
Tasks	Object Classification, Object Localization, Semantic Segmentation
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04404v1
PDF	http://arxiv.org/pdf/1904.04404v1.pdf
PWC	https://paperswithcode.com/paper/embodied-visual-recognition
Repo
Framework

Joint inference on structural and diffusion MRI for sequence-adaptive Bayesian segmentation of thalamic nuclei with probabilistic atlases


Title	Joint inference on structural and diffusion MRI for sequence-adaptive Bayesian segmentation of thalamic nuclei with probabilistic atlases
Authors	Juan Eugenio Iglesias, Koen Van Leemput, Polina Golland, Anastasia Yendiki
Abstract	Segmentation of structural and diffusion MRI (sMRI/dMRI) is usually performed independently in neuroimaging pipelines. However, some brain structures (e.g., globus pallidus, thalamus and its nuclei) can be extracted more accurately by fusing the two modalities. Following the framework of Bayesian segmentation with probabilistic atlases and unsupervised appearance modeling, we present here a novel algorithm to jointly segment multi-modal sMRI/dMRI data. We propose a hierarchical likelihood term for the dMRI defined on the unit ball, which combines the Beta and Dimroth-Scheidegger-Watson distributions to model the data at each voxel. This term is integrated with a mixture of Gaussians for the sMRI data, such that the resulting joint unsupervised likelihood enables the analysis of multi-modal scans acquired with any type of MRI contrast, b-values, or number of directions, which enables wide applicability. We also propose an inference algorithm to estimate the maximum-a-posteriori model parameters from input images, and to compute the most likely segmentation. Using a recently published atlas derived from histology, we apply our method to thalamic nuclei segmentation on two datasets: HCP (state of the art) and ADNI (legacy) - producing lower sample sizes than Bayesian segmentation with sMRI alone.
Tasks
Published	2019-03-11
URL	http://arxiv.org/abs/1903.04352v1
PDF	http://arxiv.org/pdf/1903.04352v1.pdf
PWC	https://paperswithcode.com/paper/joint-inference-on-structural-and-diffusion
Repo
Framework

Supervised Machine Learning based Ensemble Model for Accurate Prediction of Type 2 Diabetes


Title	Supervised Machine Learning based Ensemble Model for Accurate Prediction of Type 2 Diabetes
Authors	Ramya Akula, Ni Nguyen, Ivan Garibay
Abstract	According to the American Diabetes Association(ADA), 30.3 million people in the United States have diabetes, but only 7.2 million may be undiagnosed and unaware of their condition. Type 2 diabetes is usually diagnosed for most patients later on in life whereas the less common Type 1 diabetes is diagnosed early on in life. People can live healthy and happy lives while living with diabetes, but early detection produces a better overall outcome on most patient’s health. Thus, to test the accurate prediction of Type 2 diabetes, we use the patients’ information from an electronic health records company called Practice Fusion, which has about 10,000 patient records from 2009 to 2012. This data contains individual key biometrics, including age, diastolic and systolic blood pressure, gender, height, and weight. We use this data on popular machine learning algorithms and for each algorithm, we evaluate the performance of every model based on their classification accuracy, precision, sensitivity, specificity/recall, negative predictive value, and F1 score. In our study, we find that all algorithms other than Naive Bayes suffered from very low precision. Hence, we take a step further and incorporate all the algorithms into a weighted average or soft voting ensemble model where each algorithm will count towards a majority vote towards the decision outcome of whether a patient has diabetes or not. The accuracy of the Ensemble model on Practice Fusion is 85%, by far our ensemble approach is new in this space. We firmly believe that the weighted average ensemble model not only performed well in overall metrics but also helped to recover wrong predictions and aid in accurate prediction of Type 2 diabetes. Our accurate novel model can be used as an alert for the patients to seek medical evaluation in time.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.09356v1
PDF	https://arxiv.org/pdf/1910.09356v1.pdf
PWC	https://paperswithcode.com/paper/supervised-machine-learning-based-ensemble
Repo
Framework

Deep Learned Path Planning via Randomized Reward-Linked-Goals and Potential Space Applications


Title	Deep Learned Path Planning via Randomized Reward-Linked-Goals and Potential Space Applications
Authors	Tamir Blum, William Jones, Kazuya Yoshida
Abstract	Space exploration missions have seen use of increasingly sophisticated robotic systems with ever more autonomy. Deep learning promises to take this even a step further, and has applications for high-level tasks, like path planning, as well as low-level tasks, like motion control, which are critical components for mission efficiency and success. Using deep reinforcement end-to-end learning with randomized reward function parameters during training, we teach a simulated 8 degree-of-freedom quadruped ant-like robot to travel anywhere within a perimeter, conducting path plan and motion control on a single neural network, without any system model or prior knowledge of the terrain or environment. Our approach also allows for user specified waypoints, which could translate well to either fully autonomous or semi-autonomous/teleoperated space applications that encounter delay times. We trained the agent using randomly generated waypoints linked to the reward function and passed waypoint coordinates as inputs to the neural network. Such applications show promise on a variety of space exploration robots, including high speed rovers for fast locomotion and legged cave robots for rough terrain.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06034v1
PDF	https://arxiv.org/pdf/1909.06034v1.pdf
PWC	https://paperswithcode.com/paper/deep-learned-path-planning-via-randomized
Repo
Framework

Applying Deep Learning to Detect Traffic Accidents in Real Time Using Spatiotemporal Sequential Data


Title	Applying Deep Learning to Detect Traffic Accidents in Real Time Using Spatiotemporal Sequential Data
Authors	Amir Bahador Parsa, Rishabh Singh Chauhan, Homa Taghipour, Sybil Derrible, Abolfazl Mohammadian
Abstract	Accident detection is a vital part of traffic safety. Many road users suffer from traffic accidents, as well as their consequences such as delay, congestion, air pollution, and so on. In this study, we utilize two advanced deep learning techniques, Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRUs), to detect traffic accidents in Chicago. These two techniques are selected because they are known to perform well with sequential data (i.e., time series). The full dataset consists of 241 accident and 6,038 non-accident cases selected from Chicago expressway, and it includes traffic spatiotemporal data, weather condition data, and congestion status data. Moreover, because the dataset is imbalanced (i.e., the dataset contains many more non-accident cases than accident cases), Synthetic Minority Over-sampling Technique (SMOTE) is employed. Overall, the two models perform significantly well, both with an Area Under Curve (AUC) of 0.85. Nonetheless, the GRU model is observed to perform slightly better than LSTM model with respect to detection rate. The performance of both models is similar in terms of false alarm rate.
Tasks	Time Series
Published	2019-12-15
URL	https://arxiv.org/abs/1912.06991v2
PDF	https://arxiv.org/pdf/1912.06991v2.pdf
PWC	https://paperswithcode.com/paper/applying-deep-learning-to-detect-traffic
Repo
Framework

RILOD: Near Real-Time Incremental Learning for Object Detection at the Edge


Title	RILOD: Near Real-Time Incremental Learning for Object Detection at the Edge
Authors	Dawei Li, Serafettin Tasci, Shalini Ghosh, Jingwen Zhu, Junting Zhang, Larry Heck
Abstract	Object detection models shipped with camera-equipped edge devices cannot cover the objects of interest for every user. Therefore, the incremental learning capability is a critical feature for a robust and personalized object detection system that many applications would rely on. In this paper, we present an efficient yet practical system, RILOD, to incrementally train an existing object detection model such that it can detect new object classes without losing its capability to detect old classes. The key component of RILOD is a novel incremental learning algorithm that trains end-to-end for one-stage deep object detection models only using training data of new object classes. Specifically to avoid catastrophic forgetting, the algorithm distills three types of knowledge from the old model to mimic the old model’s behavior on object classification, bounding box regression and feature extraction. In addition, since the training data for the new classes may not be available, a real-time dataset construction pipeline is designed to collect training images on-the-fly and automatically label the images with both category and bounding box annotations. We have implemented RILOD under both edge-cloud and edge-only setups. Experiment results show that the proposed system can learn to detect a new object class in just a few minutes, including both dataset construction and model training. In comparison, traditional fine-tuning based method may take a few hours for training, and in most cases would also need a tedious and costly manual dataset labeling step.
Tasks	Object Classification, Object Detection
Published	2019-03-26
URL	https://arxiv.org/abs/1904.00781v2
PDF	https://arxiv.org/pdf/1904.00781v2.pdf
PWC	https://paperswithcode.com/paper/efficient-incremental-learning-for-mobile
Repo
Framework

Alignment Based Matching Networks for One-Shot Classification and Open-Set Recognition


Title	Alignment Based Matching Networks for One-Shot Classification and Open-Set Recognition
Authors	Paresh Malalur, Tommi Jaakkola
Abstract	Deep learning for object classification relies heavily on convolutional models. While effective, CNNs are rarely interpretable after the fact. An attention mechanism can be used to highlight the area of the image that the model focuses on thus offering a narrow view into the mechanism of classification. We expand on this idea by forcing the method to explicitly align images to be classified to reference images representing the classes. The mechanism of alignment is learned and therefore does not require that the reference objects are anything like those being classified. Beyond explanation, our exemplar based cross-alignment method enables classification with only a single example per category (one-shot). Our model cuts the 5-way, 1-shot error rate in Omniglot from 2.1% to 1.4% and in MiniImageNet from 53.5% to 46.5% while simultaneously providing point-wise alignment information providing some understanding on what the network is capturing. This method of alignment also enables the recognition of an unsupported class (open-set) in the one-shot setting while maintaining an F1-score of above 0.5 for Omniglot even with 19 other distracting classes while baselines completely fail to separate the open-set class in the one-shot setting.
Tasks	Object Classification, Omniglot, Open Set Learning
Published	2019-03-11
URL	http://arxiv.org/abs/1903.06538v1
PDF	http://arxiv.org/pdf/1903.06538v1.pdf
PWC	https://paperswithcode.com/paper/alignment-based-matching-networks-for-one
Repo
Framework

The NIGENS General Sound Events Database


Title	The NIGENS General Sound Events Database
Authors	Ivo Trowitzsch, Jalil Taghia, Youssef Kashef, Klaus Obermayer
Abstract	Computational auditory scene analysis is gaining interest in the last years. Trailing behind the more mature field of speech recognition, it is particularly general sound event detection that is attracting increasing attention. Crucial for training and testing reasonable models is having available enough suitable data – until recently, general sound event databases were hardly found. We release and present a database with 714 wav files containing isolated high quality sound events of 14 different types, plus 303 `general’ wav files of anything else but these 14 types. All sound events are strongly labeled with perceptual on- and offset times, paying attention to omitting in-between silences. The amount of isolated sound events, the quality of annotations, and the particular general sound class distinguish NIGENS from other databases. \|
Tasks	Sound Event Detection, Speech Recognition
Published	2019-02-21
URL	https://arxiv.org/abs/1902.08314v4
PDF	https://arxiv.org/pdf/1902.08314v4.pdf
PWC	https://paperswithcode.com/paper/the-nigens-general-sound-events-database
Repo
Framework