Paper Group ANR 805
Triple Generative Adversarial Networks. Convolutional neural network for detection and classification of seizures in clinical data. Convolutional Neural Networks for Automatic Meter Reading. The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures. Towards Inconsistency Measurement in …
Triple Generative Adversarial Networks
Title | Triple Generative Adversarial Networks |
Authors | Chongxuan Li, Kun Xu, Jiashuo Liu, Jun Zhu, Bo Zhang |
Abstract | Generative adversarial networks (GANs) have shown promise in image generation and classification given limited supervision. Existing methods extend the unsupervised GAN framework to incorporate supervision heuristically. Specifically, a single discriminator plays two incompatible roles of identifying fake samples and predicting labels and it only estimates the data without considering the labels. The formulation intrinsically causes two problems: (1) the generator and the discriminator (i.e., the classifier) may not converge to the data distribution at the same time; and (2) the generator cannot control the semantics of the generated samples. In this paper, we present the triple generative adversarial network (Triple-GAN), which consists of three players—a generator, a classifier, and a discriminator. The generator and the classifier characterize the conditional distributions between images and labels, and the discriminator solely focuses on identifying fake image-label pairs. We design compatible objective functions to ensure that the distributions characterized by the generator and the classifier converge to the data distribution. We evaluate Triple-GAN in two challenging settings, namely, semi-supervised learning and the extreme low data regime. In both settings, Triple-GAN can achieve state-of-the-art classification results among deep generative models and generate meaningful samples in a specific class simultaneously. |
Tasks | Image Generation |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.09784v1 |
https://arxiv.org/pdf/1912.09784v1.pdf | |
PWC | https://paperswithcode.com/paper/triple-generative-adversarial-networks |
Repo | |
Framework | |
Convolutional neural network for detection and classification of seizures in clinical data
Title | Convolutional neural network for detection and classification of seizures in clinical data |
Authors | Tomas Iesmantas, Robertas Alzbutas |
Abstract | Epileptic seizure detection and classification in clinical electroencephalogram data still is a challenge, and only low sensitivity with a high rate of false positives has been achieved with commercially available seizure detection tools, which usually are patient non-specific. Epilepsy patients suffer from severe detrimental effects like physical injury or depression due to unpredictable seizures. However, even in hospitals due to the high rate of false positives the seizure alert systems are of poor help for patients as tools of seizure detection are mostly trained on unrealistically clean data, containing little noise and obtained under controlled laboratory conditions, where patient groups are homogeneous, e.g. in terms of age or type of seizures. In this study authors present the approach for detection and classification of a seizure using clinical data of electroencephalograms and a convolutional neural network trained on features of brain synchronisation and power spectrum. Various deep learning methods were applied, and the network was trained on very heterogeneous clinical electroencephalogram dataset. In total, eight different types of seizures were considered, and the patients were of various ages, health conditions and they were observed under clinical conditions. Despite this, classifier presented in this paper achieved sensitivity and specificity equal to 0.68 and 0.67, accordingly, which is a significant improvement as compared to the known results for clinical data. |
Tasks | Seizure Detection |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.08864v1 |
http://arxiv.org/pdf/1903.08864v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-network-for-detection |
Repo | |
Framework | |
Convolutional Neural Networks for Automatic Meter Reading
Title | Convolutional Neural Networks for Automatic Meter Reading |
Authors | Rayson Laroca, Victor Barroso, Matheus A. Diniz, Gabriel R. Gonçalves, William Robson Schwartz, David Menotti |
Abstract | In this paper, we tackle Automatic Meter Reading (AMR) by leveraging the high capability of Convolutional Neural Networks (CNNs). We design a two-stage approach that employs the Fast-YOLO object detector for counter detection and evaluates three different CNN-based approaches for counter recognition. In the AMR literature, most datasets are not available to the research community since the images belong to a service company. In this sense, we introduce a new public dataset, called UFPR-AMR dataset, with 2,000 fully and manually annotated images. This dataset is, to the best of our knowledge, three times larger than the largest public dataset found in the literature and contains a well-defined evaluation protocol to assist the development and evaluation of AMR methods. Furthermore, we propose the use of a data augmentation technique to generate a balanced training set with many more examples to train the CNN models for counter recognition. In the proposed dataset, impressive results were obtained and a detailed speed/accuracy trade-off evaluation of each model was performed. In a public dataset, state-of-the-art results were achieved using less than 200 images for training. |
Tasks | Data Augmentation |
Published | 2019-02-25 |
URL | http://arxiv.org/abs/1902.09600v1 |
http://arxiv.org/pdf/1902.09600v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-for-automatic |
Repo | |
Framework | |
The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures
Title | The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures |
Authors | Sheshera Mysore, Zach Jensen, Edward Kim, Kevin Huang, Haw-Shiuan Chang, Emma Strubell, Jeffrey Flanigan, Andrew McCallum, Elsa Olivetti |
Abstract | Materials science literature contains millions of materials synthesis procedures described in unstructured natural language text. Large-scale analysis of these synthesis procedures would facilitate deeper scientific understanding of materials synthesis and enable automated synthesis planning. Such analysis requires extracting structured representations of synthesis procedures from the raw text as a first step. To facilitate the training and evaluation of synthesis extraction models, we introduce a dataset of 230 synthesis procedures annotated by domain experts with labeled graphs that express the semantics of the synthesis sentences. The nodes in this graph are synthesis operations and their typed arguments, and labeled edges specify relations between the nodes. We describe this new resource in detail and highlight some specific challenges to annotating scientific text with shallow semantic structure. We make the corpus available to the community to promote further research and development of scientific information extraction systems. |
Tasks | |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.06939v2 |
https://arxiv.org/pdf/1905.06939v2.pdf | |
PWC | https://paperswithcode.com/paper/the-materials-science-procedural-text-corpus |
Repo | |
Framework | |
Towards Inconsistency Measurement in Business Rule Bases
Title | Towards Inconsistency Measurement in Business Rule Bases |
Authors | Carl Corea, Matthias Thimm |
Abstract | We investigate the application of inconsistency measures to the problem of analysing business rule bases. Due to some intricacies of the domain of business rule bases, a straightforward application is not feasible. We therefore develop some new rationality postulates for this setting as well as adapt and modify existing inconsistency measures. We further adapt the notion of inconsistency values (or culpability measures) for this setting and give a comprehensive feasibility study. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08872v1 |
https://arxiv.org/pdf/1911.08872v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-inconsistency-measurement-in-business |
Repo | |
Framework | |
The Trumpiest Trump? Identifying a Subject’s Most Characteristic Tweets
Title | The Trumpiest Trump? Identifying a Subject’s Most Characteristic Tweets |
Authors | Charuta Pethe, Steven Skiena |
Abstract | The sequence of documents produced by any given author varies in style and content, but some documents are more typical or representative of the source than others. We quantify the extent to which a given short text is characteristic of a specific person, using a dataset of tweets from fifteen celebrities. Such analysis is useful for generating excerpts of high-volume Twitter profiles, and understanding how representativeness relates to tweet popularity. We first consider the related task of binary author detection (is x the author of text T?), and report a test accuracy of 90.37% for the best of five approaches to this problem. We then use these models to compute characterization scores among all of an author’s texts. A user study shows human evaluators agree with our characterization model for all 15 celebrities in our dataset, each with p-value < 0.05. We use these classifiers to show surprisingly strong correlations between characterization scores and the popularity of the associated texts. Indeed, we demonstrate a statistically significant correlation between this score and tweet popularity (likes/replies/retweets) for 13 of the 15 celebrities in our study. |
Tasks | |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.04002v1 |
https://arxiv.org/pdf/1909.04002v1.pdf | |
PWC | https://paperswithcode.com/paper/the-trumpiest-trump-identifying-a-subjects |
Repo | |
Framework | |
Dynamical Component Analysis (DyCA) and its application on epileptic EEG
Title | Dynamical Component Analysis (DyCA) and its application on epileptic EEG |
Authors | Katharina Korn, Bastian Seifert, Christian Uhl |
Abstract | Dynamical Component Analysis (DyCA) is a recently-proposed method to detect projection vectors to reduce the dimensionality of multi-variate deterministic datasets. It is based on the solution of a generalized eigenvalue problem and therefore straight forward to implement. DyCA is introduced and applied to EEG data of epileptic seizures. The obtained eigenvectors are used to project the signal and the corresponding trajectories in phase space are compared with PCA and ICA-projections. The eigenvalues of DyCA are utilized for seizure detection and the obtained results in terms of specificity, false discovery rate and miss rate are compared to other seizure detection algorithms. |
Tasks | EEG, Seizure Detection |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.01777v1 |
http://arxiv.org/pdf/1902.01777v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamical-component-analysis-dyca-and-its |
Repo | |
Framework | |
Embodied Visual Recognition
Title | Embodied Visual Recognition |
Authors | Jianwei Yang, Zhile Ren, Mingze Xu, Xinlei Chen, David Crandall, Devi Parikh, Dhruv Batra |
Abstract | Passive visual systems typically fail to recognize objects in the amodal setting where they are heavily occluded. In contrast, humans and other embodied agents have the ability to move in the environment, and actively control the viewing angle to better understand object shapes and semantics. In this work, we introduce the task of Embodied Visual Recognition (EVR): An agent is instantiated in a 3D environment close to an occluded target object, and is free to move in the environment to perform object classification, amodal object localization, and amodal object segmentation. To address this, we develop a new model called Embodied Mask R-CNN, for agents to learn to move strategically to improve their visual recognition abilities. We conduct experiments using the House3D environment. Experimental results show that: 1) agents with embodiment (movement) achieve better visual recognition performance than passive ones; 2) in order to improve visual recognition abilities, agents can learn strategical moving paths that are different from shortest paths. |
Tasks | Object Classification, Object Localization, Semantic Segmentation |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04404v1 |
http://arxiv.org/pdf/1904.04404v1.pdf | |
PWC | https://paperswithcode.com/paper/embodied-visual-recognition |
Repo | |
Framework | |
Joint inference on structural and diffusion MRI for sequence-adaptive Bayesian segmentation of thalamic nuclei with probabilistic atlases
Title | Joint inference on structural and diffusion MRI for sequence-adaptive Bayesian segmentation of thalamic nuclei with probabilistic atlases |
Authors | Juan Eugenio Iglesias, Koen Van Leemput, Polina Golland, Anastasia Yendiki |
Abstract | Segmentation of structural and diffusion MRI (sMRI/dMRI) is usually performed independently in neuroimaging pipelines. However, some brain structures (e.g., globus pallidus, thalamus and its nuclei) can be extracted more accurately by fusing the two modalities. Following the framework of Bayesian segmentation with probabilistic atlases and unsupervised appearance modeling, we present here a novel algorithm to jointly segment multi-modal sMRI/dMRI data. We propose a hierarchical likelihood term for the dMRI defined on the unit ball, which combines the Beta and Dimroth-Scheidegger-Watson distributions to model the data at each voxel. This term is integrated with a mixture of Gaussians for the sMRI data, such that the resulting joint unsupervised likelihood enables the analysis of multi-modal scans acquired with any type of MRI contrast, b-values, or number of directions, which enables wide applicability. We also propose an inference algorithm to estimate the maximum-a-posteriori model parameters from input images, and to compute the most likely segmentation. Using a recently published atlas derived from histology, we apply our method to thalamic nuclei segmentation on two datasets: HCP (state of the art) and ADNI (legacy) - producing lower sample sizes than Bayesian segmentation with sMRI alone. |
Tasks | |
Published | 2019-03-11 |
URL | http://arxiv.org/abs/1903.04352v1 |
http://arxiv.org/pdf/1903.04352v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-inference-on-structural-and-diffusion |
Repo | |
Framework | |
Supervised Machine Learning based Ensemble Model for Accurate Prediction of Type 2 Diabetes
Title | Supervised Machine Learning based Ensemble Model for Accurate Prediction of Type 2 Diabetes |
Authors | Ramya Akula, Ni Nguyen, Ivan Garibay |
Abstract | According to the American Diabetes Association(ADA), 30.3 million people in the United States have diabetes, but only 7.2 million may be undiagnosed and unaware of their condition. Type 2 diabetes is usually diagnosed for most patients later on in life whereas the less common Type 1 diabetes is diagnosed early on in life. People can live healthy and happy lives while living with diabetes, but early detection produces a better overall outcome on most patient’s health. Thus, to test the accurate prediction of Type 2 diabetes, we use the patients’ information from an electronic health records company called Practice Fusion, which has about 10,000 patient records from 2009 to 2012. This data contains individual key biometrics, including age, diastolic and systolic blood pressure, gender, height, and weight. We use this data on popular machine learning algorithms and for each algorithm, we evaluate the performance of every model based on their classification accuracy, precision, sensitivity, specificity/recall, negative predictive value, and F1 score. In our study, we find that all algorithms other than Naive Bayes suffered from very low precision. Hence, we take a step further and incorporate all the algorithms into a weighted average or soft voting ensemble model where each algorithm will count towards a majority vote towards the decision outcome of whether a patient has diabetes or not. The accuracy of the Ensemble model on Practice Fusion is 85%, by far our ensemble approach is new in this space. We firmly believe that the weighted average ensemble model not only performed well in overall metrics but also helped to recover wrong predictions and aid in accurate prediction of Type 2 diabetes. Our accurate novel model can be used as an alert for the patients to seek medical evaluation in time. |
Tasks | |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.09356v1 |
https://arxiv.org/pdf/1910.09356v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-machine-learning-based-ensemble |
Repo | |
Framework | |
Deep Learned Path Planning via Randomized Reward-Linked-Goals and Potential Space Applications
Title | Deep Learned Path Planning via Randomized Reward-Linked-Goals and Potential Space Applications |
Authors | Tamir Blum, William Jones, Kazuya Yoshida |
Abstract | Space exploration missions have seen use of increasingly sophisticated robotic systems with ever more autonomy. Deep learning promises to take this even a step further, and has applications for high-level tasks, like path planning, as well as low-level tasks, like motion control, which are critical components for mission efficiency and success. Using deep reinforcement end-to-end learning with randomized reward function parameters during training, we teach a simulated 8 degree-of-freedom quadruped ant-like robot to travel anywhere within a perimeter, conducting path plan and motion control on a single neural network, without any system model or prior knowledge of the terrain or environment. Our approach also allows for user specified waypoints, which could translate well to either fully autonomous or semi-autonomous/teleoperated space applications that encounter delay times. We trained the agent using randomly generated waypoints linked to the reward function and passed waypoint coordinates as inputs to the neural network. Such applications show promise on a variety of space exploration robots, including high speed rovers for fast locomotion and legged cave robots for rough terrain. |
Tasks | |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06034v1 |
https://arxiv.org/pdf/1909.06034v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learned-path-planning-via-randomized |
Repo | |
Framework | |
Applying Deep Learning to Detect Traffic Accidents in Real Time Using Spatiotemporal Sequential Data
Title | Applying Deep Learning to Detect Traffic Accidents in Real Time Using Spatiotemporal Sequential Data |
Authors | Amir Bahador Parsa, Rishabh Singh Chauhan, Homa Taghipour, Sybil Derrible, Abolfazl Mohammadian |
Abstract | Accident detection is a vital part of traffic safety. Many road users suffer from traffic accidents, as well as their consequences such as delay, congestion, air pollution, and so on. In this study, we utilize two advanced deep learning techniques, Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRUs), to detect traffic accidents in Chicago. These two techniques are selected because they are known to perform well with sequential data (i.e., time series). The full dataset consists of 241 accident and 6,038 non-accident cases selected from Chicago expressway, and it includes traffic spatiotemporal data, weather condition data, and congestion status data. Moreover, because the dataset is imbalanced (i.e., the dataset contains many more non-accident cases than accident cases), Synthetic Minority Over-sampling Technique (SMOTE) is employed. Overall, the two models perform significantly well, both with an Area Under Curve (AUC) of 0.85. Nonetheless, the GRU model is observed to perform slightly better than LSTM model with respect to detection rate. The performance of both models is similar in terms of false alarm rate. |
Tasks | Time Series |
Published | 2019-12-15 |
URL | https://arxiv.org/abs/1912.06991v2 |
https://arxiv.org/pdf/1912.06991v2.pdf | |
PWC | https://paperswithcode.com/paper/applying-deep-learning-to-detect-traffic |
Repo | |
Framework | |
RILOD: Near Real-Time Incremental Learning for Object Detection at the Edge
Title | RILOD: Near Real-Time Incremental Learning for Object Detection at the Edge |
Authors | Dawei Li, Serafettin Tasci, Shalini Ghosh, Jingwen Zhu, Junting Zhang, Larry Heck |
Abstract | Object detection models shipped with camera-equipped edge devices cannot cover the objects of interest for every user. Therefore, the incremental learning capability is a critical feature for a robust and personalized object detection system that many applications would rely on. In this paper, we present an efficient yet practical system, RILOD, to incrementally train an existing object detection model such that it can detect new object classes without losing its capability to detect old classes. The key component of RILOD is a novel incremental learning algorithm that trains end-to-end for one-stage deep object detection models only using training data of new object classes. Specifically to avoid catastrophic forgetting, the algorithm distills three types of knowledge from the old model to mimic the old model’s behavior on object classification, bounding box regression and feature extraction. In addition, since the training data for the new classes may not be available, a real-time dataset construction pipeline is designed to collect training images on-the-fly and automatically label the images with both category and bounding box annotations. We have implemented RILOD under both edge-cloud and edge-only setups. Experiment results show that the proposed system can learn to detect a new object class in just a few minutes, including both dataset construction and model training. In comparison, traditional fine-tuning based method may take a few hours for training, and in most cases would also need a tedious and costly manual dataset labeling step. |
Tasks | Object Classification, Object Detection |
Published | 2019-03-26 |
URL | https://arxiv.org/abs/1904.00781v2 |
https://arxiv.org/pdf/1904.00781v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-incremental-learning-for-mobile |
Repo | |
Framework | |
Alignment Based Matching Networks for One-Shot Classification and Open-Set Recognition
Title | Alignment Based Matching Networks for One-Shot Classification and Open-Set Recognition |
Authors | Paresh Malalur, Tommi Jaakkola |
Abstract | Deep learning for object classification relies heavily on convolutional models. While effective, CNNs are rarely interpretable after the fact. An attention mechanism can be used to highlight the area of the image that the model focuses on thus offering a narrow view into the mechanism of classification. We expand on this idea by forcing the method to explicitly align images to be classified to reference images representing the classes. The mechanism of alignment is learned and therefore does not require that the reference objects are anything like those being classified. Beyond explanation, our exemplar based cross-alignment method enables classification with only a single example per category (one-shot). Our model cuts the 5-way, 1-shot error rate in Omniglot from 2.1% to 1.4% and in MiniImageNet from 53.5% to 46.5% while simultaneously providing point-wise alignment information providing some understanding on what the network is capturing. This method of alignment also enables the recognition of an unsupported class (open-set) in the one-shot setting while maintaining an F1-score of above 0.5 for Omniglot even with 19 other distracting classes while baselines completely fail to separate the open-set class in the one-shot setting. |
Tasks | Object Classification, Omniglot, Open Set Learning |
Published | 2019-03-11 |
URL | http://arxiv.org/abs/1903.06538v1 |
http://arxiv.org/pdf/1903.06538v1.pdf | |
PWC | https://paperswithcode.com/paper/alignment-based-matching-networks-for-one |
Repo | |
Framework | |
The NIGENS General Sound Events Database
Title | The NIGENS General Sound Events Database |
Authors | Ivo Trowitzsch, Jalil Taghia, Youssef Kashef, Klaus Obermayer |
Abstract | Computational auditory scene analysis is gaining interest in the last years. Trailing behind the more mature field of speech recognition, it is particularly general sound event detection that is attracting increasing attention. Crucial for training and testing reasonable models is having available enough suitable data – until recently, general sound event databases were hardly found. We release and present a database with 714 wav files containing isolated high quality sound events of 14 different types, plus 303 `general’ wav files of anything else but these 14 types. All sound events are strongly labeled with perceptual on- and offset times, paying attention to omitting in-between silences. The amount of isolated sound events, the quality of annotations, and the particular general sound class distinguish NIGENS from other databases. | |
Tasks | Sound Event Detection, Speech Recognition |
Published | 2019-02-21 |
URL | https://arxiv.org/abs/1902.08314v4 |
https://arxiv.org/pdf/1902.08314v4.pdf | |
PWC | https://paperswithcode.com/paper/the-nigens-general-sound-events-database |
Repo | |
Framework | |