Paper Group ANR 125
Extreme Few-view CT Reconstruction using Deep Inference. A Deep Learning System That Generates Quantitative CT Reports for Diagnosing Pulmonary Tuberculosis. Integrating Learning and Reasoning with Deep Logic Models. Transmitter Classification With Supervised Deep Learning. SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentu …
Extreme Few-view CT Reconstruction using Deep Inference
Title | Extreme Few-view CT Reconstruction using Deep Inference |
Authors | Hyojin Kim, Rushil Anirudh, K. Aditya Mohan, Kyle Champley |
Abstract | Reconstruction of few-view x-ray Computed Tomography (CT) data is a highly ill-posed problem. It is often used in applications that require low radiation dose in clinical CT, rapid industrial scanning, or fixed-gantry CT. Existing analytic or iterative algorithms generally produce poorly reconstructed images, severely deteriorated by artifacts and noise, especially when the number of x-ray projections is considerably low. This paper presents a deep network-driven approach to address extreme few-view CT by incorporating convolutional neural network-based inference into state-of-the-art iterative reconstruction. The proposed method interprets few-view sinogram data using attention-based deep networks to infer the reconstructed image. The predicted image is then used as prior knowledge in the iterative algorithm for final reconstruction. We demonstrate effectiveness of the proposed approach by performing reconstruction experiments on a chest CT dataset. |
Tasks | Computed Tomography (CT) |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05375v1 |
https://arxiv.org/pdf/1910.05375v1.pdf | |
PWC | https://paperswithcode.com/paper/extreme-few-view-ct-reconstruction-using-deep |
Repo | |
Framework | |
A Deep Learning System That Generates Quantitative CT Reports for Diagnosing Pulmonary Tuberculosis
Title | A Deep Learning System That Generates Quantitative CT Reports for Diagnosing Pulmonary Tuberculosis |
Authors | Wei Wu, Xukun Li, Peng Du, Guanjing Lang, Min Xu, Kaijin Xu, Lanjuan Li |
Abstract | We developed a deep learning model-based system to automatically generate a quantitative Computed Tomography (CT) diagnostic report for Pulmonary Tuberculosis (PTB) cases.501 CT imaging datasets from 223 patients with active PTB were collected, and another 501 cases from a healthy population served as negative samples.2884 lesions of PTB were carefully labeled and classified manually by professional radiologists.Three state-of-the-art 3D convolution neural network (CNN) models were trained and evaluated in the inspection of PTB CT images. Transfer learning method was also utilized during this process. The best model was selected to annotate the spatial location of lesions and classify them into miliary, infiltrative, caseous, tuberculoma and cavitary types simultaneously.Then the Noisy-Or Bayesian function was used to generate an overall infection probability.Finally, a quantitative diagnostic report was exported.The results showed that the recall and precision rates, from the perspective of a single lesion region of PTB, were 85.9% and 89.2% respectively. The overall recall and precision rates,from the perspective of one PTB case, were 98.7% and 93.7%, respectively. Moreover, the precision rate of the PTB lesion type classification was 90.9%.The new method might serve as an effective reference for decision making by clinical doctors. |
Tasks | Computed Tomography (CT), Decision Making, Transfer Learning |
Published | 2019-10-05 |
URL | https://arxiv.org/abs/1910.02285v1 |
https://arxiv.org/pdf/1910.02285v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-system-that-generates |
Repo | |
Framework | |
Integrating Learning and Reasoning with Deep Logic Models
Title | Integrating Learning and Reasoning with Deep Logic Models |
Authors | Giuseppe Marra, Francesco Giannini, Michelangelo Diligenti, Marco Gori |
Abstract | Deep learning is very effective at jointly learning feature representations and classification models, especially when dealing with high dimensional input patterns. Probabilistic logic reasoning, on the other hand, is capable to take consistent and robust decisions in complex environments. The integration of deep learning and logic reasoning is still an open-research problem and it is considered to be the key for the development of real intelligent agents. This paper presents Deep Logic Models, which are deep graphical models integrating deep learning and logic reasoning both for learning and inference. Deep Logic Models create an end-to-end differentiable architecture, where deep learners are embedded into a network implementing a continuous relaxation of the logic knowledge. The learning process allows to jointly learn the weights of the deep learners and the meta-parameters controlling the high-level reasoning. The experimental results show that the proposed methodology overtakes the limitations of the other approaches that have been proposed to bridge deep learning and reasoning. |
Tasks | |
Published | 2019-01-14 |
URL | http://arxiv.org/abs/1901.04195v1 |
http://arxiv.org/pdf/1901.04195v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-learning-and-reasoning-with-deep |
Repo | |
Framework | |
Transmitter Classification With Supervised Deep Learning
Title | Transmitter Classification With Supervised Deep Learning |
Authors | Cyrille Morin, Leonardo Cardoso, Jakob Hoydis, Jean-Marie Gorce, Thibaud Vial |
Abstract | Hardware imperfections in RF transmitters introduce features that can be used to identify a specific transmitter amongst others. Supervised deep learning has shown good performance in this task but using datasets not applicable to real world situations where topologies evolve over time. To remedy this, the work rests on a series of datasets gathered in the Future Internet of Things / Cognitive Radio Testbed [4] (FIT/CorteXlab) to train a convolutional neural network (CNN), where focus has been given to reduce channel bias that has plagued previous works and constrained them to a constant environment or to simulations. The most challenging scenarios provide the trained neural network with resilience and show insight on the best signal type to use for identification , namely packet preamble. The generated datasets are published on the Machine Learning For Communications Emerging Technologies Initiatives web site 4 in the hope that they serve as stepping stones for future progress in the area. The community is also invited to reproduce the studied scenarios and results by generating new datasets in FIT/CorteXlab. |
Tasks | |
Published | 2019-05-20 |
URL | https://arxiv.org/abs/1905.07923v1 |
https://arxiv.org/pdf/1905.07923v1.pdf | |
PWC | https://paperswithcode.com/paper/transmitter-classification-with-supervised |
Repo | |
Framework | |
SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum
Title | SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum |
Authors | Jianyu Wang, Vinayak Tantia, Nicolas Ballas, Michael Rabbat |
Abstract | Distributed optimization is essential for training large models on large datasets. Multiple approaches have been proposed to reduce the communication overhead in distributed training, such as synchronizing only after performing multiple local SGD steps, and decentralized methods (e.g., using gossip algorithms) to decouple communications among workers. Although these methods run faster than AllReduce-based methods, which use blocking communication before every update, the resulting models may be less accurate after the same number of updates. Inspired by the BMUF method of Chen & Huo (2016), we propose a slow momentum (SlowMo) framework, where workers periodically synchronize and perform a momentum update, after multiple iterations of a base optimization algorithm. Experiments on image classification and machine translation tasks demonstrate that SlowMo consistently yields improvements in optimization and generalization performance relative to the base optimizer, even when the additional overhead is amortized over many updates so that the SlowMo runtime is on par with that of the base optimizer. We provide theoretical convergence guarantees showing that SlowMo converges to a stationary point of smooth non-convex losses. Since BMUF can be expressed through the SlowMo framework, our results also correspond to the first theoretical convergence guarantees for BMUF. |
Tasks | Distributed Optimization, Image Classification, Machine Translation |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00643v2 |
https://arxiv.org/pdf/1910.00643v2.pdf | |
PWC | https://paperswithcode.com/paper/slowmo-improving-communication-efficient |
Repo | |
Framework | |
EEG Classification by factoring in Sensor Configuration
Title | EEG Classification by factoring in Sensor Configuration |
Authors | Lubna Shibly Mokatren, Rashid Ansari, Ahmet Enis Cetin, Alex D Leow, Heide Klumpp, Olusola Ajilore, Fatos Yarman Vural |
Abstract | Electroencephalography (EEG) serves as an effective diagnostic tool for mental disorders and neurological abnormalities. Enhanced analysis and classification of EEG signals can help improve detection performance. A new approach is examined here for enhancing EEG classification performance by leveraging knowledge of spatial layout of EEG sensors. Performance of two classification models - model 1 that ignores the sensor layout and model 2 that factors it in - is investigated and found to achieve consistently higher detection accuracy. The analysis is based on the information content of these signals represented in two different ways: concatenation of the channels of the frequency bands and an image-like 2D representation of the EEG channel locations. Performance of these models is examined on two tasks, social anxiety disorder (SAD) detection, and emotion recognition using a dataset for emotion analysis using physiological signals (DEAP). We hypothesized that model 2 will significantly outperform model 1 and this was validated in our results as model 2 yielded $5$–$8%$ higher accuracy in all machine learning algorithms investigated. Convolutional Neural Networks (CNN) provided the best performance far exceeding that of Support Vector Machine (SVM) and k-Nearest Neighbors (kNNs) algorithms. |
Tasks | EEG, Emotion Recognition |
Published | 2019-05-22 |
URL | https://arxiv.org/abs/1905.09472v2 |
https://arxiv.org/pdf/1905.09472v2.pdf | |
PWC | https://paperswithcode.com/paper/improved-eeg-classification-by-factoring-in |
Repo | |
Framework | |
Domain-Constrained Advertising Keyword Generation
Title | Domain-Constrained Advertising Keyword Generation |
Authors | Hao Zhou, Minlie Huang, Yishun Mao, Changlei Zhu, Peng Shu, Xiaoyan Zhu |
Abstract | Advertising (ad for short) keyword suggestion is important for sponsored search to improve online advertising and increase search revenue. There are two common challenges in this task. First, the keyword bidding problem: hot ad keywords are very expensive for most of the advertisers because more advertisers are bidding on more popular keywords, while unpopular keywords are difficult to discover. As a result, most ads have few chances to be presented to the users. Second, the inefficient ad impression issue: a large proportion of search queries, which are unpopular yet relevant to many ad keywords, have no ads presented on their search result pages. Existing retrieval-based or matching-based methods either deteriorate the bidding competition or are unable to suggest novel keywords to cover more queries, which leads to inefficient ad impressions. To address the above issues, this work investigates to use generative neural networks for keyword generation in sponsored search. Given a purchased keyword (a word sequence) as input, our model can generate a set of keywords that are not only relevant to the input but also satisfy the domain constraint which enforces that the domain category of a generated keyword is as expected. Furthermore, a reinforcement learning algorithm is proposed to adaptively utilize domain-specific information in keyword generation. Offline evaluation shows that the proposed model can generate keywords that are diverse, novel, relevant to the source keyword, and accordant with the domain constraint. Online evaluation shows that generative models can improve coverage (COV), click-through rate (CTR), and revenue per mille (RPM) substantially in sponsored search. |
Tasks | |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10374v1 |
http://arxiv.org/pdf/1902.10374v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-constrained-advertising-keyword |
Repo | |
Framework | |
Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations
Title | Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathematical Content and Citations |
Authors | Norman Meuschke, Vincent Stange, Moritz Schubotz, Michael Karmer, Bela Gipp |
Abstract | Identifying academic plagiarism is a pressing task for educational and research institutions, publishers, and funding agencies. Current plagiarism detection systems reliably find instances of copied and moderately reworded text. However, reliably detecting concealed plagiarism, such as strong paraphrases, translations, and the reuse of nontextual content and ideas is an open research problem. In this paper, we extend our prior research on analyzing mathematical content and academic citations. Both are promising approaches for improving the detection of concealed academic plagiarism primarily in Science, Technology, Engineering and Mathematics (STEM). We make the following contributions: i) We present a two-stage detection process that combines similarity assessments of mathematical content, academic citations, and text. ii) We introduce new similarity measures that consider the order of mathematical features and outperform the measures in our prior research. iii) We compare the effectiveness of the math-based, citation-based, and text-based detection approaches using confirmed cases of academic plagiarism. iv) We demonstrate that the combined analysis of math-based and citation-based content features allows identifying potentially suspicious cases in a collection of 102K STEM documents. Overall, we show that analyzing the similarity of mathematical content and academic citations is a striking supplement for conventional text-based detection approaches for academic literature in the STEM disciplines. |
Tasks | |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11761v1 |
https://arxiv.org/pdf/1906.11761v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-academic-plagiarism-detection-for |
Repo | |
Framework | |
Learning for Multi-Model and Multi-Type Fitting
Title | Learning for Multi-Model and Multi-Type Fitting |
Authors | Xun Xu, Loong-Fah Cheong, Zhuwen Li |
Abstract | Multi-model fitting has been extensively studied from the random sampling and clustering perspectives. Most assume that only a single type/class of model is present and their generalizations to fitting multiple types of models/structures simultaneously are non-trivial. The inherent challenges include choice of types and numbers of models, sampling imbalance and parameter tuning, all of which render conventional approaches ineffective. In this work, we formulate the multi-model multi-type fitting problem as one of learning deep feature embedding that is clustering-friendly. In other words, points of the same clusters are embedded closer together through the network. For inference, we apply K-means to cluster the data in the embedded feature space and model selection is enabled by analyzing the K-means residuals. Experiments are carried out on both synthetic and real world multi-type fitting datasets, producing state-of-the-art results. Comparisons are also made on single-type multi-model fitting tasks with promising results as well. |
Tasks | Model Selection |
Published | 2019-01-29 |
URL | http://arxiv.org/abs/1901.10254v1 |
http://arxiv.org/pdf/1901.10254v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-for-multi-model-and-multi-type |
Repo | |
Framework | |
Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system
Title | Adversarial Training: embedding adversarial perturbations into the parameter space of a neural network to build a robust system |
Authors | Shixian Wen, Laurent Itti |
Abstract | Adversarial training, in which a network is trained on both adversarial and clean examples, is one of the most trusted defense methods against adversarial attacks. However, there are three major practical difficulties in implementing and deploying this method - expensive in terms of extra memory and computation costs; accuracy trade-off between clean and adversarial examples; and lack of diversity of adversarial perturbations. Classical adversarial training uses fixed, precomputed perturbations in adversarial examples (input space). In contrast, we introduce dynamic adversarial perturbations into the parameter space of the network, by adding perturbation biases to the fully connected layers of deep convolutional neural network. During training, using only clean images, the perturbation biases are updated in the Fast Gradient Sign Direction to automatically create and store adversarial perturbations by recycling the gradient information computed. The network learns and adjusts itself automatically to these learned adversarial perturbations. Thus, we can achieve adversarial training with negligible cost compared to requiring a training set of adversarial example images. In addition, if combined with classical adversarial training, our perturbation biases can alleviate accuracy trade-off difficulties, and diversify adversarial perturbations. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04279v1 |
https://arxiv.org/pdf/1910.04279v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-training-embedding-adversarial |
Repo | |
Framework | |
Uncertainty-Aware Driver Trajectory Prediction at Urban Intersections
Title | Uncertainty-Aware Driver Trajectory Prediction at Urban Intersections |
Authors | Xin Huang, Stephen McGill, Brian C. Williams, Luke Fletcher, Guy Rosman |
Abstract | Predicting the motion of a driver’s vehicle is crucial for advanced driving systems, enabling detection of potential risks towards shared control between the driver and automation systems. In this paper, we propose a variational neural network approach that predicts future driver trajectory distributions for the vehicle based on multiple sensors. Our predictor generates both a conditional variational distribution of future trajectories, as well as a confidence estimate for different time horizons. Our approach allows us to handle inherently uncertain situations, and reason about information gain from each input, as well as combine our model with additional predictors, creating a mixture of experts. We show how to augment the variational predictor with a physics-based predictor, and based on their confidence estimations, improve overall system performance. The resulting combined model is aware of the uncertainty associated with its predictions, which can help the vehicle autonomy to make decisions with more confidence. The model is validated on real-world urban driving data collected in multiple locations. This validation demonstrates that our approach improves the prediction error of a physics-based model by 25% while successfully identifying the uncertain cases with 82% accuracy. |
Tasks | Trajectory Prediction |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05105v2 |
http://arxiv.org/pdf/1901.05105v2.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-aware-driver-trajectory |
Repo | |
Framework | |
Machine learning for early prediction of circulatory failure in the intensive care unit
Title | Machine learning for early prediction of circulatory failure in the intensive care unit |
Authors | Stephanie L. Hyland, Martin Faltys, Matthias Hüser, Xinrui Lyu, Thomas Gumbsch, Cristóbal Esteban, Christian Bock, Max Horn, Michael Moor, Bastian Rieck, Marc Zimmermann, Dean Bodenham, Karsten Borgwardt, Gunnar Rätsch, Tobias M. Merz |
Abstract | Intensive care clinicians are presented with large quantities of patient information and measurements from a multitude of monitoring systems. The limited ability of humans to process such complex information hinders physicians to readily recognize and act on early signs of patient deterioration. We used machine learning to develop an early warning system for circulatory failure based on a high-resolution ICU database with 240 patient years of data. This automatic system predicts 90.0% of circulatory failure events (prevalence 3.1%), with 81.8% identified more than two hours in advance, resulting in an area under the receiver operating characteristic curve of 94.0% and area under the precision-recall curve of 63.0%. The model was externally validated in a large independent patient cohort. |
Tasks | |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07990v2 |
http://arxiv.org/pdf/1904.07990v2.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-for-early-prediction-of |
Repo | |
Framework | |
Local versus Global Strategies in Social Query Expansion
Title | Local versus Global Strategies in Social Query Expansion |
Authors | Omar Alonso, Vasileios Kandylas, Serge-Eric Tremblay |
Abstract | Link sharing in social media can be seen as a collaboratively retrieved set of documents for a query or topic expressed by a hashtag. Temporal information plays an important role for identifying the correct context for which such annotations are valid for retrieval purposes. We investigate how social data as temporal context can be used for query expansion and compare global versus local strategies for computing such contextual information for a set of hashtags. |
Tasks | |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01868v1 |
https://arxiv.org/pdf/1908.01868v1.pdf | |
PWC | https://paperswithcode.com/paper/local-versus-global-strategies-in-social |
Repo | |
Framework | |
Unsupervised Domain Adaptation via Regularized Conditional Alignment
Title | Unsupervised Domain Adaptation via Regularized Conditional Alignment |
Authors | Safa Cicek, Stefano Soatto |
Abstract | We propose a method for unsupervised domain adaptation that trains a shared embedding to align the joint distributions of inputs (domain) and outputs (classes), making any classifier agnostic to the domain. Joint alignment ensures that not only the marginal distributions of the domain are aligned, but the labels as well. We propose a novel objective function that encourages the class-conditional distributions to have disjoint support in feature space. We further exploit adversarial regularization to improve the performance of the classifier on the domain for which no annotated data is available. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2019-05-26 |
URL | https://arxiv.org/abs/1905.10885v1 |
https://arxiv.org/pdf/1905.10885v1.pdf | |
PWC | https://paperswithcode.com/paper/190510885 |
Repo | |
Framework | |
Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence
Title | Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence |
Authors | Aditya Golatkar, Alessandro Achille, Stefano Soatto |
Abstract | Regularization is typically understood as improving generalization by altering the landscape of local extrema to which the model eventually converges. Deep neural networks (DNNs), however, challenge this view: We show that removing regularization after an initial transient period has little effect on generalization, even if the final loss landscape is the same as if there had been no regularization. In some cases, generalization even improves after interrupting regularization. Conversely, if regularization is applied only after the initial transient, it has no effect on the final solution, whose generalization gap is as bad as if regularization never happened. This suggests that what matters for training deep networks is not just whether or how, but when to regularize. The phenomena we observe are manifest in different datasets (CIFAR-10, CIFAR-100), different architectures (ResNet-18, All-CNN), different regularization methods (weight decay, data augmentation), different learning rate schedules (exponential, piece-wise constant). They collectively suggest that there is a ``critical period’’ for regularizing deep networks that is decisive of the final performance. More analysis should, therefore, focus on the transient rather than asymptotic behavior of learning. | |
Tasks | Data Augmentation |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13277v1 |
https://arxiv.org/pdf/1905.13277v1.pdf | |
PWC | https://paperswithcode.com/paper/time-matters-in-regularizing-deep-networks |
Repo | |
Framework | |