Paper Group ANR 906
Optimality of Spectral Clustering for Gaussian Mixture Model. On the connections between algorithmic regularization and penalization for convex losses. Human Activity Recognition Using Visual Object Detection. Fault-Diagnosing SLAM for Varying Scale Change Detection. KBQA: Learning Question Answering over QA Corpora and Knowledge Bases. Competing T …
Optimality of Spectral Clustering for Gaussian Mixture Model
Title | Optimality of Spectral Clustering for Gaussian Mixture Model |
Authors | Matthias Löffler, Anderson Y. Zhang, Harrison H. Zhou |
Abstract | Spectral clustering is one of the most popular algorithms to group high dimensional data. It is easy to implement and computationally efficient. Despite its popularity and successful applications, its theoretical properties have not been fully understood. The spectral clustering algorithm is often used as a consistent initializer for more sophisticated clustering algorithms. However, in this paper, we show that spectral clustering is actually already optimal in the Gaussian Mixture Model, when the number of clusters of is fixed and consistent clustering is possible. Contrary to that spectral gap conditions are widely assumed in literature to analyze spectral clustering, these conditions are not needed in this paper to establish its optimality. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00538v1 |
https://arxiv.org/pdf/1911.00538v1.pdf | |
PWC | https://paperswithcode.com/paper/optimality-of-spectral-clustering-for |
Repo | |
Framework | |
On the connections between algorithmic regularization and penalization for convex losses
Title | On the connections between algorithmic regularization and penalization for convex losses |
Authors | Qian Qian, Xiaoyuan Qian |
Abstract | In this work we establish the equivalence of algorithmic regularization and explicit convex penalization for generic convex losses. We introduce a geometric condition for the optimization path of a convex function, and show that if such a condition is satisfied, the optimization path of an iterative algorithm on the unregularized optimization problem can be represented as the solution path of a corresponding penalized problem. |
Tasks | |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03371v1 |
https://arxiv.org/pdf/1909.03371v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-connections-between-algorithmic |
Repo | |
Framework | |
Human Activity Recognition Using Visual Object Detection
Title | Human Activity Recognition Using Visual Object Detection |
Authors | Schalk Wilhelm Pienaar, Reza Malekian |
Abstract | Visual Human Activity Recognition (HAR) and data fusion with other sensors can help us at tracking the behavior and activity of underground miners with little obstruction. Existing models, such as Single Shot Detector (SSD), trained on the Common Objects in Context (COCO) dataset is used in this paper to detect the current state of a miner, such as an injured miner vs a non-injured miner. Tensorflow is used for the abstraction layer of implementing machine learning algorithms, and although it uses Python to deal with nodes and tensors, the actual algorithms run on C++ libraries, providing a good balance between performance and speed of development. The paper further discusses evaluation methods for determining the accuracy of the machine-learning and an approach to increase the accuracy of the detected activity/state of people in a mining environment, by means of data fusion. |
Tasks | Activity Recognition, Human Activity Recognition, Object Detection |
Published | 2019-05-02 |
URL | http://arxiv.org/abs/1905.03707v1 |
http://arxiv.org/pdf/1905.03707v1.pdf | |
PWC | https://paperswithcode.com/paper/190503707 |
Repo | |
Framework | |
Fault-Diagnosing SLAM for Varying Scale Change Detection
Title | Fault-Diagnosing SLAM for Varying Scale Change Detection |
Authors | Sugimoto Takuma, Yamaguchi Kousuke, Tanaka Kanji |
Abstract | In this paper, we present a new fault diagnosis (FD) -based approach for detection of imagery changes that can detect significant changes as inconsistencies between different sub-modules (e.g., self-localizaiton) of visual SLAM. Unlike classical change detection approaches such as pairwise image comparison (PC) and anomaly detection (AD), neither the memorization of each map image nor the maintenance of up-to-date place-specific anomaly detectors are required in this FD approach. A significant challenge that is encountered when incorporating different SLAM sub-modules into FD involves dealing with the varying scales of objects that have changed (e.g., the appearance of small dangerous obstacles on the floor). To address this issue, we reconsider the bag-of-words (BoW) image representation, by exploiting its recent advances in terms of self-localization and change detection. As a key advantage, BoW image representation can be reorganized into any different scaling by simply cropping the original BoW image. Furthermore, we propose to combine different self-localization modules with strong and weak BoW features with different discriminativity, and to treat inconsistency between strong and weak self-localization as an indicator of change. The efficacy of the proposed approach for FD with/without AD and/or PC was experimentally validated. |
Tasks | Anomaly Detection |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.09592v1 |
https://arxiv.org/pdf/1909.09592v1.pdf | |
PWC | https://paperswithcode.com/paper/fault-diagnosing-slam-for-varying-scale |
Repo | |
Framework | |
KBQA: Learning Question Answering over QA Corpora and Knowledge Bases
Title | KBQA: Learning Question Answering over QA Corpora and Knowledge Bases |
Authors | Wanyun Cui, Yanghua Xiao, Haixun Wang, Yangqiu Song, Seung-won Hwang, Wei Wang |
Abstract | Question answering (QA) has become a popular way for humans to access billion-scale knowledge bases. Unlike web search, QA over a knowledge base gives out accurate and concise results, provided that natural language questions can be understood and mapped precisely to structured queries over the knowledge base. The challenge, however, is that a human can ask one question in many different ways. Previous approaches have natural limits due to their representations: rule based approaches only understand a small set of “canned” questions, while keyword based or synonym based approaches cannot fully understand the questions. In this paper, we design a new kind of question representation: templates, over a billion scale knowledge base and a million scale QA corpora. For example, for questions about a city’s population, we learn templates such as What’s the population of $city?, How many people are there in $city?. We learned 27 million templates for 2782 intents. Based on these templates, our QA system KBQA effectively supports binary factoid questions, as well as complex questions which are composed of a series of binary factoid questions. Furthermore, we expand predicates in RDF knowledge base, which boosts the coverage of knowledge base by 57 times. Our QA system beats all other state-of-art works on both effectiveness and efficiency over QALD benchmarks. |
Tasks | Question Answering |
Published | 2019-03-06 |
URL | http://arxiv.org/abs/1903.02419v1 |
http://arxiv.org/pdf/1903.02419v1.pdf | |
PWC | https://paperswithcode.com/paper/kbqa-learning-question-answering-over-qa |
Repo | |
Framework | |
Competing Topic Naming Conventions in Quora: Predicting Appropriate Topic Merges and Winning Topics from Millions of Topic Pairs
Title | Competing Topic Naming Conventions in Quora: Predicting Appropriate Topic Merges and Winning Topics from Millions of Topic Pairs |
Authors | Binny Mathew, Suman Kalyan Maity, Pawan Goyal, Animesh Mukherjee |
Abstract | Quora is a popular Q&A site which provides users with the ability to tag questions with multiple relevant topics which helps to attract quality answers. These topics are not predefined but user-defined conventions and it is not so rare to have multiple such conventions present in the Quora ecosystem describing exactly the same concept. In almost all such cases, users (or Quora moderators) manually merge the topic pair into one of the either topics, thus selecting one of the competing conventions. An important application for the site therefore is to identify such competing conventions early enough that should merge in future. In this paper, we propose a two-step approach that uniquely combines the anomaly detection and the supervised classification frameworks to predict whether two topics from among millions of topic pairs are indeed competing conventions, and should merge, achieving an F-score of 0.711. We also develop a model to predict the direction of the topic merge, i.e., the winning convention, achieving an F-score of 0.898. Our system is also able to predict ~ 25% of the correct case of merges within the first month of the merge and ~ 40% of the cases within a year. This is an encouraging result since Quora users on average take 936 days to identify such a correct merge. Human judgment experiments show that our system is able to predict almost all the correct cases that humans can predict plus 37.24% correct cases which the humans are not able to identify at all. |
Tasks | Anomaly Detection |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04367v1 |
https://arxiv.org/pdf/1909.04367v1.pdf | |
PWC | https://paperswithcode.com/paper/competing-topic-naming-conventions-in-quora |
Repo | |
Framework | |
Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid
Title | Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid |
Authors | Zhanghui Kuang, Yiming Gao, Guanbin Li, Ping Luo, Yimin Chen, Liang Lin, Wayne Zhang |
Abstract | Matching clothing images from customers and online shopping stores has rich applications in E-commerce. Existing algorithms encoded an image as a global feature vector and performed retrieval with the global representation. However, discriminative local information on clothes are submerged in this global representation, resulting in sub-optimal performance. To address this issue, we propose a novel Graph Reasoning Network (GRNet) on a Similarity Pyramid, which learns similarities between a query and a gallery cloth by using both global and local representations in multiple scales. The similarity pyramid is represented by a Graph of similarity, where nodes represent similarities between clothing components at different scales, and the final matching score is obtained by message passing along edges. In GRNet, graph reasoning is solved by training a graph convolutional network, enabling to align salient clothing components to improve clothing retrieval. To facilitate future researches, we introduce a new benchmark FindFashion, containing rich annotations of bounding boxes, views, occlusions, and cropping. Extensive experiments show that GRNet obtains new state-of-the-art results on two challenging benchmarks, e.g., pushing the top-1, top-20, and top-50 accuracies on DeepFashion to 26%, 64%, and 75% (i.e., 4%, 10%, and 10% absolute improvements), outperforming competitors with large margins. On FindFashion, GRNet achieves considerable improvements on all empirical settings. |
Tasks | |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11754v1 |
https://arxiv.org/pdf/1908.11754v1.pdf | |
PWC | https://paperswithcode.com/paper/fashion-retrieval-via-graph-reasoning |
Repo | |
Framework | |
Accelerated Nuclear Magnetic Resonance Spectroscopy with Deep Learning
Title | Accelerated Nuclear Magnetic Resonance Spectroscopy with Deep Learning |
Authors | Xiaobo Qu, Yihui Huang, Hengfa Lu, Tianyu Qiu, Di Guo, Tatiana Agback, Vladislav Orekhov, Zhong Chen |
Abstract | Nuclear magnetic resonance (NMR) spectroscopy serves as an indispensable tool in chemistry and biology but often suffers from long experimental time. We present a proof-of-concept of application of deep learning and neural network for high-quality, reliable, and very fast NMR spectra reconstruction from limited experimental data. We show that the neural network training can be achieved using solely synthetic NMR signal, which lifts the prohibiting demand for a large volume of realistic training data usually required in the deep learning approach. |
Tasks | |
Published | 2019-04-09 |
URL | https://arxiv.org/abs/1904.05168v2 |
https://arxiv.org/pdf/1904.05168v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-nuclear-magnetic-resonance |
Repo | |
Framework | |
Healthy versus pathological learning transferability in shoulder muscle MRI segmentation using deep convolutional encoder-decoders
Title | Healthy versus pathological learning transferability in shoulder muscle MRI segmentation using deep convolutional encoder-decoders |
Authors | Pierre-Henri Conze, Sylvain Brochard, Valérie Burdin, Frances T. Sheehan, Christelle Pons |
Abstract | Automatic segmentation of pathological shoulder muscles in patients with musculo-skeletal diseases is a challenging task due to the huge variability in muscle shape, size, location, texture and injury. A reliable fully-automated segmentation method from magnetic resonance images could greatly help clinicians to plan therapeutic interventions and predict interventional outcomes while eliminating time consuming manual segmentation efforts. The purpose of this work is three-fold. First, we investigate the feasibility of pathological shoulder muscle segmentation using deep learning techniques, given a very limited amount of available annotated pediatric data. Second, we address the learning transferability from healthy to pathological data by comparing different learning schemes in terms of model generalizability. Third, extended versions of deep convolutional encoder-decoder architectures using encoders pre-trained on non-medical data are proposed to improve the segmentation accuracy. Methodological aspects are evaluated in a leave-one-out fashion on a dataset of 24 shoulder examinations from patients with obstetrical brachial plexus palsy and focus on 4 different muscles including deltoid as well as infraspinatus, supraspinatus and subscapularis from the rotator cuff. The most relevant segmentation model is partially pre-trained on ImageNet and jointly exploits inter-patient healthy and pathological annotated data. Its performance reaches Dice scores of 82.4%, 82.0%, 71.0% and 82.8% for deltoid, infraspinatus, supraspinatus and subscapularis muscles. Absolute surface estimation errors are all below 83mm$^2$ except for supraspinatus with 134.6mm$^2$. These contributions offer new perspectives for force inference in the context of musculo-skeletal disorder management. |
Tasks | |
Published | 2019-01-06 |
URL | http://arxiv.org/abs/1901.01620v2 |
http://arxiv.org/pdf/1901.01620v2.pdf | |
PWC | https://paperswithcode.com/paper/healthy-versus-pathological-learning |
Repo | |
Framework | |
IoU-aware Single-stage Object Detector for Accurate Localization
Title | IoU-aware Single-stage Object Detector for Accurate Localization |
Authors | Shengkai Wu, Xiaoping Li, Xinggang Wang |
Abstract | Due to the simpleness and high efficiency, single-stage object detectors have been widely applied in many computer vision applications . However, the low correlation between the classification score and localization accuracy of the predicted detections has severely hurt the localization accuracy of models. In this paper, IoU-aware single-stage object detector is proposed to solve this problem. Specifically, IoU-aware single-stage object detector predicts the IoU for each detected box. Then the classification score and predicted IoU are multiplied to compute the final detection confidence, which is more correlated with the localization accuracy. The detection confidence is then used as the input of the subsequent NMS and COCO AP computation, which will substantially improve the localization accuracy of models. Sufficient experiments on COCO and PASCAL VOC datasets demonstrate the effectiveness of IoU-aware single-stage object detector on improving model’s localization accuracy. Without whistles and bells, the proposed method can substantially improve AP by $1.7%\sim1.9%$ and AP75 by $2.2%\sim2.5%$ on COCO \textit{test-dev}. On PASCAL VOC, the proposed method can substantially improve AP by $2.9%\sim4.4%$ and AP80, AP90 by $4.6%\sim10.2%$. The source code will be made publicly available. |
Tasks | |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.05992v2 |
https://arxiv.org/pdf/1912.05992v2.pdf | |
PWC | https://paperswithcode.com/paper/iou-aware-single-stage-object-detector-for |
Repo | |
Framework | |
Rapid identification of pathogenic bacteria using Raman spectroscopy and deep learning
Title | Rapid identification of pathogenic bacteria using Raman spectroscopy and deep learning |
Authors | Chi-Sing Ho, Neal Jean, Catherine A. Hogan, Lena Blackmon, Stefanie S. Jeffrey, Mark Holodniy, Niaz Banaei, Amr A. E. Saleh, Stefano Ermon, Jennifer Dionne |
Abstract | Rapid identification of bacteria is essential to prevent the spread of infectious disease, help combat antimicrobial resistance, and improve patient outcomes. Raman optical spectroscopy promises to combine bacterial detection, identification, and antibiotic susceptibility testing in a single step. However, achieving clinically relevant speeds and accuracies remains challenging due to the weak Raman signal from bacterial cells and the large number of bacterial species and phenotypes. By amassing the largest known dataset of bacterial Raman spectra, we are able to apply state-of-the-art deep learning approaches to identify 30 of the most common bacterial pathogens from noisy Raman spectra, achieving antibiotic treatment identification accuracies of 99.0$\pm$0.1%. This novel approach distinguishes between methicillin-resistant and -susceptible isolates of Staphylococcus aureus (MRSA and MSSA) as well as a pair of isogenic MRSA and MSSA that are genetically identical apart from deletion of the mecA resistance gene, indicating the potential for culture-free detection of antibiotic resistance. Results from initial clinical validation are promising: using just 10 bacterial spectra from each of 25 isolates, we achieve 99.0$\pm$1.9% species identification accuracy. Our combined Raman-deep learning system represents an important proof-of-concept for rapid, culture-free identification of bacterial isolates and antibiotic resistance and could be readily extended for diagnostics on blood, urine, and sputum. |
Tasks | |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.07666v2 |
https://arxiv.org/pdf/1901.07666v2.pdf | |
PWC | https://paperswithcode.com/paper/rapid-identification-of-pathogenic-bacteria |
Repo | |
Framework | |
Melanoma detection with electrical impedance spectroscopy and dermoscopy using joint deep learning models
Title | Melanoma detection with electrical impedance spectroscopy and dermoscopy using joint deep learning models |
Authors | Nils Gessert, Marcel Bengs, Alexander Schlaefer |
Abstract | The initial assessment of skin lesions is typically based on dermoscopic images. As this is a difficult and time-consuming task, machine learning methods using dermoscopic images have been proposed to assist human experts. Other approaches have studied electrical impedance spectroscopy (EIS) as a basis for clinical decision support systems. Both methods represent different ways of measuring skin lesion properties as dermoscopy relies on visible light and EIS uses electric currents. Thus, the two methods might carry complementary features for lesion classification. Therefore, we propose joint deep learning models considering both EIS and dermoscopy for melanoma detection. For this purpose, we first study machine learning methods for EIS that incorporate domain knowledge and previously used heuristics into the design process. As a result, we propose a recurrent model with state-max-pooling which automatically learns the relevance of different EIS measurements. Second, we combine this new model with different convolutional neural networks that process dermoscopic images. We study ensembling approaches and also propose a cross-attention module guiding information exchange between the EIS and dermoscopy model. In general, combinations of EIS and dermoscopy clearly outperform models that only use either EIS or dermoscopy. We show that our attention-based, combined model outperforms other models with specificities of 34.4% (CI 31.3-38.4), 34.7% (CI 31.0-38.8) and 53.7% (CI 50.1-57.6) for dermoscopy, EIS and the combined model, respectively, at a clinically relevant sensitivity of 98%. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02322v2 |
https://arxiv.org/pdf/1911.02322v2.pdf | |
PWC | https://paperswithcode.com/paper/melanoma-detection-with-electrical-impedance |
Repo | |
Framework | |
Deep Reinforcement Learning from Policy-Dependent Human Feedback
Title | Deep Reinforcement Learning from Policy-Dependent Human Feedback |
Authors | Dilip Arumugam, Jun Ki Lee, Sophie Saskin, Michael L. Littman |
Abstract | To widen their accessibility and increase their utility, intelligent agents must be able to learn complex behaviors as specified by (non-expert) human users. Moreover, they will need to learn these behaviors within a reasonable amount of time while efficiently leveraging the sparse feedback a human trainer is capable of providing. Recent work has shown that human feedback can be characterized as a critique of an agent’s current behavior rather than as an alternative reward signal to be maximized, culminating in the COnvergent Actor-Critic by Humans (COACH) algorithm for making direct policy updates based on human feedback. Our work builds on COACH, moving to a setting where the agent’s policy is represented by a deep neural network. We employ a series of modifications on top of the original COACH algorithm that are critical for successfully learning behaviors from high-dimensional observations, while also satisfying the constraint of obtaining reduced sample complexity. We demonstrate the effectiveness of our Deep COACH algorithm in the rich 3D world of Minecraft with an agent that learns to complete tasks by mapping from raw pixels to actions using only real-time human feedback in 10-15 minutes of interaction. |
Tasks | |
Published | 2019-02-12 |
URL | http://arxiv.org/abs/1902.04257v1 |
http://arxiv.org/pdf/1902.04257v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-from-policy |
Repo | |
Framework | |
Degenerate Feedback Loops in Recommender Systems
Title | Degenerate Feedback Loops in Recommender Systems |
Authors | Ray Jiang, Silvia Chiappa, Tor Lattimore, András György, Pushmeet Kohli |
Abstract | Machine learning is used extensively in recommender systems deployed in products. The decisions made by these systems can influence user beliefs and preferences which in turn affect the feedback the learning system receives - thus creating a feedback loop. This phenomenon can give rise to the so-called “echo chambers” or “filter bubbles” that have user and societal implications. In this paper, we provide a novel theoretical analysis that examines both the role of user dynamics and the behavior of recommender systems, disentangling the echo chamber from the filter bubble effect. In addition, we offer practical solutions to slow down system degeneracy. Our study contributes toward understanding and developing solutions to commonly cited issues in the complex temporal scenario, an area that is still largely unexplored. |
Tasks | Recommendation Systems |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10730v3 |
http://arxiv.org/pdf/1902.10730v3.pdf | |
PWC | https://paperswithcode.com/paper/degenerate-feedback-loops-in-recommender |
Repo | |
Framework | |
Gated Channel Transformation for Visual Recognition
Title | Gated Channel Transformation for Visual Recognition |
Authors | Zongxin Yang, Linchao Zhu, Yu Wu, Yi Yang |
Abstract | In this work, we propose a generally applicable transformation unit for visual recognition with deep convolutional neural networks. This transformation explicitly models channel relationships with explainable control variables. These variables determine the neuron behaviors of competition or cooperation, and they are jointly optimized with the convolutional weight towards more accurate recognition. In Squeeze-and-Excitation (SE) Networks, the channel relationships are implicitly learned by fully connected layers, and the SE block is integrated at the block-level. We instead introduce a channel normalization layer to reduce the number of parameters and computational complexity. This lightweight layer incorporates a simple l2 normalization, enabling our transformation unit applicable to operator-level without much increase of additional parameters. Extensive experiments demonstrate the effectiveness of our unit with clear margins on many vision tasks, i.e., image classification on ImageNet, object detection and instance segmentation on COCO, video classification on Kinetics. |
Tasks | Image Classification, Instance Segmentation, Object Detection, Semantic Segmentation, Video Classification |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11519v2 |
https://arxiv.org/pdf/1909.11519v2.pdf | |
PWC | https://paperswithcode.com/paper/gated-channel-transformation-for-visual |
Repo | |
Framework | |