October 19, 2019

3113 words 15 mins read

Paper Group ANR 193

Deep Supervision with Intermediate Concepts. Deep Learning for Automated Classification of Tuberculosis-Related Chest X-Ray: Dataset Specificity Limits Diagnostic Performance Generalizability. Online Evaluations for Everyone: Mr. DLib’s Living Lab for Scholarly Recommendations. Mining Rank Data. Efficient architecture for deep neural networks with …

Deep Supervision with Intermediate Concepts


Title	Deep Supervision with Intermediate Concepts
Authors	Chi Li, M. Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Gregory D. Hager, Manmohan Chandraker
Abstract	Recent data-driven approaches to scene interpretation predominantly pose inference as an end-to-end black-box mapping, commonly performed by a Convolutional Neural Network (CNN). However, decades of work on perceptual organization in both human and machine vision suggests that there are often intermediate representations that are intrinsic to an inference task, and which provide essential structure to improve generalization. In this work, we explore an approach for injecting prior domain structure into neural network training by supervising hidden layers of a CNN with intermediate concepts that normally are not observed in practice. We formulate a probabilistic framework which formalizes these notions and predicts improved generalization via this deep supervision method. One advantage of this approach is that we are able to train only from synthetic CAD renderings of cluttered scenes, where concept values can be extracted, but apply the results to real images. Our implementation achieves the state-of-the-art performance of 2D/3D keypoint localization and image classification on real image benchmarks, including KITTI, PASCAL VOC, PASCAL3D+, IKEA, and CIFAR100. We provide additional evidence that our approach outperforms alternative forms of supervision, such as multi-task networks.
Tasks	Image Classification
Published	2018-01-08
URL	http://arxiv.org/abs/1801.03399v2
PDF	http://arxiv.org/pdf/1801.03399v2.pdf
PWC	https://paperswithcode.com/paper/deep-supervision-with-intermediate-concepts
Repo
Framework


Title	Deep Learning for Automated Classification of Tuberculosis-Related Chest X-Ray: Dataset Specificity Limits Diagnostic Performance Generalizability
Authors	Seelwan Sathitratanacheewin, Krit Pongpirul
Abstract	Machine learning has been an emerging tool for various aspects of infectious diseases including tuberculosis surveillance and detection. However, WHO provided no recommendations on using computer-aided tuberculosis detection software because of the small number of studies, methodological limitations, and limited generalizability of the findings. To quantify the generalizability of the machine-learning model, we developed a Deep Convolutional Neural Network (DCNN) model using a TB-specific CXR dataset of one population (National Library of Medicine Shenzhen No.3 Hospital) and tested it with non-TB-specific CXR dataset of another population (National Institute of Health Clinical Centers). The findings suggested that a supervised deep learning model developed by using the training dataset from one population may not have the same diagnostic performance in another population. Technical specification of CXR images, disease severity distribution, overfitting, and overdiagnosis should be examined before implementation in other settings.
Tasks
Published	2018-11-13
URL	http://arxiv.org/abs/1811.07985v2
PDF	http://arxiv.org/pdf/1811.07985v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-automated-classification-of
Repo
Framework

Online Evaluations for Everyone: Mr. DLib’s Living Lab for Scholarly Recommendations


Title	Online Evaluations for Everyone: Mr. DLib’s Living Lab for Scholarly Recommendations
Authors	Joeran Beel, Andrew Collins, Oliver Kopp, Linus W. Dietz, Petr Knoth
Abstract	We introduce the first ‘living lab’ for scholarly recommender systems. This lab allows recommender-system researchers to conduct online evaluations of their novel algorithms for scholarly recommendations, i.e., recommendations for research papers, citations, conferences, research grants, etc. Recommendations are delivered through the living lab’s API to platforms such as reference management software and digital libraries. The living lab is built on top of the recommender-system as-a-service Mr. DLib. Current partners are the reference management software JabRef and the CORE research team. We present the architecture of Mr. DLib’s living lab as well as usage statistics on the first sixteen months of operating it. During this time, 1,826,643 recommendations were delivered with an average click-through rate of 0.21%.
Tasks	Recommendation Systems
Published	2018-07-19
URL	https://arxiv.org/abs/1807.07298v2
PDF	https://arxiv.org/pdf/1807.07298v2.pdf
PWC	https://paperswithcode.com/paper/mr-dlibs-living-lab-for-scholarly
Repo
Framework

Mining Rank Data


Title	Mining Rank Data
Authors	Sascha Henzgen, Eyke Hüllermeier
Abstract	The problem of frequent pattern mining has been studied quite extensively for various types of data, including sets, sequences, and graphs. Somewhat surprisingly, another important type of data, namely rank data, has received very little attention in data mining so far. In this paper, we therefore addresses the problem of mining rank data, that is, data in the form of rankings (total orders) of an underlying set of items. More specifically, two types of patterns are considered, namely frequent rankings and dependencies between such rankings in the form of association rules. Algorithms for mining frequent rankings and frequent closed rankings are proposed and tested experimentally, using both synthetic and real data.
Tasks
Published	2018-06-15
URL	http://arxiv.org/abs/1806.05897v1
PDF	http://arxiv.org/pdf/1806.05897v1.pdf
PWC	https://paperswithcode.com/paper/mining-rank-data
Repo
Framework

Efficient architecture for deep neural networks with heterogeneous sensitivity


Title	Efficient architecture for deep neural networks with heterogeneous sensitivity
Authors	Hyunjoong Cho, Jinhyeok Jang, Chanhyeok Lee, Seungjoon Yang
Abstract	This work presents a neural network that consists of nodes with heterogeneous sensitivity. Each node in a network is assigned a variable that determines the sensitivity with which it learns to perform a given task. The network is trained by a constrained optimization that maximizes the sparsity of the sensitivity variables while ensuring the network’s performance. As a result, the network learns to perform a given task using only a small number of sensitive nodes. Insensitive nodes, the nodes with zero sensitivity, can be removed from a trained network to obtain a computationally efficient network. Removing zero-sensitivity nodes has no effect on the network’s performance because the network has already been trained to perform the task without them. The regularization parameter used to solve the optimization problem is found simultaneously during the training of networks. To validate our approach, we design networks with computationally efficient architectures for various tasks such as autoregression, object recognition, facial expression recognition, and object detection using various datasets. In our experiments, the networks designed by the proposed method provide the same or higher performance but with far less computational complexity.
Tasks	Facial Expression Recognition, Object Detection, Object Recognition
Published	2018-10-12
URL	https://arxiv.org/abs/1810.05358v3
PDF	https://arxiv.org/pdf/1810.05358v3.pdf
PWC	https://paperswithcode.com/paper/optimal-architecture-for-deep-neural-networks
Repo
Framework

Semi-supervised Content-based Detection of Misinformation via Tensor Embeddings


Title	Semi-supervised Content-based Detection of Misinformation via Tensor Embeddings
Authors	Gisel Bastidas Guacho, Sara Abdali, Neil Shah, Evangelos E. Papalexakis
Abstract	Fake news may be intentionally created to promote economic, political and social interests, and can lead to negative impacts on humans beliefs and decisions. Hence, detection of fake news is an emerging problem that has become extremely prevalent during the last few years. Most existing works on this topic focus on manual feature extraction and supervised classification models leveraging a large number of labeled (fake or real) articles. In contrast, we focus on content-based detection of fake news articles, while assuming that we have a small amount of labels, made available by manual fact-checkers or automated sources. We argue this is a more realistic setting in the presence of massive amounts of content, most of which cannot be easily factchecked. To that end, we represent collections of news articles as multi-dimensional tensors, leverage tensor decomposition to derive concise article embeddings that capture spatial/contextual information about each news article, and use those embeddings to create an article-by-article graph on which we propagate limited labels. Results on three real-world datasets show that our method performs on par or better than existing models that are fully supervised, in that we achieve better detection accuracy using fewer labels. In particular, our proposed method achieves 75.43% of accuracy using only 30% of labels of a public dataset while an SVM-based classifier achieved 67.43%. Furthermore, our method achieves 70.92% of accuracy in a large dataset using only 2% of labels.
Tasks
Published	2018-04-24
URL	http://arxiv.org/abs/1804.09088v1
PDF	http://arxiv.org/pdf/1804.09088v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-content-based-detection-of
Repo
Framework

Unsupervised Machine Commenting with Neural Variational Topic Model


Title	Unsupervised Machine Commenting with Neural Variational Topic Model
Authors	Shuming Ma, Lei Cui, Furu Wei, Xu Sun
Abstract	Article comments can provide supplementary opinions and facts for readers, thereby increase the attraction and engagement of articles. Therefore, automatically commenting is helpful in improving the activeness of the community, such as online forums and news websites. Previous work shows that training an automatic commenting system requires large parallel corpora. Although part of articles are naturally paired with the comments on some websites, most articles and comments are unpaired on the Internet. To fully exploit the unpaired data, we completely remove the need for parallel data and propose a novel unsupervised approach to train an automatic article commenting model, relying on nothing but unpaired articles and comments. Our model is based on a retrieval-based commenting framework, which uses news to retrieve comments based on the similarity of their topics. The topic representation is obtained from a neural variational topic model, which is trained in an unsupervised manner. We evaluate our model on a news comment dataset. Experiments show that our proposed topic-based approach significantly outperforms previous lexicon-based models. The model also profits from paired corpora and achieves state-of-the-art performance under semi-supervised scenarios.
Tasks
Published	2018-09-13
URL	http://arxiv.org/abs/1809.04960v1
PDF	http://arxiv.org/pdf/1809.04960v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-machine-commenting-with-neural
Repo
Framework

A Novel Geometric Framework on Gram Matrix Trajectories for Human Behavior Understanding


Title	A Novel Geometric Framework on Gram Matrix Trajectories for Human Behavior Understanding
Authors	Anis Kacem, Mohamed Daoudi, Boulbaba Ben Amor, Stefano Berretti, Juan Carlos Alvarez-Paiva
Abstract	In this paper, we propose a novel space-time geometric representation of human landmark configurations and derive tools for comparison and classification. We model the temporal evolution of landmarks as parametrized trajectories on the Riemannian manifold of positive semidefinite matrices of fixed-rank. Our representation has the benefit to bring naturally a second desirable quantity when comparing shapes, the spatial covariance, in addition to the conventional affine-shape representation. We derived then geometric and computational tools for rate-invariant analysis and adaptive re-sampling of trajectories, grounding on the Riemannian geometry of the underlying manifold. Specifically, our approach involves three steps: (1) landmarks are first mapped into the Riemannian manifold of positive semidefinite matrices of fixed-rank to build time-parameterized trajectories; (2) a temporal warping is performed on the trajectories, providing a geometry-aware (dis-)similarity measure between them; (3) finally, a pairwise proximity function SVM is used to classify them, incorporating the (dis-)similarity measure into the kernel function. We show that such representation and metric achieve competitive results in applications as action recognition and emotion recognition from 3D skeletal data, and facial expression recognition from videos. Experiments have been conducted on several publicly available up-to-date benchmarks.
Tasks	Emotion Recognition, Facial Expression Recognition, Temporal Action Localization
Published	2018-06-29
URL	http://arxiv.org/abs/1807.00676v1
PDF	http://arxiv.org/pdf/1807.00676v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-geometric-framework-on-gram-matrix
Repo
Framework

Cross-Cultural and Cultural-Specific Production and Perception of Facial Expressions of Emotion in the Wild


Title	Cross-Cultural and Cultural-Specific Production and Perception of Facial Expressions of Emotion in the Wild
Authors	Ramprakash Srinivasan, Aleix M. Martinez
Abstract	Automatic recognition of emotion from facial expressions is an intense area of research, with a potentially long list of important application. Yet, the study of emotion requires knowing which facial expressions are used within and across cultures in the wild, not in controlled lab conditions; but such studies do not exist. Which and how many cross-cultural and cultural-specific facial expressions do people commonly use? And, what affect variables does each expression communicate to observers? If we are to design technology that understands the emotion of users, we need answers to these two fundamental questions. In this paper, we present the first large-scale study of the production and visual perception of facial expressions of emotion in the wild. We find that of the 16,384 possible facial configurations that people can theoretically produce, only 35 are successfully used to transmit emotive information across cultures, and only 8 within a smaller number of cultures. Crucially, we find that visual analysis of cross-cultural expressions yields consistent perception of emotion categories and valence, but not arousal. In contrast, visual analysis of cultural-specific expressions yields consistent perception of valence and arousal, but not of emotion categories. Additionally, we find that the number of expressions used to communicate each emotion is also different, e.g., 17 expressions transmit happiness, but only 1 is used to convey disgust.
Tasks
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04399v1
PDF	http://arxiv.org/pdf/1808.04399v1.pdf
PWC	https://paperswithcode.com/paper/cross-cultural-and-cultural-specific
Repo
Framework

Expression Empowered ResiDen Network for Facial Action Unit Detection


Title	Expression Empowered ResiDen Network for Facial Action Unit Detection
Authors	Shreyank Jyoti, Abhinav Dhall
Abstract	The paper explores the topic of Facial Action Unit (FAU) detection in the wild. In particular, we are interested in answering the following questions: (1) how useful are residual connections across dense blocks for face analysis? (2) how useful is the information from a network trained for categorical Facial Expression Recognition (FER) for the task of FAU detection? The proposed network (ResiDen) exploits dense blocks along with residual connections and uses auxiliary information from a FER network. The experiments are performed on the EmotionNet and DISFA datasets. The experiments show the usefulness of facial expression information for AU detection. The proposed network achieves state-of-art results on the two databases. Analysis of the results for cross database protocol shows the effectiveness of the network.
Tasks	Action Unit Detection, Facial Action Unit Detection, Facial Expression Recognition
Published	2018-06-13
URL	http://arxiv.org/abs/1806.04957v1
PDF	http://arxiv.org/pdf/1806.04957v1.pdf
PWC	https://paperswithcode.com/paper/expression-empowered-residen-network-for
Repo
Framework

Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations


Title	Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations
Authors	Dipendra Misra, Ming-Wei Chang, Xiaodong He, Wen-tau Yih
Abstract	Semantic parsing from denotations faces two key challenges in model training: (1) given only the denotations (e.g., answers), search for good candidate semantic parses, and (2) choose the best model update algorithm. We propose effective and general solutions to each of them. Using policy shaping, we bias the search procedure towards semantic parses that are more compatible to the text, which provide better supervision signals for training. In addition, we propose an update equation that generalizes three different families of learning algorithms, which enables fast model exploration. When experimented on a recently proposed sequential question answering dataset, our framework leads to a new state-of-the-art model that outperforms previous work by 5.0% absolute on exact match accuracy.
Tasks	Question Answering, Semantic Parsing
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01299v1
PDF	http://arxiv.org/pdf/1809.01299v1.pdf
PWC	https://paperswithcode.com/paper/policy-shaping-and-generalized-update
Repo
Framework

Deploying Deep Ranking Models for Search Verticals


Title	Deploying Deep Ranking Models for Search Verticals
Authors	Rohan Ramanath, Gungor Polatkan, Liqin Xu, Harold Lee, Bo Hu, Shan Zhou
Abstract	In this paper, we present an architecture executing a complex machine learning model such as a neural network capturing semantic similarity between a query and a document; and deploy to a real-world production system serving 500M+users. We present the challenges that arise in a real-world system and how we solve them. We demonstrate that our architecture provides competitive modeling capability without any significant performance impact to the system in terms of latency. Our modular solution and insights can be used by other real-world search systems to realize and productionize recent gains in neural networks.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02281v1
PDF	http://arxiv.org/pdf/1806.02281v1.pdf
PWC	https://paperswithcode.com/paper/deploying-deep-ranking-models-for-search
Repo
Framework


Title	Autonomous Vehicle Speed Control for Safe Navigation of Occluded Pedestrian Crosswalk
Authors	Sarah Thornton
Abstract	Both humans and the sensors on an autonomous vehicle have limited sensing capabilities. When these limitations coincide with scenarios involving vulnerable road users, it becomes important to account for these limitations in the motion planner. For the scenario of an occluded pedestrian crosswalk, the speed of the approaching vehicle should be a function of the amount of uncertainty on the roadway. In this work, the longitudinal controller is formulated as a partially observable Markov decision process and dynamic programming is used to compute the control policy. The control policy scales the speed profile to be used by a model predictive steering controller.
Tasks
Published	2018-02-18
URL	http://arxiv.org/abs/1802.06314v1
PDF	http://arxiv.org/pdf/1802.06314v1.pdf
PWC	https://paperswithcode.com/paper/autonomous-vehicle-speed-control-for-safe
Repo
Framework

Interactive Full Image Segmentation by Considering All Regions Jointly


Title	Interactive Full Image Segmentation by Considering All Regions Jointly
Authors	Eirikur Agustsson, Jasper R. R. Uijlings, Vittorio Ferrari
Abstract	We address interactive full image annotation, where the goal is to accurately segment all object and stuff regions in an image. We propose an interactive, scribble-based annotation framework which operates on the whole image to produce segmentations for all regions. This enables sharing scribble corrections across regions, and allows the annotator to focus on the largest errors made by the machine across the whole image. To realize this, we adapt Mask-RCNN into a fast interactive segmentation framework and introduce an instance-aware loss measured at the pixel-level in the full image canvas, which lets predictions for nearby regions properly compete for space. Finally, we compare to interactive single object segmentation on the COCO panoptic dataset. We demonstrate that our interactive full image segmentation approach leads to a 5% IoU gain, reaching 90% IoU at a budget of four extreme clicks and four corrective scribbles per region.
Tasks	Interactive Segmentation, Semantic Segmentation
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01888v2
PDF	http://arxiv.org/pdf/1812.01888v2.pdf
PWC	https://paperswithcode.com/paper/interactive-full-image-segmentation
Repo
Framework

Bridge type classification: supervised learning on a modified NBI dataset


Title	Bridge type classification: supervised learning on a modified NBI dataset
Authors	Achyuthan Jootoo, David Lattanzi
Abstract	A key phase in the bridge design process is the selection of the structural system. Due to budget and time constraints, engineers typically rely on engineering judgment and prior experience when selecting a structural system, often considering a limited range of design alternatives. The objective of this study was to explore the suitability of supervised machine learning as a preliminary design aid that provides guidance to engineers with regards to the statistically optimal bridge type to choose, ultimately improving the likelihood of optimized design, design standardization, and reduced maintenance costs. In order to devise this supervised learning system, data for over 600,000 bridges from the National Bridge Inventory database were analyzed. Key attributes for determining the bridge structure type were identified through three feature selection techniques. Potentially useful attributes like seismic intensity and historic data on the cost of materials (steel and concrete) were then added from the US Geological Survey (USGS) database and Engineering News Record. Decision tree, Bayes network and Support Vector Machines were used for predicting the bridge design type. Due to state-to-state variations in material availability, material costs, and design codes, supervised learning models based on the complete data set did not yield favorable results. Supervised learning models were then trained and tested using 10-fold cross validation on data for each state. Inclusion of seismic data improved the model performance noticeably. The data was then resampled to reduce the bias of the models towards more common design types, and the supervised learning models thus constructed showed further improvements in performance. The average recall and precision for the state models was 88.6% and 88.0% using Decision Trees, 84.0% and 83.7% using Bayesian Networks, and 80.8% and 75.6% using SVM.
Tasks	Feature Selection
Published	2018-02-12
URL	http://arxiv.org/abs/1803.04478v1
PDF	http://arxiv.org/pdf/1803.04478v1.pdf
PWC	https://paperswithcode.com/paper/bridge-type-classification-supervised
Repo
Framework