January 25, 2020

3734 words 18 mins read

Paper Group ANR 1765

Paper Group ANR 1765

Learning Residual Flow as Dynamic Motion from Stereo Videos. Using image-extracted features to determine heart rate and blink duration for driver sleepiness detection. Apricot variety classification using image processing and machine learning approaches. Scalable Global Alignment Graph Kernel Using Random Features: From Node Embedding to Graph Embe …

Learning Residual Flow as Dynamic Motion from Stereo Videos

Title Learning Residual Flow as Dynamic Motion from Stereo Videos
Authors Seokju Lee, Sunghoon Im, Stephen Lin, In So Kweon
Abstract We present a method for decomposing the 3D scene flow observed from a moving stereo rig into stationary scene elements and dynamic object motion. Our unsupervised learning framework jointly reasons about the camera motion, optical flow, and 3D motion of moving objects. Three cooperating networks predict stereo matching, camera motion, and residual flow, which represents the flow component due to object motion and not from camera motion. Based on rigid projective geometry, the estimated stereo depth is used to guide the camera motion estimation, and the depth and camera motion are used to guide the residual flow estimation. We also explicitly estimate the 3D scene flow of dynamic objects based on the residual flow and scene depth. Experiments on the KITTI dataset demonstrate the effectiveness of our approach and show that our method outperforms other state-of-the-art algorithms on the optical flow and visual odometry tasks.
Tasks Depth And Camera Motion, Motion Estimation, Optical Flow Estimation, Stereo Matching, Stereo Matching Hand, Visual Odometry
Published 2019-09-16
URL https://arxiv.org/abs/1909.06999v1
PDF https://arxiv.org/pdf/1909.06999v1.pdf
PWC https://paperswithcode.com/paper/learning-residual-flow-as-dynamic-motion-from
Repo
Framework
Title Using image-extracted features to determine heart rate and blink duration for driver sleepiness detection
Authors Erfan Darzi, Armin Mohammadie-Zand, Hamid Soltanian-Zadeh
Abstract Heart rate and blink duration are two vital physiological signals which give information about cardiac activity and consciousness. Monitoring these two signals is crucial for various applications such as driver drowsiness detection. As there are several problems posed by the conventional systems to be used for continuous, long-term monitoring, a remote blink and ECG monitoring system can be used as an alternative. For estimating the blink duration, two strategies are used. In the first approach, pictures of open and closed eyes are fed into an Artificial Neural Network (ANN) to decide whether the eyes are open or close. In the second approach, they are classified and labeled using Linear Discriminant Analysis (LDA). The labeled images are then be used to determine the blink duration. For heart rate variability, two strategies are used to evaluate the passing blood volume: Independent Component Analysis (ICA); and a chrominance based method. Eye recognition yielded 78-92% accuracy in classifying open/closed eyes with ANN and 71-91% accuracy with LDA. Heart rate evaluations had a mean loss of around 16 Beats Per Minute (BPM) for the ICA strategy and 13 BPM for the chrominance based technique.
Tasks Heart Rate Variability
Published 2019-11-04
URL https://arxiv.org/abs/1911.01333v2
PDF https://arxiv.org/pdf/1911.01333v2.pdf
PWC https://paperswithcode.com/paper/using-image-extracted-features-to-determine
Repo
Framework

Apricot variety classification using image processing and machine learning approaches

Title Apricot variety classification using image processing and machine learning approaches
Authors Seyed Vahid Mirnezami, Ali HamidiSepehr, Mahdi Ghaebi
Abstract Apricot which is a cultivated type of Zerdali (wild apricot) has an important place in human nutrition and its medical properties are essential for human health. The objective of this research was to obtain a model for apricot mass and separate apricot variety with image processing technology using external features of apricot fruit. In this study, five verities of apricot were used. In order to determine the size of the fruits, three mutually perpendicular axes were defined, length, width, and thickness. Measurements show that the effect of variety on all properties was statistically significant at the 1% probability level. Furthermore, there is no significant difference between the estimated dimensions by image processing approach and the actual dimensions. The developed system consists of a digital camera, a light diffusion chamber, a distance adjustment pedestal, and a personal computer. Images taken by the digital camera were stored as (RGB) for further analysis. The images were taken for a number of 49 samples of each cultivar in three directions. A linear equation is recommended to calculate the apricot mass based on the length and the width with R 2 = 0.97. In addition, ANFIS model with C-means was the best model for classifying the apricot varieties based on the physical features including length, width, thickness, mass, and projected area of three perpendicular surfaces. The accuracy of the model was 87.7.
Tasks
Published 2019-12-27
URL https://arxiv.org/abs/1912.11953v1
PDF https://arxiv.org/pdf/1912.11953v1.pdf
PWC https://paperswithcode.com/paper/apricot-variety-classification-using-image
Repo
Framework

Scalable Global Alignment Graph Kernel Using Random Features: From Node Embedding to Graph Embedding

Title Scalable Global Alignment Graph Kernel Using Random Features: From Node Embedding to Graph Embedding
Authors Lingfei Wu, Ian En-Hsu Yen, Zhen Zhang, Kun Xu, Liang Zhao, Xi Peng, Yinglong Xia, Charu Aggarwal
Abstract Graph kernels are widely used for measuring the similarity between graphs. Many existing graph kernels, which focus on local patterns within graphs rather than their global properties, suffer from significant structure information loss when representing graphs. Some recent global graph kernels, which utilizes the alignment of geometric node embeddings of graphs, yield state-of-the-art performance. However, these graph kernels are not necessarily positive-definite. More importantly, computing the graph kernel matrix will have at least quadratic {time} complexity in terms of the number and the size of the graphs. In this paper, we propose a new family of global alignment graph kernels, which take into account the global properties of graphs by using geometric node embeddings and an associated node transportation based on earth mover’s distance. Compared to existing global kernels, the proposed kernel is positive-definite. Our graph kernel is obtained by defining a distribution over \emph{random graphs}, which can naturally yield random feature approximations. The random feature approximations lead to our graph embeddings, which is named as “random graph embeddings” (RGE). In particular, RGE is shown to achieve \emph{(quasi-)linear scalability} with respect to the number and the size of the graphs. The experimental results on nine benchmark datasets demonstrate that RGE outperforms or matches twelve state-of-the-art graph classification algorithms.
Tasks Graph Classification, Graph Embedding
Published 2019-11-25
URL https://arxiv.org/abs/1911.11119v1
PDF https://arxiv.org/pdf/1911.11119v1.pdf
PWC https://paperswithcode.com/paper/scalable-global-alignment-graph-kernel-using
Repo
Framework

Labeled Graph Generative Adversarial Networks

Title Labeled Graph Generative Adversarial Networks
Authors Shuangfei Fan, Bert Huang
Abstract As a new way to train generative models, generative adversarial networks (GANs) have achieved considerable success in image generation, and this framework has also recently been applied to data with graph structures. We identify the drawbacks of existing deep frameworks for generating graphs, and we propose labeled-graph generative adversarial networks (LGGAN) to train deep generative models for graph-structured data with node labels. We test the approach on various types of graph datasets, such as collections of citation networks and protein graphs. Experiment results show that our model can generate diverse labeled graphs that match the structural characteristics of the training data and outperforms all baselines in terms of quality, generality, and scalability. To further evaluate the quality of the generated graphs, we apply it to a downstream task for graph classification, and the results show that LGGAN can better capture the important aspects of the graph structure.
Tasks Graph Classification, Image Generation
Published 2019-06-07
URL https://arxiv.org/abs/1906.03220v1
PDF https://arxiv.org/pdf/1906.03220v1.pdf
PWC https://paperswithcode.com/paper/labeled-graph-generative-adversarial-networks
Repo
Framework

Reviewing and Improving the Gaussian Mechanism for Differential Privacy

Title Reviewing and Improving the Gaussian Mechanism for Differential Privacy
Authors Jun Zhao, Teng Wang, Tao Bai, Kwok-Yan Lam, Zhiying Xu, Shuyu Shi, Xuebin Ren, Xinyu Yang, Yang Liu, Han Yu
Abstract Differential privacy provides a rigorous framework to quantify data privacy, and has received considerable interest recently. A randomized mechanism satisfying $(\epsilon, \delta)$-differential privacy (DP) roughly means that, except with a small probability $\delta$, altering a record in a dataset cannot change the probability that an output is seen by more than a multiplicative factor $e^{\epsilon} $. A well-known solution to $(\epsilon, \delta)$-DP is the Gaussian mechanism initiated by Dwork et al. [1] in 2006 with an improvement by Dwork and Roth [2] in 2014, where a Gaussian noise amount $\sqrt{2\ln \frac{2}{\delta}} \times \frac{\Delta}{\epsilon}$ of [1] or $\sqrt{2\ln \frac{1.25}{\delta}} \times \frac{\Delta}{\epsilon}$ of [2] is added independently to each dimension of the query result, for a query with $\ell_2$-sensitivity $\Delta$. Although both classical Gaussian mechanisms [1,2] assume $0 < \epsilon \leq 1$, our review finds that many studies in the literature have used the classical Gaussian mechanisms under values of $\epsilon$ and $\delta$ where the added noise amounts of [1,2] do not achieve $(\epsilon,\delta)$-DP. We obtain such result by analyzing the optimal noise amount $\sigma_{DP-OPT}$ for $(\epsilon,\delta)$-DP and identifying $\epsilon$ and $\delta$ where the noise amounts of classical mechanisms are even less than $\sigma_{DP-OPT}$. Since $\sigma_{DP-OPT}$ has no closed-form expression and needs to be approximated in an iterative manner, we propose Gaussian mechanisms by deriving closed-form upper bounds for $\sigma_{DP-OPT}$. Our mechanisms achieve $(\epsilon,\delta)$-DP for any $\epsilon$, while the classical mechanisms [1,2] do not achieve $(\epsilon,\delta)$-DP for large $\epsilon$ given $\delta$. Moreover, the utilities of our mechanisms improve those of [1,2] and are close to that of the optimal yet more computationally expensive Gaussian mechanism.
Tasks
Published 2019-11-27
URL https://arxiv.org/abs/1911.12060v2
PDF https://arxiv.org/pdf/1911.12060v2.pdf
PWC https://paperswithcode.com/paper/reviewing-and-improving-the-gaussian
Repo
Framework

Active Manifolds: A non-linear analogue to Active Subspaces

Title Active Manifolds: A non-linear analogue to Active Subspaces
Authors Robert A. Bridges, Anthony D. Gruber, Christopher Felder, Miki Verma, Chelsey Hoff
Abstract We present an approach to analyze $C^1(\mathbb{R}^m)$ functions that addresses limitations present in the Active Subspaces (AS) method of Constantine et al.(2015; 2014). Under appropriate hypotheses, our Active Manifolds (AM) method identifies a 1-D curve in the domain (the active manifold) on which nearly all values of the unknown function are attained, and which can be exploited for approximation or analysis, especially when $m$ is large (high-dimensional input space). We provide theorems justifying our AM technique and an algorithm permitting functional approximation and sensitivity analysis. Using accessible, low-dimensional functions as initial examples, we show AM reduces approximation error by an order of magnitude compared to AS, at the expense of more computation. Following this, we revisit the sensitivity analysis by Glaws et al. (2017), who apply AS to analyze a magnetohydrodynamic power generator model, and compare the performance of AM on the same data. Our analysis provides detailed information not captured by AS, exhibiting the influence of each parameter individually along an active manifold. Overall, AM represents a novel technique for analyzing functional models with benefits including: reducing $m$-dimensional analysis to a 1-D analogue, permitting more accurate regression than AS (at more computational expense), enabling more informative sensitivity analysis, and granting accessible visualizations(2-D plots) of parameter sensitivity along the AM.
Tasks
Published 2019-04-30
URL https://arxiv.org/abs/1904.13386v4
PDF https://arxiv.org/pdf/1904.13386v4.pdf
PWC https://paperswithcode.com/paper/active-manifolds-a-non-linear-analogue-to
Repo
Framework

Can AI Generate Love Advice?: Toward Neural Answer Generation for Non-Factoid Questions

Title Can AI Generate Love Advice?: Toward Neural Answer Generation for Non-Factoid Questions
Authors Makoto Nakatsuji
Abstract Deep learning methods that extract answers for non-factoid questions from QA sites are seen as critical since they can assist users in reaching their next decisions through conversations with AI systems. The current methods, however, have the following two problems: (1) They can not understand the ambiguous use of words in the questions as word usage can strongly depend on the context. As a result, the accuracies of their answer selections are not good enough. (2) The current methods can only select from among the answers held by QA sites and can not generate new ones. Thus, they can not answer the questions that are somewhat different with those stored in QA sites. Our solution, Neural Answer Construction Model, tackles these problems as it: (1) Incorporates the biases of semantics behind questions into word embeddings while also computing them regardless of the semantics. As a result, it can extract answers that suit the contexts of words used in the question as well as following the common usage of words across semantics. This improves the accuracy of answer selection. (2) Uses biLSTM to compute the embeddings of questions as well as those of the sentences often used to form answers. It then simultaneously learns the optimum combination of those sentences as well as the closeness between the question and those sentences. As a result, our model can construct an answer that corresponds to the situation that underlies the question; it fills the gap between answer selection and generation and is the first model to move beyond the current simple answer selection model for non-factoid QAs. Evaluations using datasets created for love advice stored in the Japanese QA site, Oshiete goo, indicate that our model achieves 20% higher accuracy in answer creation than the strong baselines. Our model is practical and has already been applied to the love advice service in Oshiete goo.
Tasks Answer Selection, Word Embeddings
Published 2019-12-06
URL https://arxiv.org/abs/1912.10163v1
PDF https://arxiv.org/pdf/1912.10163v1.pdf
PWC https://paperswithcode.com/paper/can-ai-generate-love-advice-toward-neural
Repo
Framework

Clustering-Based Collaborative Filtering Using an Incentivized/Penalized User Model

Title Clustering-Based Collaborative Filtering Using an Incentivized/Penalized User Model
Authors Cong Tran, Jang-Young Kim, Won-Yong Shin, Sang-Wook Kim
Abstract Giving or recommending appropriate content based on the quality of experience is the most important and challenging issue in recommender systems. As collaborative filtering (CF) is one of the most prominent and popular techniques used for recommender systems, we propose a new clustering-based CF (CBCF) method using an incentivized/penalized user (IPU) model only with ratings given by users, which is thus easy to implement. We aim to design such a simple clustering-based approach with no further prior information while improving the recommendation accuracy. To be precise, the purpose of CBCF with the IPU model is to improve recommendation performance such as precision, recall, and $F_1$ score by carefully exploiting different preferences among users. Specifically, we formulate a constrained optimization problem, in which we aim to maximize the recall (or equivalently $F_1$ score) for a given precision. To this end, users are divided into several clusters based on the actual rating data and Pearson correlation coefficient. Afterwards, we give each item an incentive/penalty according to the preference tendency by users within the same cluster. Our experimental results show a significant performance improvement over the baseline CF scheme without clustering in terms of recall or $F_1$ score for a given precision.
Tasks Recommendation Systems
Published 2019-05-01
URL http://arxiv.org/abs/1905.01990v1
PDF http://arxiv.org/pdf/1905.01990v1.pdf
PWC https://paperswithcode.com/paper/clustering-based-collaborative-filtering
Repo
Framework

Real-time 3D Shape Instantiation for Partially-deployed Stent Segment from a Single 2D Fluoroscopic Image in Robot-assisted Fenestrated Endovascular Aortic Repair

Title Real-time 3D Shape Instantiation for Partially-deployed Stent Segment from a Single 2D Fluoroscopic Image in Robot-assisted Fenestrated Endovascular Aortic Repair
Authors Jian-Qing Zheng, Xiao-Yun Zhou, Guang-Zhong Yang
Abstract In robot-assisted Fenestrated Endovascular Aortic Repair (FEVAR), accurate alignment of stent graft fenestrations or scallops with aortic branches is essential for establishing complete blood flow perfusion. Current navigation is largely based on 2D fluoroscopic images, which lacks 3D anatomical information, thus causing longer operation time as well as high risks of radiation exposure. Previously, 3D shape instantiation frameworks for real-time 3D shape reconstruction of fully-deployed or fully-compressed stent graft from a single 2D fluoroscopic image have been proposed for 3D navigation in robot-assisted FEVAR. However, these methods could not instantiate partially-deployed stent segments, as the 3D marker references are unknown. In this paper, an adapted Graph Convolutional Network (GCN) is proposed to predict 3D marker references from 3D fully-deployed markers. As original GCN is for classification, in this paper, the coarsening layers are removed and the softmax function at the network end is replaced with linear mapping for the regression task. The derived 3D and the 2D marker references are used to instantiate partially-deployed stent segment shape with the existing 3D shape instantiation framework. Validations were performed on three commonly used stent grafts and five patient-specific 3D printed aortic aneurysm phantoms. Comparable performances with average mesh distance errors of 1$\sim$3mm and average angular errors around 7degree were achieved.
Tasks
Published 2019-02-28
URL http://arxiv.org/abs/1902.11089v1
PDF http://arxiv.org/pdf/1902.11089v1.pdf
PWC https://paperswithcode.com/paper/real-time-3d-shape-instantiation-for
Repo
Framework

Semi-supervised Learning in Network-Structured Data via Total Variation Minimization

Title Semi-supervised Learning in Network-Structured Data via Total Variation Minimization
Authors Alexander Jung, Alfred O. Hero III, Alexandru Mara, Saeed Jahromi, Ayelet Heimowitz, Yonina C. Eldar
Abstract We propose and analyze a method for semi-supervised learning from partially-labeled network-structured data. Our approach is based on a graph signal recovery interpretation under a clustering hypothesis that labels of data points belonging to the same well-connected subset (cluster) are similar valued. This lends naturally to learning the labels by total variation (TV) minimization, which we solve by applying a recently proposed primal-dual method for non-smooth convex optimization. The resulting algorithm allows for a highly scalable implementation using message passing over the underlying empirical graph, which renders the algorithm suitable for big data applications. By applying tools of compressed sensing, we derive a sufficient condition on the underlying network structure such that TV minimization recovers clusters in the empirical graph of the data. In particular, we show that the proposed primal-dual method amounts to maximizing network flows over the empirical graph of the dataset. Moreover, the learning accuracy of the proposed algorithm is linked to the set of network flows between data points having known labels. The effectiveness and scalability of our approach is verified by numerical experiments.
Tasks
Published 2019-01-28
URL https://arxiv.org/abs/1901.09838v2
PDF https://arxiv.org/pdf/1901.09838v2.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-in-network
Repo
Framework

Collaborative and Privacy-Preserving Machine Teaching via Consensus Optimization

Title Collaborative and Privacy-Preserving Machine Teaching via Consensus Optimization
Authors Yufei Han, Yuzhe Ma, Christopher Gates, Kevin Roundy, Yun Shen
Abstract In this work, we define a collaborative and privacy-preserving machine teaching paradigm with multiple distributed teachers. We focus on consensus super teaching. It aims at organizing distributed teachers to jointly select a compact while informative training subset from data hosted by the teachers to make a learner learn better. The challenges arise from three perspectives. First, the state-of-the-art pool-based super teaching method applies mixed-integer non-linear programming (MINLP) which does not scale well to very large data sets. Second, it is desirable to restrict data access of the teachers to only their own data during the collaboration stage to mitigate privacy leaks. Finally, the teaching collaboration should be communication-efficient since large communication overheads can cause synchronization delays between teachers. To address these challenges, we formulate collaborative teaching as a consensus and privacy-preserving optimization process to minimize teaching risk. We theoretically demonstrate the necessity of collaboration between teachers for improving the learner’s learning. Furthermore, we show that the proposed method enjoys a similar property as the Oracle property of adaptive Lasso. The empirical study illustrates that our teaching method can deliver significantly more accurate teaching results with high speed, while the non-collaborative MINLP-based super teaching becomes prohibitively expensive to compute.
Tasks
Published 2019-05-07
URL https://arxiv.org/abs/1905.02796v1
PDF https://arxiv.org/pdf/1905.02796v1.pdf
PWC https://paperswithcode.com/paper/collaborative-and-privacy-preserving-machine
Repo
Framework

Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning

Title Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning
Authors Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
Abstract Understanding narratives requires reading between the lines, which in turn, requires interpreting the likely causes and effects of events, even when they are not mentioned explicitly. In this paper, we introduce Cosmos QA, a large-scale dataset of 35,600 problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. In stark contrast to most existing reading comprehension datasets where the questions focus on factual and literal understanding of the context paragraph, our dataset focuses on reading between the lines over a diverse collection of people’s everyday narratives, asking such questions as “what might be the possible reason of …?", or “what would have happened if …” that require reasoning beyond the exact text spans in the context. To establish baseline performances on Cosmos QA, we experiment with several state-of-the-art neural architectures for reading comprehension, and also propose a new architecture that improves over the competitive baselines. Experimental results demonstrate a significant gap between machine (68.4%) and human performance (94%), pointing to avenues for future research on commonsense machine comprehension. Dataset, code and leaderboard is publicly available at https://wilburone.github.io/cosmos.
Tasks Machine Reading Comprehension, Reading Comprehension
Published 2019-08-31
URL https://arxiv.org/abs/1909.00277v2
PDF https://arxiv.org/pdf/1909.00277v2.pdf
PWC https://paperswithcode.com/paper/cosmos-qa-machine-reading-comprehension-with
Repo
Framework

The TechQA Dataset

Title The TechQA Dataset
Authors Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Mike McCawley, Mohamed Nasr, Lin Pan, Cezar Pendus, John Pitrelli, Saurabh Pujar, Salim Roukos, Andrzej Sakrajda, Avirup Sil, Rosario Uceda-Sosa, Todd Ward, Rong Zhang
Abstract We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size – 600 training, 310 dev, and 490 evaluation question/answer pairs – thus reflecting the cost of creating large labeled datasets with actual data. Consequently, TechQA is meant to stimulate research in domain adaptation rather than being a resource to build QA systems from scratch. The dataset was obtained by crawling the IBM Developer and IBM DeveloperWorks forums for questions with accepted answers that appear in a published IBM Technote—a technical document that addresses a specific technical issue. We also release a collection of the 801,998 publicly available Technotes as of April 4, 2019 as a companion resource that might be used for pretraining, to learn representations of the IT domain language.
Tasks Domain Adaptation, Question Answering
Published 2019-11-08
URL https://arxiv.org/abs/1911.02984v1
PDF https://arxiv.org/pdf/1911.02984v1.pdf
PWC https://paperswithcode.com/paper/the-techqa-dataset
Repo
Framework

Query-Based Named Entity Recognition

Title Query-Based Named Entity Recognition
Authors Yuxian Meng, Xiaoya Li, Zijun Sun, Jiwei Li
Abstract In this paper, we propose a new strategy for the task of named entity recognition (NER). We cast the task as a query-based machine reading comprehension task: e.g., the task of extracting entities with PER is formalized as answering the question of “which person is mentioned in the text ?". Such a strategy comes with the advantage that it solves the long-standing issue of handling overlapping or nested entities (the same token that participates in more than one entity categories) with sequence-labeling techniques for NER. Additionally, since the query encodes informative prior knowledge, this strategy facilitates the process of entity extraction, leading to better performances. We experiment the proposed model on five widely used NER datasets on English and Chinese, including MSRA, Resume, OntoNotes, ACE04 and ACE05. The proposed model sets new SOTA results on all of these datasets.
Tasks Entity Extraction, Machine Reading Comprehension, Named Entity Recognition, Reading Comprehension
Published 2019-08-24
URL https://arxiv.org/abs/1908.09138v2
PDF https://arxiv.org/pdf/1908.09138v2.pdf
PWC https://paperswithcode.com/paper/query-based-named-entity-recognition
Repo
Framework
comments powered by Disqus