October 19, 2019

2922 words 14 mins read

Paper Group ANR 154

Paper Group ANR 154

Needle Tip Force Estimation using an OCT Fiber and a Fused convGRU-CNN Architecture. Learning and Inference on Generative Adversarial Quantum Circuits. SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems. Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data. Game Semantics and Linear …

Needle Tip Force Estimation using an OCT Fiber and a Fused convGRU-CNN Architecture

Title Needle Tip Force Estimation using an OCT Fiber and a Fused convGRU-CNN Architecture
Authors Nils Gessert, Torben Priegnitz, Thore Saathoff, Sven-Thomas Antoni, David Meyer, Moritz Franz Hamann, Klaus-Peter Jünemann, Christoph Otte, Alexander Schlaefer
Abstract Needle insertion is common during minimally invasive interventions such as biopsy or brachytherapy. During soft tissue needle insertion, forces acting at the needle tip cause tissue deformation and needle deflection. Accurate needle tip force measurement provides information on needle-tissue interaction and helps detecting and compensating potential misplacement. For this purpose we introduce an image-based needle tip force estimation method using an optical fiber imaging the deformation of an epoxy layer below the needle tip over time. For calibration and force estimation, we introduce a novel deep learning-based fused convolutional GRU-CNN model which effectively exploits the spatio-temporal data structure. The needle is easy to manufacture and our model achieves a mean absolute error of 1.76 +- 1.5 mN with a cross-correlation coefficient of 0.9996, clearly outperforming other methods. We test needles with different materials to demonstrate that the approach can be adapted for different sensitivities and force ranges. Furthermore, we validate our approach in an ex-vivo prostate needle insertion scenario.
Tasks Calibration
Published 2018-05-30
URL http://arxiv.org/abs/1805.11911v1
PDF http://arxiv.org/pdf/1805.11911v1.pdf
PWC https://paperswithcode.com/paper/needle-tip-force-estimation-using-an-oct
Repo
Framework

Learning and Inference on Generative Adversarial Quantum Circuits

Title Learning and Inference on Generative Adversarial Quantum Circuits
Authors Jinfeng Zeng, Yufeng Wu, Jin-Guo Liu, Lei Wang, Jiangping Hu
Abstract Quantum mechanics is inherently probabilistic in light of Born’s rule. Using quantum circuits as probabilistic generative models for classical data exploits their superior expressibility and efficient direct sampling ability. However, training of quantum circuits can be more challenging compared to classical neural networks due to lack of efficient differentiable learning algorithm. We devise an adversarial quantum-classical hybrid training scheme via coupling a quantum circuit generator and a classical neural network discriminator together. After training, the quantum circuit generative model can infer missing data with quadratic speed up via amplitude amplification. We numerically simulate the learning and inference of generative adversarial quantum circuit using the prototypical Bars-and-Stripes dataset. Generative adversarial quantum circuits is a fresh approach to machine learning which may enjoy the practically useful quantum advantage on near-term quantum devices.
Tasks
Published 2018-08-10
URL http://arxiv.org/abs/1808.03425v1
PDF http://arxiv.org/pdf/1808.03425v1.pdf
PWC https://paperswithcode.com/paper/learning-and-inference-on-generative
Repo
Framework

SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems

Title SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems
Authors Kevin K. Bowden, Jiaqi Wu, Shereen Oraby, Amita Misra, Marilyn Walker
Abstract In dialogue systems, the tasks of named entity recognition (NER) and named entity linking (NEL) are vital preprocessing steps for understanding user intent, especially in open domain interaction where we cannot rely on domain-specific inference. UCSC’s effort as one of the funded teams in the 2017 Amazon Alexa Prize Contest has yielded Slugbot, an open domain social bot, aimed at casual conversation. We discovered several challenges specifically associated with both NER and NEL when building Slugbot, such as that the NE labels are too coarse-grained or the entity types are not linked to a useful ontology. Moreover, we have discovered that traditional approaches do not perform well in our context: even systems designed to operate on tweets or other social media data do not work well in dialogue systems. In this paper, we introduce Slugbot’s Named Entity Recognition for dialogue Systems (SlugNERDS), a NER and NEL tool which is optimized to address these issues. We describe two new resources that we are building as part of this work: SlugEntityDB and SchemaActuator. We believe these resources will be useful for the research community.
Tasks Entity Linking, Named Entity Recognition
Published 2018-05-10
URL http://arxiv.org/abs/1805.03784v1
PDF http://arxiv.org/pdf/1805.03784v1.pdf
PWC https://paperswithcode.com/paper/slugnerds-a-named-entity-recognition-tool-for
Repo
Framework

Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data

Title Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data
Authors Wei-Ning Hsu, James Glass
Abstract Multimodal sensory data resembles the form of information perceived by humans for learning, and are easy to obtain in large quantities. Compared to unimodal data, synchronization of concepts between modalities in such data provides supervision for disentangling the underlying explanatory factors of each modality. Previous work leveraging multimodal data has mainly focused on retaining only the modality-invariant factors while discarding the rest. In this paper, we present a partitioned variational autoencoder (PVAE) and several training objectives to learn disentangled representations, which encode not only the shared factors, but also modality-dependent ones, into separate latent variables. Specifically, PVAE integrates a variational inference framework and a multimodal generative model that partitions the explanatory factors and conditions only on the relevant subset of them for generation. We evaluate our model on two parallel speech/image datasets, and demonstrate its ability to learn disentangled representations by qualitatively exploring within-modality and cross-modality conditional generation with semantics and styles specified by examples. For quantitative analysis, we evaluate the classification accuracy of automatically discovered semantic units. Our PVAE can achieve over 99% accuracy on both modalities.
Tasks Representation Learning
Published 2018-05-29
URL http://arxiv.org/abs/1805.11264v1
PDF http://arxiv.org/pdf/1805.11264v1.pdf
PWC https://paperswithcode.com/paper/disentangling-by-partitioning-a
Repo
Framework

Game Semantics and Linear Logic in the Cognition Process

Title Game Semantics and Linear Logic in the Cognition Process
Authors Dmitry Maximov
Abstract A description of the environment cognition process by intelligent systems with a fixed set of system goals is suggested. Such a system is represented by the set of its goals only without any models of the system elements or the environment. The set has a lattice structure and a monoid structure; thus, the structure of linear logic is defined on the set. The cognition process of some environment by the system is described on this basis. The environment is represented as a configuration space of possible system positions which are estimated by an information amount (by corresponding sets). This information is supplied to the system by the environment. Thus, it is possible to define the category of Conway games with a payoff on the configuration space and to choose an optimal system’s play (i.e., a trajectory). The choice is determined by the requirement of maximal information increasing and takes into account the structure of the system goal set: the linear logic on the set is used to determine the priority of possible different parallel processes. The survey may be useful to describe the behavior of robots and simple biological systems, e.g., ants.
Tasks
Published 2018-12-27
URL http://arxiv.org/abs/1812.11969v2
PDF http://arxiv.org/pdf/1812.11969v2.pdf
PWC https://paperswithcode.com/paper/game-semantics-and-linear-logic-in-the
Repo
Framework

Data Augmentation for Robust Keyword Spotting under Playback Interference

Title Data Augmentation for Robust Keyword Spotting under Playback Interference
Authors Anirudh Raju, Sankaran Panchapagesan, Xing Liu, Arindam Mandal, Nikko Strom
Abstract Accurate on-device keyword spotting (KWS) with low false accept and false reject rate is crucial to customer experience for far-field voice control of conversational agents. It is particularly challenging to maintain low false reject rate in real world conditions where there is (a) ambient noise from external sources such as TV, household appliances, or other speech that is not directed at the device (b) imperfect cancellation of the audio playback from the device, resulting in residual echo, after being processed by the Acoustic Echo Cancellation (AEC) system. In this paper, we propose a data augmentation strategy to improve keyword spotting performance under these challenging conditions. The training set audio is artificially corrupted by mixing in music and TV/movie audio, at different signal to interference ratios. Our results show that we get around 30-45% relative reduction in false reject rates, at a range of false alarm rates, under audio playback from such devices.
Tasks Data Augmentation, Keyword Spotting
Published 2018-08-01
URL http://arxiv.org/abs/1808.00563v1
PDF http://arxiv.org/pdf/1808.00563v1.pdf
PWC https://paperswithcode.com/paper/data-augmentation-for-robust-keyword-spotting
Repo
Framework

A data-driven model order reduction approach for Stokes flow through random porous media

Title A data-driven model order reduction approach for Stokes flow through random porous media
Authors Constantin Grigo, Phaedon-Stelios Koutsourelakis
Abstract Direct numerical simulation of Stokes flow through an impermeable, rigid body matrix by finite elements requires meshes fine enough to resolve the pore-size scale and is thus a computationally expensive task. The cost is significantly amplified when randomness in the pore microstructure is present and therefore multiple simulations need to be carried out. It is well known that in the limit of scale-separation, Stokes flow can be accurately approximated by Darcy’s law with an effective diffusivity field depending on viscosity and the pore-matrix topology. We propose a fully probabilistic, Darcy-type, reduced-order model which, based on only a few tens of full-order Stokes model runs, is capable of learning a map from the fine-scale topology to the effective diffusivity and is maximally predictive of the fine-scale response. The reduced-order model learned can significantly accelerate uncertainty quantification tasks as well as provide quantitative confidence metrics of the predictive estimates produced.
Tasks
Published 2018-06-21
URL http://arxiv.org/abs/1806.08117v1
PDF http://arxiv.org/pdf/1806.08117v1.pdf
PWC https://paperswithcode.com/paper/a-data-driven-model-order-reduction-approach
Repo
Framework

Learning Real-World Robot Policies by Dreaming

Title Learning Real-World Robot Policies by Dreaming
Authors AJ Piergiovanni, Alan Wu, Michael S. Ryoo
Abstract Learning to control robots directly based on images is a primary challenge in robotics. However, many existing reinforcement learning approaches require iteratively obtaining millions of robot samples to learn a policy, which can take significant time. In this paper, we focus on learning a realistic world model capturing the dynamics of scene changes conditioned on robot actions. Our dreaming model can emulate samples equivalent to a sequence of images from the actual environment, technically by learning an action-conditioned future representation/scene regressor. This allows the agent to learn action policies (i.e., visuomotor policies) by interacting with the dreaming model rather than the real-world. We experimentally confirm that our dreaming model enables robot learning of policies that transfer to the real-world.
Tasks
Published 2018-05-20
URL https://arxiv.org/abs/1805.07813v4
PDF https://arxiv.org/pdf/1805.07813v4.pdf
PWC https://paperswithcode.com/paper/learning-real-world-robot-policies-by
Repo
Framework

A non-invertible cancelable fingerprint template generation based on ridge feature transformation

Title A non-invertible cancelable fingerprint template generation based on ridge feature transformation
Authors Rudresh Dwivedi, Somnath Dey
Abstract In a biometric verification system, leakage of biometric data leads to permanent identity loss since original biometric data is inherently linked to a user. Further, various types of attacks on a biometric system may reveal the original template and utility in other applications. To address these security and privacy concerns cancelable biometric has been introduced. Cancelable biometric constructs a protected template from the original biometric template using transformation functions and performs the comparison between templates in the transformed domain. Recent approaches towards cancelable fingerprint generation either rely on aligning minutiae points with respect to singular points (core/delta) or utilize the absolute coordinate positions of minutiae points. In this paper, we propose a novel non-invertible ridge feature transformation method to protect the original fingerprint template information. The proposed method partitions the fingerprint region into a number of sectors with reference to each minutia point employing a ridge-based co-ordinate system. The nearest neighbor minutiae in each sector are identified, and ridge-based features are computed. Further, a cancelable template is generated by applying the Cantor pairing function followed by random projection. We have evaluated our method with FVC2002, FVC2004 and FVC2006 databases. It is evident from the experimental results that the proposed method outperforms existing methods in the literature. Moreover, the security analysis demonstrates that the proposed method fulfills the necessary requirements of non-invertibility, revocability, and diversity with a minor performance degradation caused due to cancelable transformation.
Tasks
Published 2018-05-28
URL http://arxiv.org/abs/1805.10853v1
PDF http://arxiv.org/pdf/1805.10853v1.pdf
PWC https://paperswithcode.com/paper/a-non-invertible-cancelable-fingerprint
Repo
Framework

Momen(e)t: Flavor the Moments in Learning to Classify Shapes

Title Momen(e)t: Flavor the Moments in Learning to Classify Shapes
Authors Mor Joseph-Rivlin, Alon Zvirin, Ron Kimmel
Abstract A fundamental question in learning to classify 3D shapes is how to treat the data in a way that would allow us to construct efficient and accurate geometric processing and analysis procedures. Here, we restrict ourselves to networks that operate on point clouds. There were several attempts to treat point clouds as non-structured data sets by which a neural network is trained to extract discriminative properties. The idea of using 3D coordinates as class identifiers motivated us to extend this line of thought to that of shape classification by comparing attributes that could easily account for the shape moments. Here, we propose to add polynomial functions of the coordinates allowing the network to account for higher order moments of a given shape. Experiments on two benchmarks show that the suggested network is able to provide state of the art results and at the same token learn more efficiently in terms of memory and computational complexity.
Tasks
Published 2018-12-18
URL https://arxiv.org/abs/1812.07431v2
PDF https://arxiv.org/pdf/1812.07431v2.pdf
PWC https://paperswithcode.com/paper/mo-net-flavor-the-moments-in-learning-to
Repo
Framework

TAN: Temporal Aggregation Network for Dense Multi-label Action Recognition

Title TAN: Temporal Aggregation Network for Dense Multi-label Action Recognition
Authors Xiyang Dai, Bharat Singh, Joe Yue-Hei Ng, Larry S. Davis
Abstract We present Temporal Aggregation Network (TAN) which decomposes 3D convolutions into spatial and temporal aggregation blocks. By stacking spatial and temporal convolutions repeatedly, TAN forms a deep hierarchical representation for capturing spatio-temporal information in videos. Since we do not apply 3D convolutions in each layer but only apply temporal aggregation blocks once after each spatial downsampling layer in the network, we significantly reduce the model complexity. The use of dilated convolutions at different resolutions of the network helps in aggregating multi-scale spatio-temporal information efficiently. Experiments show that our model is well suited for dense multi-label action recognition, which is a challenging sub-topic of action recognition that requires predicting multiple action labels in each frame. We outperform state-of-the-art methods by 5% and 3% on the Charades and Multi-THUMOS dataset respectively.
Tasks Temporal Action Localization
Published 2018-12-14
URL http://arxiv.org/abs/1812.06203v1
PDF http://arxiv.org/pdf/1812.06203v1.pdf
PWC https://paperswithcode.com/paper/tan-temporal-aggregation-network-for-dense
Repo
Framework

A Neural Network Study of Blasius Equation

Title A Neural Network Study of Blasius Equation
Authors Halil Mutuk
Abstract In this work we applied a feed forward neural network to solve Blasius equation which is a third-order nonlinear differential equation. Blasius equation is a kind of boundary layer flow. We solved Blasius equation without reducing it into a system of first order equation. Numerical results are presented and a comparison according to some studies is made in the form of their results. Obtained results are found to be in good agreement with the given studies.
Tasks
Published 2018-11-08
URL https://arxiv.org/abs/1811.08936v2
PDF https://arxiv.org/pdf/1811.08936v2.pdf
PWC https://paperswithcode.com/paper/a-neural-network-study-of-blasius-equation
Repo
Framework

Limitations of the Lipschitz constant as a defense against adversarial examples

Title Limitations of the Lipschitz constant as a defense against adversarial examples
Authors Todd Huster, Cho-Yu Jason Chiang, Ritu Chadha
Abstract Several recent papers have discussed utilizing Lipschitz constants to limit the susceptibility of neural networks to adversarial examples. We analyze recently proposed methods for computing the Lipschitz constant. We show that the Lipschitz constant may indeed enable adversarially robust neural networks. However, the methods currently employed for computing it suffer from theoretical and practical limitations. We argue that addressing this shortcoming is a promising direction for future research into certified adversarial defenses.
Tasks
Published 2018-07-25
URL http://arxiv.org/abs/1807.09705v1
PDF http://arxiv.org/pdf/1807.09705v1.pdf
PWC https://paperswithcode.com/paper/limitations-of-the-lipschitz-constant-as-a
Repo
Framework

DAC-SDC Low Power Object Detection Challenge for UAV Applications

Title DAC-SDC Low Power Object Detection Challenge for UAV Applications
Authors Xiaowei Xu, Xinyi Zhang, Bei Yu, X. Sharon Hu, Christopher Rowen, Jingtong Hu, Yiyu Shi
Abstract The 55th Design Automation Conference (DAC) held its first System Design Contest (SDC) in 2018. SDC’18 features a lower power object detection challenge (LPODC) on designing and implementing novel algorithms based object detection in images taken from unmanned aerial vehicles (UAV). The dataset includes 95 categories and 150k images, and the hardware platforms include Nvidia’s TX2 and Xilinx’s PYNQ Z1. DAC-SDC’18 attracted more than 110 entries from 12 countries. This paper presents in detail the dataset and evaluation procedure. It further discusses the methods developed by some of the entries as well as representative results. The paper concludes with directions for future improvements.
Tasks Object Detection
Published 2018-09-01
URL http://arxiv.org/abs/1809.00110v1
PDF http://arxiv.org/pdf/1809.00110v1.pdf
PWC https://paperswithcode.com/paper/dac-sdc-low-power-object-detection-challenge
Repo
Framework

The promises and pitfalls of Stochastic Gradient Langevin Dynamics

Title The promises and pitfalls of Stochastic Gradient Langevin Dynamics
Authors Nicolas Brosse, Alain Durmus, Eric Moulines
Abstract Stochastic Gradient Langevin Dynamics (SGLD) has emerged as a key MCMC algorithm for Bayesian learning from large scale datasets. While SGLD with decreasing step sizes converges weakly to the posterior distribution, the algorithm is often used with a constant step size in practice and has demonstrated successes in machine learning tasks. The current practice is to set the step size inversely proportional to $N$ where $N$ is the number of training samples. As $N$ becomes large, we show that the SGLD algorithm has an invariant probability measure which significantly departs from the target posterior and behaves like Stochastic Gradient Descent (SGD). This difference is inherently due to the high variance of the stochastic gradients. Several strategies have been suggested to reduce this effect; among them, SGLD Fixed Point (SGLDFP) uses carefully designed control variates to reduce the variance of the stochastic gradients. We show that SGLDFP gives approximate samples from the posterior distribution, with an accuracy comparable to the Langevin Monte Carlo (LMC) algorithm for a computational cost sublinear in the number of data points. We provide a detailed analysis of the Wasserstein distances between LMC, SGLD, SGLDFP and SGD and explicit expressions of the means and covariance matrices of their invariant distributions. Our findings are supported by limited numerical experiments.
Tasks
Published 2018-11-25
URL http://arxiv.org/abs/1811.10072v1
PDF http://arxiv.org/pdf/1811.10072v1.pdf
PWC https://paperswithcode.com/paper/the-promises-and-pitfalls-of-stochastic
Repo
Framework
comments powered by Disqus