October 19, 2019

3370 words 16 mins read

Paper Group ANR 300

Paper Group ANR 300

Towards Human Pulse Rate Estimation from Face Video: Automatic Component Selection and Comparison of Blind Source Separation Methods. Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition. Risk-Aware Resource Allocation for URLLC: Challenges and Strategies with Machine Learning. Zero-Shot Visual Imitation. Human-Machine Co …

Towards Human Pulse Rate Estimation from Face Video: Automatic Component Selection and Comparison of Blind Source Separation Methods

Title Towards Human Pulse Rate Estimation from Face Video: Automatic Component Selection and Comparison of Blind Source Separation Methods
Authors Vladislav Ostankovich, Geesara Prathap, Ilya Afanasyev
Abstract Human heartbeat can be measured using several different ways appropriately based on the patient condition which includes contact base such as measured by using instruments and non-contact base such as computer vision assisted techniques. Non-contact based approached are getting popular due to those techniques are capable of mitigating some of the limitations of contact-based techniques especially in clinical section. However, existing vision guided approaches are not able to prove high accurate result due to various reason such as the property of camera, illumination changes, skin tones in face image, etc. We propose a technique that uses video as an input and returns pulse rate in output. Initially, key point detection is carried out on two facial subregions: forehead and nose-mouth. After removing unstable features, the temporal filtering is applied to isolate frequencies of interest. Then four component analysis methods are employed in order to distinguish the cardiovascular pulse signal from extraneous noise caused by respiration, vestibular activity and other changes in facial expression. Afterwards, proposed peak detection technique is applied for each component which extracted from one of the four different component selection algorithms. This will enable to locate the positions of peaks in each component. Proposed automatic components selection technique is employed in order to select an optimal component which will be used to calculate the heartbeat. Finally, we conclude with a comparison of four component analysis methods (PCA, FastICA, JADE, SHIBBS), processing face video datasets of fifteen volunteers with verification by an ECG/EKG Workstation as a ground truth.
Tasks
Published 2018-10-28
URL http://arxiv.org/abs/1810.11770v1
PDF http://arxiv.org/pdf/1810.11770v1.pdf
PWC https://paperswithcode.com/paper/towards-human-pulse-rate-estimation-from-face
Repo
Framework

Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition

Title Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition
Authors Maziar Sanjabi, Meisam Razaviyayn, Jason D. Lee
Abstract In this short note, we consider the problem of solving a min-max zero-sum game. This problem has been extensively studied in the convex-concave regime where the global solution can be computed efficiently. Recently, there have also been developments for finding the first order stationary points of the game when one of the player’s objective is concave or (weakly) concave. This work focuses on the non-convex non-concave regime where the objective of one of the players satisfies Polyak-{\L}ojasiewicz (PL) Condition. For such a game, we show that a simple multi-step gradient descent-ascent algorithm finds an $\varepsilon$–first order stationary point of the problem in $\widetilde{\mathcal{O}}(\varepsilon^{-2})$ iterations.
Tasks
Published 2018-12-07
URL http://arxiv.org/abs/1812.02878v1
PDF http://arxiv.org/pdf/1812.02878v1.pdf
PWC https://paperswithcode.com/paper/solving-non-convex-non-concave-min-max-games
Repo
Framework

Risk-Aware Resource Allocation for URLLC: Challenges and Strategies with Machine Learning

Title Risk-Aware Resource Allocation for URLLC: Challenges and Strategies with Machine Learning
Authors Amin Azari, Mustafa Ozger, Cicek Cavdar
Abstract Supporting ultra-reliable low-latency communications (URLLC) is a major challenge of 5G wireless networks. Stringent delay and reliability requirements need to be satisfied for both scheduled and non-scheduled URLLC traffic to enable a diverse set of 5G applications. Although physical and media access control layer solutions have been investigated to satisfy only scheduled URLLC traffic, there is a lack of study on enabling transmission of non-scheduled URLLC traffic, especially in coexistence with the scheduled URLLC traffic. Machine learning (ML) is an important enabler for such a co-existence scenario due to its ability to exploit spatial/temporal correlation in user behaviors and use of radio resources. Hence, in this paper, we first study the coexistence design challenges, especially the radio resource management (RRM) problem and propose a distributed risk-aware ML solution for RRM. The proposed solution benefits from hybrid orthogonal/non-orthogonal radio resource slicing, and proactively regulates the spectrum needed for satisfying delay/reliability requirement of each URLLC traffic type. A case study is introduced to investigate the potential of the proposed RRM in serving coexisting URLLC traffic types. The results further provide insights on the benefits of leveraging intelligent RRM, e.g. a 75% increase in data rate with respect to the conservative design approach for the scheduled traffic is achieved, while the 99.99% reliability of both scheduled and nonscheduled traffic types is satisfied.
Tasks
Published 2018-12-22
URL http://arxiv.org/abs/1901.04292v1
PDF http://arxiv.org/pdf/1901.04292v1.pdf
PWC https://paperswithcode.com/paper/risk-aware-resource-allocation-for-urllc
Repo
Framework

Zero-Shot Visual Imitation

Title Zero-Shot Visual Imitation
Authors Deepak Pathak, Parsa Mahmoudieh, Guanghao Luo, Pulkit Agrawal, Dian Chen, Yide Shentu, Evan Shelhamer, Jitendra Malik, Alexei A. Efros, Trevor Darrell
Abstract The current dominant paradigm for imitation learning relies on strong supervision of expert actions to learn both ‘what’ and ‘how’ to imitate. We pursue an alternative paradigm wherein an agent first explores the world without any expert supervision and then distills its experience into a goal-conditioned skill policy with a novel forward consistency loss. In our framework, the role of the expert is only to communicate the goals (i.e., what to imitate) during inference. The learned policy is then employed to mimic the expert (i.e., how to imitate) after seeing just a sequence of images demonstrating the desired task. Our method is ‘zero-shot’ in the sense that the agent never has access to expert actions during training or for the task demonstration at inference. We evaluate our zero-shot imitator in two real-world settings: complex rope manipulation with a Baxter robot and navigation in previously unseen office environments with a TurtleBot. Through further experiments in VizDoom simulation, we provide evidence that better mechanisms for exploration lead to learning a more capable policy which in turn improves end task performance. Videos, models, and more details are available at https://pathak22.github.io/zeroshot-imitation/
Tasks Imitation Learning
Published 2018-04-23
URL http://arxiv.org/abs/1804.08606v1
PDF http://arxiv.org/pdf/1804.08606v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-visual-imitation
Repo
Framework

Human-Machine Collaborative Optimization via Apprenticeship Scheduling

Title Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Authors Matthew Gombolay, Reed Jensen, Jessica Stigile, Toni Golen, Neel Shah, Sung-Hyun Son, Julie Shah
Abstract Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee” apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator. |
Tasks Decision Making
Published 2018-05-11
URL http://arxiv.org/abs/1805.04220v1
PDF http://arxiv.org/pdf/1805.04220v1.pdf
PWC https://paperswithcode.com/paper/human-machine-collaborative-optimization-via
Repo
Framework

Enhanced Word Representations for Bridging Anaphora Resolution

Title Enhanced Word Representations for Bridging Anaphora Resolution
Authors Yufang Hou
Abstract Most current models of word representations(e.g.,GloVe) have successfully captured fine-grained semantics. However, semantic similarity exhibited in these word embeddings is not suitable for resolving bridging anaphora, which requires the knowledge of associative similarity (i.e., relatedness) instead of semantic similarity information between synonyms or hypernyms. We create word embeddings (embeddings_PP) to capture such relatedness by exploring the syntactic structure of noun phrases. We demonstrate that using embeddings_PP alone achieves around 30% of accuracy for bridging anaphora resolution on the ISNotes corpus. Furthermore, we achieve a substantial gain over the state-of-the-art system (Hou et al., 2013) for bridging antecedent selection.
Tasks Bridging Anaphora Resolution, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published 2018-03-13
URL http://arxiv.org/abs/1803.04790v2
PDF http://arxiv.org/pdf/1803.04790v2.pdf
PWC https://paperswithcode.com/paper/enhanced-word-representations-for-bridging
Repo
Framework

TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation

Title TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation
Authors Alexander R. Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield, Dragomir R. Radev
Abstract The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address this situation, we introduce TutorialBank, a new, publicly available dataset which aims to facilitate NLP education and research. We have manually collected and categorized over 6,300 resources on NLP as well as the related fields of Artificial Intelligence (AI), Machine Learning (ML) and Information Retrieval (IR). Our dataset is notably the largest manually-picked corpus of resources intended for NLP education which does not include only academic papers. Additionally, we have created both a search engine and a command-line tool for the resources and have annotated the corpus to include lists of research topics, relevant resources for each topic, prerequisite relations among topics, relevant sub-parts of individual resources, among other annotations. We are releasing the dataset and present several avenues for further research.
Tasks Information Retrieval
Published 2018-05-11
URL http://arxiv.org/abs/1805.04617v1
PDF http://arxiv.org/pdf/1805.04617v1.pdf
PWC https://paperswithcode.com/paper/tutorialbank-a-manually-collected-corpus-for
Repo
Framework

Cascaded Pyramid Mining Network for Weakly Supervised Temporal Action Localization

Title Cascaded Pyramid Mining Network for Weakly Supervised Temporal Action Localization
Authors Haisheng Su, Xu Zhao, Tianwei Lin
Abstract Weakly supervised temporal action localization, which aims at temporally locating action instances in untrimmed videos using only video-level class labels during training, is an important yet challenging problem in video analysis. Many current methods adopt the “localization by classification” framework: first do video classification, then locate temporal area contributing to the results most. However, this framework fails to locate the entire action instances and gives little consideration to the local context. In this paper, we present a novel architecture called Cascaded Pyramid Mining Network (CPMN) to address these issues using two effective modules. First, to discover the entire temporal interval of specific action, we design a two-stage cascaded module with proposed Online Adversarial Erasing (OAE) mechanism, where new and complementary regions are mined through feeding the erased feature maps of discovered regions back to the system. Second, to exploit hierarchical contextual information in videos and reduce missing detections, we design a pyramid module which produces a scale-invariant attention map through combining the feature maps from different levels. Final, we aggregate the results of two modules to perform action localization via locating high score areas in temporal Class Activation Sequence (CAS). Extensive experiments conducted on THUMOS14 and ActivityNet-1.3 datasets demonstrate the effectiveness of our method.
Tasks Action Localization, Temporal Action Localization, Video Classification, Weakly-supervised Temporal Action Localization
Published 2018-10-28
URL http://arxiv.org/abs/1810.11794v1
PDF http://arxiv.org/pdf/1810.11794v1.pdf
PWC https://paperswithcode.com/paper/cascaded-pyramid-mining-network-for-weakly
Repo
Framework

Fine-grained Video Categorization with Redundancy Reduction Attention

Title Fine-grained Video Categorization with Redundancy Reduction Attention
Authors Chen Zhu, Xiao Tan, Feng Zhou, Xiao Liu, Kaiyu Yue, Errui Ding, Yi Ma
Abstract For fine-grained categorization tasks, videos could serve as a better source than static images as videos have a higher chance of containing discriminative patterns. Nevertheless, a video sequence could also contain a lot of redundant and irrelevant frames. How to locate critical information of interest is a challenging task. In this paper, we propose a new network structure, known as Redundancy Reduction Attention (RRA), which learns to focus on multiple discriminative patterns by sup- pressing redundant feature channels. Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform. Suppression is achieved by modulating the feature maps and threshing out weak activations. The updated feature maps are then used in the next iteration. Finally, the video is classified based on multiple summaries. The proposed method achieves out- standing performances in multiple video classification datasets. Further- more, we have collected two large-scale video datasets, YouTube-Birds and YouTube-Cars, for future researches on fine-grained video categorization. The datasets are available at http://www.cs.umd.edu/~chenzhu/fgvc.
Tasks Video Classification
Published 2018-10-26
URL http://arxiv.org/abs/1810.11189v1
PDF http://arxiv.org/pdf/1810.11189v1.pdf
PWC https://paperswithcode.com/paper/fine-grained-video-categorization-with
Repo
Framework

Learning discrete Bayesian networks in polynomial time and sample complexity

Title Learning discrete Bayesian networks in polynomial time and sample complexity
Authors Adarsh Barik, Jean Honorio
Abstract In this paper, we study the problem of structure learning for Bayesian networks in which nodes take discrete values. The problem is NP-hard in general but we show that under certain conditions we can recover the true structure of a Bayesian network with sufficient number of samples. We develop a mathematical model which does not assume any specific conditional probability distributions for the nodes. We use a primal-dual witness construction to prove that, under some technical conditions on the interaction between node pairs, we can do exact recovery of the parents and children of a node by performing group l_12-regularized multivariate regression. Thus, we recover the true Bayesian network structure. If degree of a node is bounded then the sample complexity of our proposed approach grows logarithmically with respect to the number of nodes in the Bayesian network. Furthermore, our method runs in polynomial time.
Tasks
Published 2018-03-12
URL http://arxiv.org/abs/1803.04087v3
PDF http://arxiv.org/pdf/1803.04087v3.pdf
PWC https://paperswithcode.com/paper/learning-discrete-bayesian-networks-in
Repo
Framework

Unique Identification of Macaques for Population Monitoring and Control

Title Unique Identification of Macaques for Population Monitoring and Control
Authors Ankita Shukla, Gullal Singh Cheema, Saket Anand, Qamar Qureshi, Yadvendradev Jhala
Abstract Despite loss of natural habitat due to development and urbanization, certain species like the Rhesus macaque have adapted well to the urban environment. With abundant food and no predators, macaque populations have increased substantially in urban areas, leading to frequent conflicts with humans. Overpopulated areas often witness macaques raiding crops, feeding on bird and snake eggs as well as destruction of nests, thus adversely affecting other species in the ecosystem. In order to mitigate these adverse effects, sterilization has emerged as a humane and effective way of population control of macaques. As sterilization requires physical capture of individuals or groups, their unique identification is integral to such control measures. In this work, we propose the Macaque Face Identification (MFID), an image based, non-invasive tool that relies on macaque facial recognition to identify individuals, and can be used to verify if they are sterilized. Our primary contribution is a robust facial recognition and verification module designed for Rhesus macaques, but extensible to other non-human primate species. We evaluate the performance of MFID on a dataset of 93 monkeys under closed set, open set and verification evaluation protocols. Finally, we also report state of the art results when evaluating our proposed model on endangered primate species.
Tasks Face Identification
Published 2018-11-02
URL http://arxiv.org/abs/1811.00743v2
PDF http://arxiv.org/pdf/1811.00743v2.pdf
PWC https://paperswithcode.com/paper/unique-identification-of-macaques-for
Repo
Framework

Hierarchical Motion Consistency Constraint for Efficient Geometrical Verification in UAV Image Matching

Title Hierarchical Motion Consistency Constraint for Efficient Geometrical Verification in UAV Image Matching
Authors San Jiang, Wanshou Jiang
Abstract This paper proposes a strategy for efficient geometrical verification in unmanned aerial vehicle (UAV) image matching. First, considering the complex transformation model between correspondence set in the image-space, feature points of initial candidate matches are projected onto an elevation plane in the object-space, with assistant of UAV flight control data and camera mounting angles. Spatial relationships are simplified as a 2D-translation in which a motion establishes the relation of two correspondence points. Second, a hierarchical motion consistency constraint, termed HMCC, is designed to eliminate outliers from initial candidate matches, which includes three major steps, namely the global direction consistency constraint, the local direction-change consistency constraint and the global length consistency constraint. To cope with scenarios with high outlier ratios, the HMCC is achieved by using a voting scheme. Finally, an efficient geometrical verification strategy is proposed by using the HMCC as a pre-processing step to increase inlier ratios before the consequent application of the basic RANSAC algorithm. The performance of the proposed strategy is verified through comprehensive comparison and analysis by using real UAV datasets captured with different photogrammetric systems. Experimental results demonstrate that the generated motions have noticeable separation ability, and the HMCC-RANSAC algorithm can efficiently eliminate outliers based on the motion consistency constraint, with a speedup ratio reaching to 6 for oblique UAV images. Even though the completeness sacrifice of approximately 7 percent of points is observed from image orientation tests, competitive orientation accuracy is achieved from all used datasets. For geometrical verification of both nadir and oblique UAV images, the proposed method can be a more efficient solution.
Tasks
Published 2018-01-12
URL http://arxiv.org/abs/1801.04096v1
PDF http://arxiv.org/pdf/1801.04096v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-motion-consistency-constraint
Repo
Framework

Meta-Learning for Low-Resource Neural Machine Translation

Title Meta-Learning for Low-Resource Neural Machine Translation
Authors Jiatao Gu, Yong Wang, Yun Chen, Kyunghyun Cho, Victor O. K. Li
Abstract In this paper, we propose to extend the recently introduced model-agnostic meta-learning algorithm (MAML) for low-resource neural machine translation (NMT). We frame low-resource translation as a meta-learning problem, and we learn to adapt to low-resource languages based on multilingual high-resource language tasks. We use the universal lexical representation~\citep{gu2018universal} to overcome the input-output mismatch across different languages. We evaluate the proposed meta-learning strategy using eighteen European languages (Bg, Cs, Da, De, El, Es, Et, Fr, Hu, It, Lt, Nl, Pl, Pt, Sk, Sl, Sv and Ru) as source tasks and five diverse languages (Ro, Lv, Fi, Tr and Ko) as target tasks. We show that the proposed approach significantly outperforms the multilingual, transfer learning based approach~\citep{zoph2016transfer} and enables us to train a competitive NMT system with only a fraction of training examples. For instance, the proposed approach can achieve as high as 22.04 BLEU on Romanian-English WMT’16 by seeing only 16,000 translated words (~600 parallel sentences).
Tasks Low-Resource Neural Machine Translation, Machine Translation, Meta-Learning, Transfer Learning
Published 2018-08-25
URL http://arxiv.org/abs/1808.08437v1
PDF http://arxiv.org/pdf/1808.08437v1.pdf
PWC https://paperswithcode.com/paper/meta-learning-for-low-resource-neural-machine
Repo
Framework

High-Dynamic-Range Imaging for Cloud Segmentation

Title High-Dynamic-Range Imaging for Cloud Segmentation
Authors Soumyabrata Dev, Florian M. Savoy, Yee Hui Lee, Stefan Winkler
Abstract Sky/cloud images obtained from ground-based sky-cameras are usually captured using a fish-eye lens with a wide field of view. However, the sky exhibits a large dynamic range in terms of luminance, more than a conventional camera can capture. It is thus difficult to capture the details of an entire scene with a regular camera in a single shot. In most cases, the circumsolar region is over-exposed, and the regions near the horizon are under-exposed. This renders cloud segmentation for such images difficult. In this paper, we propose HDRCloudSeg – an effective method for cloud segmentation using High-Dynamic-Range (HDR) imaging based on multi-exposure fusion. We describe the HDR image generation process and release a new database to the community for benchmarking. Our proposed approach is the first using HDR radiance maps for cloud segmentation and achieves very good results.
Tasks Image Generation
Published 2018-03-02
URL http://arxiv.org/abs/1803.01071v1
PDF http://arxiv.org/pdf/1803.01071v1.pdf
PWC https://paperswithcode.com/paper/high-dynamic-range-imaging-for-cloud
Repo
Framework

A new model for Cerebellar computation

Title A new model for Cerebellar computation
Authors Reza Moazzezi
Abstract The standard state space model is widely believed to account for the cerebellar computation in motor adaptation tasks [1]. Here we show that several recent experiments [2-4] where the visual feedback is irrelevant to the motor response challenge the standard model. Furthermore, we propose a new model that accounts for the the results presented in [2-4]. According to this new model, learning and forgetting are coupled and are error size dependent. We also show that under reasonable assumptions, our proposed model is the only model that accounts for both the classical adaptation paradigm as well as the recent experiments [2-4].
Tasks
Published 2018-02-22
URL http://arxiv.org/abs/1802.08217v1
PDF http://arxiv.org/pdf/1802.08217v1.pdf
PWC https://paperswithcode.com/paper/a-new-model-for-cerebellar-computation
Repo
Framework
comments powered by Disqus