October 19, 2019

3370 words 16 mins read

Paper Group ANR 300

Towards Human Pulse Rate Estimation from Face Video: Automatic Component Selection and Comparison of Blind Source Separation Methods. Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition. Risk-Aware Resource Allocation for URLLC: Challenges and Strategies with Machine Learning. Zero-Shot Visual Imitation. Human-Machine Co …


Title	Towards Human Pulse Rate Estimation from Face Video: Automatic Component Selection and Comparison of Blind Source Separation Methods
Authors	Vladislav Ostankovich, Geesara Prathap, Ilya Afanasyev
Abstract	Human heartbeat can be measured using several different ways appropriately based on the patient condition which includes contact base such as measured by using instruments and non-contact base such as computer vision assisted techniques. Non-contact based approached are getting popular due to those techniques are capable of mitigating some of the limitations of contact-based techniques especially in clinical section. However, existing vision guided approaches are not able to prove high accurate result due to various reason such as the property of camera, illumination changes, skin tones in face image, etc. We propose a technique that uses video as an input and returns pulse rate in output. Initially, key point detection is carried out on two facial subregions: forehead and nose-mouth. After removing unstable features, the temporal filtering is applied to isolate frequencies of interest. Then four component analysis methods are employed in order to distinguish the cardiovascular pulse signal from extraneous noise caused by respiration, vestibular activity and other changes in facial expression. Afterwards, proposed peak detection technique is applied for each component which extracted from one of the four different component selection algorithms. This will enable to locate the positions of peaks in each component. Proposed automatic components selection technique is employed in order to select an optimal component which will be used to calculate the heartbeat. Finally, we conclude with a comparison of four component analysis methods (PCA, FastICA, JADE, SHIBBS), processing face video datasets of fifteen volunteers with verification by an ECG/EKG Workstation as a ground truth.
Tasks
Published	2018-10-28
URL	http://arxiv.org/abs/1810.11770v1
PDF	http://arxiv.org/pdf/1810.11770v1.pdf
PWC	https://paperswithcode.com/paper/towards-human-pulse-rate-estimation-from-face
Repo
Framework

Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition


Title	Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition
Authors	Maziar Sanjabi, Meisam Razaviyayn, Jason D. Lee
Abstract	In this short note, we consider the problem of solving a min-max zero-sum game. This problem has been extensively studied in the convex-concave regime where the global solution can be computed efficiently. Recently, there have also been developments for finding the first order stationary points of the game when one of the player’s objective is concave or (weakly) concave. This work focuses on the non-convex non-concave regime where the objective of one of the players satisfies Polyak-{\L}ojasiewicz (PL) Condition. For such a game, we show that a simple multi-step gradient descent-ascent algorithm finds an $\varepsilon$–first order stationary point of the problem in $\widetilde{\mathcal{O}}(\varepsilon^{-2})$ iterations.
Tasks
Published	2018-12-07
URL	http://arxiv.org/abs/1812.02878v1
PDF	http://arxiv.org/pdf/1812.02878v1.pdf
PWC	https://paperswithcode.com/paper/solving-non-convex-non-concave-min-max-games
Repo
Framework

Risk-Aware Resource Allocation for URLLC: Challenges and Strategies with Machine Learning


Title	Risk-Aware Resource Allocation for URLLC: Challenges and Strategies with Machine Learning
Authors	Amin Azari, Mustafa Ozger, Cicek Cavdar
Abstract	Supporting ultra-reliable low-latency communications (URLLC) is a major challenge of 5G wireless networks. Stringent delay and reliability requirements need to be satisfied for both scheduled and non-scheduled URLLC traffic to enable a diverse set of 5G applications. Although physical and media access control layer solutions have been investigated to satisfy only scheduled URLLC traffic, there is a lack of study on enabling transmission of non-scheduled URLLC traffic, especially in coexistence with the scheduled URLLC traffic. Machine learning (ML) is an important enabler for such a co-existence scenario due to its ability to exploit spatial/temporal correlation in user behaviors and use of radio resources. Hence, in this paper, we first study the coexistence design challenges, especially the radio resource management (RRM) problem and propose a distributed risk-aware ML solution for RRM. The proposed solution benefits from hybrid orthogonal/non-orthogonal radio resource slicing, and proactively regulates the spectrum needed for satisfying delay/reliability requirement of each URLLC traffic type. A case study is introduced to investigate the potential of the proposed RRM in serving coexisting URLLC traffic types. The results further provide insights on the benefits of leveraging intelligent RRM, e.g. a 75% increase in data rate with respect to the conservative design approach for the scheduled traffic is achieved, while the 99.99% reliability of both scheduled and nonscheduled traffic types is satisfied.
Tasks
Published	2018-12-22
URL	http://arxiv.org/abs/1901.04292v1
PDF	http://arxiv.org/pdf/1901.04292v1.pdf
PWC	https://paperswithcode.com/paper/risk-aware-resource-allocation-for-urllc
Repo
Framework

Zero-Shot Visual Imitation


Title	Zero-Shot Visual Imitation
Authors	Deepak Pathak, Parsa Mahmoudieh, Guanghao Luo, Pulkit Agrawal, Dian Chen, Yide Shentu, Evan Shelhamer, Jitendra Malik, Alexei A. Efros, Trevor Darrell
Abstract	The current dominant paradigm for imitation learning relies on strong supervision of expert actions to learn both ‘what’ and ‘how’ to imitate. We pursue an alternative paradigm wherein an agent first explores the world without any expert supervision and then distills its experience into a goal-conditioned skill policy with a novel forward consistency loss. In our framework, the role of the expert is only to communicate the goals (i.e., what to imitate) during inference. The learned policy is then employed to mimic the expert (i.e., how to imitate) after seeing just a sequence of images demonstrating the desired task. Our method is ‘zero-shot’ in the sense that the agent never has access to expert actions during training or for the task demonstration at inference. We evaluate our zero-shot imitator in two real-world settings: complex rope manipulation with a Baxter robot and navigation in previously unseen office environments with a TurtleBot. Through further experiments in VizDoom simulation, we provide evidence that better mechanisms for exploration lead to learning a more capable policy which in turn improves end task performance. Videos, models, and more details are available at https://pathak22.github.io/zeroshot-imitation/
Tasks	Imitation Learning
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08606v1
PDF	http://arxiv.org/pdf/1804.08606v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-visual-imitation
Repo
Framework

Human-Machine Collaborative Optimization via Apprenticeship Scheduling


Title	Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Authors	Matthew Gombolay, Reed Jensen, Jessica Stigile, Toni Golen, Neel Shah, Sung-Hyun Son, Julie Shah
Abstract	Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee” apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator. \|
Tasks	Decision Making
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04220v1
PDF	http://arxiv.org/pdf/1805.04220v1.pdf
PWC	https://paperswithcode.com/paper/human-machine-collaborative-optimization-via
Repo
Framework

Enhanced Word Representations for Bridging Anaphora Resolution


Title	Enhanced Word Representations for Bridging Anaphora Resolution
Authors	Yufang Hou
Abstract	Most current models of word representations(e.g.,GloVe) have successfully captured fine-grained semantics. However, semantic similarity exhibited in these word embeddings is not suitable for resolving bridging anaphora, which requires the knowledge of associative similarity (i.e., relatedness) instead of semantic similarity information between synonyms or hypernyms. We create word embeddings (embeddings_PP) to capture such relatedness by exploring the syntactic structure of noun phrases. We demonstrate that using embeddings_PP alone achieves around 30% of accuracy for bridging anaphora resolution on the ISNotes corpus. Furthermore, we achieve a substantial gain over the state-of-the-art system (Hou et al., 2013) for bridging antecedent selection.
Tasks	Bridging Anaphora Resolution, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published	2018-03-13
URL	http://arxiv.org/abs/1803.04790v2
PDF	http://arxiv.org/pdf/1803.04790v2.pdf
PWC	https://paperswithcode.com/paper/enhanced-word-representations-for-bridging
Repo
Framework

TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation


Title	TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation
Authors	Alexander R. Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield, Dragomir R. Radev
Abstract	The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address this situation, we introduce TutorialBank, a new, publicly available dataset which aims to facilitate NLP education and research. We have manually collected and categorized over 6,300 resources on NLP as well as the related fields of Artificial Intelligence (AI), Machine Learning (ML) and Information Retrieval (IR). Our dataset is notably the largest manually-picked corpus of resources intended for NLP education which does not include only academic papers. Additionally, we have created both a search engine and a command-line tool for the resources and have annotated the corpus to include lists of research topics, relevant resources for each topic, prerequisite relations among topics, relevant sub-parts of individual resources, among other annotations. We are releasing the dataset and present several avenues for further research.
Tasks	Information Retrieval
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04617v1
PDF	http://arxiv.org/pdf/1805.04617v1.pdf
PWC	https://paperswithcode.com/paper/tutorialbank-a-manually-collected-corpus-for
Repo
Framework

Cascaded Pyramid Mining Network for Weakly Supervised Temporal Action Localization


Title	Cascaded Pyramid Mining Network for Weakly Supervised Temporal Action Localization
Authors	Haisheng Su, Xu Zhao, Tianwei Lin
Abstract	Weakly supervised temporal action localization, which aims at temporally locating action instances in untrimmed videos using only video-level class labels during training, is an important yet challenging problem in video analysis. Many current methods adopt the “localization by classification” framework: first do video classification, then locate temporal area contributing to the results most. However, this framework fails to locate the entire action instances and gives little consideration to the local context. In this paper, we present a novel architecture called Cascaded Pyramid Mining Network (CPMN) to address these issues using two effective modules. First, to discover the entire temporal interval of specific action, we design a two-stage cascaded module with proposed Online Adversarial Erasing (OAE) mechanism, where new and complementary regions are mined through feeding the erased feature maps of discovered regions back to the system. Second, to exploit hierarchical contextual information in videos and reduce missing detections, we design a pyramid module which produces a scale-invariant attention map through combining the feature maps from different levels. Final, we aggregate the results of two modules to perform action localization via locating high score areas in temporal Class Activation Sequence (CAS). Extensive experiments conducted on THUMOS14 and ActivityNet-1.3 datasets demonstrate the effectiveness of our method.
Tasks	Action Localization, Temporal Action Localization, Video Classification, Weakly-supervised Temporal Action Localization
Published	2018-10-28
URL	http://arxiv.org/abs/1810.11794v1
PDF	http://arxiv.org/pdf/1810.11794v1.pdf
PWC	https://paperswithcode.com/paper/cascaded-pyramid-mining-network-for-weakly
Repo
Framework

Fine-grained Video Categorization with Redundancy Reduction Attention


Title	Fine-grained Video Categorization with Redundancy Reduction Attention
Authors	Chen Zhu, Xiao Tan, Feng Zhou, Xiao Liu, Kaiyu Yue, Errui Ding, Yi Ma
Abstract	For fine-grained categorization tasks, videos could serve as a better source than static images as videos have a higher chance of containing discriminative patterns. Nevertheless, a video sequence could also contain a lot of redundant and irrelevant frames. How to locate critical information of interest is a challenging task. In this paper, we propose a new network structure, known as Redundancy Reduction Attention (RRA), which learns to focus on multiple discriminative patterns by sup- pressing redundant feature channels. Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform. Suppression is achieved by modulating the feature maps and threshing out weak activations. The updated feature maps are then used in the next iteration. Finally, the video is classified based on multiple summaries. The proposed method achieves out- standing performances in multiple video classification datasets. Further- more, we have collected two large-scale video datasets, YouTube-Birds and YouTube-Cars, for future researches on fine-grained video categorization. The datasets are available at http://www.cs.umd.edu/~chenzhu/fgvc.
Tasks	Video Classification
Published	2018-10-26
URL	http://arxiv.org/abs/1810.11189v1
PDF	http://arxiv.org/pdf/1810.11189v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-video-categorization-with
Repo
Framework

Learning discrete Bayesian networks in polynomial time and sample complexity


Title	Learning discrete Bayesian networks in polynomial time and sample complexity
Authors	Adarsh Barik, Jean Honorio
Abstract	In this paper, we study the problem of structure learning for Bayesian networks in which nodes take discrete values. The problem is NP-hard in general but we show that under certain conditions we can recover the true structure of a Bayesian network with sufficient number of samples. We develop a mathematical model which does not assume any specific conditional probability distributions for the nodes. We use a primal-dual witness construction to prove that, under some technical conditions on the interaction between node pairs, we can do exact recovery of the parents and children of a node by performing group l_12-regularized multivariate regression. Thus, we recover the true Bayesian network structure. If degree of a node is bounded then the sample complexity of our proposed approach grows logarithmically with respect to the number of nodes in the Bayesian network. Furthermore, our method runs in polynomial time.
Tasks
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04087v3
PDF	http://arxiv.org/pdf/1803.04087v3.pdf
PWC	https://paperswithcode.com/paper/learning-discrete-bayesian-networks-in
Repo
Framework

Unique Identification of Macaques for Population Monitoring and Control


Title	Unique Identification of Macaques for Population Monitoring and Control
Authors	Ankita Shukla, Gullal Singh Cheema, Saket Anand, Qamar Qureshi, Yadvendradev Jhala
Abstract	Despite loss of natural habitat due to development and urbanization, certain species like the Rhesus macaque have adapted well to the urban environment. With abundant food and no predators, macaque populations have increased substantially in urban areas, leading to frequent conflicts with humans. Overpopulated areas often witness macaques raiding crops, feeding on bird and snake eggs as well as destruction of nests, thus adversely affecting other species in the ecosystem. In order to mitigate these adverse effects, sterilization has emerged as a humane and effective way of population control of macaques. As sterilization requires physical capture of individuals or groups, their unique identification is integral to such control measures. In this work, we propose the Macaque Face Identification (MFID), an image based, non-invasive tool that relies on macaque facial recognition to identify individuals, and can be used to verify if they are sterilized. Our primary contribution is a robust facial recognition and verification module designed for Rhesus macaques, but extensible to other non-human primate species. We evaluate the performance of MFID on a dataset of 93 monkeys under closed set, open set and verification evaluation protocols. Finally, we also report state of the art results when evaluating our proposed model on endangered primate species.
Tasks	Face Identification
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00743v2
PDF	http://arxiv.org/pdf/1811.00743v2.pdf
PWC	https://paperswithcode.com/paper/unique-identification-of-macaques-for
Repo
Framework

Hierarchical Motion Consistency Constraint for Efficient Geometrical Verification in UAV Image Matching


Title	Hierarchical Motion Consistency Constraint for Efficient Geometrical Verification in UAV Image Matching
Authors	San Jiang, Wanshou Jiang
Abstract	This paper proposes a strategy for efficient geometrical verification in unmanned aerial vehicle (UAV) image matching. First, considering the complex transformation model between correspondence set in the image-space, feature points of initial candidate matches are projected onto an elevation plane in the object-space, with assistant of UAV flight control data and camera mounting angles. Spatial relationships are simplified as a 2D-translation in which a motion establishes the relation of two correspondence points. Second, a hierarchical motion consistency constraint, termed HMCC, is designed to eliminate outliers from initial candidate matches, which includes three major steps, namely the global direction consistency constraint, the local direction-change consistency constraint and the global length consistency constraint. To cope with scenarios with high outlier ratios, the HMCC is achieved by using a voting scheme. Finally, an efficient geometrical verification strategy is proposed by using the HMCC as a pre-processing step to increase inlier ratios before the consequent application of the basic RANSAC algorithm. The performance of the proposed strategy is verified through comprehensive comparison and analysis by using real UAV datasets captured with different photogrammetric systems. Experimental results demonstrate that the generated motions have noticeable separation ability, and the HMCC-RANSAC algorithm can efficiently eliminate outliers based on the motion consistency constraint, with a speedup ratio reaching to 6 for oblique UAV images. Even though the completeness sacrifice of approximately 7 percent of points is observed from image orientation tests, competitive orientation accuracy is achieved from all used datasets. For geometrical verification of both nadir and oblique UAV images, the proposed method can be a more efficient solution.
Tasks
Published	2018-01-12
URL	http://arxiv.org/abs/1801.04096v1
PDF	http://arxiv.org/pdf/1801.04096v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-motion-consistency-constraint
Repo
Framework

Meta-Learning for Low-Resource Neural Machine Translation


Title	Meta-Learning for Low-Resource Neural Machine Translation
Authors	Jiatao Gu, Yong Wang, Yun Chen, Kyunghyun Cho, Victor O. K. Li
Abstract	In this paper, we propose to extend the recently introduced model-agnostic meta-learning algorithm (MAML) for low-resource neural machine translation (NMT). We frame low-resource translation as a meta-learning problem, and we learn to adapt to low-resource languages based on multilingual high-resource language tasks. We use the universal lexical representation~\citep{gu2018universal} to overcome the input-output mismatch across different languages. We evaluate the proposed meta-learning strategy using eighteen European languages (Bg, Cs, Da, De, El, Es, Et, Fr, Hu, It, Lt, Nl, Pl, Pt, Sk, Sl, Sv and Ru) as source tasks and five diverse languages (Ro, Lv, Fi, Tr and Ko) as target tasks. We show that the proposed approach significantly outperforms the multilingual, transfer learning based approach~\citep{zoph2016transfer} and enables us to train a competitive NMT system with only a fraction of training examples. For instance, the proposed approach can achieve as high as 22.04 BLEU on Romanian-English WMT’16 by seeing only 16,000 translated words (~600 parallel sentences).
Tasks	Low-Resource Neural Machine Translation, Machine Translation, Meta-Learning, Transfer Learning
Published	2018-08-25
URL	http://arxiv.org/abs/1808.08437v1
PDF	http://arxiv.org/pdf/1808.08437v1.pdf
PWC	https://paperswithcode.com/paper/meta-learning-for-low-resource-neural-machine
Repo
Framework

High-Dynamic-Range Imaging for Cloud Segmentation


Title	High-Dynamic-Range Imaging for Cloud Segmentation
Authors	Soumyabrata Dev, Florian M. Savoy, Yee Hui Lee, Stefan Winkler
Abstract	Sky/cloud images obtained from ground-based sky-cameras are usually captured using a fish-eye lens with a wide field of view. However, the sky exhibits a large dynamic range in terms of luminance, more than a conventional camera can capture. It is thus difficult to capture the details of an entire scene with a regular camera in a single shot. In most cases, the circumsolar region is over-exposed, and the regions near the horizon are under-exposed. This renders cloud segmentation for such images difficult. In this paper, we propose HDRCloudSeg – an effective method for cloud segmentation using High-Dynamic-Range (HDR) imaging based on multi-exposure fusion. We describe the HDR image generation process and release a new database to the community for benchmarking. Our proposed approach is the first using HDR radiance maps for cloud segmentation and achieves very good results.
Tasks	Image Generation
Published	2018-03-02
URL	http://arxiv.org/abs/1803.01071v1
PDF	http://arxiv.org/pdf/1803.01071v1.pdf
PWC	https://paperswithcode.com/paper/high-dynamic-range-imaging-for-cloud
Repo
Framework

A new model for Cerebellar computation


Title	A new model for Cerebellar computation
Authors	Reza Moazzezi
Abstract	The standard state space model is widely believed to account for the cerebellar computation in motor adaptation tasks [1]. Here we show that several recent experiments [2-4] where the visual feedback is irrelevant to the motor response challenge the standard model. Furthermore, we propose a new model that accounts for the the results presented in [2-4]. According to this new model, learning and forgetting are coupled and are error size dependent. We also show that under reasonable assumptions, our proposed model is the only model that accounts for both the classical adaptation paradigm as well as the recent experiments [2-4].
Tasks
Published	2018-02-22
URL	http://arxiv.org/abs/1802.08217v1
PDF	http://arxiv.org/pdf/1802.08217v1.pdf
PWC	https://paperswithcode.com/paper/a-new-model-for-cerebellar-computation
Repo
Framework