January 24, 2020

2955 words 14 mins read

Paper Group NANR 182

A Model Cortical Network for Spatiotemporal Sequence Learning and Prediction. Parsing Meaning Representations: Is Easier Always Better?. Adversarial Exploration Strategy for Self-Supervised Imitation Learning. Agile Depth Sensing Using Triangulation Light Curtains. CGNF: Conditional Graph Neural Fields. Rethinking learning rate schedules for stocha …

A Model Cortical Network for Spatiotemporal Sequence Learning and Prediction


Title	A Model Cortical Network for Spatiotemporal Sequence Learning and Prediction
Authors	Jielin Qiu, Ge Huang, Tai Sing Lee
Abstract	In this paper we developed a hierarchical network model, called Hierarchical Prediction Network (HPNet) to understand how spatiotemporal memories might be learned and encoded in a representational hierarchy for predicting future video frames. The model is inspired by the feedforward, feedback and lateral recurrent circuits in the mammalian hierarchical visual system. It assumes that spatiotemporal memories are encoded in the recurrent connections within each level and between different levels of the hierarchy. The model contains a feed-forward path that computes and encodes spatiotemporal features of successive complexity and a feedback path that projects interpretation from a higher level to the level below. Within each level, the feed-forward path and the feedback path intersect in a recurrent gated circuit that integrates their signals as well as the circuit’s internal memory states to generate a prediction of the incoming signals. The network learns by comparing the incoming signals with its prediction, updating its internal model of the world by minimizing the prediction errors at each level of the hierarchy in the style of {\em predictive self-supervised learning}. The network processes data in blocks of video frames rather than a frame-to-frame basis. This allows it to learn relationships among movement patterns, yielding state-of-the-art performance in long range video sequence predictions in benchmark datasets. We observed that hierarchical interaction in the network introduces sensitivity to memories of global movement patterns even in the population representation of the units in the earliest level. Finally, we provided neurophysiological evidence, showing that neurons in the early visual cortex of awake monkeys exhibit very similar sensitivity and behaviors. These findings suggest that predictive self-supervised learning might be an important principle for representational learning in the visual cortex.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=BJl_VnR9Km
PDF	https://openreview.net/pdf?id=BJl_VnR9Km
PWC	https://paperswithcode.com/paper/a-model-cortical-network-for-spatiotemporal
Repo
Framework

Parsing Meaning Representations: Is Easier Always Better?


Title	Parsing Meaning Representations: Is Easier Always Better?
Authors	Zi Lin, Nianwen Xue
Abstract	The parsing accuracy varies a great deal for different meaning representations. In this paper, we compare the parsing performances between Abstract Meaning Representation (AMR) and Minimal Recursion Semantics (MRS), and provide an in-depth analysis of what factors contributed to the discrepancy in their parsing accuracy. By crystalizing the trade-off between representation expressiveness and ease of automatic parsing, we hope our results can help inform the design of the next-generation meaning representations.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3304/
PDF	https://www.aclweb.org/anthology/W19-3304
PWC	https://paperswithcode.com/paper/parsing-meaning-representations-is-easier
Repo
Framework

Adversarial Exploration Strategy for Self-Supervised Imitation Learning


Title	Adversarial Exploration Strategy for Self-Supervised Imitation Learning
Authors	Zhang-Wei Hong, Tsu-Jui Fu, Tzu-Yun Shann, Yi-Hsiang Chang, Chun-Yi Lee
Abstract	We present an adversarial exploration strategy, a simple yet effective imitation learning scheme that incentivizes exploration of an environment without any extrinsic reward or human demonstration. Our framework consists of a deep reinforcement learning (DRL) agent and an inverse dynamics model contesting with each other. The former collects training samples for the latter, and its objective is to maximize the error of the latter. The latter is trained with samples collected by the former, and generates rewards for the former when it fails to predict the actual action taken by the former. In such a competitive setting, the DRL agent learns to generate samples that the inverse dynamics model fails to predict correctly, and the inverse dynamics model learns to adapt to the challenging samples. We further propose a reward structure that ensures the DRL agent collects only moderately hard samples and not overly hard ones that prevent the inverse model from imitating effectively. We evaluate the effectiveness of our method on several OpenAI gym robotic arm and hand manipulation tasks against a number of baseline models. Experimental results show that our method is comparable to that directly trained with expert demonstrations, and superior to the other baselines even without any human priors.
Tasks	Imitation Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=Hyxtso0qtX
PDF	https://openreview.net/pdf?id=Hyxtso0qtX
PWC	https://paperswithcode.com/paper/adversarial-exploration-strategy-for-self-1
Repo
Framework

Agile Depth Sensing Using Triangulation Light Curtains


Title	Agile Depth Sensing Using Triangulation Light Curtains
Authors	Joseph R. Bartels, Jian Wang, William “Red” Whittaker, Srinivasa G. Narasimhan
Abstract	Depth sensors like LIDARs and Kinect use a fixed depth acquisition strategy that is independent of the scene of interest. Due to the low spatial and temporal resolution of these sensors, this strategy can undersample parts of the scene that are important (small or fast moving objects), or oversample areas that are not informative for the task at hand (a fixed planar wall). In this paper, we present an approach and system to dynamically and adaptively sample the depths of a scene using the principle of triangulation light curtains. The approach directly detects the presence or absence of objects at specified 3D lines. These 3D lines can be sampled sparsely, non-uniformly, or densely only at specified regions. The depth sampling can be varied in real-time, enabling quick object discovery or detailed exploration of areas of interest. These results are achieved using a novel prototype light curtain system that is based on a 2D rolling shutter camera with higher light efficiency, working range, and faster adaptation than previous work, making it useful broadly for autonomous navigation and exploration.
Tasks	Autonomous Navigation
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Bartels_Agile_Depth_Sensing_Using_Triangulation_Light_Curtains_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Bartels_Agile_Depth_Sensing_Using_Triangulation_Light_Curtains_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/agile-depth-sensing-using-triangulation-light
Repo
Framework

CGNF: Conditional Graph Neural Fields


Title	CGNF: Conditional Graph Neural Fields
Authors	Tengfei Ma, Cao Xiao, Junyuan Shang, Jimeng Sun
Abstract	Graph convolutional networks have achieved tremendous success in the tasks of graph node classification. These models could learn a better node representation through encoding the graph structure and node features. However, the correlation between the node labels are not considered. In this paper, we propose a novel architecture for graph node classification, named conditional graph neural fields (CGNF). By integrating the conditional random fields (CRF) in the graph convolutional networks, we explicitly model a joint probability of the entire set of node labels, thus taking advantage of neighborhood label information in the node label prediction task. Our model could have both the representation capacity of graph neural networks and the prediction power of CRFs. Experiments on several graph datasets demonstrate effectiveness of CGNF.
Tasks	Node Classification
Published	2019-05-01
URL	https://openreview.net/forum?id=ryxMX2R9YQ
PDF	https://openreview.net/pdf?id=ryxMX2R9YQ
PWC	https://paperswithcode.com/paper/cgnf-conditional-graph-neural-fields
Repo
Framework

Rethinking learning rate schedules for stochastic optimization


Title	Rethinking learning rate schedules for stochastic optimization
Authors	Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli
Abstract	There is a stark disparity between the learning rate schedules used in the practice of large scale machine learning and what are considered admissible learning rate schedules prescribed in the theory of stochastic approximation. Recent results, such as in the ‘super-convergence’ methods which use oscillating learning rates, serve to emphasize this point even more. One plausible explanation is that non-convex neural network training procedures are better suited to the use of fundamentally different learning rate schedules, such as the ``cut the learning rate every constant number of epochs’’ method (which more closely resembles an exponentially decaying learning rate schedule); note that this widely used schedule is in stark contrast to the polynomial decay schemes prescribed in the stochastic approximation literature, which are indeed shown to be (worst case) optimal for classes of convex optimization problems. The main contribution of this work shows that the picture is far more nuanced, where we do not even need to move to non-convex optimization to show other learning rate schemes can be far more effective. In fact, even for the simple case of stochastic linear regression with a fixed time horizon, the rate achieved by any polynomial decay scheme is sub-optimal compared to the statistical minimax rate (by a factor of condition number); in contrast the ```'‘cut the learning rate every constant number of epochs’’ provides an exponential improvement (depending only logarithmically on the condition number) compared to any polynomial decay scheme. Finally, it is important to ask if our theoretical insights are somehow fundamentally tied to quadratic loss minimization (where we have circumvented minimax lower bounds for more general convex optimization problems)? Here, we conjecture that recent results which make the gradient norm small at a near optimal rate, for both convex and non-convex optimization, may also provide more insights into learning rate schedules used in practice. \|
Tasks	Stochastic Optimization
Published	2019-05-01
URL	https://openreview.net/forum?id=HJePy3RcF7
PDF	https://openreview.net/pdf?id=HJePy3RcF7
PWC	https://paperswithcode.com/paper/rethinking-learning-rate-schedules-for
Repo
Framework

Through-Wall Object Recognition and Pose Estimation


Title	Through-Wall Object Recognition and Pose Estimation
Authors	Wang, Ruoyu; Xiang, Siyuan; Feng, Chen; Wang, Pu; Ergan, Semiha; Yi Fang
Abstract	Robots need to perceive beyond lines of sight, e.g., to avoid cutting water pipes or electric wires when drilling holes on a wall. Recent off-the-shelf radio frequency (RF) imaging sensors ease the process of 3D sensing inside or through walls. Yet unlike optical images, RF images are difficult to understand by a human. Meanwhile, in practice, RF components are often subject to hardware imperfections, resulting in distorted RF images, whose quality could be far from the claimed specifications. Thus, we introduce several challenging geometric and semantic perception tasks on such signals, including object and material recognition, fine-grained property classification and pose estimation. Since detailed forward modeling of such sensors is sometimes difficult, due to hidden or inaccessible system parameters, onboard processing procedures and limited access to raw RF waveform, we tackled the above tasks by supervised machine learning. We collected a large dataset of RF images of utility objects through a mock wall as the input of our algorithm, and the corresponding optical images were taken from the other side of the wall simultaneously as the ground truth. We designed three learning algorithms based on nearest neighbors or neural networks, and report their performances on the dataset. Our experiments showed reasonable results for semantic perception tasks yet unsatisfactory results for geometric ones, calling for more efforts in this research direction.
Tasks	Material Recognition, Object Recognition, Pose Estimation, RF-based Pose Estimation
Published	2019-05-21
URL	http://doi.org/10.22260/ISARC2019/0157
PDF	https://www.iaarc.org/publications/fulltext/ISARC_2019_Paper_231.pdf
PWC	https://paperswithcode.com/paper/through-wall-object-recognition-and-pose
Repo
Framework

Graph Based Skeleton Modeling for Human Activity Analysis


Title	Graph Based Skeleton Modeling for Human Activity Analysis
Authors	Jiun-Yu Kao, Antonio Ortega, Dong Tian, Hassan Mansour, Anthony Vetro
Abstract	Understanding human activity based on sensor information is required in many applications and has been an active research area. With the advancement of depth sensors and tracking algorithms, systems for human motion activity analysis can be built by combining off-the-shelf motion tracking systems with application-dependent learning tools to extract higher semantic level information. Many of these motion tracking systems provide raw motion data registered to the skeletal joints in the human body. In this paper, we propose novel representations for human motion data using the skeleton-based graph structure along with techniques in graph signal processing. Methods for graph construction and their corresponding basis functions are discussed. The proposed representations can achieve comparable classification performance in action recognition tasks while additionally being more robust to noise and missing data.
Tasks	graph construction, Skeleton Based Action Recognition
Published	2019-08-26
URL	https://doi.org/10.1109/ICIP.2019.8803186
PDF	http://www.merl.com/publications/docs/TR2019-037.pdf
PWC	https://paperswithcode.com/paper/graph-based-skeleton-modeling-for-human
Repo
Framework

Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems


Title	Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems
Authors	Jia Li, Chongyang Tao, Wei Wu, Yansong Feng, Dongyan Zhao, Rui Yan
Abstract	We study how to sample negative examples to automatically construct a training set for effective model learning in retrieval-based dialogue systems. Following an idea of dynamically adapting negative examples to matching models in learning, we consider four strategies including minimum sampling, maximum sampling, semi-hard sampling, and decay-hard sampling. Empirical studies on two benchmarks with three matching models indicate that compared with the widely used random sampling strategy, although the first two strategies lead to performance drop, the latter two ones can bring consistent improvement to the performance of all the models on both benchmarks.
Tasks	Conversational Response Selection
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1128/
PDF	https://www.aclweb.org/anthology/D19-1128
PWC	https://paperswithcode.com/paper/sampling-matters-an-empirical-study-of
Repo
Framework

A study of semantic augmentation of word embeddings for extractive summarization


Title	A study of semantic augmentation of word embeddings for extractive summarization
Authors	Nikiforos Pittaras, Vangelis Karkaletsis
Abstract	In this study we examine the effect of semantic augmentation approaches on extractive text summarization. Wordnet hypernym relations are used to extract term-frequency concept information, subsequently concatenated to sentence-level representations produced by aggregated deep neural word embeddings. Multiple dimensionality reduction techniques and combination strategies are examined via feature transformation and clustering methods. An experimental evaluation on the MultiLing 2015 MSS dataset illustrates that semantic information can introduce benefits to the extractive summarization process in terms of F1, ROUGE-1 and ROUGE-2 scores, with LSA-based post-processing introducing the largest improvements.
Tasks	Dimensionality Reduction, Text Summarization, Word Embeddings
Published	2019-09-01
URL	https://www.aclweb.org/anthology/W19-8909/
PDF	https://www.aclweb.org/anthology/W19-8909
PWC	https://paperswithcode.com/paper/a-study-of-semantic-augmentation-of-word
Repo
Framework

CVIT’s submissions to WAT-2019


Title	CVIT’s submissions to WAT-2019
Authors	Jerin Philip, Shashank Siripragada, Upendra Kumar, Vinay Namboodiri, C V Jawahar
Abstract	This paper describes the Neural Machine Translation systems used by IIIT Hyderabad (CVIT-MT) for the translation tasks part of WAT-2019. We participated in tasks pertaining to Indian languages and submitted results for English-Hindi, Hindi-English, English-Tamil and Tamil-English language pairs. We employ Transformer architecture experimenting with multilingual models and methods for low-resource languages.
Tasks	Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5215/
PDF	https://www.aclweb.org/anthology/D19-5215
PWC	https://paperswithcode.com/paper/cvits-submissions-to-wat-2019
Repo
Framework

SVD: A Large-Scale Short Video Dataset for Near-Duplicate Video Retrieval


Title	SVD: A Large-Scale Short Video Dataset for Near-Duplicate Video Retrieval
Authors	Qing-Yuan Jiang, Yi He, Gen Li, Jian Lin, Lei Li, Wu-Jun Li
Abstract	With the explosive growth of video data in real applications, near-duplicate video retrieval (NDVR) has become indispensable and challenging, especially for short videos. However, all existing NDVR datasets are introduced for long videos. Furthermore, most of them are small-scale and lack of diversity due to the high cost of collecting and labeling near-duplicate videos. In this paper, we introduce a large-scale short video dataset, called SVD, for the NDVR task. SVD contains over 500,000 short videos and over 30,000 labeled videos of near-duplicates. We use multiple video mining techniques to construct positive/negative pairs. Furthermore, we design temporal and spatial transformations to mimic user-attack behavior in real applications for constructing more difficult variants of SVD. Experiments show that existing state-of-the-art NDVR methods, including real-value based and hashing based methods, fail to achieve satisfactory performance on this challenging dataset. The release of SVD dataset will foster research and system engineering in the NDVR area. The SVD dataset is available at https://svdbase.github.io.
Tasks	Video Retrieval
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Jiang_SVD_A_Large-Scale_Short_Video_Dataset_for_Near-Duplicate_Video_Retrieval_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Jiang_SVD_A_Large-Scale_Short_Video_Dataset_for_Near-Duplicate_Video_Retrieval_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/svd-a-large-scale-short-video-dataset-for
Repo
Framework

Deep Incremental Hashing Network for Efficient Image Retrieval


Title	Deep Incremental Hashing Network for Efficient Image Retrieval
Authors	Dayan Wu, Qi Dai, Jing Liu, Bo Li, Weiping Wang
Abstract	Hashing has shown great potential in large-scale image retrieval due to its storage and computation efficiency, especially the recent deep supervised hashing methods. To achieve promising performance, deep supervised hashing methods require a large amount of training data from different classes. However, when images of new categories emerge, existing deep hashing methods have to retrain the CNN model and generate hash codes for all the database images again, which is impractical for large-scale retrieval system. In this paper, we propose a novel deep hashing framework, called Deep Incremental Hashing Network (DIHN), for learning hash codes in an incremental manner. DIHN learns the hash codes for the new coming images directly, while keeping the old ones unchanged. Simultaneously, a deep hash function for query set is learned by preserving the similarities between training points. Extensive experiments on two widely used image retrieval benchmarks demonstrate that the proposed DIHN framework can significantly decrease the training time while keeping the state-of-the-art retrieval accuracy.
Tasks	Image Retrieval
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_Deep_Incremental_Hashing_Network_for_Efficient_Image_Retrieval_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_Deep_Incremental_Hashing_Network_for_Efficient_Image_Retrieval_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/deep-incremental-hashing-network-for
Repo
Framework

Modeling the Relationship between User Comments and Edits in Document Revision


Title	Modeling the Relationship between User Comments and Edits in Document Revision
Authors	Xuchao Zhang, Dheeraj Rajagopal, Michael Gamon, Sujay Kumar Jauhar, ChangTien Lu
Abstract	Management of collaborative documents can be difficult, given the profusion of edits and comments that multiple authors make during a document{'}s evolution. Reliably modeling the relationship between edits and comments is a crucial step towards helping the user keep track of a document in flux. A number of authoring tasks, such as categorizing and summarizing edits, detecting completed to-dos, and visually rearranging comments could benefit from such a contribution. Thus, in this paper we explore the relationship between comments and edits by defining two novel, related tasks: Comment Ranking and Edit Anchoring. We begin by collecting a dataset with more than half a million comment-edit pairs based on Wikipedia revision histories. We then propose a hierarchical multi-layer deep neural-network to model the relationship between edits and comments. Our architecture tackles both Comment Ranking and Edit Anchoring tasks by encoding specific edit actions such as additions and deletions, while also accounting for document context. In a number of evaluation settings, our experimental results show that our approach outperforms several strong baselines significantly. We are able to achieve a precision@1 of 71.0{%} and a precision@3 of 94.4{%} for Comment Ranking, while we achieve 74.4{%} accuracy on Edit Anchoring.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1505/
PDF	https://www.aclweb.org/anthology/D19-1505
PWC	https://paperswithcode.com/paper/modeling-the-relationship-between-user
Repo
Framework

Can Modern Standard Arabic Approaches be used for Arabic Dialects? Sentiment Analysis as a Case Study


Title	Can Modern Standard Arabic Approaches be used for Arabic Dialects? Sentiment Analysis as a Case Study
Authors	Chatrine Qwaider, Stergios Chatzikyriakidis, Simon Dobnik
Abstract
Tasks	Sentiment Analysis
Published	2019-07-01
URL	https://www.aclweb.org/anthology/W19-5606/
PDF	https://www.aclweb.org/anthology/W19-5606
PWC	https://paperswithcode.com/paper/can-modern-standard-arabic-approaches-be-used
Repo
Framework