Paper Group NANR 182
A Model Cortical Network for Spatiotemporal Sequence Learning and Prediction. Parsing Meaning Representations: Is Easier Always Better?. Adversarial Exploration Strategy for Self-Supervised Imitation Learning. Agile Depth Sensing Using Triangulation Light Curtains. CGNF: Conditional Graph Neural Fields. Rethinking learning rate schedules for stocha …
A Model Cortical Network for Spatiotemporal Sequence Learning and Prediction
Title | A Model Cortical Network for Spatiotemporal Sequence Learning and Prediction |
Authors | Jielin Qiu, Ge Huang, Tai Sing Lee |
Abstract | In this paper we developed a hierarchical network model, called Hierarchical Prediction Network (HPNet) to understand how spatiotemporal memories might be learned and encoded in a representational hierarchy for predicting future video frames. The model is inspired by the feedforward, feedback and lateral recurrent circuits in the mammalian hierarchical visual system. It assumes that spatiotemporal memories are encoded in the recurrent connections within each level and between different levels of the hierarchy. The model contains a feed-forward path that computes and encodes spatiotemporal features of successive complexity and a feedback path that projects interpretation from a higher level to the level below. Within each level, the feed-forward path and the feedback path intersect in a recurrent gated circuit that integrates their signals as well as the circuit’s internal memory states to generate a prediction of the incoming signals. The network learns by comparing the incoming signals with its prediction, updating its internal model of the world by minimizing the prediction errors at each level of the hierarchy in the style of {\em predictive self-supervised learning}. The network processes data in blocks of video frames rather than a frame-to-frame basis. This allows it to learn relationships among movement patterns, yielding state-of-the-art performance in long range video sequence predictions in benchmark datasets. We observed that hierarchical interaction in the network introduces sensitivity to memories of global movement patterns even in the population representation of the units in the earliest level. Finally, we provided neurophysiological evidence, showing that neurons in the early visual cortex of awake monkeys exhibit very similar sensitivity and behaviors. These findings suggest that predictive self-supervised learning might be an important principle for representational learning in the visual cortex. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BJl_VnR9Km |
https://openreview.net/pdf?id=BJl_VnR9Km | |
PWC | https://paperswithcode.com/paper/a-model-cortical-network-for-spatiotemporal |
Repo | |
Framework | |
Parsing Meaning Representations: Is Easier Always Better?
Title | Parsing Meaning Representations: Is Easier Always Better? |
Authors | Zi Lin, Nianwen Xue |
Abstract | The parsing accuracy varies a great deal for different meaning representations. In this paper, we compare the parsing performances between Abstract Meaning Representation (AMR) and Minimal Recursion Semantics (MRS), and provide an in-depth analysis of what factors contributed to the discrepancy in their parsing accuracy. By crystalizing the trade-off between representation expressiveness and ease of automatic parsing, we hope our results can help inform the design of the next-generation meaning representations. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3304/ |
https://www.aclweb.org/anthology/W19-3304 | |
PWC | https://paperswithcode.com/paper/parsing-meaning-representations-is-easier |
Repo | |
Framework | |
Adversarial Exploration Strategy for Self-Supervised Imitation Learning
Title | Adversarial Exploration Strategy for Self-Supervised Imitation Learning |
Authors | Zhang-Wei Hong, Tsu-Jui Fu, Tzu-Yun Shann, Yi-Hsiang Chang, Chun-Yi Lee |
Abstract | We present an adversarial exploration strategy, a simple yet effective imitation learning scheme that incentivizes exploration of an environment without any extrinsic reward or human demonstration. Our framework consists of a deep reinforcement learning (DRL) agent and an inverse dynamics model contesting with each other. The former collects training samples for the latter, and its objective is to maximize the error of the latter. The latter is trained with samples collected by the former, and generates rewards for the former when it fails to predict the actual action taken by the former. In such a competitive setting, the DRL agent learns to generate samples that the inverse dynamics model fails to predict correctly, and the inverse dynamics model learns to adapt to the challenging samples. We further propose a reward structure that ensures the DRL agent collects only moderately hard samples and not overly hard ones that prevent the inverse model from imitating effectively. We evaluate the effectiveness of our method on several OpenAI gym robotic arm and hand manipulation tasks against a number of baseline models. Experimental results show that our method is comparable to that directly trained with expert demonstrations, and superior to the other baselines even without any human priors. |
Tasks | Imitation Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=Hyxtso0qtX |
https://openreview.net/pdf?id=Hyxtso0qtX | |
PWC | https://paperswithcode.com/paper/adversarial-exploration-strategy-for-self-1 |
Repo | |
Framework | |
Agile Depth Sensing Using Triangulation Light Curtains
Title | Agile Depth Sensing Using Triangulation Light Curtains |
Authors | Joseph R. Bartels, Jian Wang, William “Red” Whittaker, Srinivasa G. Narasimhan |
Abstract | Depth sensors like LIDARs and Kinect use a fixed depth acquisition strategy that is independent of the scene of interest. Due to the low spatial and temporal resolution of these sensors, this strategy can undersample parts of the scene that are important (small or fast moving objects), or oversample areas that are not informative for the task at hand (a fixed planar wall). In this paper, we present an approach and system to dynamically and adaptively sample the depths of a scene using the principle of triangulation light curtains. The approach directly detects the presence or absence of objects at specified 3D lines. These 3D lines can be sampled sparsely, non-uniformly, or densely only at specified regions. The depth sampling can be varied in real-time, enabling quick object discovery or detailed exploration of areas of interest. These results are achieved using a novel prototype light curtain system that is based on a 2D rolling shutter camera with higher light efficiency, working range, and faster adaptation than previous work, making it useful broadly for autonomous navigation and exploration. |
Tasks | Autonomous Navigation |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Bartels_Agile_Depth_Sensing_Using_Triangulation_Light_Curtains_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Bartels_Agile_Depth_Sensing_Using_Triangulation_Light_Curtains_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/agile-depth-sensing-using-triangulation-light |
Repo | |
Framework | |
CGNF: Conditional Graph Neural Fields
Title | CGNF: Conditional Graph Neural Fields |
Authors | Tengfei Ma, Cao Xiao, Junyuan Shang, Jimeng Sun |
Abstract | Graph convolutional networks have achieved tremendous success in the tasks of graph node classification. These models could learn a better node representation through encoding the graph structure and node features. However, the correlation between the node labels are not considered. In this paper, we propose a novel architecture for graph node classification, named conditional graph neural fields (CGNF). By integrating the conditional random fields (CRF) in the graph convolutional networks, we explicitly model a joint probability of the entire set of node labels, thus taking advantage of neighborhood label information in the node label prediction task. Our model could have both the representation capacity of graph neural networks and the prediction power of CRFs. Experiments on several graph datasets demonstrate effectiveness of CGNF. |
Tasks | Node Classification |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=ryxMX2R9YQ |
https://openreview.net/pdf?id=ryxMX2R9YQ | |
PWC | https://paperswithcode.com/paper/cgnf-conditional-graph-neural-fields |
Repo | |
Framework | |
Rethinking learning rate schedules for stochastic optimization
Title | Rethinking learning rate schedules for stochastic optimization |
Authors | Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli |
Abstract | There is a stark disparity between the learning rate schedules used in the practice of large scale machine learning and what are considered admissible learning rate schedules prescribed in the theory of stochastic approximation. Recent results, such as in the ‘super-convergence’ methods which use oscillating learning rates, serve to emphasize this point even more. One plausible explanation is that non-convex neural network training procedures are better suited to the use of fundamentally different learning rate schedules, such as the ``cut the learning rate every constant number of epochs’’ method (which more closely resembles an exponentially decaying learning rate schedule); note that this widely used schedule is in stark contrast to the polynomial decay schemes prescribed in the stochastic approximation literature, which are indeed shown to be (worst case) optimal for classes of convex optimization problems. The main contribution of this work shows that the picture is far more nuanced, where we do not even need to move to non-convex optimization to show other learning rate schemes can be far more effective. In fact, even for the simple case of stochastic linear regression with a fixed time horizon, the rate achieved by any polynomial decay scheme is sub-optimal compared to the statistical minimax rate (by a factor of condition number); in contrast the ```'‘cut the learning rate every constant number of epochs’’ provides an exponential improvement (depending only logarithmically on the condition number) compared to any polynomial decay scheme. Finally, it is important to ask if our theoretical insights are somehow fundamentally tied to quadratic loss minimization (where we have circumvented minimax lower bounds for more general convex optimization problems)? Here, we conjecture that recent results which make the gradient norm small at a near optimal rate, for both convex and non-convex optimization, may also provide more insights into learning rate schedules used in practice. | |
Tasks | Stochastic Optimization |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=HJePy3RcF7 |
https://openreview.net/pdf?id=HJePy3RcF7 | |
PWC | https://paperswithcode.com/paper/rethinking-learning-rate-schedules-for |
Repo | |
Framework | |
Through-Wall Object Recognition and Pose Estimation
Title | Through-Wall Object Recognition and Pose Estimation |
Authors | Wang, Ruoyu; Xiang, Siyuan; Feng, Chen; Wang, Pu; Ergan, Semiha; Yi Fang |
Abstract | Robots need to perceive beyond lines of sight, e.g., to avoid cutting water pipes or electric wires when drilling holes on a wall. Recent off-the-shelf radio frequency (RF) imaging sensors ease the process of 3D sensing inside or through walls. Yet unlike optical images, RF images are difficult to understand by a human. Meanwhile, in practice, RF components are often subject to hardware imperfections, resulting in distorted RF images, whose quality could be far from the claimed specifications. Thus, we introduce several challenging geometric and semantic perception tasks on such signals, including object and material recognition, fine-grained property classification and pose estimation. Since detailed forward modeling of such sensors is sometimes difficult, due to hidden or inaccessible system parameters, onboard processing procedures and limited access to raw RF waveform, we tackled the above tasks by supervised machine learning. We collected a large dataset of RF images of utility objects through a mock wall as the input of our algorithm, and the corresponding optical images were taken from the other side of the wall simultaneously as the ground truth. We designed three learning algorithms based on nearest neighbors or neural networks, and report their performances on the dataset. Our experiments showed reasonable results for semantic perception tasks yet unsatisfactory results for geometric ones, calling for more efforts in this research direction. |
Tasks | Material Recognition, Object Recognition, Pose Estimation, RF-based Pose Estimation |
Published | 2019-05-21 |
URL | http://doi.org/10.22260/ISARC2019/0157 |
https://www.iaarc.org/publications/fulltext/ISARC_2019_Paper_231.pdf | |
PWC | https://paperswithcode.com/paper/through-wall-object-recognition-and-pose |
Repo | |
Framework | |
Graph Based Skeleton Modeling for Human Activity Analysis
Title | Graph Based Skeleton Modeling for Human Activity Analysis |
Authors | Jiun-Yu Kao, Antonio Ortega, Dong Tian, Hassan Mansour, Anthony Vetro |
Abstract | Understanding human activity based on sensor information is required in many applications and has been an active research area. With the advancement of depth sensors and tracking algorithms, systems for human motion activity analysis can be built by combining off-the-shelf motion tracking systems with application-dependent learning tools to extract higher semantic level information. Many of these motion tracking systems provide raw motion data registered to the skeletal joints in the human body. In this paper, we propose novel representations for human motion data using the skeleton-based graph structure along with techniques in graph signal processing. Methods for graph construction and their corresponding basis functions are discussed. The proposed representations can achieve comparable classification performance in action recognition tasks while additionally being more robust to noise and missing data. |
Tasks | graph construction, Skeleton Based Action Recognition |
Published | 2019-08-26 |
URL | https://doi.org/10.1109/ICIP.2019.8803186 |
http://www.merl.com/publications/docs/TR2019-037.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-skeleton-modeling-for-human |
Repo | |
Framework | |
Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems
Title | Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems |
Authors | Jia Li, Chongyang Tao, Wei Wu, Yansong Feng, Dongyan Zhao, Rui Yan |
Abstract | We study how to sample negative examples to automatically construct a training set for effective model learning in retrieval-based dialogue systems. Following an idea of dynamically adapting negative examples to matching models in learning, we consider four strategies including minimum sampling, maximum sampling, semi-hard sampling, and decay-hard sampling. Empirical studies on two benchmarks with three matching models indicate that compared with the widely used random sampling strategy, although the first two strategies lead to performance drop, the latter two ones can bring consistent improvement to the performance of all the models on both benchmarks. |
Tasks | Conversational Response Selection |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1128/ |
https://www.aclweb.org/anthology/D19-1128 | |
PWC | https://paperswithcode.com/paper/sampling-matters-an-empirical-study-of |
Repo | |
Framework | |
A study of semantic augmentation of word embeddings for extractive summarization
Title | A study of semantic augmentation of word embeddings for extractive summarization |
Authors | Nikiforos Pittaras, Vangelis Karkaletsis |
Abstract | In this study we examine the effect of semantic augmentation approaches on extractive text summarization. Wordnet hypernym relations are used to extract term-frequency concept information, subsequently concatenated to sentence-level representations produced by aggregated deep neural word embeddings. Multiple dimensionality reduction techniques and combination strategies are examined via feature transformation and clustering methods. An experimental evaluation on the MultiLing 2015 MSS dataset illustrates that semantic information can introduce benefits to the extractive summarization process in terms of F1, ROUGE-1 and ROUGE-2 scores, with LSA-based post-processing introducing the largest improvements. |
Tasks | Dimensionality Reduction, Text Summarization, Word Embeddings |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-8909/ |
https://www.aclweb.org/anthology/W19-8909 | |
PWC | https://paperswithcode.com/paper/a-study-of-semantic-augmentation-of-word |
Repo | |
Framework | |
CVIT’s submissions to WAT-2019
Title | CVIT’s submissions to WAT-2019 |
Authors | Jerin Philip, Shashank Siripragada, Upendra Kumar, Vinay Namboodiri, C V Jawahar |
Abstract | This paper describes the Neural Machine Translation systems used by IIIT Hyderabad (CVIT-MT) for the translation tasks part of WAT-2019. We participated in tasks pertaining to Indian languages and submitted results for English-Hindi, Hindi-English, English-Tamil and Tamil-English language pairs. We employ Transformer architecture experimenting with multilingual models and methods for low-resource languages. |
Tasks | Machine Translation |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5215/ |
https://www.aclweb.org/anthology/D19-5215 | |
PWC | https://paperswithcode.com/paper/cvits-submissions-to-wat-2019 |
Repo | |
Framework | |
SVD: A Large-Scale Short Video Dataset for Near-Duplicate Video Retrieval
Title | SVD: A Large-Scale Short Video Dataset for Near-Duplicate Video Retrieval |
Authors | Qing-Yuan Jiang, Yi He, Gen Li, Jian Lin, Lei Li, Wu-Jun Li |
Abstract | With the explosive growth of video data in real applications, near-duplicate video retrieval (NDVR) has become indispensable and challenging, especially for short videos. However, all existing NDVR datasets are introduced for long videos. Furthermore, most of them are small-scale and lack of diversity due to the high cost of collecting and labeling near-duplicate videos. In this paper, we introduce a large-scale short video dataset, called SVD, for the NDVR task. SVD contains over 500,000 short videos and over 30,000 labeled videos of near-duplicates. We use multiple video mining techniques to construct positive/negative pairs. Furthermore, we design temporal and spatial transformations to mimic user-attack behavior in real applications for constructing more difficult variants of SVD. Experiments show that existing state-of-the-art NDVR methods, including real-value based and hashing based methods, fail to achieve satisfactory performance on this challenging dataset. The release of SVD dataset will foster research and system engineering in the NDVR area. The SVD dataset is available at https://svdbase.github.io. |
Tasks | Video Retrieval |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Jiang_SVD_A_Large-Scale_Short_Video_Dataset_for_Near-Duplicate_Video_Retrieval_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Jiang_SVD_A_Large-Scale_Short_Video_Dataset_for_Near-Duplicate_Video_Retrieval_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/svd-a-large-scale-short-video-dataset-for |
Repo | |
Framework | |
Deep Incremental Hashing Network for Efficient Image Retrieval
Title | Deep Incremental Hashing Network for Efficient Image Retrieval |
Authors | Dayan Wu, Qi Dai, Jing Liu, Bo Li, Weiping Wang |
Abstract | Hashing has shown great potential in large-scale image retrieval due to its storage and computation efficiency, especially the recent deep supervised hashing methods. To achieve promising performance, deep supervised hashing methods require a large amount of training data from different classes. However, when images of new categories emerge, existing deep hashing methods have to retrain the CNN model and generate hash codes for all the database images again, which is impractical for large-scale retrieval system. In this paper, we propose a novel deep hashing framework, called Deep Incremental Hashing Network (DIHN), for learning hash codes in an incremental manner. DIHN learns the hash codes for the new coming images directly, while keeping the old ones unchanged. Simultaneously, a deep hash function for query set is learned by preserving the similarities between training points. Extensive experiments on two widely used image retrieval benchmarks demonstrate that the proposed DIHN framework can significantly decrease the training time while keeping the state-of-the-art retrieval accuracy. |
Tasks | Image Retrieval |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_Deep_Incremental_Hashing_Network_for_Efficient_Image_Retrieval_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_Deep_Incremental_Hashing_Network_for_Efficient_Image_Retrieval_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/deep-incremental-hashing-network-for |
Repo | |
Framework | |
Modeling the Relationship between User Comments and Edits in Document Revision
Title | Modeling the Relationship between User Comments and Edits in Document Revision |
Authors | Xuchao Zhang, Dheeraj Rajagopal, Michael Gamon, Sujay Kumar Jauhar, ChangTien Lu |
Abstract | Management of collaborative documents can be difficult, given the profusion of edits and comments that multiple authors make during a document{'}s evolution. Reliably modeling the relationship between edits and comments is a crucial step towards helping the user keep track of a document in flux. A number of authoring tasks, such as categorizing and summarizing edits, detecting completed to-dos, and visually rearranging comments could benefit from such a contribution. Thus, in this paper we explore the relationship between comments and edits by defining two novel, related tasks: Comment Ranking and Edit Anchoring. We begin by collecting a dataset with more than half a million comment-edit pairs based on Wikipedia revision histories. We then propose a hierarchical multi-layer deep neural-network to model the relationship between edits and comments. Our architecture tackles both Comment Ranking and Edit Anchoring tasks by encoding specific edit actions such as additions and deletions, while also accounting for document context. In a number of evaluation settings, our experimental results show that our approach outperforms several strong baselines significantly. We are able to achieve a precision@1 of 71.0{%} and a precision@3 of 94.4{%} for Comment Ranking, while we achieve 74.4{%} accuracy on Edit Anchoring. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1505/ |
https://www.aclweb.org/anthology/D19-1505 | |
PWC | https://paperswithcode.com/paper/modeling-the-relationship-between-user |
Repo | |
Framework | |
Can Modern Standard Arabic Approaches be used for Arabic Dialects? Sentiment Analysis as a Case Study
Title | Can Modern Standard Arabic Approaches be used for Arabic Dialects? Sentiment Analysis as a Case Study |
Authors | Chatrine Qwaider, Stergios Chatzikyriakidis, Simon Dobnik |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/W19-5606/ |
https://www.aclweb.org/anthology/W19-5606 | |
PWC | https://paperswithcode.com/paper/can-modern-standard-arabic-approaches-be-used |
Repo | |
Framework | |