January 25, 2020

3086 words 15 mins read

Paper Group ANR 1776

In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks. A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding. Job Prediction: From Deep Neural Network Models to Applications. Novelty Detection Via Blurring. Measurement and Fairness. Flow-Distilled IP Two-Stream Networks for Compr …

In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks


Title	In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks
Authors	Heng Yang, Luca Carlone
Abstract	We study the problem of 3D shape reconstruction from 2D landmarks extracted in a single image. We adopt the 3D deformable shape model and formulate the reconstruction as a joint optimization of the camera pose and the linear shape parameters. Our first contribution is to apply Lasserre’s hierarchy of convex Sums-of-Squares (SOS) relaxations to solve the shape reconstruction problem and show that the SOS relaxation of minimum order 2 empirically solves the original non-convex problem exactly. Our second contribution is to exploit the structure of the polynomial in the objective function and find a reduced set of basis monomials for the SOS relaxation that significantly decreases the size of the resulting semidefinite program (SDP) without compromising its accuracy. These two contributions, to the best of our knowledge, lead to the first certifiably optimal solver for 3D shape reconstruction, that we name Shape. Our third contribution is to add an outlier rejection layer to Shape using a truncated least squares (TLS) robust cost function and leveraging graduated non-convexity to solve TLS without initialization. The result is a robust reconstruction algorithm, named Shape#, that tolerates a large amount of outlier measurements. We evaluate the performance of Shape* and Shape# in both simulated and real experiments, showing that Shape* outperforms local optimization and previous convex relaxation techniques, while Shape# achieves state-of-the-art performance and is robust against 70% outliers in the FG3DCar dataset.
Tasks
Published	2019-11-27
URL	https://arxiv.org/abs/1911.11924v2
PDF	https://arxiv.org/pdf/1911.11924v2.pdf
PWC	https://paperswithcode.com/paper/in-perfect-shape-certifiably-optimal-3d-shape
Repo
Framework

A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding


Title	A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding
Authors	Libo Qin, Wanxiang Che, Yangming Li, Haoyang Wen, Ting Liu
Abstract	Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system. The two tasks are closely tied and the slots often highly depend on the intent. In this paper, we propose a novel framework for SLU to better incorporate the intent information, which further guides the slot filling. In our framework, we adopt a joint model with Stack-Propagation which can directly use the intent information as input for slot filling, thus to capture the intent semantic knowledge. In addition, to further alleviate the error propagation, we perform the token-level intent detection for the Stack-Propagation framework. Experiments on two publicly datasets show that our model achieves the state-of-the-art performance and outperforms other previous methods by a large margin. Finally, we use the Bidirectional Encoder Representation from Transformer (BERT) model in our framework, which further boost our performance in SLU task.
Tasks	Intent Detection, Slot Filling, Spoken Language Understanding
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02188v1
PDF	https://arxiv.org/pdf/1909.02188v1.pdf
PWC	https://paperswithcode.com/paper/a-stack-propagation-framework-with-token
Repo
Framework

Job Prediction: From Deep Neural Network Models to Applications


Title	Job Prediction: From Deep Neural Network Models to Applications
Authors	Tin Van Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen
Abstract	Determining the job is suitable for a student or a person looking for work based on their job’s descriptions such as knowledge and skills that are difficult, as well as how employers must find ways to choose the candidates that match the job they require. In this paper, we focus on studying the job prediction using different deep neural network models including TextCNN, Bi-GRU-LSTM-CNN, and Bi-GRU-CNN with various pre-trained word embeddings on the IT Job dataset. In addition, we also proposed a simple and effective ensemble model combining different deep neural network models. The experimental results illustrated that our proposed ensemble model achieved the highest result with an F1 score of 72.71%. Moreover, we analyze these experimental results to have insights about this problem to find better solutions in the future.
Tasks	Word Embeddings
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12214v2
PDF	https://arxiv.org/pdf/1912.12214v2.pdf
PWC	https://paperswithcode.com/paper/job-prediction-from-deep-neural-network
Repo
Framework

Novelty Detection Via Blurring


Title	Novelty Detection Via Blurring
Authors	Sungik Choi, Sae-Young Chung
Abstract	Conventional out-of-distribution (OOD) detection schemes based on variational autoencoder or Random Network Distillation (RND) have been observed to assign lower uncertainty to the OOD than the target distribution. In this work, we discover that such conventional novelty detection schemes are also vulnerable to the blurred images. Based on the observation, we construct a novel RND-based OOD detector, SVD-RND, that utilizes blurred images during training. Our detector is simple, efficient at test time, and outperforms baseline OOD detectors in various domains. Further results show that SVD-RND learns better target distribution representation than the baseline RND algorithm. Finally, SVD-RND combined with geometric transform achieves near-perfect detection accuracy on the CelebA dataset.
Tasks
Published	2019-11-27
URL	https://arxiv.org/abs/1911.11943v3
PDF	https://arxiv.org/pdf/1911.11943v3.pdf
PWC	https://paperswithcode.com/paper/novelty-detection-via-blurring-1
Repo
Framework

Measurement and Fairness


Title	Measurement and Fairness
Authors	Abigail Z. Jacobs, Hanna Wallach
Abstract	We introduce the language of measurement modeling from the quantitative social sciences as a framework for understanding fairness in computational systems. Computational systems often involve unobservable theoretical constructs, such as “creditworthiness,” “teacher quality,” or “risk to society,” that cannot be measured directly and must instead be inferred from observable properties thought to be related to them—i.e., operationalized via a measurement model. This process introduces the potential for mismatch between the theoretical understanding of the construct purported to be measured and its operationalization. Indeed, we argue that many of the harms discussed in the literature on fairness in computational systems are direct results of such mismatches. Further complicating these discussions is the fact that fairness itself is an unobservable theoretical construct. Moreover, it is an essentially contested construct—i.e., it has many different theoretical understandings depending on the context. We argue that this contestedness underlies recent debates about fairness definitions: disagreements that appear to be about contradictory operationalizations are, in fact, disagreements about different theoretical understandings of the construct itself. By introducing the language of measurement modeling, we provide the computer science community with a process for making explicit and testing assumptions about unobservable theoretical constructs, thereby making it easier to identify, characterize, and even mitigate fairness-related harms.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05511v1
PDF	https://arxiv.org/pdf/1912.05511v1.pdf
PWC	https://paperswithcode.com/paper/measurement-and-fairness
Repo
Framework

Flow-Distilled IP Two-Stream Networks for Compressed Video Action Recognition


Title	Flow-Distilled IP Two-Stream Networks for Compressed Video Action Recognition
Authors	Shiyuan Huang, Xudong Lin, Svebor Karaman, Shih-Fu Chang
Abstract	Two-stream networks have achieved great success in video recognition. A two-stream network combines a spatial stream of RGB frames and a temporal stream of Optical Flow to make predictions. However, the temporal redundancy of RGB frames as well as the high-cost of optical flow computation creates challenges for both the performance and efficiency. Recent works instead use modern compressed video modalities as an alternative to the RGB spatial stream and improve the inference speed by orders of magnitudes. Previous works create one stream for each modality which are combined with an additional temporal stream through late fusion. This is redundant since some modalities like motion vectors already contain temporal information. Based on this observation, we propose a compressed domain two-stream network IP TSN for compressed video recognition, where the two streams are represented by the two types of frames (I and P frames) in compressed videos, without needing a separate temporal stream. With this goal, we propose to fully exploit the motion information of P-stream through generalized distillation from optical flow, which largely improves the efficiency and accuracy. Our P-stream runs 60 times faster than using optical flow while achieving higher accuracy. Our full IP TSN, evaluated over public action recognition benchmarks (UCF101, HMDB51 and a subset of Kinetics), outperforms other compressed domain methods by large margins while improving the total inference speed by 20%.
Tasks	Optical Flow Estimation, Temporal Action Localization, Video Recognition
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04462v2
PDF	https://arxiv.org/pdf/1912.04462v2.pdf
PWC	https://paperswithcode.com/paper/flow-distilled-ip-two-stream-networks-for
Repo
Framework

U-net super-neural segmentation and similarity calculation to realize vegetation change assessment in satellite imagery


Title	U-net super-neural segmentation and similarity calculation to realize vegetation change assessment in satellite imagery
Authors	Chunxue Wu, Bobo Ju, Naixue Xiong, Guisong Yang, Yan Wu, Hongming Yang, Jiaying Huang, Zhiyong Xu
Abstract	Vegetation is the natural linkage connecting soil, atmosphere and water. It can represent the change of land cover to a certain extent and serve as an indicator for global change research. Methods for measuring coverage can be divided into two types: surface measurement and remote sensing. Because vegetation cover has significant spatial and temporal differentiation characteristics, remote sensing has become an important technical means to estimate vegetation coverage. This paper firstly uses U-net to perform remote sensing image semantic segmentation training, then uses the result of semantic segmentation, and then uses the integral progressive method to calculate the forestland change rate, and finally realizes automated valuation of woodland change rate.
Tasks	Semantic Segmentation
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04410v1
PDF	https://arxiv.org/pdf/1909.04410v1.pdf
PWC	https://paperswithcode.com/paper/u-net-super-neural-segmentation-and
Repo
Framework

Making CNNs for Video Parsing Accessible


Title	Making CNNs for Video Parsing Accessible
Authors	Zijin Luo, Matthew Guzdial, Mark Riedl
Abstract	The ability to extract sequences of game events for high-resolution e-sport games has traditionally required access to the game’s engine. This serves as a barrier to groups who don’t possess this access. It is possible to apply deep learning to derive these logs from gameplay video, but it requires computational power that serves as an additional barrier. These groups would benefit from access to these logs, such as small e-sport tournament organizers who could better visualize gameplay to inform both audience and commentators. In this paper we present a combined solution to reduce the required computational resources and time to apply a convolutional neural network (CNN) to extract events from e-sport gameplay videos. This solution consists of techniques to train a CNN faster and methods to execute predictions more quickly. This expands the types of machines capable of training and running these models, which in turn extends access to extracting game logs with this approach. We evaluate the approaches in the domain of DOTA2, one of the most popular e-sports. Our results demonstrate our approach outperforms standard backpropagation baselines.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.11877v1
PDF	https://arxiv.org/pdf/1906.11877v1.pdf
PWC	https://paperswithcode.com/paper/making-cnns-for-video-parsing-accessible
Repo
Framework

Adversary A3C for Robust Reinforcement Learning


Title	Adversary A3C for Robust Reinforcement Learning
Authors	Zhaoyuan Gu, Zhenzhong Jia, Howie Choset
Abstract	Asynchronous Advantage Actor Critic (A3C) is an effective Reinforcement Learning (RL) algorithm for a wide range of tasks, such as Atari games and robot control. The agent learns policies and value function through trial-and-error interactions with the environment until converging to an optimal policy. Robustness and stability are critical in RL; however, neural network can be vulnerable to noise from unexpected sources and is not likely to withstand very slight disturbances. We note that agents generated from mild environment using A3C are not able to handle challenging environments. Learning from adversarial examples, we proposed an algorithm called Adversary Robust A3C (AR-A3C) to improve the agent’s performance under noisy environments. In this algorithm, an adversarial agent is introduced to the learning process to make it more robust against adversarial disturbances, thereby making it more adaptive to noisy environments. Both simulations and real-world experiments are carried out to illustrate the stability of the proposed algorithm. The AR-A3C algorithm outperforms A3C in both clean and noisy environments.
Tasks	Atari Games
Published	2019-12-01
URL	https://arxiv.org/abs/1912.00330v1
PDF	https://arxiv.org/pdf/1912.00330v1.pdf
PWC	https://paperswithcode.com/paper/adversary-a3c-for-robust-reinforcement-1
Repo
Framework

Face Detection in Repeated Settings


Title	Face Detection in Repeated Settings
Authors	Mohammad Nayeem Teli, Bruce A. Draper, J. Ross Beveridge
Abstract	Face detection is an important first step before face verification and recognition. In unconstrained settings it is still an open challenge because of the variation in pose, lighting, scale, background and location. However, for the purposes of verification we can have a control on background and location. Images are primarily captured in places such as the entrance to a sensitive building, in front of a door or some location where the background does not change. We present a correlation based face detection algorithm to detect faces in such settings, where we control the location, and leave lighting, pose, and scale uncontrolled. In these scenarios the results indicate that our algorithm is easy and fast to train, outperforms Viola and Jones face detection accuracy and is faster to test.
Tasks	Face Detection, Face Verification
Published	2019-03-20
URL	http://arxiv.org/abs/1903.08649v1
PDF	http://arxiv.org/pdf/1903.08649v1.pdf
PWC	https://paperswithcode.com/paper/face-detection-in-repeated-settings
Repo
Framework

Patient Clustering Improves Efficiency of Federated Machine Learning to predict mortality and hospital stay time using distributed Electronic Medical Records


Title	Patient Clustering Improves Efficiency of Federated Machine Learning to predict mortality and hospital stay time using distributed Electronic Medical Records
Authors	Li Huang, Dianbo Liu
Abstract	Electronic medical records (EMRs) supports the development of machine learning algorithms for predicting disease incidence, patient response to treatment, and other healthcare events. But insofar most algorithms have been centralized, taking little account of the decentralized, non-identically independently distributed (non-IID), and privacy-sensitive characteristics of EMRs that can complicate data collection, sharing and learning. To address this challenge, we introduced a community-based federated machine learning (CBFL) algorithm and evaluated it on non-IID ICU EMRs. Our algorithm clustered the distributed data into clinically meaningful communities that captured similar diagnoses and geological locations, and learnt one model for each community. Throughout the learning process, the data was kept local on hospitals, while locally-computed results were aggregated on a server. Evaluation results show that CBFL outperformed the baseline FL algorithm in terms of Area Under the Receiver Operating Characteristic Curve (ROC AUC), Area Under the Precision-Recall Curve (PR AUC), and communication cost between hospitals and the server. Furthermore, communities’ performance difference could be explained by how dissimilar one community was to others.
Tasks
Published	2019-03-22
URL	http://arxiv.org/abs/1903.09296v1
PDF	http://arxiv.org/pdf/1903.09296v1.pdf
PWC	https://paperswithcode.com/paper/patient-clustering-improves-efficiency-of
Repo
Framework

Bayesian Reinforcement Learning via Deep, Sparse Sampling


Title	Bayesian Reinforcement Learning via Deep, Sparse Sampling
Authors	Divya Grover, Debabrota Basu, Christos Dimitrakakis
Abstract	We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance relative to the Bayes optimal as well as lower computational complexity. The main novelty is the use of a candidate policy generator, to generate long-term options in the planning tree (over beliefs), which allows us to create much sparser and deeper trees. Experimental results on different environments show that in comparison to the state-of-the-art, our algorithm is both computationally more efficient, and obtains significantly higher reward over time in discrete environments.
Tasks	Efficient Exploration
Published	2019-02-07
URL	https://arxiv.org/abs/1902.02661v3
PDF	https://arxiv.org/pdf/1902.02661v3.pdf
PWC	https://paperswithcode.com/paper/deeper-sparser-exploration
Repo
Framework

Weakly Supervised Object Detection with Segmentation Collaboration


Title	Weakly Supervised Object Detection with Segmentation Collaboration
Authors	Xiaoyan Li, Meina Kan, Shiguang Shan, Xilin Chen
Abstract	Weakly supervised object detection aims at learning precise object detectors, given image category labels. In recent prevailing works, this problem is generally formulated as a multiple instance learning module guided by an image classification loss. The object bounding box is assumed to be the one contributing most to the classification among all proposals. However, the region contributing most is also likely to be a crucial part or the supporting context of an object. To obtain a more accurate detector, in this work we propose a novel end-to-end weakly supervised detection approach, where a newly introduced generative adversarial segmentation module interacts with the conventional detection module in a collaborative loop. The collaboration mechanism takes full advantages of the complementary interpretations of the weakly supervised localization task, namely detection and segmentation tasks, forming a more comprehensive solution. Consequently, our method obtains more precise object bounding boxes, rather than parts or irrelevant surroundings. Expectedly, the proposed method achieves an accuracy of 51.0% on the PASCAL VOC 2007 dataset, outperforming the state-of-the-arts and demonstrating its superiority for weakly supervised object detection.
Tasks	Image Classification, Multiple Instance Learning, Object Detection, Weakly Supervised Object Detection
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00551v1
PDF	http://arxiv.org/pdf/1904.00551v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-object-detection-with-2
Repo
Framework

Unsupervised Bi-directional Flow-based Video Generation from one Snapshot


Title	Unsupervised Bi-directional Flow-based Video Generation from one Snapshot
Authors	Lu Sheng, Junting Pan, Jiaming Guo, Jing Shao, Xiaogang Wang, Chen Change Loy
Abstract	Imagining multiple consecutive frames given one single snapshot is challenging, since it is difficult to simultaneously predict diverse motions from a single image and faithfully generate novel frames without visual distortions. In this work, we leverage an unsupervised variational model to learn rich motion patterns in the form of long-term bi-directional flow fields, and apply the predicted flows to generate high-quality video sequences. In contrast to the state-of-the-art approach, our method does not require external flow supervisions for learning. This is achieved through a novel module that performs bi-directional flows prediction from a single image. In addition, with the bi-directional flow consistency check, our method can handle occlusion and warping artifacts in a principled manner. Our method can be trained end-to-end based on arbitrarily sampled natural video clips, and it is able to capture multi-modal motion uncertainty and synthesizes photo-realistic novel sequences. Quantitative and qualitative evaluations over synthetic and real-world datasets demonstrate the effectiveness of the proposed approach over the state-of-the-art methods.
Tasks	Video Generation
Published	2019-03-03
URL	http://arxiv.org/abs/1903.00913v1
PDF	http://arxiv.org/pdf/1903.00913v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-bi-directional-flow-based-video
Repo
Framework

Deep Reinforcement Learning for Personalized Search Story Recommendation


Title	Deep Reinforcement Learning for Personalized Search Story Recommendation
Authors	Jason, Zhang, Junming Yin, Dongwon Lee, Linhong Zhu
Abstract	In recent years, \emph{search story}, a combined display with other organic channels, has become a major source of user traffic on platforms such as e-commerce search platforms, news feed platforms and web and image search platforms. The recommended search story guides a user to identify her own preference and personal intent, which subsequently influences the user’s real-time and long-term search behavior. %With such an increased importance of search stories, As search stories become increasingly important, in this work, we study the problem of personalized search story recommendation within a search engine, which aims to suggest a search story relevant to both a search keyword and an individual user’s interest. To address the challenge of modeling both immediate and future values of recommended search stories (i.e., cross-channel effect), for which conventional supervised learning framework is not applicable, we resort to a Markov decision process and propose a deep reinforcement learning architecture trained by both imitation learning and reinforcement learning. We empirically demonstrate the effectiveness of our proposed approach through extensive experiments on real-world data sets from JD.com.
Tasks	Image Retrieval, Imitation Learning
Published	2019-07-26
URL	https://arxiv.org/abs/1907.11754v1
PDF	https://arxiv.org/pdf/1907.11754v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-for-personalized
Repo
Framework