October 15, 2019

2513 words 12 mins read

Paper Group NANR 257

Dialogue Structure Annotation for Multi-Floor Interaction. Improving Color Reproduction Accuracy on Cameras. Transferable Adversarial Perturbations. Modelling Salient Features as Directions in Fine-Tuned Semantic Spaces. WaveNet 聲碼器及其於語音轉換之應用 (WaveNet Vocoder and its Applications in Voice Conversion) [In Chinese]. Erase or Fill? Deep Joint Recurren …

Dialogue Structure Annotation for Multi-Floor Interaction


Title	Dialogue Structure Annotation for Multi-Floor Interaction
Authors	David Traum, Cassidy Henry, Stephanie Lukin, Ron Artstein, Felix Gervits, Kimberly Pollard, Claire Bonial, Su Lei, Clare Voss, Matthew Marge, Cory Hayes, Susan Hill
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1017/
PDF	https://www.aclweb.org/anthology/L18-1017
PWC	https://paperswithcode.com/paper/dialogue-structure-annotation-for-multi-floor
Repo
Framework

Improving Color Reproduction Accuracy on Cameras


Title	Improving Color Reproduction Accuracy on Cameras
Authors	Hakki Can Karaimer, Michael S. Brown
Abstract	One of the key operations performed on a digital camera is to map the sensor-specific color space to a standard perceptual color space. This procedure involves the application of a white-balance correction followed by a color space transform. The current approach for this colorimetric mapping is based on an interpolation of pre-calibrated color space transforms computed for two fixed illuminations (i.e., two white-balance settings). Images captured under different illuminations are subject to less color accuracy due to the use of this interpolation process. In this paper, we discuss the limitations of the current colorimetric mapping approach and propose two methods that are able to improve color accuracy. We evaluate our approach on seven different cameras and show improvements of up to 30% (DSLR cameras) and 59% (mobile phone cameras) in terms of color reproduction error.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Karaimer_Improving_Color_Reproduction_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Karaimer_Improving_Color_Reproduction_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/improving-color-reproduction-accuracy-on
Repo
Framework

Transferable Adversarial Perturbations


Title	Transferable Adversarial Perturbations
Authors	Wen Zhou, Xin Hou, Yongjun Chen, Mengyun Tang, Xiangqi Huang, Xiang Gan, Yong Yang
Abstract	State-of-the-art deep neural network classifiers are highly vulnerable to adversarial examples which are designed to mislead classifiers with a very small perturbation. However, the performance of black-box attacks (without knowledge of the model parameters) against deployed models always degrades significantly. In this paper, We propose a novel way of perturbations for adversarial examples to enable black-box transfer. We first show that maximizing distance between natural images and their adversarial examples in the intermediate feature maps can improve both white-box attacks (with knowledge of the model parameters) and black-box attacks. We also show that smooth regularization on adversarial perturbations enables transferring across models. Extensive experimental results show that our approach outperforms state-of-the-art methods both in white-box and black-box attacks.
Tasks
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Bruce_Hou_Transferable_Adversarial_Perturbations_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Bruce_Hou_Transferable_Adversarial_Perturbations_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/transferable-adversarial-perturbations
Repo
Framework

Modelling Salient Features as Directions in Fine-Tuned Semantic Spaces


Title	Modelling Salient Features as Directions in Fine-Tuned Semantic Spaces
Authors	Thomas Ager, Ond{\v{r}}ej Ku{\v{z}}elka, Steven Schockaert
Abstract	In this paper we consider semantic spaces consisting of objects from some particular domain (e.g. IMDB movie reviews). Various authors have observed that such semantic spaces often model salient features (e.g. how scary a movie is) as directions. These feature directions allow us to rank objects according to how much they have the corresponding feature, and can thus play an important role in interpretable classifiers, recommendation systems, or entity-oriented search engines, among others. Methods for learning semantic spaces, however, are mostly aimed at modelling similarity. In this paper, we argue that there is an inherent trade-off between capturing similarity and faithfully modelling features as directions. Following this observation, we propose a simple method to fine-tune existing semantic spaces, with the aim of improving the quality of their feature directions. Crucially, our method is fully unsupervised, requiring only a bag-of-words representation of the objects as input.
Tasks	Knowledge Graph Embeddings, Recommendation Systems, Word Embeddings
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-1051/
PDF	https://www.aclweb.org/anthology/K18-1051
PWC	https://paperswithcode.com/paper/modelling-salient-features-as-directions-in
Repo
Framework

WaveNet 聲碼器及其於語音轉換之應用 (WaveNet Vocoder and its Applications in Voice Conversion) [In Chinese]


Title	WaveNet 聲碼器及其於語音轉換之應用 (WaveNet Vocoder and its Applications in Voice Conversion) [In Chinese]
Authors	Wen-Chin Huang, Chen-Chou Lo, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang
Abstract
Tasks	Voice Conversion
Published	2018-10-01
URL	https://www.aclweb.org/anthology/O18-1009/
PDF	https://www.aclweb.org/anthology/O18-1009
PWC	https://paperswithcode.com/paper/wavenet-e2c14a-aa14eae3e12a1c-wavenet-vocoder
Repo
Framework

Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos


Title	Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos
Authors	Jiaying Liu, Wenhan Yang, Shuai Yang, Zongming Guo
Abstract	In this paper, we address the problem of video rain removal by constructing deep recurrent convolutional networks. We visit the rain removal case by considering rain occlusion regions, i.e. light transmittance of rain streaks is low. Different from additive rain streaks, in such rain occlusion regions, the details of background images are completely lost. Therefore, we propose a hybrid rain model to depict both rain streaks and occlusions. With the wealth of temporal redundancy, we build a Joint Recurrent Rain Removal and Reconstruction Network (J4R-Net) that seamlessly integrates rain degradation classification, spatial texture appearances based rain removal and temporal coherence based background details reconstruction. The rain degradation classification provides a binary map that reveals whether a location degraded by linear additive streaks or occlusions. With this side information, the gate of the recurrent unit learns to make a trade-off between rain streak removal and background details reconstruction. Extensive experiments on a series of synthetic and real videos with rain streaks verify the superiority of the proposed method over previous state-of-the-art methods.
Tasks	Rain Removal
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Liu_Erase_or_Fill_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Erase_or_Fill_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/erase-or-fill-deep-joint-recurrent-rain
Repo
Framework

Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera


Title	Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera
Authors	Timo von Marcard, Roberto Henschel, Michael J. Black, Bodo Rosenhahn, Gerard Pons-Moll
Abstract	In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW ), a new dataset consisting of more than 51; 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having coffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.
Tasks	3D Pose Estimation, Pose Estimation
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Timo_von_Marcard_Recovering_Accurate_3D_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Timo_von_Marcard_Recovering_Accurate_3D_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/recovering-accurate-3d-human-pose-in-the-wild
Repo
Framework

An Empirical Study of Self-Disclosure in Spoken Dialogue Systems


Title	An Empirical Study of Self-Disclosure in Spoken Dialogue Systems
Authors	Ravich, Abhilasha er, Alan W. Black
Abstract	Self-disclosure is a key social strategy employed in conversation to build relations and increase conversational depth. It has been heavily studied in psychology and linguistic literature, particularly for its ability to induce self-disclosure from the recipient, a phenomena known as reciprocity. However, we know little about how self-disclosure manifests in conversation with automated dialog systems, especially as any self-disclosure on the part of a dialog system is patently disingenuous. In this work, we run a large-scale quantitative analysis on the effect of self-disclosure by analyzing interactions between real-world users and a spoken dialog system in the context of social conversation. We find that indicators of reciprocity occur even in human-machine dialog, with far-reaching implications for chatbots in a variety of domains including education, negotiation and social dialog.
Tasks	Spoken Dialogue Systems
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-5030/
PDF	https://www.aclweb.org/anthology/W18-5030
PWC	https://paperswithcode.com/paper/an-empirical-study-of-self-disclosure-in
Repo
Framework

Video Rain Streak Removal by Multiscale Convolutional Sparse Coding


Title	Video Rain Streak Removal by Multiscale Convolutional Sparse Coding
Authors	Minghan Li, Qi Xie, Qian Zhao, Wei Wei, Shuhang Gu, Jing Tao, Deyu Meng
Abstract	Videos captured by outdoor surveillance equipments sometimes contain unexpected rain streaks, which brings difficulty in subsequent video processing tasks. Rain streak removal from a video is thus an important topic in recent computer vision research. In this paper, we raise two intrinsic characteristics specifically possessed by rain streaks. Firstly, the rain streaks in a video contain repetitive local patterns sparsely scattered over different positions of the video. Secondly, the rain streaks are with multiscale configurations due to their occurrence on positions with different distances to the cameras. Based on such understanding, we specifically formulate both characteristics into a multiscale convolutional sparse coding (MS-CSC) model for the video rain streak removal task. Specifically, we use multiple convolutional filters convolved on the sparse feature maps to deliver the former characteristic, and further use multiscale filters to represent different scales of rain streaks. Such a new encoding manner makes the proposed method capable of properly extracting rain streaks from videos, thus getting fine video deraining effects. Experiments implemented on synthetic and real videos verify the superiority of the proposed method, as compared with the state-of-the-art ones along this research line, both visually and quantitatively.
Tasks	Rain Removal
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Video_Rain_Streak_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Video_Rain_Streak_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/video-rain-streak-removal-by-multiscale
Repo
Framework

#MeToo Alexa: How Conversational Systems Respond to Sexual Harassment


Title	#MeToo Alexa: How Conversational Systems Respond to Sexual Harassment
Authors	Am Cercas Curry, a, Verena Rieser
Abstract	Conversational AI systems, such as Amazon{'}s Alexa, are rapidly developing from purely transactional systems to social chatbots, which can respond to a wide variety of user requests. In this article, we establish how current state-of-the-art conversational systems react to inappropriate requests, such as bullying and sexual harassment on the part of the user, by collecting and analysing the novel {#}MeTooAlexa corpus. Our results show that commercial systems mainly avoid answering, while rule-based chatbots show a variety of behaviours and often deflect. Data-driven systems, on the other hand, are often non-coherent, but also run the risk of being interpreted as flirtatious and sometimes react with counter-aggression. This includes our own system, trained on {``}clean{''} data, which suggests that inappropriate system behaviour is not caused by data bias. \|
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-0802/
PDF	https://www.aclweb.org/anthology/W18-0802
PWC	https://paperswithcode.com/paper/metoo-alexa-how-conversational-systems-1
Repo
Framework

Deep End-to-End Time-of-Flight Imaging


Title	Deep End-to-End Time-of-Flight Imaging
Authors	Shuochen Su, Felix Heide, Gordon Wetzstein, Wolfgang Heidrich
Abstract	We present an end-to-end image processing framework for time-of-flight (ToF) cameras. Existing ToF image processing pipelines consist of a sequence of operations including modulated exposures, denoising, phase unwrapping and multipath interference correction. While this cascaded modular design offers several benefits, such as closed-form solutions and power-efficient processing, it also suffers from error accumulation and information loss as each module can only observe the output from its direct predecessor, resulting in erroneous depth estimates. We depart from a conventional pipeline model and propose a deep convolutional neural network architecture that recovers scene depth directly from dual-frequency, raw ToF correlation measurements. To train this network, we simulate ToF images for a variety of scenes using a time-resolved renderer, devise depth-specific losses, and apply normalization and augmentation strategies to generalize this model to real captures. We demonstrate that the proposed network can efficiently exploit the spatio-temporal structures of ToF frequency measurements, and validate the performance of the joint multipath removal, denoising and phase unwrapping method on a wide range of challenging scenes.
Tasks	Denoising
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Su_Deep_End-to-End_Time-of-Flight_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Su_Deep_End-to-End_Time-of-Flight_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/deep-end-to-end-time-of-flight-imaging
Repo
Framework

Loss Decomposition for Fast Learning in Large Output Spaces


Title	Loss Decomposition for Fast Learning in Large Output Spaces
Authors	Ian En-Hsu Yen, Satyen Kale, Felix Yu, Daniel Holtmann-Rice, Sanjiv Kumar, Pradeep Ravikumar
Abstract	For problems with large output spaces, evaluation of the loss function and its gradient are expensive, typically taking linear time in the size of the output space. Recently, methods have been developed to speed up learning via efficient data structures for Nearest-Neighbor Search (NNS) or Maximum Inner-Product Search (MIPS). However, the performance of such data structures typically degrades in high dimensions. In this work, we propose a novel technique to reduce the intractable high dimensional search problem to several much more tractable lower dimensional ones via dual decomposition of the loss function. At the same time, we demonstrate guaranteed convergence to the original loss via a greedy message passing procedure. In our experiments on multiclass and multilabel classification with hundreds of thousands of classes, as well as training skip-gram word embeddings with a vocabulary size of half a million, our technique consistently improves the accuracy of search-based gradient approximation methods and outperforms sampling-based gradient approximation methods by a large margin.
Tasks	Word Embeddings
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2391
PDF	http://proceedings.mlr.press/v80/yen18a/yen18a.pdf
PWC	https://paperswithcode.com/paper/loss-decomposition-for-fast-learning-in-large
Repo
Framework

Building a TOCFL Learner Corpus for Chinese Grammatical Error Diagnosis


Title	Building a TOCFL Learner Corpus for Chinese Grammatical Error Diagnosis
Authors	Lung-Hao Lee, Yuen-Hsien Tseng, Li-Ping Chang
Abstract
Tasks	Grammatical Error Detection, Language Acquisition, Language Identification
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1363/
PDF	https://www.aclweb.org/anthology/L18-1363
PWC	https://paperswithcode.com/paper/building-a-tocfl-learner-corpus-for-chinese
Repo
Framework

Adaptive Dropout with Rademacher Complexity Regularization


Title	Adaptive Dropout with Rademacher Complexity Regularization
Authors	Ke Zhai, Huan Wang
Abstract	We propose a novel framework to adaptively adjust the dropout rates for the deep neural network based on a Rademacher complexity bound. The state-of-the-art deep learning algorithms impose dropout strategy to prevent feature co-adaptation. However, choosing the dropout rates remains an art of heuristics or relies on empirical grid-search over some hyperparameter space. In this work, we show the network Rademacher complexity is bounded by a function related to the dropout rate vectors and the weight coefficient matrices. Subsequently, we impose this bound as a regularizer and provide a theoretical justified way to trade-off between model complexity and representation power. Therefore, the dropout rates and the empirical loss are unified into the same objective function, which is then optimized using the block coordinate descent algorithm. We discover that the adaptively adjusted dropout rates converge to some interesting distributions that reveal meaningful patterns.Experiments on the task of image and document classification also show our method achieves better performance compared to the state-of the-art dropout algorithms.
Tasks	Document Classification
Published	2018-01-01
URL	https://openreview.net/forum?id=S1uxsye0Z
PDF	https://openreview.net/pdf?id=S1uxsye0Z
PWC	https://paperswithcode.com/paper/adaptive-dropout-with-rademacher-complexity
Repo
Framework

CUNI Transformer Neural MT System for WMT18


Title	CUNI Transformer Neural MT System for WMT18
Authors	Martin Popel
Abstract	We describe our NMT system submitted to the WMT2018 shared task in news translation. Our system is based on the Transformer model (Vaswani et al., 2017). We use an improved technique of backtranslation, where we iterate the process of translating monolingual data in one direction and training an NMT model for the opposite direction using synthetic parallel data. We apply a simple but effective filtering of the synthetic data. We pre-process the input sentences using coreference resolution in order to disambiguate the gender of pro-dropped personal pronouns. Finally, we apply two simple post-processing substitutions on the translated output. Our system is significantly (p {\textless} 0.05) better than all other English-Czech and Czech-English systems in WMT2018.
Tasks	Coreference Resolution, Machine Translation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-6424/
PDF	https://www.aclweb.org/anthology/W18-6424
PWC	https://paperswithcode.com/paper/cuni-transformer-neural-mt-system-for-wmt18
Repo
Framework