Paper Group AWR 235
Decoupling Localization and Classification in Single Shot Temporal Action Detection. Dataset Culling: Towards Efficient Training Of Distillation-Based Domain Specific Models. Reinforcement Learning for Market Making in a Multi-agent Dealer Market. LEDNet: A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation. Learning Hierarchic …
Decoupling Localization and Classification in Single Shot Temporal Action Detection
Title | Decoupling Localization and Classification in Single Shot Temporal Action Detection |
Authors | Yupan Huang, Qi Dai, Yutong Lu |
Abstract | Video temporal action detection aims to temporally localize and recognize the action in untrimmed videos. Existing one-stage approaches mostly focus on unifying two subtasks, i.e., localization of action proposals and classification of each proposal through a fully shared backbone. However, such design of encapsulating all components of two subtasks in one single network might restrict the training by ignoring the specialized characteristic of each subtask. In this paper, we propose a novel Decoupled Single Shot temporal Action Detection (Decouple-SSAD) method to mitigate such problem by decoupling the localization and classification in a one-stage scheme. Particularly, two separate branches are designed in parallel to enable each component to own representations privately for accurate localization or classification. Each branch produces a set of action anchor layers by applying deconvolution to the feature maps of the main stream. Each branch produces a set of feature maps by applying deconvolution to the feature maps of the main stream. High-level semantic information from deeper layers is thus incorporated to enhance the feature representations. We conduct extensive experiments on THUMOS14 dataset and demonstrate superior performance over state-of-the-art methods. Our code is available online. |
Tasks | Action Detection |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07442v1 |
http://arxiv.org/pdf/1904.07442v1.pdf | |
PWC | https://paperswithcode.com/paper/decoupling-localization-and-classification-in |
Repo | https://github.com/hypjudy/Decouple-SSAD |
Framework | tf |
Dataset Culling: Towards Efficient Training Of Distillation-Based Domain Specific Models
Title | Dataset Culling: Towards Efficient Training Of Distillation-Based Domain Specific Models |
Authors | Kentaro Yoshioka, Edward Lee, Simon Wong, Mark Horowitz |
Abstract | Real-time CNN-based object detection models for applications like surveillance can achieve high accuracy but are computationally expensive. Recent works have shown 10 to 100x reduction in computation cost for inference by using domain-specific networks. However, prior works have focused on inference only. If the domain model requires frequent retraining, training costs can pose a significant bottleneck. To address this, we propose Dataset Culling: a pipeline to reduce the size of the dataset for training, based on the prediction difficulty. Images that are easy to classify are filtered out since they contribute little to improving the accuracy. The difficulty is measured using our proposed confidence loss metric with little computational overhead. Dataset Culling is extended to optimize the image resolution to further improve training and inference costs. We develop fixed-angle, long-duration video datasets across several domains, and we show that the dataset size can be culled by a factor of 300x to reduce the total training time by 47x with no accuracy loss or even with slight improvement. Codes are available: https://github.com/kentaroy47/DatasetCulling |
Tasks | Object Detection |
Published | 2019-02-01 |
URL | https://arxiv.org/abs/1902.00173v3 |
https://arxiv.org/pdf/1902.00173v3.pdf | |
PWC | https://paperswithcode.com/paper/dataset-culling-towards-efficient-training-of |
Repo | https://github.com/kentaroy47/DatasetCulling |
Framework | pytorch |
Reinforcement Learning for Market Making in a Multi-agent Dealer Market
Title | Reinforcement Learning for Market Making in a Multi-agent Dealer Market |
Authors | Sumitra Ganesh, Nelson Vadori, Mengda Xu, Hua Zheng, Prashant Reddy, Manuela Veloso |
Abstract | Market makers play an important role in providing liquidity to markets by continuously quoting prices at which they are willing to buy and sell, and managing inventory risk. In this paper, we build a multi-agent simulation of a dealer market and demonstrate that it can be used to understand the behavior of a reinforcement learning (RL) based market maker agent. We use the simulator to train an RL-based market maker agent with different competitive scenarios, reward formulations and market price trends (drifts). We show that the reinforcement learning agent is able to learn about its competitor’s pricing policy; it also learns to manage inventory by smartly selecting asymmetric prices on the buy and sell sides (skewing), and maintaining a positive (or negative) inventory depending on whether the market price drift is positive (or negative). Finally, we propose and test reward formulations for creating risk averse RL-based market maker agents. |
Tasks | |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.05892v1 |
https://arxiv.org/pdf/1911.05892v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-market-making-in-a |
Repo | https://github.com/denisewong1/ASX300 |
Framework | tf |
LEDNet: A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation
Title | LEDNet: A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation |
Authors | Yu Wang, Quan Zhou, Jia Liu, Jian Xiong, Guangwei Gao, Xiaofu Wu, Longin Jan Latecki |
Abstract | LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation |
Tasks | Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02423v3 |
https://arxiv.org/pdf/1905.02423v3.pdf | |
PWC | https://paperswithcode.com/paper/lednet-a-lightweight-encoder-decoder-network |
Repo | https://github.com/EEEGUI/LEDNet-pytorch |
Framework | pytorch |
Learning Hierarchical Discourse-level Structure for Fake News Detection
Title | Learning Hierarchical Discourse-level Structure for Fake News Detection |
Authors | Hamid Karimi, Jiliang Tang |
Abstract | On the one hand, nowadays, fake news articles are easily propagated through various online media platforms and have become a grand threat to the trustworthiness of information. On the other hand, our understanding of the language of fake news is still minimal. Incorporating hierarchical discourse-level structure of fake and real news articles is one crucial step toward a better understanding of how these articles are structured. Nevertheless, this has rarely been investigated in the fake news detection domain and faces tremendous challenges. First, existing methods for capturing discourse-level structure rely on annotated corpora which are not available for fake news datasets. Second, how to extract out useful information from such discovered structures is another challenge. To address these challenges, we propose Hierarchical Discourse-level Structure for Fake news detection. HDSF learns and constructs a discourse-level structure for fake/real news articles in an automated and data-driven manner. Moreover, we identify insightful structure-related properties, which can explain the discovered structures and boost our understating of fake news. Conducted experiments show the effectiveness of the proposed approach. Further structural analysis suggests that real and fake news present substantial differences in the hierarchical discourse-level structures. |
Tasks | Fake News Detection |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1903.07389v6 |
http://arxiv.org/pdf/1903.07389v6.pdf | |
PWC | https://paperswithcode.com/paper/learning-hierarchical-discourse-level |
Repo | https://github.com/hamidkarimi/DHSF |
Framework | pytorch |
Complex Transformer: A Framework for Modeling Complex-Valued Sequence
Title | Complex Transformer: A Framework for Modeling Complex-Valued Sequence |
Authors | Muqiao Yang, Martin Q. Ma, Dongyu Li, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov |
Abstract | While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers. However, speech, signal and audio data are naturally complex-valued after Fourier Transform, and studies have shown a potentially richer representation of complex nets. In this paper, we propose a Complex Transformer, which incorporates the transformer model as a backbone for sequence modeling; we also develop attention and encoder-decoder network operating for complex input. The model achieves state-of-the-art performance on the MusicNet dataset and an In-phase Quadrature (IQ) signal dataset. |
Tasks | |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.10202v1 |
https://arxiv.org/pdf/1910.10202v1.pdf | |
PWC | https://paperswithcode.com/paper/complex-transformer-a-framework-for-modeling |
Repo | https://github.com/muqiaoy/dl_signal |
Framework | pytorch |
Robust Chinese Word Segmentation with Contextualized Word Representations
Title | Robust Chinese Word Segmentation with Contextualized Word Representations |
Authors | Yung-Sung Chuang |
Abstract | In recent years, after the neural-network-based method was proposed, the accuracy of the Chinese word segmentation task has made great progress. However, when dealing with out-of-vocabulary words, there is still a large error rate. We used a simple bidirectional LSTM architecture and a large-scale pretrained language model to generate high-quality contextualize character representations, which successfully reduced the weakness of the ambiguous meanings of each Chinese character that widely appears in Chinese characters, and hence effectively reduced OOV error rate. State-of-the-art performance is achieved on many datasets. |
Tasks | Chinese Word Segmentation, Language Modelling |
Published | 2019-01-17 |
URL | http://arxiv.org/abs/1901.05816v1 |
http://arxiv.org/pdf/1901.05816v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-chinese-word-segmentation-with |
Repo | https://github.com/voidism/pywordseg |
Framework | pytorch |
Improving Adversarial Robustness via Promoting Ensemble Diversity
Title | Improving Adversarial Robustness via Promoting Ensemble Diversity |
Authors | Tianyu Pang, Kun Xu, Chao Du, Ning Chen, Jun Zhu |
Abstract | Though deep neural networks have achieved significant progress on various tasks, often enhanced by model ensemble, existing high-performance models can be vulnerable to adversarial attacks. Many efforts have been devoted to enhancing the robustness of individual networks and then constructing a straightforward ensemble, e.g., by directly averaging the outputs, which ignores the interaction among networks. This paper presents a new method that explores the interaction among individual networks to improve robustness for ensemble models. Technically, we define a new notion of ensemble diversity in the adversarial setting as the diversity among non-maximal predictions of individual members, and present an adaptive diversity promoting (ADP) regularizer to encourage the diversity, which leads to globally better robustness for the ensemble by making adversarial examples difficult to transfer among individual members. Our method is computationally efficient and compatible with the defense methods acting on individual networks. Empirical results on various datasets verify that our method can improve adversarial robustness while maintaining state-of-the-art accuracy on normal examples. |
Tasks | |
Published | 2019-01-25 |
URL | https://arxiv.org/abs/1901.08846v3 |
https://arxiv.org/pdf/1901.08846v3.pdf | |
PWC | https://paperswithcode.com/paper/improving-adversarial-robustness-via |
Repo | https://github.com/P2333/Adaptive-Diversity-Promoting |
Framework | tf |
Self-Supervised Correspondence in Visuomotor Policy Learning
Title | Self-Supervised Correspondence in Visuomotor Policy Learning |
Authors | Peter Florence, Lucas Manuelli, Russ Tedrake |
Abstract | In this paper we explore using self-supervised correspondence for improving the generalization performance and sample efficiency of visuomotor policy learning. Prior work has primarily used approaches such as autoencoding, pose-based losses, and end-to-end policy optimization in order to train the visual portion of visuomotor policies. We instead propose an approach using self-supervised dense visual correspondence training, and show this enables visuomotor policy learning with surprisingly high generalization performance with modest amounts of data: using imitation learning, we demonstrate extensive hardware validation on challenging manipulation tasks with as few as 50 demonstrations. Our learned policies can generalize across classes of objects, react to deformable object configurations, and manipulate textureless symmetrical objects in a variety of backgrounds, all with closed-loop, real-time vision-based policies. Simulated imitation learning experiments suggest that correspondence training offers sample complexity and generalization benefits compared to autoencoding and end-to-end training. |
Tasks | Imitation Learning |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.06933v1 |
https://arxiv.org/pdf/1909.06933v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-correspondence-in-visuomotor |
Repo | https://github.com/peteflorence/visuomotor_correspondence |
Framework | pytorch |
Image to Images Translation for Multi-Task Organ Segmentation and Bone Suppression in Chest X-Ray Radiography
Title | Image to Images Translation for Multi-Task Organ Segmentation and Bone Suppression in Chest X-Ray Radiography |
Authors | Mohammad Eslami, Solale Tabarestani, Shadi Albarqouni, Ehsan Adeli, Nassir Navab, Malek Adjouadi |
Abstract | Chest X-ray radiography is one of the earliest medical imaging technologies and remains one of the most widely-used for diagnosis, screening, and treatment follow up of diseases related to lungs and heart. The literature in this field of research reports many interesting studies dealing with the challenging tasks of bone suppression and organ segmentation but performed separately, limiting any learning that comes with the consolidation of parameters that could optimize both processes. This study, and for the first time, introduces a multitask deep learning model that generates simultaneously the bone-suppressed image and the organ-segmented image, enhancing the accuracy of tasks, minimizing the number of parameters needed by the model and optimizing the processing time, all by exploiting the interplay between the network parameters to benefit the performance of both tasks. The architectural design of this model, which relies on a conditional generative adversarial network, reveals the process on how the well-established pix2pix network (image-to-image network) is modified to fit the need for multitasking and extending it to the new image-to-images architecture. The developed source code of this multitask model is shared publicly on Github as the first attempt for providing the two-task pix2pix extension, a supervised/paired/aligned/registered image-to-images translation which would be useful in many multitask applications. Dilated convolutions are also used to improve the results through a more effective receptive field assessment. The comparison with state-of-the-art algorithms along with ablation study and a demonstration video are provided to evaluate efficacy and gauge the merits of the proposed approach. |
Tasks | Decision Making |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.10089v2 |
https://arxiv.org/pdf/1906.10089v2.pdf | |
PWC | https://paperswithcode.com/paper/image-to-images-translation-for-multi-task |
Repo | https://github.com/mohaEs/image-to-images-translation |
Framework | tf |
PadChest: A large chest x-ray image dataset with multi-label annotated reports
Title | PadChest: A large chest x-ray image dataset with multi-label annotated reports |
Authors | Aurelia Bustos, Antonio Pertusa, Jose-Maria Salinas, Maria de la Iglesia-Vayá |
Abstract | We present a labeled large-scale, high resolution chest x-ray dataset for the automated exploration of medical images along with their associated reports. This dataset includes more than 160,000 images obtained from 67,000 patients that were interpreted and reported by radiologists at Hospital San Juan Hospital (Spain) from 2009 to 2017, covering six different position views and additional information on image acquisition and patient demography. The reports were labeled with 174 different radiographic findings, 19 differential diagnoses and 104 anatomic locations organized as a hierarchical taxonomy and mapped onto standard Unified Medical Language System (UMLS) terminology. Of these reports, 27% were manually annotated by trained physicians and the remaining set was labeled using a supervised method based on a recurrent neural network with attention mechanisms. The labels generated were then validated in an independent test set achieving a 0.93 Micro-F1 score. To the best of our knowledge, this is one of the largest public chest x-ray database suitable for training supervised models concerning radiographs, and the first to contain radiographic reports in Spanish. The PadChest dataset can be downloaded from http://bimcv.cipf.es/bimcv-projects/padchest/. |
Tasks | |
Published | 2019-01-22 |
URL | http://arxiv.org/abs/1901.07441v2 |
http://arxiv.org/pdf/1901.07441v2.pdf | |
PWC | https://paperswithcode.com/paper/padchest-a-large-chest-x-ray-image-dataset |
Repo | https://github.com/auriml/Rx-thorax-automatic-captioning |
Framework | none |
Mapped Convolutions
Title | Mapped Convolutions |
Authors | Marc Eder, True Price, Thanh Vu, Akash Bapat, Jan-Michael Frahm |
Abstract | We present a versatile formulation of the convolution operation that we term a “mapped convolution.” The standard convolution operation implicitly samples the pixel grid and computes a weighted sum. Our mapped convolution decouples these two components, freeing the operation from the confines of the image grid and allowing the kernel to process any type of structured data. As a test case, we demonstrate its use by applying it to dense inference on spherical data. We perform an in-depth study of existing spherical image convolution methods and propose an improved sampling method for equirectangular images. Then, we discuss the impact of data discretization when deriving a sampling function, highlighting drawbacks of the cube map representation for spherical data. Finally, we illustrate how mapped convolutions enable us to convolve directly on a mesh by projecting the spherical image onto a geodesic grid and training on the textured mesh. This method exceeds the state of the art for spherical depth estimation by nearly 17%. Our findings suggest that mapped convolutions can be instrumental in expanding the application scope of convolutional neural networks. |
Tasks | Depth Estimation |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11096v1 |
https://arxiv.org/pdf/1906.11096v1.pdf | |
PWC | https://paperswithcode.com/paper/mapped-convolutions |
Repo | https://github.com/meder411/MappedConvolutions |
Framework | pytorch |
HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds
Title | HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds |
Authors | Xiuye Gu, Yijie Wang, Chongruo wu, Yong-Jae lee, Panqu Wang |
Abstract | We present a novel deep neural network architecture for end-to-end scene flow estimation that directly operates on large-scale 3D point clouds. Inspired by Bilateral Convolutional Layers (BCL), we propose novel DownBCL, UpBCL, and CorrBCL operations that restore structural information from unstructured point clouds, and fuse information from two consecutive point clouds. Operating on discrete and sparse permutohedral lattice points, our architectural design is parsimonious in computational cost. Our model can efficiently process a pair of point cloud frames at once with a maximum of 86K points per frame. Our approach achieves state-of-the-art performance on the FlyingThings3D and KITTI Scene Flow 2015 datasets. Moreover, trained on synthetic data, our approach shows great generalization ability on real-world data and on different point densities without fine-tuning. |
Tasks | Scene Flow Estimation |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05332v1 |
https://arxiv.org/pdf/1906.05332v1.pdf | |
PWC | https://paperswithcode.com/paper/hplflownet-hierarchical-permutohedral-lattice-1 |
Repo | https://github.com/laoreja/HPLFlowNet |
Framework | pytorch |
Kernel computations from large-scale random features obtained by Optical Processing Units
Title | Kernel computations from large-scale random features obtained by Optical Processing Units |
Authors | Ruben Ohana, Jonas Wacker, Jonathan Dong, Sébastien Marmin, Florent Krzakala, Maurizio Filippone, Laurent Daudet |
Abstract | Approximating kernel functions with random features (RFs)has been a successful application of random projections for nonparametric estimation. However, performing random projections presents computational challenges for large-scale problems. Recently, a new optical hardware called Optical Processing Unit (OPU) has been developed for fast and energy-efficient computation of large-scale RFs in the analog domain. More specifically, the OPU performs the multiplication of input vectors by a large random matrix with complex-valued i.i.d. Gaussian entries, followed by the application of an element-wise squared absolute value operation - this last nonlinearity being intrinsic to the sensing process. In this paper, we show that this operation results in a dot-product kernel that has connections to the polynomial kernel, and we extend this computation to arbitrary powers of the feature map. Experiments demonstrate that the OPU kernel and its RF approximation achieve competitive performance in applications using kernel ridge regression and transfer learning for image classification. Crucially, thanks to the use of the OPU, these results are obtained with time and energy savings. |
Tasks | Image Classification, Transfer Learning |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.09880v2 |
https://arxiv.org/pdf/1910.09880v2.pdf | |
PWC | https://paperswithcode.com/paper/kernel-computations-from-large-scale-random |
Repo | https://github.com/joneswack/opu-kernel-experiments |
Framework | pytorch |
Generative Image Translation for Data Augmentation in Colorectal Histopathology Images
Title | Generative Image Translation for Data Augmentation in Colorectal Histopathology Images |
Authors | Jerry Wei, Arief Suriawinata, Louis Vaickus, Bing Ren, Xiaoying Liu, Jason Wei, Saeed Hassanpour |
Abstract | We present an image translation approach to generate augmented data for mitigating data imbalances in a dataset of histopathology images of colorectal polyps, adenomatous tumors that can lead to colorectal cancer if left untreated. By applying cycle-consistent generative adversarial networks (CycleGANs) to a source domain of normal colonic mucosa images, we generate synthetic colorectal polyp images that belong to diagnostically less common polyp classes. Generated images maintain the general structure of their source image but exhibit adenomatous features that can be enhanced with our proposed filtration module, called Path-Rank-Filter. We evaluate the quality of generated images through Turing tests with four gastrointestinal pathologists, finding that at least two of the four pathologists could not identify generated images at a statistically significant level. Finally, we demonstrate that using CycleGAN-generated images to augment training data improves the AUC of a convolutional neural network for detecting sessile serrated adenomas by over 10%, suggesting that our approach might warrant further research for other histopathology image classification tasks. |
Tasks | Data Augmentation, Image Classification |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05827v1 |
https://arxiv.org/pdf/1910.05827v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-image-translation-for-data-1 |
Repo | https://github.com/BMIRDS/HistoGAN |
Framework | pytorch |