February 1, 2020

2886 words 14 mins read

Paper Group AWR 162

Paper Group AWR 162

DiffEqFlux.jl - A Julia Library for Neural Differential Equations. Equalized odds postprocessing under imperfect group information. Pre-Training with Whole Word Masking for Chinese BERT. Pan-tilt-zoom SLAM for Sports Videos. DenseNet Models for Tiny ImageNet Classification. Unsupervised Embedding Learning via Invariant and Spreading Instance Featur …

DiffEqFlux.jl - A Julia Library for Neural Differential Equations

Title DiffEqFlux.jl - A Julia Library for Neural Differential Equations
Authors Chris Rackauckas, Mike Innes, Yingbo Ma, Jesse Bettencourt, Lyndon White, Vaibhav Dixit
Abstract DiffEqFlux.jl is a library for fusing neural networks and differential equations. In this work we describe differential equations from the viewpoint of data science and discuss the complementary nature between machine learning models and differential equations. We demonstrate the ability to incorporate DifferentialEquations.jl-defined differential equation problems into a Flux-defined neural network, and vice versa. The advantages of being able to use the entire DifferentialEquations.jl suite for this purpose is demonstrated by counter examples where simple integration strategies fail, but the sophisticated integration strategies provided by the DifferentialEquations.jl library succeed. This is followed by a demonstration of delay differential equations and stochastic differential equations inside of neural networks. We show high-level functionality for defining neural ordinary differential equations (neural networks embedded into the differential equation) and describe the extra models in the Flux model zoo which includes neural stochastic differential equations. We conclude by discussing the various adjoint methods used for backpropogation of the differential equation solvers. DiffEqFlux.jl is an important contribution to the area, as it allows the full weight of the differential equation solvers developed from decades of research in the scientific computing field to be readily applied to the challenges posed by machine learning and data science.
Tasks
Published 2019-02-06
URL http://arxiv.org/abs/1902.02376v1
PDF http://arxiv.org/pdf/1902.02376v1.pdf
PWC https://paperswithcode.com/paper/diffeqfluxjl-a-julia-library-for-neural
Repo https://github.com/UnofficialJuliaMirrorSnapshots/DiffEqFlux.jl-aae7a2af-3d4f-5e19-a356-7da93b79d9d0
Framework none

Equalized odds postprocessing under imperfect group information

Title Equalized odds postprocessing under imperfect group information
Authors Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern
Abstract Most approaches aiming to ensure a model’s fairness with respect to a protected attribute (such as gender or race) assume to know the true value of the attribute for every data point. In this paper, we ask to what extent fairness interventions can be effective even when only imperfect information about the protected attribute is available. In particular, we study the prominent equalized odds postprocessing method of Hardt et al. (2016) under a perturbation of the attribute. We identify conditions on the perturbation that guarantee that the bias of a classifier is reduced even by running equalized odds with the perturbed attribute. We also study the error of the resulting classifier. We empirically observe that under our identified conditions most often the error does not suffer from a perturbation of the protected attribute. For a special case, we formally prove this observation to be true.
Tasks
Published 2019-06-07
URL https://arxiv.org/abs/1906.03284v3
PDF https://arxiv.org/pdf/1906.03284v3.pdf
PWC https://paperswithcode.com/paper/effectiveness-of-equalized-odds-for-fair
Repo https://github.com/matthklein/equalized_odds_with_perturbed_attribute
Framework none

Pre-Training with Whole Word Masking for Chinese BERT

Title Pre-Training with Whole Word Masking for Chinese BERT
Authors Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang, Shijin Wang, Guoping Hu
Abstract Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks. Recently, an upgraded version of BERT has been released with Whole Word Masking (WWM), which mitigate the drawbacks of masking partial WordPiece tokens in pre-training BERT. In this technical report, we adapt whole word masking in Chinese text, that masking the whole word instead of masking Chinese characters, which could bring another challenge in Masked Language Model (MLM) pre-training task. The proposed models are verified on various NLP tasks, across sentence-level to document-level, including machine reading comprehension (CMRC 2018, DRCD, CJRC), natural language inference (XNLI), sentiment classification (ChnSentiCorp), sentence pair matching (LCQMC, BQ Corpus), and document classification (THUCNews). Experimental results on these datasets show that the whole word masking could bring another significant gain. Moreover, we also examine the effectiveness of the Chinese pre-trained models: BERT, ERNIE, BERT-wwm, BERT-wwm-ext, RoBERTa-wwm-ext, and RoBERTa-wwm-ext-large. We release all the pre-trained models: \url{https://github.com/ymcui/Chinese-BERT-wwm
Tasks Document Classification, Language Modelling, Machine Reading Comprehension, Named Entity Recognition, Natural Language Inference, Reading Comprehension, Sentiment Analysis
Published 2019-06-19
URL https://arxiv.org/abs/1906.08101v2
PDF https://arxiv.org/pdf/1906.08101v2.pdf
PWC https://paperswithcode.com/paper/pre-training-with-whole-word-masking-for
Repo https://github.com/ymcui/Chinese-BERT-wwm
Framework pytorch

Pan-tilt-zoom SLAM for Sports Videos

Title Pan-tilt-zoom SLAM for Sports Videos
Authors Jikai Lu, Jianhui Chen, James J. Little
Abstract We present an online SLAM system specifically designed to track pan-tilt-zoom (PTZ) cameras in highly dynamic sports such as basketball and soccer games. In these games, PTZ cameras rotate very fast and players cover large image areas. To overcome these challenges, we propose to use a novel camera model for tracking and to use rays as landmarks in mapping. Rays overcome the missing depth in pure-rotation cameras. We also develop an online pan-tilt forest for mapping and introduce moving objects (players) detection to mitigate negative impacts from foreground objects. We test our method on both synthetic and real datasets. The experimental results show the superior performance of our method over previous methods for online PTZ camera pose estimation.
Tasks Pose Estimation
Published 2019-07-20
URL https://arxiv.org/abs/1907.08816v1
PDF https://arxiv.org/pdf/1907.08816v1.pdf
PWC https://paperswithcode.com/paper/pan-tilt-zoom-slam-for-sports-videos
Repo https://github.com/lulufa390/Pan-tilt-zoom-SLAM
Framework none

DenseNet Models for Tiny ImageNet Classification

Title DenseNet Models for Tiny ImageNet Classification
Authors Zoheb Abai, Nishad Rajmalwar
Abstract In this paper, we present two image classification models on the Tiny ImageNet dataset. We built two very different networks from scratch based on the idea of Densely Connected Convolution Networks. The architecture of the networks is designed based on the image resolution of this specific dataset and by calculating the Receptive Field of the convolution layers. We also used some non-conventional techniques related to image augmentation and Cyclical Learning Rate to improve the accuracy of our models. The networks are trained under high constraints and low computation resources. We aimed to achieve top-1 validation accuracy of 60%; the results and error analysis are also presented.
Tasks Image Augmentation, Image Classification
Published 2019-04-23
URL http://arxiv.org/abs/1904.10429v1
PDF http://arxiv.org/pdf/1904.10429v1.pdf
PWC https://paperswithcode.com/paper/densenet-models-for-tiny-imagenet
Repo https://github.com/ZohebAbai/Tiny-ImageNet-Challenge
Framework none

Unsupervised Embedding Learning via Invariant and Spreading Instance Feature

Title Unsupervised Embedding Learning via Invariant and Spreading Instance Feature
Authors Mang Ye, Xu Zhang, Pong C. Yuen, Shih-Fu Chang
Abstract This paper studies the unsupervised embedding learning problem, which requires an effective similarity measurement between samples in low-dimensional embedding space. Motivated by the positive concentrated and negative separated properties observed from category-wise supervised learning, we propose to utilize the instance-wise supervision to approximate these properties, which aims at learning data augmentation invariant and instance spread-out features. To achieve this goal, we propose a novel instance based softmax embedding method, which directly optimizes the `real’ instance features on top of the softmax function. It achieves significantly faster learning speed and higher accuracy than all existing methods. The proposed method performs well for both seen and unseen testing categories with cosine similarity. It also achieves competitive performance even without pre-trained network over samples from fine-grained categories. |
Tasks Data Augmentation
Published 2019-04-06
URL http://arxiv.org/abs/1904.03436v1
PDF http://arxiv.org/pdf/1904.03436v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-embedding-learning-via-invariant
Repo https://github.com/mangye16/Unsupervised_Embedding_Learning
Framework pytorch

Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation

Title Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation
Authors Yuzhou Liu, DeLiang Wang
Abstract We address talker-independent monaural speaker separation from the perspectives of deep learning and computational auditory scene analysis (CASA). Specifically, we decompose the multi-speaker separation task into the stages of simultaneous grouping and sequential grouping. Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network. In the second stage, the frame-level separated spectra are sequentially grouped to different speakers by a clustering network. The proposed deep CASA approach optimizes frame-level separation and speaker tracking in turn, and produces excellent results for both objectives. Experimental results on the benchmark WSJ0-2mix database show that the new approach achieves the state-of-the-art results with a modest model size.
Tasks Speaker Separation, Speech Separation
Published 2019-04-25
URL http://arxiv.org/abs/1904.11148v1
PDF http://arxiv.org/pdf/1904.11148v1.pdf
PWC https://paperswithcode.com/paper/divide-and-conquer-a-deep-casa-approach-to
Repo https://github.com/yuzhou-git/deep-casa
Framework tf

Benchmarking Classic and Learned Navigation in Complex 3D Environments

Title Benchmarking Classic and Learned Navigation in Complex 3D Environments
Authors Dmytro Mishkin, Alexey Dosovitskiy, Vladlen Koltun
Abstract Navigation research is attracting renewed interest with the advent of learning-based methods. However, this new line of work is largely disconnected from well-established classic navigation approaches. In this paper, we take a step towards coordinating these two directions of research. We set up classic and learning-based navigation systems in common simulated environments and thoroughly evaluate them in indoor spaces of varying complexity, with access to different sensory modalities. Additionally, we measure human performance in the same environments. We find that a classic pipeline, when properly tuned, can perform very well in complex cluttered environments. On the other hand, learned systems can operate more robustly with a limited sensor suite. Overall, both approaches are still far from human-level performance.
Tasks
Published 2019-01-30
URL http://arxiv.org/abs/1901.10915v2
PDF http://arxiv.org/pdf/1901.10915v2.pdf
PWC https://paperswithcode.com/paper/benchmarking-classic-and-learned-navigation
Repo https://github.com/ducha-aiki/navigation-benchmark
Framework pytorch

MAP Inference via L2-Sphere Linear Program Reformulation

Title MAP Inference via L2-Sphere Linear Program Reformulation
Authors Baoyuan Wu, Li Shen, Tong Zhang, Bernard Ghanem
Abstract Maximum a posteriori (MAP) inference is an important task for graphical models. Due to complex dependencies among variables in realistic model, finding an exact solution for MAP inference is often intractable. Thus, many approximation methods have been developed, among which the linear programming (LP) relaxation based methods show promising performance. However, one major drawback of LP relaxation is that it is possible to give fractional solutions. Instead of presenting a tighter relaxation, in this work we propose a continuous but equivalent reformulation of the original MAP inference problem, called LS-LP. We add the L2-sphere constraint onto the original LP relaxation, leading to an intersected space with the local marginal polytope that is equivalent to the space of all valid integer label configurations. Thus, LS-LP is equivalent to the original MAP inference problem. We propose a perturbed alternating direction method of multipliers (ADMM) algorithm to optimize the LS-LP problem, by adding a sufficiently small perturbation epsilon onto the objective function and constraints. We prove that the perturbed ADMM algorithm globally converges to the epsilon-Karush-Kuhn-Tucker (epsilon-KKT) point of the LS-LP problem. The convergence rate will also be analyzed. Experiments on several benchmark datasets from Probabilistic Inference Challenge (PIC 2011) and OpenGM 2 show competitive performance of our proposed method against state-of-the-art MAP inference methods.
Tasks
Published 2019-05-09
URL https://arxiv.org/abs/1905.03433v3
PDF https://arxiv.org/pdf/1905.03433v3.pdf
PWC https://paperswithcode.com/paper/190503433
Repo https://github.com/wubaoyuan/Lpbox-ADMM
Framework none

Pyramid Feature Attention Network for Saliency detection

Title Pyramid Feature Attention Network for Saliency detection
Authors Ting Zhao, Xiangqian Wu
Abstract Saliency detection is one of the basic challenges in computer vision. How to extract effective features is a critical point for saliency detection. Recent methods mainly adopt integrating multi-scale convolutional features indiscriminately. However, not all features are useful for saliency detection and some even cause interferences. To solve this problem, we propose Pyramid Feature Attention network to focus on effective high-level context features and low-level spatial structural features. First, we design Context-aware Pyramid Feature Extraction (CPFE) module for multi-scale high-level feature maps to capture rich context features. Second, we adopt channel-wise attention (CA) after CPFE feature maps and spatial attention (SA) after low-level feature maps, then fuse outputs of CA & SA together. Finally, we propose an edge preservation loss to guide network to learn more detailed information in boundary localization. Extensive evaluations on five benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches under different evaluation metrics.
Tasks Saliency Detection
Published 2019-03-01
URL http://arxiv.org/abs/1903.00179v2
PDF http://arxiv.org/pdf/1903.00179v2.pdf
PWC https://paperswithcode.com/paper/pyramid-feature-selective-network-for
Repo https://github.com/CaitinZhao/cvpr2019_Pyramid-Feature-Attention-Network-for-Saliency-detection
Framework tf

Open-domain Event Extraction and Embedding for Natural Gas Market Prediction

Title Open-domain Event Extraction and Embedding for Natural Gas Market Prediction
Authors Minh Triet Chau, Diego Esteves, Jens Lehmann
Abstract We propose an approach to predict the natural gas price in several days using historical price data and events extracted from news headlines. Most previous methods treats price as an extrapolatable time series, those analyze the relation between prices and news either trim their price data correspondingly to a public news dataset, manually annotate headlines or use off-the-shelf tools. In comparison to off-the-shelf tools, our event extraction method detects not only the occurrence of phenomena but also the changes in attribution and characteristics from public sources. Instead of using sentence embedding as a feature, we use every word of the extracted events, encode and organize them before feeding to the learning models. Empirical results show favorable results, in terms of prediction performance, money saved and scalability.
Tasks Sentence Embedding, Time Series
Published 2019-12-08
URL https://arxiv.org/abs/1912.11334v1
PDF https://arxiv.org/pdf/1912.11334v1.pdf
PWC https://paperswithcode.com/paper/open-domain-event-extraction-and-embedding
Repo https://github.com/minhtriet/gas_market
Framework none

Variance Reduced Local SGD with Lower Communication Complexity

Title Variance Reduced Local SGD with Lower Communication Complexity
Authors Xianfeng Liang, Shuheng Shen, Jingchang Liu, Zhen Pan, Enhong Chen, Yifei Cheng
Abstract To accelerate the training of machine learning models, distributed stochastic gradient descent (SGD) and its variants have been widely adopted, which apply multiple workers in parallel to speed up training. Among them, Local SGD has gained much attention due to its lower communication cost. Nevertheless, when the data distribution on workers is non-identical, Local SGD requires $O(T^{\frac{3}{4}} N^{\frac{3}{4}})$ communications to maintain its \emph{linear iteration speedup} property, where $T$ is the total number of iterations and $N$ is the number of workers. In this paper, we propose Variance Reduced Local SGD (VRL-SGD) to further reduce the communication complexity. Benefiting from eliminating the dependency on the gradient variance among workers, we theoretically prove that VRL-SGD achieves a \emph{linear iteration speedup} with a lower communication complexity $O(T^{\frac{1}{2}} N^{\frac{3}{2}})$ even if workers access non-identical datasets. We conduct experiments on three machine learning tasks, and the experimental results demonstrate that VRL-SGD performs impressively better than Local SGD when the data among workers are quite diverse.
Tasks
Published 2019-12-30
URL https://arxiv.org/abs/1912.12844v1
PDF https://arxiv.org/pdf/1912.12844v1.pdf
PWC https://paperswithcode.com/paper/variance-reduced-local-sgd-with-lower-1
Repo https://github.com/zerolxf/VRL-SGD
Framework pytorch

From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI

Title From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI
Authors Roman Beliy, Guy Gaziv, Assaf Hoogi, Francesca Strappini, Tal Golan, Michal Irani
Abstract Reconstructing observed images from fMRI brain recordings is challenging. Unfortunately, acquiring sufficient “labeled” pairs of {Image, fMRI} (i.e., images with their corresponding fMRI responses) to span the huge space of natural images is prohibitive for many reasons. We present a novel approach which, in addition to the scarce labeled data (training pairs), allows to train fMRI-to-image reconstruction networks also on “unlabeled” data (i.e., images without fMRI recording, and fMRI recording without images). The proposed model utilizes both an Encoder network (image-to-fMRI) and a Decoder network (fMRI-to-image). Concatenating these two networks back-to-back (Encoder-Decoder & Decoder-Encoder) allows augmenting the training with both types of unlabeled data. Importantly, it allows training on the unlabeled test-fMRI data. This self-supervision adapts the reconstruction network to the new input test-data, despite its deviation from the statistics of the scarce training data.
Tasks Image Reconstruction
Published 2019-07-03
URL https://arxiv.org/abs/1907.02431v1
PDF https://arxiv.org/pdf/1907.02431v1.pdf
PWC https://paperswithcode.com/paper/from-voxels-to-pixels-and-back-self
Repo https://github.com/WeizmannVision/ssfmri2im
Framework none

MAVNet: an Effective Semantic Segmentation Micro-Network for MAV-based Tasks

Title MAVNet: an Effective Semantic Segmentation Micro-Network for MAV-based Tasks
Authors Ty Nguyen, Shreyas S. Shivakumar, Ian D. Miller, James Keller, Elijah S. Lee, Alex Zhou, Tolga Ozaslan, Giuseppe Loianno, Joseph H. Harwood, Jennifer Wozencraft, Camillo J. Taylor, Vijay Kumar
Abstract Real-time semantic image segmentation on platforms subject to size, weight and power (SWaP) constraints is a key area of interest for air surveillance and inspection. In this work, we propose MAVNet: a small, light-weight, deep neural network for real-time semantic segmentation on micro Aerial Vehicles (MAVs). MAVNet, inspired by ERFNet, features 400 times fewer parameters and achieves comparable performance with some reference models in empirical experiments. Our model achieves a trade-off between speed and accuracy, achieving up to 48 FPS on an NVIDIA 1080Ti and 9 FPS on the NVIDIA Jetson Xavier when processing high resolution imagery. Additionally, we provide two novel datasets that represent challenges in semantic segmentation for real-time MAV tracking and infrastructure inspection tasks and verify MAVNet on these datasets. Our algorithm and datasets are made publicly available.
Tasks Real-Time Semantic Segmentation, Semantic Segmentation, Visual Odometry
Published 2019-04-03
URL https://arxiv.org/abs/1904.01795v2
PDF https://arxiv.org/pdf/1904.01795v2.pdf
PWC https://paperswithcode.com/paper/mavnet-an-effective-semantic-segmentation
Repo https://github.com/tynguyen/MAVNet
Framework none

Neural Architectures for Nested NER through Linearization

Title Neural Architectures for Nested NER through Linearization
Authors Jana Straková, Milan Straka, Jan Hajič
Abstract We propose two neural network architectures for nested named entity recognition (NER), a setting in which named entities may overlap and also be labeled with more than one label. We encode the nested labels using a linearized scheme. In our first proposed approach, the nested labels are modeled as multilabels corresponding to the Cartesian product of the nested labels in a standard LSTM-CRF architecture. In the second one, the nested NER is viewed as a sequence-to-sequence problem, in which the input sequence consists of the tokens and output sequence of the labels, using hard attention on the word whose label is being predicted. The proposed methods outperform the nested NER state of the art on four corpora: ACE-2004, ACE-2005, GENIA and Czech CNEC. We also enrich our architectures with the recently published contextual embeddings: ELMo, BERT and Flair, reaching further improvements for the four nested entity corpora. In addition, we report flat NER state-of-the-art results for CoNLL-2002 Dutch and Spanish and for CoNLL-2003 English.
Tasks Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition
Published 2019-08-19
URL https://arxiv.org/abs/1908.06926v1
PDF https://arxiv.org/pdf/1908.06926v1.pdf
PWC https://paperswithcode.com/paper/neural-architectures-for-nested-ner-through-1
Repo https://github.com/ufal/acl2019_nested_ner
Framework none
comments powered by Disqus