February 2, 2020

3157 words 15 mins read

Paper Group AWR 37

Latent Channel Networks. DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online Optimization. Output-Constrained Bayesian Neural Networks. Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation. End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation. Adversa …

Latent Channel Networks


Title	Latent Channel Networks
Authors	Clifford Anderson-Bergman, Phan Nguyen, Jose Cadena Pico
Abstract	Latent Euclidean embedding models a given network by representing each node in a Euclidean space, where the probability of two nodes sharing an edge is a function of the distances between the nodes. This implies that for two nodes to share an edge with high probability, they must be relatively close in all dimensions. This constraint may be overly restrictive for describing modern networks, in which having similarities in at least one area may be sufficient for having a high edge probability. We introduce a new model, which we call Latent Channel Networks, which allows for such features of a network. We present an EM algorithm for fitting the model, for which the computational complexity is linear in the number of edges and number of channels and apply the algorithm to both synthetic and classic network datasets.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04563v2
PDF	https://arxiv.org/pdf/1906.04563v2.pdf
PWC	https://paperswithcode.com/paper/latent-channel-networks
Repo	https://github.com/pistacliffcho/LatentChannelNetworks
Framework	none

DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online Optimization


Title	DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online Optimization
Authors	Parvin Nazari, Davoud Ataee Tarzanagh, George Michailidis
Abstract	Adaptive gradient-based optimization methods such as \textsc{Adagrad}, \textsc{Rmsprop}, and \textsc{Adam} are widely used in solving large-scale machine learning problems including deep learning. A number of schemes have been proposed in the literature aiming at parallelizing them, based on communications of peripheral nodes with a central node, but incur high communications cost. To address this issue, we develop a novel consensus-based distributed adaptive moment estimation method (\textsc{Dadam}) for online optimization over a decentralized network that enables data parallelization, as well as decentralized computation. The method is particularly useful, since it can accommodate settings where access to local data is allowed. Further, as established theoretically in this work, it can outperform centralized adaptive algorithms, for certain classes of loss functions used in applications. We analyze the convergence properties of the proposed algorithm and provide a dynamic regret bound on the convergence rate of adaptive moment estimation methods in both stochastic and deterministic settings. Empirical results demonstrate that \textsc{Dadam} works also well in practice and compares favorably to competing online optimization methods.
Tasks	Stochastic Optimization
Published	2019-01-25
URL	https://arxiv.org/abs/1901.09109v6
PDF	https://arxiv.org/pdf/1901.09109v6.pdf
PWC	https://paperswithcode.com/paper/dadam-a-consensus-based-distributed-adaptive
Repo	https://github.com/Tarzanagh/DADAM
Framework	none

Output-Constrained Bayesian Neural Networks


Title	Output-Constrained Bayesian Neural Networks
Authors	Wanqian Yang, Lars Lorch, Moritz A. Graule, Srivatsan Srinivasan, Anirudh Suresh, Jiayu Yao, Melanie F. Pradier, Finale Doshi-Velez
Abstract	Bayesian neural network (BNN) priors are defined in parameter space, making it hard to encode prior knowledge expressed in function space. We formulate a prior that incorporates functional constraints about what the output can or cannot be in regions of the input space. Output-Constrained BNNs (OC-BNN) represent an interpretable approach of enforcing a range of constraints, fully consistent with the Bayesian framework and amenable to black-box inference. We demonstrate how OC-BNNs improve model robustness and prevent the prediction of infeasible outputs in two real-world applications of healthcare and robotics.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06287v1
PDF	https://arxiv.org/pdf/1905.06287v1.pdf
PWC	https://paperswithcode.com/paper/output-constrained-bayesian-neural-networks
Repo	https://github.com/dtak/ocbnn-public
Framework	pytorch

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation


Title	Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation
Authors	Yi Luo, Zhuo Chen, Takuya Yoshioka
Abstract	Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods. Unlike the time-frequency domain approaches, the time-domain separation systems often receive input sequences consisting of a huge number of time steps, which introduces challenges for modeling extremely long sequences. Conventional recurrent neural networks (RNNs) are not effective for modeling such long sequences due to optimization difficulties, while one-dimensional convolutional neural networks (1-D CNNs) cannot perform utterance-level sequence modeling when its receptive field is smaller than the sequence length. In this paper, we propose dual-path recurrent neural network (DPRNN), a simple yet effective method for organizing RNN layers in a deep structure to model extremely long sequences. DPRNN splits the long sequential input into smaller chunks and applies intra- and inter-chunk operations iteratively, where the input length can be made proportional to the square root of the original sequence length in each operation. Experiments show that by replacing 1-D CNN with DPRNN and apply sample-level modeling in the time-domain audio separation network (TasNet), a new state-of-the-art performance on WSJ0-2mix is achieved with a 20 times smaller model than the previous best system.
Tasks	Speech Separation
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06379v2
PDF	https://arxiv.org/pdf/1910.06379v2.pdf
PWC	https://paperswithcode.com/paper/dual-path-rnn-efficient-long-sequence
Repo	https://github.com/sp-uhh/dual-path-rnn
Framework	tf

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation


Title	End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation
Authors	Yi Luo, Zhuo Chen, Nima Mesgarani, Takuya Yoshioka
Abstract	An important problem in ad-hoc microphone speech separation is how to guarantee the robustness of a system with respect to the locations and numbers of microphones. The former requires the system to be invariant to different indexing of the microphones with the same locations, while the latter requires the system to be able to process inputs with varying dimensions. Conventional optimization-based beamforming techniques satisfy these requirements by definition, while for deep learning-based end-to-end systems those constraints are not fully addressed. In this paper, we propose transform-average-concatenate (TAC), a simple design paradigm for channel permutation and number invariant multi-channel speech separation. Based on the filter-and-sum network (FaSNet), a recently proposed end-to-end time-domain beamforming system, we show how TAC significantly improves the separation performance across various numbers of microphones in noisy reverberant separation tasks with ad-hoc arrays. Moreover, we show that TAC also significantly improves the separation performance with fixed geometry array configuration, further proving the effectiveness of the proposed paradigm in the general problem of multi-microphone speech separation.
Tasks	Speech Separation
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14104v3
PDF	https://arxiv.org/pdf/1910.14104v3.pdf
PWC	https://paperswithcode.com/paper/end-to-end-microphone-permutation-and-number
Repo	https://github.com/yluo42/TAC
Framework	pytorch

Adversarial Robustness vs Model Compression, or Both?


Title	Adversarial Robustness vs Model Compression, or Both?
Authors	Shaokai Ye, Kaidi Xu, Sijia Liu, Hao Cheng, Jan-Henrik Lambrechts, Huan Zhang, Aojun Zhou, Kaisheng Ma, Yanzhi Wang, Xue Lin
Abstract	It is well known that deep neural networks (DNNs) are vulnerable to adversarial attacks, which are implemented by adding crafted perturbations onto benign examples. Min-max robust optimization based adversarial training can provide a notion of security against adversarial attacks. However, adversarial robustness requires a significantly larger capacity of the network than that for the natural training with only benign examples. This paper proposes a framework of concurrent adversarial training and weight pruning that enables model compression while still preserving the adversarial robustness and essentially tackles the dilemma of adversarial training. Furthermore, this work studies two hypotheses about weight pruning in the conventional setting and finds that weight pruning is essential for reducing the network model size in the adversarial setting, training a small model from scratch even with inherited initialization from the large model cannot achieve both adversarial robustness and high standard accuracy. Code is available at https://github.com/yeshaokai/Robustness-Aware-Pruning-ADMM.
Tasks	Model Compression, Network Pruning
Published	2019-03-29
URL	https://arxiv.org/abs/1903.12561v4
PDF	https://arxiv.org/pdf/1903.12561v4.pdf
PWC	https://paperswithcode.com/paper/second-rethinking-of-network-pruning-in-the
Repo	https://github.com/yeshaokai/Robustness-Aware-Pruning-ADMM
Framework	pytorch

An embarrassingly simple approach to neural multiple instance classification


Title	An embarrassingly simple approach to neural multiple instance classification
Authors	Amina Asif, Fayyaz ul Amir Afsar Minhas
Abstract	Multiple Instance Learning (MIL) is a weak supervision learning paradigm that allows modeling of machine learning problems in which labels are available only for groups of examples called bags. A positive bag may contain one or more positive examples but it is not known which examples in the bag are positive. All examples in a negative bag belong to the negative class. Such problems arise frequently in fields of computer vision, medical image processing and bioinformatics. Many neural network based solutions have been proposed in the literature for MIL, however, almost all of them rely on introducing specialized blocks and connectivity in the architectures. In this paper, we present a novel and effective approach to Multiple Instance Learning in neural networks. Instead of making changes to the architectures, we propose a simple bag-level ranking loss function that allows Multiple Instance Classification in any neural architecture. We have demonstrated the effectiveness of our proposed method for popular MIL benchmark datasets. In addition, we have tested the performance of our method in convolutional neural networks used to model an MIL problem derived from the well-known MNIST dataset. Results have shown that despite being simpler, our proposed scheme is comparable or better than existing methods in the literature in practical scenarios. Python code files for all the experiments can be found at https://github.com/amina01/ESMIL.
Tasks	Multiple Instance Learning
Published	2019-05-06
URL	https://arxiv.org/abs/1905.01947v1
PDF	https://arxiv.org/pdf/1905.01947v1.pdf
PWC	https://paperswithcode.com/paper/an-embarrassingly-simple-approach-to-neural
Repo	https://github.com/amina01/ESMIL
Framework	pytorch

Induction of Non-Monotonic Rules From Statistical Learning Models Using High-Utility Itemset Mining


Title	Induction of Non-Monotonic Rules From Statistical Learning Models Using High-Utility Itemset Mining
Authors	Farhad Shakerin, Gopal Gupta
Abstract	We present a fast and scalable algorithm to induce non-monotonic logic programs from statistical learning models. We reduce the problem of search for best clauses to instances of the High-Utility Itemset Mining (HUIM) problem. In the HUIM problem, feature values and their importance are treated as transactions and utilities respectively. We make use of TreeExplainer, a fast and scalable implementation of the Explainable AI tool SHAP, to extract locally important features and their weights from ensemble tree models. Our experiments with UCI standard benchmarks suggest a significant improvement in terms of classification evaluation metrics and running time of the training algorithm compared to ALEPH, a state-of-the-art Inductive Logic Programming (ILP) system.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.11226v2
PDF	https://arxiv.org/pdf/1905.11226v2.pdf
PWC	https://paperswithcode.com/paper/induction-of-non-monotonic-rules-from
Repo	https://github.com/fxs130430/SHAP_FOLD
Framework	none

Iris Recognition with Image Segmentation Employing Retrained Off-the-Shelf Deep Neural Networks


Title	Iris Recognition with Image Segmentation Employing Retrained Off-the-Shelf Deep Neural Networks
Authors	Daniel Kerrigan, Mateusz Trokielewicz, Adam Czajka, Kevin Bowyer
Abstract	This paper offers three new, open-source, deep learning-based iris segmentation methods, and the methodology how to use irregular segmentation masks in a conventional Gabor-wavelet-based iris recognition. To train and validate the methods, we used a wide spectrum of iris images acquired by different teams and different sensors and offered publicly, including data taken from CASIA-Iris-Interval-v4, BioSec, ND-Iris-0405, UBIRIS, Warsaw-BioBase-Post-Mortem-Iris v2.0 (post-mortem iris images), and ND-TWINS-2009-2010 (iris images acquired from identical twins). This varied training data should increase the generalization capabilities of the proposed segmentation techniques. In database-disjoint training and testing, we show that deep learning-based segmentation outperforms the conventional (OSIRIS) segmentation in terms of Intersection over Union calculated between the obtained results and manually annotated ground-truth. Interestingly, the Gabor-based iris matching is not always better when deep learning-based segmentation is used, and is on par with the method employing Daugman’s based segmentation.
Tasks	Iris Recognition, Iris Segmentation, Semantic Segmentation
Published	2019-01-04
URL	http://arxiv.org/abs/1901.01028v1
PDF	http://arxiv.org/pdf/1901.01028v1.pdf
PWC	https://paperswithcode.com/paper/iris-recognition-with-image-segmentation
Repo	https://github.com/CVRL/iris-recognition-OTS-DNN
Framework	none

A Resource-Efficient Embedded Iris Recognition System Using Fully Convolutional Networks


Title	A Resource-Efficient Embedded Iris Recognition System Using Fully Convolutional Networks
Authors	Hokchhay Tann, Heng Zhao, Sherief Reda
Abstract	Applications of Fully Convolutional Networks (FCN) in iris segmentation have shown promising advances. For mobile and embedded systems, a significant challenge is that the proposed FCN architectures are extremely computationally demanding. In this article, we propose a resource-efficient, end-to-end iris recognition flow, which consists of FCN-based segmentation, contour fitting, followed by Daugman normalization and encoding. To attain accurate and efficient FCN models, we propose a three-step SW/HW co-design methodology consisting of FCN architectural exploration, precision quantization, and hardware acceleration. In our exploration, we propose multiple FCN models, and in comparison to previous works, our best-performing model requires 50X less FLOPs per inference while achieving a new state-of-the-art segmentation accuracy. Next, we select the most efficient set of models and further reduce their computational complexity through weights and activations quantization using 8-bit dynamic fixed-point (DFP) format. Each model is then incorporated into an end-to-end flow for true recognition performance evaluation. A few of our end-to-end pipelines outperform the previous state-of-the-art on two datasets evaluated. Finally, we propose a novel DFP accelerator and fully demonstrate the SW/HW co-design realization of our flow on an embedded FPGA platform. In comparison with the embedded CPU, our hardware acceleration achieves up to 8.3X speedup for the overall pipeline while using less than 15% of the available FPGA resources. We also provide comparisons between the FPGA system and an embedded GPU showing different benefits and drawbacks for the two platforms.
Tasks	Iris Recognition, Iris Segmentation, Quantization
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03385v1
PDF	https://arxiv.org/pdf/1909.03385v1.pdf
PWC	https://paperswithcode.com/paper/a-resource-efficient-embedded-iris
Repo	https://github.com/scale-lab/FCNiris
Framework	none

Probabilistic Discriminative Learning with Layered Graphical Models


Title	Probabilistic Discriminative Learning with Layered Graphical Models
Authors	Yuesong Shen, Tao Wu, Csaba Domokos, Daniel Cremers
Abstract	Probabilistic graphical models are traditionally known for their successes in generative modeling. In this work, we advocate layered graphical models (LGMs) for probabilistic discriminative learning. To this end, we design LGMs in close analogy to neural networks (NNs), that is, they have deep hierarchical structures and convolutional or local connections between layers. Equipped with tensorized truncated variational inference, our LGMs can be efficiently trained via backpropagation on mainstream deep learning frameworks such as PyTorch. To deal with continuous valued inputs, we use a simple yet effective soft-clamping strategy for efficient inference. Through extensive experiments on image classification over MNIST and FashionMNIST datasets, we demonstrate that LGMs are capable of achieving competitive results comparable to NNs of similar architectures, while preserving transparent probabilistic modeling.
Tasks	Image Classification
Published	2019-01-31
URL	http://arxiv.org/abs/1902.00057v1
PDF	http://arxiv.org/pdf/1902.00057v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-discriminative-learning-with
Repo	https://github.com/tum-vision/lgm
Framework	pytorch

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction


Title	Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction
Authors	Shun Zheng, Wei Cao, Wei Xu, Jiang Bian
Abstract	Most existing event extraction (EE) methods merely extract event arguments within the sentence scope. However, such sentence-level EE methods struggle to handle soaring amounts of documents from emerging applications, such as finance, legislation, health, etc., where event arguments always scatter across different sentences, and even multiple such event mentions frequently co-exist in the same document. To address these challenges, we propose a novel end-to-end model, Doc2EDAG, which can generate an entity-based directed acyclic graph to fulfill the document-level EE (DEE) effectively. Moreover, we reformalize a DEE task with the no-trigger-words design to ease the document-level event labeling. To demonstrate the effectiveness of Doc2EDAG, we build a large-scale real-world dataset consisting of Chinese financial announcements with the challenges mentioned above. Extensive experiments with comprehensive analyses illustrate the superiority of Doc2EDAG over state-of-the-art methods. Data and codes can be found at https://github.com/dolphin-zs/Doc2EDAG.
Tasks
Published	2019-04-16
URL	https://arxiv.org/abs/1904.07535v2
PDF	https://arxiv.org/pdf/1904.07535v2.pdf
PWC	https://paperswithcode.com/paper/doc2edag-an-end-to-end-document-level
Repo	https://github.com/dolphin-zs/Doc2EDAG
Framework	pytorch

nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks


Title	nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks
Authors	Kin Wai Cheuk, Hans Anderson, Kat Agres, Dorien Herremans
Abstract	Converting time domain waveforms to frequency domain spectrograms is typically considered to be a prepossessing step done before model training. This approach, however, has several drawbacks. First, it takes a lot of hard disk space to store different frequency domain representations. This is especially true during the model development and tuning process, when exploring various types of spectrograms for optimal performance. Second, if another dataset is used, one must process all the audio clips again before the network can be retrained. In this paper, we integrate the time domain to frequency domain conversion as part of the model structure, and propose a neural network based toolbox, nnAudio, which leverages 1D convolutional neural networks to perform time domain to frequency domain conversion during feed-forward. It allows on-the-fly spectrogram generation without the need to store any spectrograms on the disk. This approach also allows back-propagation on the waveforms-to-spectrograms transformation layer, which implies that this transformation process can be made trainable, and hence further optimized by gradient descent. nnAudio reduces the waveforms-to-spectrograms conversion time for 1,770 waveforms (from the MAPS dataset) from $10.64$ seconds with librosa to only $0.001$ seconds for Short-Time Fourier Transform (STFT), $18.3$ seconds to $0.015$ seconds for Mel spectrogram, $103.4$ seconds to $0.258$ for constant-Q transform (CQT), when using GPU on our DGX work station with CPU: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz Tesla v100 32Gb GPUs. (Only 1 GPU is being used for all the experiments.) We also further optimize the existing CQT algorithm, so that the CQT spectrogram can be obtained without aliasing in a much faster computation time (from $0.258$ seconds to only $0.001$ seconds).
Tasks
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12055v2
PDF	https://arxiv.org/pdf/1912.12055v2.pdf
PWC	https://paperswithcode.com/paper/nnaudio-an-on-the-fly-gpu-audio-to
Repo	https://github.com/KinWaiCheuk/nnAudio
Framework	pytorch

DiffTaichi: Differentiable Programming for Physical Simulation


Title	DiffTaichi: Differentiable Programming for Physical Simulation
Authors	Yuanming Hu, Luke Anderson, Tzu-Mao Li, Qi Sun, Nathan Carr, Jonathan Ragan-Kelley, Frédo Durand
Abstract	We present DiffTaichi, a new differentiable programming language tailored for building high-performance differentiable physical simulators. Based on an imperative programming language, DiffTaichi generates gradients of simulation steps using source code transformations that preserve arithmetic intensity and parallelism. A light-weight tape is used to record the whole simulation program structure and replay the gradient kernels in a reversed order, for end-to-end backpropagation. We demonstrate the performance and productivity of our language in gradient-based learning and optimization tasks on 10 different physical simulators. For example, a differentiable elastic object simulator written in our language is 4.2x shorter than the hand-engineered CUDA version yet runs as fast, and is 188x faster than the TensorFlow implementation. Using our differentiable programs, neural network controllers are typically optimized within only tens of iterations.
Tasks	Physical Simulations
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00935v3
PDF	https://arxiv.org/pdf/1910.00935v3.pdf
PWC	https://paperswithcode.com/paper/difftaichi-differentiable-programming-for
Repo	https://github.com/yuanming-hu/difftaichi
Framework	tf

Supporting Multi-point Fan Design with Dimension Reduction


Title	Supporting Multi-point Fan Design with Dimension Reduction
Authors	Pranay Seshadri, Shaowu Yuchi, Shahrokh Shahpar, Geoffrey Parks
Abstract	Motivated by the idea of turbomachinery active subspace performance maps, this paper studies dimension reduction in turbomachinery 3D CFD simulations. First, we show that these subspaces exist across different blades—under the same parametrization—largely independent of their Mach number or Reynolds number. This is demonstrated via a numerical study on three different blades. Then, in an attempt to reduce the computational cost of identifying a suitable dimension reducing subspace, we examine statistical sufficient dimension reduction methods, including sliced inverse regression, sliced average variance estimation, principal Hessian directions and contour regression. Unsatisfied by these results, we evaluate a new idea based on polynomial variable projection—a non-linear least squares problem. Our results using polynomial variable projection clearly demonstrate that one can accurately identify dimension reducing subspaces for turbomachinery functionals at a fraction of the cost associated with prior methods. We apply these subspaces to the problem of comparing design configurations across different flight points on a working line of a fan blade. We demonstrate how designs that offer a healthy compromise between performance at cruise and sea-level conditions can be easily found by visually inspecting their subspaces.
Tasks	Dimensionality Reduction
Published	2019-10-20
URL	https://arxiv.org/abs/1910.09030v1
PDF	https://arxiv.org/pdf/1910.09030v1.pdf
PWC	https://paperswithcode.com/paper/supporting-multi-point-fan-design-with
Repo	https://github.com/psesh/turbodata
Framework	none