April 1, 2020

3244 words 16 mins read

Paper Group ANR 476

A Unified Framework for Multiclass and Multilabel Support Vector Machines. Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model Distillation Approach. Gaussianization Flows. Discrete graphical models – an optimization perspective. Reproducibility Challenge NeurIPS 2019 Report on “Competitive Gradient Descent”. Generalized Canonic …

A Unified Framework for Multiclass and Multilabel Support Vector Machines


Title	A Unified Framework for Multiclass and Multilabel Support Vector Machines
Authors	Hoda Shajari, Anand Rangarajan
Abstract	We propose a novel integrated formulation for multiclass and multilabel support vector machines (SVMs). A number of approaches have been proposed to extend the original binary SVM to an all-in-one multiclass SVM. However, its direct extension to a unified multilabel SVM has not been widely investigated. We propose a straightforward extension to the SVM to cope with multiclass and multilabel classification problems within a unified framework. Our framework deviates from the conventional soft margin SVM framework with its direct oppositional structure. In our formulation, class-specific weight vectors (normal vectors) are learned by maximizing their margin with respect to an origin and penalizing patterns when they get too close to this origin. As a result, each weight vector chooses an orientation and a magnitude with respect to this origin in such a way that it best represents the patterns belonging to its corresponding class. Opposition between classes is introduced into the formulation via the minimization of pairwise inner products of weight vectors. We also extend our framework to cope with nonlinear separability via standard reproducing kernel Hilbert spaces (RKHS). Biases which are closely related to the origin need to be treated properly in both the original feature space and Hilbert space. We have the flexibility to incorporate constraints into the formulation (if they better reflect the underlying geometry) and improve the performance of the classifier. To this end, specifics and technicalities such as the origin in RKHS are addressed. Results demonstrates a competitive classifier for both multiclass and multilabel classification problems.
Tasks
Published	2020-03-25
URL	https://arxiv.org/abs/2003.11197v1
PDF	https://arxiv.org/pdf/2003.11197v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-for-multiclass-and
Repo
Framework

Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model Distillation Approach


Title	Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model Distillation Approach
Authors	Zeyue Xue, Shuang Luo, Chao Wu, Pan Zhou, Kaigui Bian, Wei Du
Abstract	Peer-to-peer knowledge transfer in distributed environments has emerged as a promising method since it could accelerate learning and improve team-wide performance without relying on pre-trained teachers in deep reinforcement learning. However, for traditional peer-to-peer methods such as action advising, they have encountered difficulties in how to efficiently expressed knowledge and advice. As a result, we propose a brand new solution to reuse experiences and transfer value functions among multiple students via model distillation. But it is still challenging to transfer Q-function directly since it is unstable and not bounded. To address this issue confronted with existing works, we adopt Categorical Deep Q-Network. We also describe how to design an efficient communication protocol to exploit heterogeneous knowledge among multiple distributed agents. Our proposed framework, namely Learning and Teaching Categorical Reinforcement (LTCR), shows promising performance on stabilizing and accelerating learning progress with improved team-wide reward in four typical experimental environments.
Tasks	Transfer Learning
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02202v1
PDF	https://arxiv.org/pdf/2002.02202v1.pdf
PWC	https://paperswithcode.com/paper/transfer-heterogeneous-knowledge-among-peer
Repo
Framework

Gaussianization Flows


Title	Gaussianization Flows
Authors	Chenlin Meng, Yang Song, Jiaming Song, Stefano Ermon
Abstract	Iterative Gaussianization is a fixed-point iteration procedure that can transform any continuous random vector into a Gaussian one. Based on iterative Gaussianization, we propose a new type of normalizing flow model that enables both efficient computation of likelihoods and efficient inversion for sample generation. We demonstrate that these models, named Gaussianization flows, are universal approximators for continuous probability distributions under some regularity conditions. Because of this guaranteed expressivity, they can capture multimodal target distributions without compromising the efficiency of sample generation. Experimentally, we show that Gaussianization flows achieve better or comparable performance on several tabular datasets compared to other efficiently invertible flow models such as Real NVP, Glow and FFJORD. In particular, Gaussianization flows are easier to initialize, demonstrate better robustness with respect to different transformations of the training data, and generalize better on small training sets.
Tasks
Published	2020-03-04
URL	https://arxiv.org/abs/2003.01941v1
PDF	https://arxiv.org/pdf/2003.01941v1.pdf
PWC	https://paperswithcode.com/paper/gaussianization-flows
Repo
Framework

Discrete graphical models – an optimization perspective


Title	Discrete graphical models – an optimization perspective
Authors	Bogdan Savchynskyy
Abstract	This monograph is about discrete energy minimization for discrete graphical models. It considers graphical models, or, more precisely, maximum a posteriori inference for graphical models, purely as a combinatorial optimization problem. Modeling, applications, probabilistic interpretations and many other aspects are either ignored here or find their place in examples and remarks only. It covers the integer linear programming formulation of the problem as well as its linear programming, Lagrange and Lagrange decomposition-based relaxations. In particular, it provides a detailed analysis of the polynomially solvable acyclic and submodular problems, along with the corresponding exact optimization methods. Major approximate methods, such as message passing and graph cut techniques are also described and analyzed comprehensively. The monograph can be useful for undergraduate and graduate students studying optimization or graphical models, as well as for experts in optimization who want to have a look into graphical models. To make the monograph suitable for both categories of readers we explicitly separate the mathematical optimization background chapters from those specific to graphical models.
Tasks	Combinatorial Optimization
Published	2020-01-24
URL	https://arxiv.org/abs/2001.09017v1
PDF	https://arxiv.org/pdf/2001.09017v1.pdf
PWC	https://paperswithcode.com/paper/discrete-graphical-models-an-optimization
Repo
Framework

Reproducibility Challenge NeurIPS 2019 Report on “Competitive Gradient Descent”


Title	Reproducibility Challenge NeurIPS 2019 Report on “Competitive Gradient Descent”
Authors	Gopi Kishan
Abstract	This is a report for reproducibility challenge of NeurlIPS 2019 on the paper Competitive Gradient Descent (Schafer et al., 2019). The paper introduces a novel algorithm for the numerical computation of Nash equilibria of competitive two-player games. It avoids oscillatory and divergent behaviours seen in alternating gradient descent. The purpose of this report is to critically examine the reproducibility of the work by (Schafer et al., 2019), within the framework of the NeurIPS 2019 Reproducibility Challenge. The experiments replicated in this report confirms the results of the original study. Moreover, this project offers a Python (Pytorch based) implementation of the proposed CGD algorithm which can be found at the following public git repository: (https://github.com/GopiKishan14/Reproducibility_Challenge_NeurIPS_2019)
Tasks
Published	2020-01-26
URL	https://arxiv.org/abs/2001.10820v1
PDF	https://arxiv.org/pdf/2001.10820v1.pdf
PWC	https://paperswithcode.com/paper/reproducibility-challenge-neurips-2019-report
Repo
Framework

Generalized Canonical Correlation Analysis: A Subspace Intersection Approach


Title	Generalized Canonical Correlation Analysis: A Subspace Intersection Approach
Authors	Mikael Sørensen, Charilaos I. Kanatsoulis, Nicholas D. Sidiropoulos
Abstract	Generalized Canonical Correlation Analysis (GCCA) is an important tool that finds numerous applications in data mining, machine learning, and artificial intelligence. It aims at finding `common’ random variables that are strongly correlated across multiple feature representations (views) of the same set of entities. CCA and to a lesser extent GCCA have been studied from the statistical and algorithmic points of view, but not as much from the standpoint of linear algebra. This paper offers a fresh algebraic perspective of GCCA based on a (bi-)linear generative model that naturally captures its essence. It is shown that from a linear algebra point of view, GCCA is tantamount to subspace intersection; and conditions under which the common subspace of the different views is identifiable are provided. A novel GCCA algorithm is proposed based on subspace intersection, which scales up to handle large GCCA tasks. Synthetic as well as real data experiments are provided to showcase the effectiveness of the proposed approach. \|
Tasks
Published	2020-03-25
URL	https://arxiv.org/abs/2003.11205v1
PDF	https://arxiv.org/pdf/2003.11205v1.pdf
PWC	https://paperswithcode.com/paper/generalized-canonical-correlation-analysis-a
Repo
Framework

Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes


Title	Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes
Authors	Huanjing Yue, Cong Cao, Lei Liao, Ronghe Chu, Jingyu Yang
Abstract	In recent years, the supervised learning strategy for real noisy image denoising has been emerging and has achieved promising results. In contrast, realistic noise removal for raw noisy videos is rarely studied due to the lack of noisy-clean pairs for dynamic scenes. Clean video frames for dynamic scenes cannot be captured with a long-exposure shutter or averaging multi-shots as was done for static images. In this paper, we solve this problem by creating motions for controllable objects, such as toys, and capturing each static moment for multiple times to generate clean video frames. In this way, we construct a dataset with 55 groups of noisy-clean videos with ISO values ranging from 1600 to 25600. To our knowledge, this is the first dynamic video dataset with noisy-clean pairs. Correspondingly, we propose a raw video denoising network (RViDeNet) by exploring the temporal, spatial, and channel correlations of video frames. Since the raw video has Bayer patterns, we pack it into four sub-sequences, i.e RGBG sequences, which are denoised by the proposed RViDeNet separately and finally fused into a clean video. In addition, our network not only outputs a raw denoising result, but also the sRGB result by going through an image signal processing (ISP) module, which enables users to generate the sRGB result with their favourite ISPs. Experimental results demonstrate that our method outperforms state-of-the-art video and raw image denoising algorithms on both indoor and outdoor videos.
Tasks	Denoising, Image Denoising, Video Denoising
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14013v1
PDF	https://arxiv.org/pdf/2003.14013v1.pdf
PWC	https://paperswithcode.com/paper/supervised-raw-video-denoising-with-a
Repo
Framework

Holistically-Attracted Wireframe Parsing


Title	Holistically-Attracted Wireframe Parsing
Authors	Nan Xue, Tianfu Wu, Song Bai, Fu-Dong Wang, Gui-Song Xia, Liangpei Zhang, Philip H. S. Torr
Abstract	This paper presents a fast and parsimonious parsing method to accurately and robustly detect a vectorized wireframe in an input image with a single forward pass. The proposed method is end-to-end trainable, consisting of three components: (i) line segment and junction proposal generation, (ii) line segment and junction matching, and (iii) line segment and junction verification. For computing line segment proposals, a novel exact dual representation is proposed which exploits a parsimonious geometric reparameterization for line segments and forms a holistic 4-dimensional attraction field map for an input image. Junctions can be treated as the “basins” in the attraction field. The proposed method is thus called Holistically-Attracted Wireframe Parser (HAWP). In experiments, the proposed method is tested on two benchmarks, the Wireframe dataset, and the YorkUrban dataset. On both benchmarks, it obtains state-of-the-art performance in terms of accuracy and efficiency. For example, on the Wireframe dataset, compared to the previous state-of-the-art method L-CNN, it improves the challenging mean structural average precision (msAP) by a large margin ($2.8%$ absolute improvements) and achieves 29.5 FPS on single GPU ($89%$ relative improvement). A systematic ablation study is performed to further justify the proposed method.
Tasks
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01663v1
PDF	https://arxiv.org/pdf/2003.01663v1.pdf
PWC	https://paperswithcode.com/paper/holistically-attracted-wireframe-parsing
Repo
Framework

Multimodal Deep Unfolding for Guided Image Super-Resolution


Title	Multimodal Deep Unfolding for Guided Image Super-Resolution
Authors	Iman Marivani, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis
Abstract	The reconstruction of a high resolution image given a low resolution observation is an ill-posed inverse problem in imaging. Deep learning methods rely on training data to learn an end-to-end mapping from a low-resolution input to a high-resolution output. Unlike existing deep multimodal models that do not incorporate domain knowledge about the problem, we propose a multimodal deep learning design that incorporates sparse priors and allows the effective integration of information from another image modality into the network architecture. Our solution relies on a novel deep unfolding operator, performing steps similar to an iterative algorithm for convolutional sparse coding with side information; therefore, the proposed neural network is interpretable by design. The deep unfolding architecture is used as a core component of a multimodal framework for guided image super-resolution. An alternative multimodal design is investigated by employing residual learning to improve the training efficiency. The presented multimodal approach is applied to super-resolution of near-infrared and multi-spectral images as well as depth upsampling using RGB images as side information. Experimental results show that our model outperforms state-of-the-art methods.
Tasks	Image Super-Resolution, Super-Resolution
Published	2020-01-21
URL	https://arxiv.org/abs/2001.07575v1
PDF	https://arxiv.org/pdf/2001.07575v1.pdf
PWC	https://paperswithcode.com/paper/multimodal-deep-unfolding-for-guided-image
Repo
Framework

Citation Recommendations Considering Content and Structural Context Embedding


Title	Citation Recommendations Considering Content and Structural Context Embedding
Authors	Yang Zhang, Qiang Ma
Abstract	The number of academic papers being published is increasing exponentially in recent years, and recommending adequate citations to assist researchers in writing papers is a non-trivial task. Conventional approaches may not be optimal, as the recommended papers may already be known to the users, or be solely relevant to the surrounding context but not other ideas discussed in the manuscript. In this work, we propose a novel embedding algorithm DocCit2Vec, along with the new concept of ``structural context’', to tackle the aforementioned issues. The proposed approach demonstrates superior performances to baseline models in extensive experiments designed to simulate practical usage scenarios. \|
Tasks
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02344v1
PDF	https://arxiv.org/pdf/2001.02344v1.pdf
PWC	https://paperswithcode.com/paper/citation-recommendations-considering-content
Repo
Framework

EcoNAS: Finding Proxies for Economical Neural Architecture Search


Title	EcoNAS: Finding Proxies for Economical Neural Architecture Search
Authors	Dongzhan Zhou, Xinchi Zhou, Wenwei Zhang, Chen Change Loy, Shuai Yi, Xuesen Zhang, Wanli Ouyang
Abstract	Neural Architecture Search (NAS) achieves significant progress in many computer vision tasks. While many methods have been proposed to improve the efficiency of NAS, the search progress is still laborious because training and evaluating plausible architectures over large search space is time-consuming. Assessing network candidates under a proxy (i.e., computationally reduced setting) thus becomes inevitable. In this paper, we observe that most existing proxies exhibit different behaviors in maintaining the rank consistency among network candidates. In particular, some proxies can be more reliable – the rank of candidates does not differ much comparing their reduced setting performance and final performance. In this paper, we systematically investigate some widely adopted reduction factors and report our observations. Inspired by these observations, we present a reliable proxy and further formulate a hierarchical proxy strategy. The strategy spends more computations on candidate networks that are potentially more accurate, while discards unpromising ones in early stage with a fast proxy. This leads to an economical evolutionary-based NAS (EcoNAS), which achieves an impressive 400x search time reduction in comparison to the evolutionary-based state of the art (8 vs. 3150 GPU days). Some new proxies led by our observations can also be applied to accelerate other NAS methods while still able to discover good candidate networks with performance matching those found by previous proxy strategies.
Tasks	Neural Architecture Search
Published	2020-01-05
URL	https://arxiv.org/abs/2001.01233v2
PDF	https://arxiv.org/pdf/2001.01233v2.pdf
PWC	https://paperswithcode.com/paper/econas-finding-proxies-for-economical-neural
Repo
Framework

Length-controllable Abstractive Summarization by Guiding with Summary Prototype


Title	Length-controllable Abstractive Summarization by Guiding with Summary Prototype
Authors	Itsumi Saito, Kyosuke Nishida, Kosuke Nishida, Atsushi Otsuka, Hisako Asano, Junji Tomita, Hiroyuki Shindo, Yuji Matsumoto
Abstract	We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization, especially of the length, is an important aspect for practical applications. Previous studies on length-controllable abstractive summarization incorporate length embeddings in the decoder module for controlling the summary length. Although the length embeddings can control where to stop decoding, they do not decide which information should be included in the summary within the length constraint. Unlike the previous models, our length-controllable abstractive summarization model incorporates a word-level extractive module in the encoder-decoder model instead of length embeddings. Our model generates a summary in two steps. First, our word-level extractor extracts a sequence of important words (we call it the “prototype text”) from the source text according to the word-level importance scores and the length constraint. Second, the prototype text is used as additional input to the encoder-decoder model, which generates a summary by jointly encoding and copying words from both the prototype text and source text. Since the prototype text is a guide to both the content and length of the summary, our model can generate an informative and length-controlled summary. Experiments with the CNN/Daily Mail dataset and the NEWSROOM dataset show that our model outperformed previous models in length-controlled settings.
Tasks	Abstractive Text Summarization
Published	2020-01-21
URL	https://arxiv.org/abs/2001.07331v1
PDF	https://arxiv.org/pdf/2001.07331v1.pdf
PWC	https://paperswithcode.com/paper/length-controllable-abstractive-summarization
Repo
Framework

Neighborhood Information-based Probabilistic Algorithm for Network Disintegration


Title	Neighborhood Information-based Probabilistic Algorithm for Network Disintegration
Authors	Qian Li, San-Yang Liu, Xin-She Yang
Abstract	Many real-world applications can be modelled as complex networks, and such networks include the Internet, epidemic disease networks, transport networks, power grids, protein-folding structures and others. Network integrity and robustness are important to ensure that crucial networks are protected and undesired harmful networks can be dismantled. Network structure and integrity can be controlled by a set of key nodes, and to find the optimal combination of nodes in a network to ensure network structure and integrity can be an NP-complete problem. Despite extensive studies, existing methods have many limitations and there are still many unresolved problems. This paper presents a probabilistic approach based on neighborhood information and node importance, namely, neighborhood information-based probabilistic algorithm (NIPA). We also define a new centrality-based importance measure (IM), which combines the contribution ratios of the neighbor nodes of each target node and two-hop node information. Our proposed NIPA has been tested for different network benchmarks and compared with three other methods: optimal attack strategy (OAS), high betweenness first (HBF) and high degree first (HDF). Experiments suggest that the proposed NIPA is most effective among all four methods. In general, NIPA can identify the most crucial node combination with higher effectiveness, and the set of optimal key nodes found by our proposed NIPA is much smaller than that by heuristic centrality prediction. In addition, many previously neglected weakly connected nodes are identified, which become a crucial part of the newly identified optimal nodes. Thus, revised strategies for protection are recommended to ensure the safeguard of network integrity. Further key issues and future research topics are also discussed.
Tasks
Published	2020-03-08
URL	https://arxiv.org/abs/2003.04713v1
PDF	https://arxiv.org/pdf/2003.04713v1.pdf
PWC	https://paperswithcode.com/paper/neighborhood-information-based-probabilistic
Repo
Framework

An Empirical Accuracy Law for Sequential Machine Translation: the Case of Google Translate


Title	An Empirical Accuracy Law for Sequential Machine Translation: the Case of Google Translate
Authors	Lucas Nunes Sequeira, Bruno Moreschi, Fabio Gagliardi Cozman, Bernardo Fontes
Abstract	We have established, through empirical testing, a law that relates the number of translating hops to translation accuracy in sequential machine translation in Google Translate. Both accuracy and size decrease with the number of hops; the former displays a decrease closely following a power law. Such a law allows one to predict the behavior of translation chains that may be built as society increasingly depends on automated devices.
Tasks	Machine Translation
Published	2020-03-05
URL	https://arxiv.org/abs/2003.02817v1
PDF	https://arxiv.org/pdf/2003.02817v1.pdf
PWC	https://paperswithcode.com/paper/an-empirical-accuracy-law-for-sequential
Repo
Framework

Eigen-Stratified Models


Title	Eigen-Stratified Models
Authors	Jonathan Tuck, Stephen Boyd
Abstract	Stratified models depend in an arbitrary way on a selected categorical feature that takes $K$ values, and depend linearly on the other $n$ features. Laplacian regularization with respect to a graph on the feature values can greatly improve the performance of a stratified model, especially in the low-data regime. A significant issue with Laplacian-regularized stratified models is that the model is $K$ times the size of the base model, which can be quite large. We address this issue by formulating eigen-stratifed models, which are stratified models with an additional constraint that the model parameters are linear combinations of some modest number $m$ of bottom eigenvectors of the graph Laplacian, i.e., those associated with the $m$ smallest eigenvalues. With eigen-stratified models, we only need to store the $m$ bottom eigenvectors and the corresponding coefficients as the stratified model parameters. This leads to a reduction, sometimes large, of model size when $m \leq n$ and $m \ll K$. In some cases, the additional regularization implicit in eigen-stratified models can improve out-of-sample performance over standard Laplacian regularized stratified models.
Tasks
Published	2020-01-27
URL	https://arxiv.org/abs/2001.10389v1
PDF	https://arxiv.org/pdf/2001.10389v1.pdf
PWC	https://paperswithcode.com/paper/eigen-stratified-models
Repo
Framework