Paper Group ANR 430
Mining GOLD Samples for Conditional GANs. Energy-Efficient Processing and Robust Wireless Cooperative Transmission for Edge Inference. Natural Language Generation at Scale: A Case Study for Open Domain Question Answering. $\sqrt{n}$-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank. The double traveli …
Mining GOLD Samples for Conditional GANs
Title | Mining GOLD Samples for Conditional GANs |
Authors | Sangwoo Mo, Chiheon Kim, Sungwoong Kim, Minsu Cho, Jinwoo Shin |
Abstract | Conditional generative adversarial networks (cGANs) have gained a considerable attention in recent years due to its class-wise controllability and superior quality for complex generation tasks. We introduce a simple yet effective approach to improving cGANs by measuring the discrepancy between the data distribution and the model distribution on given samples. The proposed measure, coined the gap of log-densities (GOLD), provides an effective self-diagnosis for cGANs while being efficienty computed from the discriminator. We propose three applications of the GOLD: example re-weighting, rejection sampling, and active learning, which improve the training, inference, and data selection of cGANs, respectively. Our experimental results demonstrate that the proposed methods outperform corresponding baselines for all three applications on different image datasets. |
Tasks | Active Learning |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09170v1 |
https://arxiv.org/pdf/1910.09170v1.pdf | |
PWC | https://paperswithcode.com/paper/mining-gold-samples-for-conditional-gans |
Repo | |
Framework | |
Energy-Efficient Processing and Robust Wireless Cooperative Transmission for Edge Inference
Title | Energy-Efficient Processing and Robust Wireless Cooperative Transmission for Edge Inference |
Authors | Kai Yang, Yuanming Shi, Wei Yu, Zhi Ding |
Abstract | Edge machine learning can deliver low-latency and private artificial intelligent (AI) services for mobile devices by leveraging computation and storage resources at the network edge. This paper presents an energy-efficient edge processing framework to execute deep learning inference tasks at the edge computing nodes whose wireless connections to mobile devices are prone to channel uncertainties. Aimed at minimizing the sum of computation and transmission power consumption with probabilistic quality-of-service (QoS) constraints, we formulate a joint inference tasking and downlink beamforming problem that is characterized by a group sparse objective function. We provide a statistical learning based robust optimization approach to approximate the highly intractable probabilistic-QoS constraints by nonconvex quadratic constraints, which are further reformulated as matrix inequalities with a rank-one constraint via matrix lifting. We design a reweighted power minimization approach by iteratively reweighted $\ell_1$ minimization with difference-of-convex-functions (DC) regularization and updating weights, where the reweighted approach is adopted for enhancing group sparsity whereas the DC regularization is designed for inducing rank-one solutions. Numerical results demonstrate that the proposed approach outperforms other state-of-the-art approaches. |
Tasks | |
Published | 2019-07-29 |
URL | https://arxiv.org/abs/1907.12475v2 |
https://arxiv.org/pdf/1907.12475v2.pdf | |
PWC | https://paperswithcode.com/paper/energy-efficient-processing-and-robust |
Repo | |
Framework | |
Natural Language Generation at Scale: A Case Study for Open Domain Question Answering
Title | Natural Language Generation at Scale: A Case Study for Open Domain Question Answering |
Authors | Alessandra Cervone, Chandra Khatri, Rahul Goel, Behnam Hedayatnia, Anu Venkatesh, Dilek Hakkani-Tur, Raefer Gabriel |
Abstract | Current approaches to Natural Language Generation (NLG) for dialog mainly focus on domain-specific, task-oriented applications (e.g. restaurant booking) using limited ontologies (up to 20 slot types), usually without considering the previous conversation context. Furthermore, these approaches require large amounts of data for each domain, and do not benefit from examples that may be available for other domains. This work explores the feasibility of applying statistical NLG to scenarios requiring larger ontologies, such as multi-domain dialog applications or open-domain question answering (QA) based on knowledge graphs. We model NLG through an Encoder-Decoder framework using a large dataset of interactions between real-world users and a conversational agent for open-domain QA. First, we investigate the impact of increasing the number of slot types on the generation quality and experiment with different partitions of the QA data with progressively larger ontologies (up to 369 slot types). Second, we perform multi-task learning experiments between open-domain QA and task-oriented dialog, and benchmark our model on a popular NLG dataset. Moreover, we experiment with using the conversational context as an additional input to improve response generation quality. Our experiments show the feasibility of learning statistical NLG models for open-domain QA with larger ontologies. |
Tasks | Knowledge Graphs, Multi-Task Learning, Open-Domain Question Answering, Question Answering, Text Generation |
Published | 2019-03-19 |
URL | https://arxiv.org/abs/1903.08097v2 |
https://arxiv.org/pdf/1903.08097v2.pdf | |
PWC | https://paperswithcode.com/paper/natural-language-generation-at-scale-a-case |
Repo | |
Framework | |
$\sqrt{n}$-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank
Title | $\sqrt{n}$-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank |
Authors | Kefan Dong, Jian Peng, Yining Wang, Yuan Zhou |
Abstract | In this paper, we consider the problem of online learning of Markov decision processes (MDPs) with very large state spaces. Under the assumptions of realizable function approximation and low Bellman ranks, we develop an online learning algorithm that learns the optimal value function while at the same time achieving very low cumulative regret during the learning process. Our learning algorithm, Adaptive Value-function Elimination (AVE), is inspired by the policy elimination algorithm proposed in (Jiang et al., 2017), known as OLIVE. One of our key technical contributions in AVE is to formulate the elimination steps in OLIVE as contextual bandit problems. This technique enables us to apply the active elimination and expert weighting methods from (Dudik et al., 2011), instead of the random action exploration scheme used in the original OLIVE algorithm, for more efficient exploration and better control of the regret incurred in each policy elimination step. To the best of our knowledge, this is the first $\sqrt{n}$-regret result for reinforcement learning in stochastic MDPs with general value function approximation. |
Tasks | Efficient Exploration |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02506v2 |
https://arxiv.org/pdf/1909.02506v2.pdf | |
PWC | https://paperswithcode.com/paper/sqrtn-regret-for-learning-in-markov-decision |
Repo | |
Framework | |
The double traveling salesman problem with partial last-in-first-out loading constraints
Title | The double traveling salesman problem with partial last-in-first-out loading constraints |
Authors | Jonatas B. C. Chagas, Túlio A. M. Toffolo, Marcone J. F. Souza, Manuel Iori |
Abstract | In this paper, we introduce the Double Traveling Salesman Problem with Partial Last-In-First-Out Loading Constraints (DTSPPL), a pickup-and-delivery single-vehicle routing problem where all pickup operations must be performed before any delivery one because the pickup and delivery areas are geographically separated. The vehicle collects items in the pickup area and loads them into its container, a horizontal stack. After performing all pickup operations, the vehicle begins delivering the items in the delivery area. Loading and unloading operations must obey a partial Last-In-First-Out (LIFO) policy, i.e., a version of the LIFO policy that may be violated within a given reloading depth. The objective of the DTSPPL is to minimize the total cost, which involves the total distance traveled by the vehicle and the number of reloaded items due to violations of the standard LIFO policy. We formally describe the DTSPPL by means of two Integer Linear Programming (ILP) formulations, and propose a heuristic algorithm based on the Biased Random-Key Genetic Algorithm (BRKGA) to find high-quality solutions. The performance of the proposed solution approaches is assessed over a broad set of instances. Computational results have shown that both ILP formulations were able to solve only the smaller instances, whereas the BRKGA obtained better solutions for almost all instances, requiring shorter computational time. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08494v1 |
https://arxiv.org/pdf/1908.08494v1.pdf | |
PWC | https://paperswithcode.com/paper/the-double-traveling-salesman-problem-with |
Repo | |
Framework | |
Fashion Retail: Forecasting Demand for New Items
Title | Fashion Retail: Forecasting Demand for New Items |
Authors | Pawan Kumar Singh, Yadunath Gupta, Nilpa Jha, Aruna Rajan |
Abstract | Fashion merchandising is one of the most complicated problems in forecasting, given the transient nature of trends in colours, prints, cuts, patterns, and materials in fashion, the economies of scale achievable only in bulk production, as well as geographical variations in consumption. Retailers that serve a large customer base spend a lot of money and resources to stay prepared for meeting changing fashion demands, and incur huge losses in unsold inventory and liquidation costs [2]. This problem has been addressed by analysts and statisticians as well as ML researchers in a conventional fashion - of building models that forecast for future demand given a particular item of fashion with historical data on its sales. To our knowledge, none of these models have generalized well to predict future demand at an abstracted level for a new design/style of fashion article. To address this problem, we present a study of large scale fashion sales data and directly infer which clothing/footwear attributes and merchandising factors drove demand for those items. We then build generalised models to forecast demand given new item attributes, and demonstrate robust performance by experimenting with different neural architectures, ML methods, and loss functions. |
Tasks | |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1907.01960v1 |
https://arxiv.org/pdf/1907.01960v1.pdf | |
PWC | https://paperswithcode.com/paper/fashion-retail-forecasting-demand-for-new |
Repo | |
Framework | |
Non-Negative Local Sparse Coding for Subspace Clustering
Title | Non-Negative Local Sparse Coding for Subspace Clustering |
Authors | Babak Hosseini, Barbara Hammer |
Abstract | Subspace sparse coding (SSC) algorithms have proven to be beneficial to clustering problems. They provide an alternative data representation in which the underlying structure of the clusters can be better captured. However, most of the research in this area is mainly focused on enhancing the sparse coding part of the problem. In contrast, we introduce a novel objective term in our proposed SSC framework which focuses on the separability of data points in the coding space. We also provide mathematical insights into how this local-separability term improves the clustering result of the SSC framework. Our proposed non-linear local SSC algorithm (NLSSC) also benefits from the efficient choice of its sparsity terms and constraints. The NLSSC algorithm is also formulated in the kernel-based framework (NLKSSC) which can represent the nonlinear structure of data. In addition, we address the possibility of having redundancies in sparse coding results and its negative effect on graph-based clustering problems. We introduce the link-restore post-processing step to improve the representation graph of non-negative SSC algorithms such as ours. Empirical evaluations on well-known clustering benchmarks show that our proposed NLSSC framework results in better clusterings compared to the state-of-the-art baselines and demonstrate the effectiveness of the link-restore post-processing in improving the clustering accuracy via correcting the broken links of the representation graph. |
Tasks | |
Published | 2019-03-12 |
URL | http://arxiv.org/abs/1903.05239v1 |
http://arxiv.org/pdf/1903.05239v1.pdf | |
PWC | https://paperswithcode.com/paper/non-negative-local-sparse-coding-for-subspace |
Repo | |
Framework | |
Learning to Explore in Motion and Interaction Tasks
Title | Learning to Explore in Motion and Interaction Tasks |
Authors | Miroslav Bogdanovic, Ludovic Righetti |
Abstract | Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems. |
Tasks | Efficient Exploration |
Published | 2019-08-10 |
URL | https://arxiv.org/abs/1908.03731v1 |
https://arxiv.org/pdf/1908.03731v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-explore-in-motion-and-interaction |
Repo | |
Framework | |
UltraCompression: Framework for High Density Compression of Ultrasound Volumes using Physics Modeling Deep Neural Networks
Title | UltraCompression: Framework for High Density Compression of Ultrasound Volumes using Physics Modeling Deep Neural Networks |
Authors | Debarghya China, Francis Tom, Sumanth Nandamuri, Aupendu Kar, Mukundhan Srinivasan, Pabitra Mitra, Debdoot Sheet |
Abstract | Ultrasound image compression by preserving speckle-based key information is a challenging task. In this paper, we introduce an ultrasound image compression framework with the ability to retain realism of speckle appearance despite achieving very high-density compression factors. The compressor employs a tissue segmentation method, transmitting segments along with transducer frequency, number of samples and image size as essential information required for decompression. The decompressor is based on a convolutional network trained to generate patho-realistic ultrasound images which convey essential information pertinent to tissue pathology visible in the images. We demonstrate generalizability of the building blocks using two variants to build the compressor. We have evaluated the quality of decompressed images using distortion losses as well as perception loss and compared it with other off the shelf solutions. The proposed method achieves a compression ratio of $725:1$ while preserving the statistical distribution of speckles. This enables image segmentation on decompressed images to achieve dice score of $0.89 \pm 0.11$, which evidently is not so accurately achievable when images are compressed with current standards like JPEG, JPEG 2000, WebP and BPG. We envision this frame work to serve as a roadmap for speckle image compression standards. |
Tasks | Image Compression, Semantic Segmentation |
Published | 2019-01-17 |
URL | http://arxiv.org/abs/1901.05880v1 |
http://arxiv.org/pdf/1901.05880v1.pdf | |
PWC | https://paperswithcode.com/paper/ultracompression-framework-for-high-density |
Repo | |
Framework | |
A concise guide to existing and emerging vehicle routing problem variants
Title | A concise guide to existing and emerging vehicle routing problem variants |
Authors | Thibaut Vidal, Gilbert Laporte, Piotr Matl |
Abstract | Vehicle routing problems have been the focus of extensive research over the past sixty years, driven by their economic importance and their theoretical interest. The diversity of applications has motivated the study of a myriad of problem variants with different attributes. In this article, we provide a brief survey of existing and emerging problem variants. Models are typically refined along three lines: considering more relevant objectives and performance metrics, integrating vehicle routing evaluations with other tactical decisions, and capturing fine-grained yet essential aspects of modern supply chains. We organize the main problem attributes within this structured framework. We discuss recent research directions and pinpoint current shortcomings, recent successes, and emerging challenges. |
Tasks | |
Published | 2019-06-16 |
URL | https://arxiv.org/abs/1906.06750v1 |
https://arxiv.org/pdf/1906.06750v1.pdf | |
PWC | https://paperswithcode.com/paper/a-concise-guide-to-existing-and-emerging |
Repo | |
Framework | |
Non-Parametric Structure Learning on Hidden Tree-Shaped Distributions
Title | Non-Parametric Structure Learning on Hidden Tree-Shaped Distributions |
Authors | Konstantinos E. Nikolakakis, Dionysios S. Kalogerias, Anand D. Sarwate |
Abstract | We provide high probability sample complexity guarantees for non-parametric structure learning of tree-shaped graphical models whose nodes are discrete random variables with a finite or countable alphabet, both in the noiseless and noisy regimes. First, we introduce a new, fundamental quantity called the (noisy) information threshold, which arises naturally from the error analysis of the Chow-Liu algorithm and characterizes not only the sample complexity, but also the inherent impact of the noise on the structure learning task, without explicit assumptions on the distribution of the model. This allows us to present the first non-parametric, high-probability finite sample complexity bounds on tree-structure learning from potentially noise-corrupted data. In particular, for number of nodes $p$, success rate $1-\delta$, and a fixed value of the information threshold, our sample complexity bounds for exact structure recovery are of the order of $\mathcal{O}\big(\log^{1+\zeta} (p/\delta)\big) $, for all $\zeta>0$, for both noiseless and noisy settings. Subsequently, we apply our results on two classes of hidden models, namely, the $M$-ary erasure channel and the generalized symmetric channel, illustrating the usefulness and importance of our framework. As a byproduct of our analysis, this paper resolves the open problem of tree structure learning in the presence of non-identically distributed observation noise, providing explicit conditions on the convergence of the Chow-Liu algorithm under this setting as well. |
Tasks | |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09596v1 |
https://arxiv.org/pdf/1909.09596v1.pdf | |
PWC | https://paperswithcode.com/paper/non-parametric-structure-learning-on-hidden |
Repo | |
Framework | |
Geometric structure of graph Laplacian embeddings
Title | Geometric structure of graph Laplacian embeddings |
Authors | Nicolas Garcia Trillos, Franca Hoffmann, Bamdad Hosseini |
Abstract | We analyze the spectral clustering procedure for identifying coarse structure in a data set $x_1, \dots, x_n$, and in particular study the geometry of graph Laplacian embeddings which form the basis for spectral clustering algorithms. More precisely, we assume that the data is sampled from a mixture model supported on a manifold $\mathcal{M}$ embedded in $\mathbb{R}^d$, and pick a connectivity length-scale $\varepsilon>0$ to construct a kernelized graph Laplacian. We introduce a notion of a well-separated mixture model which only depends on the model itself, and prove that when the model is well separated, with high probability the embedded data set concentrates on cones that are centered around orthogonal vectors. Our results are meaningful in the regime where $\varepsilon = \varepsilon(n)$ is allowed to decay to zero at a slow enough rate as the number of data points grows. This rate depends on the intrinsic dimension of the manifold on which the data is supported. |
Tasks | |
Published | 2019-01-30 |
URL | http://arxiv.org/abs/1901.10651v1 |
http://arxiv.org/pdf/1901.10651v1.pdf | |
PWC | https://paperswithcode.com/paper/geometric-structure-of-graph-laplacian |
Repo | |
Framework | |
MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models
Title | MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models |
Authors | Boyuan Pan, Yazheng Yang, Hao Li, Zhou Zhao, Yueting Zhuang, Deng Cai, Xiaofei He |
Abstract | Machine Comprehension (MC) is one of the core problems in natural language processing, requiring both understanding of the natural language and knowledge about the world. Rapid progress has been made since the release of several benchmark datasets, and recently the state-of-the-art models even surpass human performance on the well-known SQuAD evaluation. In this paper, we transfer knowledge learned from machine comprehension to the sequence-to-sequence tasks to deepen the understanding of the text. We propose MacNet: a novel encoder-decoder supplementary architecture to the widely used attention-based sequence-to-sequence models. Experiments on neural machine translation (NMT) and abstractive text summarization show that our proposed framework can significantly improve the performance of the baseline models, and our method for the abstractive text summarization achieves the state-of-the-art results on the Gigaword dataset. |
Tasks | Abstractive Text Summarization, Machine Translation, Reading Comprehension, Text Summarization |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1908.01816v1 |
https://arxiv.org/pdf/1908.01816v1.pdf | |
PWC | https://paperswithcode.com/paper/macnet-transferring-knowledge-from-machine-1 |
Repo | |
Framework | |
Automatic Construction of Multi-layer Perceptron Network from Streaming Examples
Title | Automatic Construction of Multi-layer Perceptron Network from Streaming Examples |
Authors | Mahardhika Pratama, Choiru Za’in, Andri Ashfahani, Yew Soon Ong, Weiping Ding |
Abstract | Autonomous construction of deep neural network (DNNs) is desired for data streams because it potentially offers two advantages: proper model’s capacity and quick reaction to drift and shift. While the self-organizing mechanism of DNNs remains an open issue, this task is even more challenging to be developed for standard multi-layer DNNs than that using the different-depth structures, because the addition of a new layer results in information loss of previously trained knowledge. A Neural Network with Dynamically Evolved Capacity (NADINE) is proposed in this paper. NADINE features a fully open structure where its network structure, depth and width, can be automatically evolved from scratch in an online manner and without the use of problem-specific thresholds. NADINE is structured under a standard MLP architecture and the catastrophic forgetting issue during the hidden layer addition phase is resolved using the proposal of soft-forgetting and adaptive memory methods. The advantage of NADINE, namely elastic structure and online learning trait, is numerically validated using nine data stream classification and regression problems where it demonstrates performance improvement over prominent algorithms in all problems. In addition, it is capable of dealing with data stream regression and classification problems equally well. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03437v2 |
https://arxiv.org/pdf/1910.03437v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-construction-of-multi-layer |
Repo | |
Framework | |
GODS: Generalized One-class Discriminative Subspaces for Anomaly Detection
Title | GODS: Generalized One-class Discriminative Subspaces for Anomaly Detection |
Authors | Jue Wang, Anoop Cherian |
Abstract | One-class learning is the classic problem of fitting a model to data for which annotations are available only for a single class. In this paper, we propose a novel objective for one-class learning. Our key idea is to use a pair of orthonormal frames – as subspaces – to “sandwich” the labeled data via optimizing for two objectives jointly: i) minimize the distance between the origins of the two subspaces, and ii) to maximize the margin between the hyperplanes and the data, either subspace demanding the data to be in its positive and negative orthant respectively. Our proposed objective however leads to a non-convex optimization problem, to which we resort to Riemannian optimization schemes and derive an efficient conjugate gradient scheme on the Stiefel manifold. To study the effectiveness of our scheme, we propose a new dataset~\emph{Dash-Cam-Pose}, consisting of clips with skeleton poses of humans seated in a car, the task being to classify the clips as normal or abnormal; the latter is when any human pose is out-of-position with regard to say an airbag deployment. Our experiments on the proposed Dash-Cam-Pose dataset, as well as several other standard anomaly/novelty detection benchmarks demonstrate the benefits of our scheme, achieving state-of-the-art one-class accuracy. |
Tasks | Anomaly Detection |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.05884v1 |
https://arxiv.org/pdf/1908.05884v1.pdf | |
PWC | https://paperswithcode.com/paper/gods-generalized-one-class-discriminative |
Repo | |
Framework | |