Paper Group ANR 155
Idle Time Optimization for Target Assignment and Path Finding in Sortation Centers. Optimizing Shallow Networks for Binary Classification. Code-Switching for Enhancing NMT with Pre-Specified Translation. Convergence rates for the stochastic gradient descent method for non-convex objective functions. Learning Metric Graphs for Neuron Segmentation In …
Idle Time Optimization for Target Assignment and Path Finding in Sortation Centers
Title | Idle Time Optimization for Target Assignment and Path Finding in Sortation Centers |
Authors | Ngai Meng Kou, Cheng Peng, Hang Ma, T. K. Satish Kumar, Sven Koenig |
Abstract | In this paper, we study the one-shot and lifelong versions of the Target Assignment and Path Finding problem in automated sortation centers, where each agent needs to constantly assign itself a sorting station, move to its assigned station without colliding with obstacles or other agents, wait in the queue of that station to obtain a parcel for delivery, and then deliver the parcel to a sorting bin. The throughput of such centers is largely determined by the total idle time of all stations since their queues can frequently become empty. To address this problem, we first formalize and study the one-shot version that assigns stations to a set of agents and finds collision-free paths for the agents to their assigned stations. We present efficient algorithms for this task based on a novel min-cost max-flow formulation that minimizes the total idle time of all stations in a fixed time window. We then demonstrate how our algorithms for solving the one-shot problem can be applied to solving the lifelong problem as well. Experimentally, we believe to be the first researchers to consider real-world automated sortation centers using an industrial simulator with realistic data and a kinodynamic model of real robots. On this simulator, we showcase the benefits of our algorithms by demonstrating their efficiency and effectiveness for up to 350 agents. |
Tasks | |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.00253v1 |
https://arxiv.org/pdf/1912.00253v1.pdf | |
PWC | https://paperswithcode.com/paper/idle-time-optimization-for-target-assignment |
Repo | |
Framework | |
Optimizing Shallow Networks for Binary Classification
Title | Optimizing Shallow Networks for Binary Classification |
Authors | Kalliopi Basioti, George V. Moustakides |
Abstract | Data driven classification that relies on neural networks is based on optimization criteria that involve some form of distance between the output of the network and the desired label. Using the same mathematical analysis, for a multitude of such measures, we can show that their optimum solution matches the ideal likelihood ratio test classifier. In this work we introduce a different family of optimization problems which is not covered by the existing approaches and, therefore, opens possibilities for new training algorithms for neural network based classification. We give examples that lead to algorithms that are simple in implementation, exhibit stable convergence characteristics and are antagonistic to the most popular existing techniques. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10161v2 |
https://arxiv.org/pdf/1905.10161v2.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-shallow-networks-for-binary |
Repo | |
Framework | |
Code-Switching for Enhancing NMT with Pre-Specified Translation
Title | Code-Switching for Enhancing NMT with Pre-Specified Translation |
Authors | Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, Min Zhang |
Abstract | Leveraging user-provided translation to constrain NMT has practical significance. Existing methods can be classified into two main categories, namely the use of placeholder tags for lexicon words and the use of hard constraints during decoding. Both methods can hurt translation fidelity for various reasons. We investigate a data augmentation method, making code-switched training data by replacing source phrases with their target translations. Our method does not change the MNT model or decoding algorithm, allowing the model to learn lexicon translations by copying source-side target words. Extensive experiments show that our method achieves consistent improvements over existing approaches, improving translation of constrained words without hurting unconstrained words. |
Tasks | Data Augmentation |
Published | 2019-04-19 |
URL | https://arxiv.org/abs/1904.09107v4 |
https://arxiv.org/pdf/1904.09107v4.pdf | |
PWC | https://paperswithcode.com/paper/code-switching-for-enhancing-nmt-with-pre |
Repo | |
Framework | |
Convergence rates for the stochastic gradient descent method for non-convex objective functions
Title | Convergence rates for the stochastic gradient descent method for non-convex objective functions |
Authors | Benjamin Fehrman, Benjamin Gess, Arnulf Jentzen |
Abstract | We prove the local convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily globally convex nor contracting objective functions. In particular, the results are applicable to simple objective functions arising in machine learning. |
Tasks | |
Published | 2019-04-02 |
URL | https://arxiv.org/abs/1904.01517v2 |
https://arxiv.org/pdf/1904.01517v2.pdf | |
PWC | https://paperswithcode.com/paper/convergence-rates-for-the-stochastic-gradient |
Repo | |
Framework | |
Learning Metric Graphs for Neuron Segmentation In Electron Microscopy Images
Title | Learning Metric Graphs for Neuron Segmentation In Electron Microscopy Images |
Authors | Kyle Luther, H. Sebastian Seung |
Abstract | In the deep metric learning approach to image segmentation, a convolutional net densely generates feature vectors at the pixels of an image. Pairs of feature vectors are trained to be similar or different, depending on whether the corresponding pixels belong to same or different ground truth segments. To segment a new image, the feature vectors are computed and clustered. Both empirically and theoretically, it is unclear whether or when deep metric learning is superior to the more conventional approach of directly predicting an affinity graph with a convolutional net. We compare the two approaches using brain images from serial section electron microscopy images, which constitute an especially challenging example of instance segmentation. We first show that seed-based postprocessing of the feature vectors, as originally proposed, produces inferior accuracy because it is difficult for the convolutional net to predict feature vectors that remain uniform across large objects. Then we consider postprocessing by thresholding a nearest neighbor graph followed by connected components. In this case, segmentations from a “metric graph” turn out to be competitive or even superior to segmentations from a directly predicted affinity graph. To explain these findings theoretically, we invoke the property that the metric function satisfies the triangle inequality. Then we show with an example where this constraint suppresses noise, causing connected components to more robustly segment a metric graph than an unconstrained affinity graph. |
Tasks | Electron Microscopy Image Segmentation, Instance Segmentation, Metric Learning, Semantic Segmentation |
Published | 2019-01-31 |
URL | http://arxiv.org/abs/1902.00100v1 |
http://arxiv.org/pdf/1902.00100v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-metric-graphs-for-neuron |
Repo | |
Framework | |
Repurposing Decoder-Transformer Language Models for Abstractive Summarization
Title | Repurposing Decoder-Transformer Language Models for Abstractive Summarization |
Authors | Luke de Oliveira, Alfredo Láinez Rodrigo |
Abstract | Neural network models have shown excellent fluency and performance when applied to abstractive summarization. Many approaches to neural abstractive summarization involve the introduction of significant inductive bias, exemplified through the use of components such as pointer-generator architectures, coverage, and partially extractive procedures, designed to mimic the process by which humans summarize documents. We show that it is possible to attain competitive performance by instead directly viewing summarization as a language modeling problem and effectively leveraging transfer learning. We introduce a simple procedure built upon decoder-transformers to obtain highly competitive ROUGE scores for summarization performance using a language modeling loss alone, with no beam-search or other decoding-time optimization, and instead relying on efficient nucleus sampling and greedy decoding. |
Tasks | Abstractive Text Summarization, Language Modelling, Transfer Learning |
Published | 2019-09-01 |
URL | https://arxiv.org/abs/1909.00325v1 |
https://arxiv.org/pdf/1909.00325v1.pdf | |
PWC | https://paperswithcode.com/paper/repurposing-decoder-transformer-language |
Repo | |
Framework | |
Interpretable Word Embeddings via Informative Priors
Title | Interpretable Word Embeddings via Informative Priors |
Authors | Miriam Hurtado Bodell, Martin Arvidsson, Måns Magnusson |
Abstract | Word embeddings have demonstrated strong performance on NLP tasks. However, lack of interpretability and the unsupervised nature of word embeddings have limited their use within computational social science and digital humanities. We propose the use of informative priors to create interpretable and domain-informed dimensions for probabilistic word embeddings. Experimental results show that sensible priors can capture latent semantic concepts better than or on-par with the current state of the art, while retaining the simplicity and generalizability of using priors. |
Tasks | Word Embeddings |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01459v1 |
https://arxiv.org/pdf/1909.01459v1.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-word-embeddings-via-informative |
Repo | |
Framework | |
Density Estimation and Incremental Learning of Latent Vector for Generative Autoencoders
Title | Density Estimation and Incremental Learning of Latent Vector for Generative Autoencoders |
Authors | Jaeyoung Yoo, Hojun Lee, Nojun Kwak |
Abstract | In this paper, we treat the image generation task using the autoencoder, a representative latent model. Unlike many studies regularizing the latent variable’s distribution by assuming a manually specified prior, we approach the image generation task using an autoencoder by directly estimating the latent distribution. To do this, we introduce ‘latent density estimator’ which captures latent distribution explicitly and propose its structure. In addition, we propose an incremental learning strategy of latent variables so that the autoencoder learns important features of data by using the structural characteristics of under-complete autoencoder without an explicit regularization term in the objective function. Through experiments, we show the effectiveness of the proposed latent density estimator and the incremental learning strategy of latent variables. We also show that our generative model generates images with improved visual quality compared to previous generative models based on autoencoders. |
Tasks | Density Estimation, Image Generation |
Published | 2019-02-12 |
URL | http://arxiv.org/abs/1902.04294v1 |
http://arxiv.org/pdf/1902.04294v1.pdf | |
PWC | https://paperswithcode.com/paper/density-estimation-and-incremental-learning |
Repo | |
Framework | |
An adaptive hybrid algorithm for social networks to choose groups with independent members
Title | An adaptive hybrid algorithm for social networks to choose groups with independent members |
Authors | Parham Hadikhani, Pooria Hadikhani |
Abstract | Choosing a committee with independent members in social networks can be named as a problem in group selection and independence in the committee is considered as the main criterion of this selection. Independence is calculated based on the social distance between group members. Although there are many solutions to solve the problem of group selection in social networks, such as selection of the target set or community detection, just one solution has been proposed to choose committee members based on their independence as a measure of group performance. In this paper, a new adaptive hybrid algorithm is proposed to select the best committee members to maximize the independence of the committees. This algorithm is a combination of particle swarm optimization (PSO) algorithm with two local search algorithms. The goal of this work is to combine exploration and exploitation to improve the efficiency of the proposed algorithm and obtain the optimal solution. Additionally, to combine local search algorithms with particle swarm optimization, an effective selection mechanism is used to select a suitable local search algorithm to combine with particle swarm optimization during the search process. The results of experimental simulation are compared with the well-known and successful metaheuristic algorithms. This comparison shows that the proposed method improves the group independence by at least 21%. |
Tasks | Community Detection |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.01875v1 |
https://arxiv.org/pdf/1910.01875v1.pdf | |
PWC | https://paperswithcode.com/paper/an-adaptive-hybrid-algorithm-for-social |
Repo | |
Framework | |
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstractive Summarization
Title | Deep Reinforcement Learning with Distributional Semantic Rewards for Abstractive Summarization |
Authors | Siyao Li, Deren Lei, Pengda Qin, William Yang Wang |
Abstract | Deep reinforcement learning (RL) has been a commonly-used strategy for the abstractive summarization task to address both the exposure bias and non-differentiable task issues. However, the conventional reward Rouge-L simply looks for exact n-grams matches between candidates and annotated references, which inevitably makes the generated sentences repetitive and incoherent. In this paper, instead of Rouge-L, we explore the practicability of utilizing the distributional semantics to measure the matching degrees. With distributional semantics, sentence-level evaluation can be obtained, and semantically-correct phrases can also be generated without being limited to the surface form of the reference sentences. Human judgments on Gigaword and CNN/Daily Mail datasets show that our proposed distributional semantics reward (DSR) has distinct superiority in capturing the lexical and compositional diversity of natural language. |
Tasks | Abstractive Text Summarization |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00141v2 |
https://arxiv.org/pdf/1909.00141v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-with-1 |
Repo | |
Framework | |
A Genetic Programming System with an Epigenetic Mechanism for Traffic Signal Control
Title | A Genetic Programming System with an Epigenetic Mechanism for Traffic Signal Control |
Authors | Esteban Ricalde |
Abstract | Traffic congestion is an increasing problem in most cities around the world. It impacts businesses as well as commuters, small cities and large ones in developing as well as developed economies. One approach to decrease urban traffic congestion is to optimize the traffic signal behaviour in order to be adaptive to changes in the traffic conditions. From the perspective of intelligent transportation systems, this optimization problem is called the traffic signal control problem and is considered a large combinatorial problem with high complexity and uncertainty. A novel approach to the traffic signal control problem is proposed in this thesis. The approach includes a new mechanism for Genetic Programming inspired by Epigenetics. Epigenetic mechanisms play an important role in biological processes such as phenotype differentiation, memory consolidation within generations and environmentally induced epigenetic modification of behaviour. These properties lead us to consider the implementation of epigenetic mechanisms as a way to improve the performance of Evolutionary Algorithms in solution to real-world problems with dynamic environmental changes, such as the traffic control signal problem. The epigenetic mechanism proposed was evaluated in four traffic scenarios with different properties and traffic conditions using two microscopic simulators. The results of these experiments indicate that Genetic Programming was able to generate competitive actuated traffic signal controllers for all the scenarios tested. Furthermore, the use of the epigenetic mechanism improved the performance of Genetic Programming in all the scenarios. The evolved controllers adapt to modifications in the traffic density and require less monitoring and less human interaction than other solutions because they dynamically adjust the signal behaviour depending on the local traffic conditions at each intersection. |
Tasks | |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.03854v1 |
http://arxiv.org/pdf/1903.03854v1.pdf | |
PWC | https://paperswithcode.com/paper/a-genetic-programming-system-with-an |
Repo | |
Framework | |
Robo-advising: Learning Investors’ Risk Preferences via Portfolio Choices
Title | Robo-advising: Learning Investors’ Risk Preferences via Portfolio Choices |
Authors | Humoud Alsabah, Agostino Capponi, Octavio Ruiz Lacedelli, Matt Stern |
Abstract | We introduce a reinforcement learning framework for retail robo-advising. The robo-advisor does not know the investor’s risk preference, but learns it over time by observing her portfolio choices in different market environments. We develop an exploration-exploitation algorithm which trades off costly solicitations of portfolio choices by the investor with autonomous trading decisions based on stale estimates of investor’s risk aversion. We show that the algorithm’s value function converges to the optimal value function of an omniscient robo-advisor over a number of periods that is polynomial in the state and action space. By correcting for the investor’s mistakes, the robo-advisor may outperform a stand-alone investor, regardless of the investor’s opportunity cost for making portfolio decisions. |
Tasks | |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.02067v2 |
https://arxiv.org/pdf/1911.02067v2.pdf | |
PWC | https://paperswithcode.com/paper/robo-advising-learning-investors-risk |
Repo | |
Framework | |
Complex networks based word embeddings
Title | Complex networks based word embeddings |
Authors | Nicolas Dugué, Victor Connes |
Abstract | Most of the time, the first step to learn word embeddings is to build a word co-occurrence matrix. As such matrices are equivalent to graphs, complex networks theory can naturally be used to deal with such data. In this paper, we consider applying community detection, a main tool of this field, to the co-occurrence matrix corresponding to a huge corpus. Community structure is used as a way to reduce the dimensionality of the initial space. Using this community structure, we propose a method to extract word embeddings that are comparable to the state-of-the-art approaches. |
Tasks | Community Detection, Word Embeddings |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01489v1 |
https://arxiv.org/pdf/1910.01489v1.pdf | |
PWC | https://paperswithcode.com/paper/complex-networks-based-word-embeddings |
Repo | |
Framework | |
The Power of Communities: A Text Classification Model with Automated Labeling Process Using Network Community Detection
Title | The Power of Communities: A Text Classification Model with Automated Labeling Process Using Network Community Detection |
Authors | Minjun Kim, Hiroki Sayama |
Abstract | Text classification is one of the most critical areas in machine learning and artificial intelligence research. It has been actively adopted in many business applications such as conversational intelligence systems, news articles categorizations, sentiment analysis, emotion detection systems, and many other recommendation systems in our daily life. One of the problems in supervised text classification models is that the models’ performance depends heavily on the quality of data labeling that is typically done by humans. In this study, we propose a new network community detection-based approach to automatically label and classify text data into multiclass value spaces. Specifically, we build networks with sentences as the network nodes and pairwise cosine similarities between the Term Frequency-Inversed Document Frequency (TFIDF) vector representations of the sentences as the network link weights. We use the Louvain method to detect the communities in the sentence networks. We train and test the Support Vector Machine and the Random Forest models on both the human-labeled data and network community detection labeled data. Results showed that models with the data labeled by the network community detection outperformed the models with the human-labeled data by 2.68-3.75% of classification accuracy. Our method may help developments of more accurate conversational intelligence and other text classification systems. |
Tasks | Community Detection, Recommendation Systems, Sentiment Analysis, Text Classification |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11706v3 |
https://arxiv.org/pdf/1909.11706v3.pdf | |
PWC | https://paperswithcode.com/paper/the-power-of-communities-a-text |
Repo | |
Framework | |
MineGAN: effective knowledge transfer from GANs to target domains with few images
Title | MineGAN: effective knowledge transfer from GANs to target domains with few images |
Authors | Yaxing Wang, Abel Gonzalez-Garcia, David Berga, Luis Herranz, Fahad Shahbaz Khan, Joost van de Weijer |
Abstract | One of the attractive characteristics of deep neural networks is their ability to transfer knowledge obtained in one domain to other related domains. As a result, high-quality networks can be trained in domains with relatively little training data. This property has been extensively studied for discriminative networks but has received significantly less attention for generative models. Given the often enormous effort required to train GANs, both computationally as well as in the dataset collection, the re-use of pretrained GANs is a desirable objective. We propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs. This is done using a miner network that identifies which part of the generative distribution of each pretrained GAN outputs samples closest to the target domain. Mining effectively steers GAN sampling towards suitable regions of the latent space, which facilitates the posterior finetuning and avoids pathologies of other methods such as mode collapse and lack of flexibility. We perform experiments on several complex datasets using various GAN architectures (BigGAN, Progressive GAN) and show that the proposed method, called MineGAN, effectively transfers knowledge to domains with few target images, outperforming existing methods. In addition, MineGAN can successfully transfer knowledge from multiple pretrained GANs. Our code is available at: \url{https://github.com/yaxingwang/MineGAN}. |
Tasks | Transfer Learning |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05270v2 |
https://arxiv.org/pdf/1912.05270v2.pdf | |
PWC | https://paperswithcode.com/paper/minegan-effective-knowledge-transfer-from |
Repo | |
Framework | |