Paper Group ANR 772
Learning to Optimize Neural Nets. Query-guided Regression Network with Context Policy for Phrase Grounding. Parameter Reference Loss for Unsupervised Domain Adaptation. Insulin Regimen ML-based control for T2DM patients. From source to target and back: symmetric bi-directional adaptive GAN. Power Systems Data Fusion based on Belief Propagation. Lif …
Learning to Optimize Neural Nets
Title | Learning to Optimize Neural Nets |
Authors | Ke Li, Jitendra Malik |
Abstract | Learning to Optimize is a recently proposed framework for learning optimization algorithms using reinforcement learning. In this paper, we explore learning an optimization algorithm for training shallow neural nets. Such high-dimensional stochastic optimization problems present interesting challenges for existing reinforcement learning algorithms. We develop an extension that is suited to learning optimization algorithms in this setting and demonstrate that the learned optimization algorithm consistently outperforms other known optimization algorithms even on unseen tasks and is robust to changes in stochasticity of gradients and the neural net architecture. More specifically, we show that an optimization algorithm trained with the proposed method on the problem of training a neural net on MNIST generalizes to the problems of training neural nets on the Toronto Faces Dataset, CIFAR-10 and CIFAR-100. |
Tasks | Stochastic Optimization |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00441v2 |
http://arxiv.org/pdf/1703.00441v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-optimize-neural-nets |
Repo | |
Framework | |
Query-guided Regression Network with Context Policy for Phrase Grounding
Title | Query-guided Regression Network with Context Policy for Phrase Grounding |
Authors | Kan Chen, Rama Kovvuri, Ram Nevatia |
Abstract | Given a textual description of an image, phrase grounding localizes objects in the image referred by query phrases in the description. State-of-the-art methods address the problem by ranking a set of proposals based on the relevance to each query, which are limited by the performance of independent proposal generation systems and ignore useful cues from context in the description. In this paper, we adopt a spatial regression method to break the performance limit, and introduce reinforcement learning techniques to further leverage semantic context information. We propose a novel Query-guided Regression network with Context policy (QRC Net) which jointly learns a Proposal Generation Network (PGN), a Query-guided Regression Network (QRN) and a Context Policy Network (CPN). Experiments show QRC Net provides a significant improvement in accuracy on two popular datasets: Flickr30K Entities and Referit Game, with 14.25% and 17.14% increase over the state-of-the-arts respectively. |
Tasks | Phrase Grounding |
Published | 2017-08-04 |
URL | http://arxiv.org/abs/1708.01676v1 |
http://arxiv.org/pdf/1708.01676v1.pdf | |
PWC | https://paperswithcode.com/paper/query-guided-regression-network-with-context |
Repo | |
Framework | |
Parameter Reference Loss for Unsupervised Domain Adaptation
Title | Parameter Reference Loss for Unsupervised Domain Adaptation |
Authors | Jiren Jin, Richard G. Calland, Takeru Miyato, Brian K. Vogel, Hideki Nakayama |
Abstract | The success of deep learning in computer vision is mainly attributed to an abundance of data. However, collecting large-scale data is not always possible, especially for the supervised labels. Unsupervised domain adaptation (UDA) aims to utilize labeled data from a source domain to learn a model that generalizes to a target domain of unlabeled data. A large amount of existing work uses Siamese network-based models, where two streams of neural networks process the source and the target domain data respectively. Nevertheless, most of these approaches focus on minimizing the domain discrepancy, overlooking the importance of preserving the discriminative ability for target domain features. Another important problem in UDA research is how to evaluate the methods properly. Common evaluation procedures require target domain labels for hyper-parameter tuning and model selection, contradicting the definition of the UDA task. Hence we propose a more reasonable evaluation principle that avoids this contradiction by simply adopting the latest snapshot of a model for evaluation. This adds an extra requirement for UDA methods besides the main performance criteria: the stability during training. We design a novel method that connects the target domain stream to the source domain stream with a Parameter Reference Loss (PRL) to solve these problems simultaneously. Experiments on various datasets show that the proposed PRL not only improves the performance on the target domain, but also stabilizes the training procedure. As a result, PRL based models do not need the contradictory model selection, and thus are more suitable for practical applications. |
Tasks | Domain Adaptation, Model Selection, Unsupervised Domain Adaptation |
Published | 2017-11-20 |
URL | http://arxiv.org/abs/1711.07170v2 |
http://arxiv.org/pdf/1711.07170v2.pdf | |
PWC | https://paperswithcode.com/paper/parameter-reference-loss-for-unsupervised |
Repo | |
Framework | |
Insulin Regimen ML-based control for T2DM patients
Title | Insulin Regimen ML-based control for T2DM patients |
Authors | Mark Shifrin, Hava Siegelmann |
Abstract | \begin{abstract} We model individual T2DM patient blood glucose level (BGL) by stochastic process with discrete number of states mainly but not solely governed by medication regimen (e.g. insulin injections). BGL states change otherwise according to various physiological triggers which render a stochastic, statistically unknown, yet assumed to be quasi-stationary, nature of the process. In order to express incentive for being in desired healthy BGL we heuristically define a reward function which returns positive values for desirable BG levels and negative values for undesirable BG levels. The state space consists of sufficient number of states in order to allow for memoryless assumption. This, in turn, allows to formulate Markov Decision Process (MDP), with an objective to maximize the total reward, summarized over a long run. The probability law is found by model-based reinforcement learning (RL) and the optimal insulin treatment policy is retrieved from MDP solution. |
Tasks | |
Published | 2017-10-21 |
URL | http://arxiv.org/abs/1710.07855v1 |
http://arxiv.org/pdf/1710.07855v1.pdf | |
PWC | https://paperswithcode.com/paper/insulin-regimen-ml-based-control-for-t2dm |
Repo | |
Framework | |
From source to target and back: symmetric bi-directional adaptive GAN
Title | From source to target and back: symmetric bi-directional adaptive GAN |
Authors | Paolo Russo, Fabio Maria Carlucci, Tatiana Tommasi, Barbara Caputo |
Abstract | The effectiveness of generative adversarial approaches in producing images according to a specific style or visual domain has recently opened new directions to solve the unsupervised domain adaptation problem. It has been shown that source labeled images can be modified to mimic target samples making it possible to train directly a classifier in the target domain, despite the original lack of annotated data. Inverse mappings from the target to the source domain have also been evaluated but only passing through adapted feature spaces, thus without new image generation. In this paper we propose to better exploit the potential of generative adversarial networks for adaptation by introducing a novel symmetric mapping among domains. We jointly optimize bi-directional image transformations combining them with target self-labeling. Moreover we define a new class consistency loss that aligns the generators in the two directions imposing to conserve the class identity of an image passing through both domain mappings. A detailed qualitative and quantitative analysis of the reconstructed images confirm the power of our approach. By integrating the two domain specific classifiers obtained with our bi-directional network we exceed previous state-of-the-art unsupervised adaptation results on four different benchmark datasets. |
Tasks | Domain Adaptation, Image Generation, Unsupervised Domain Adaptation |
Published | 2017-05-24 |
URL | http://arxiv.org/abs/1705.08824v2 |
http://arxiv.org/pdf/1705.08824v2.pdf | |
PWC | https://paperswithcode.com/paper/from-source-to-target-and-back-symmetric-bi |
Repo | |
Framework | |
Power Systems Data Fusion based on Belief Propagation
Title | Power Systems Data Fusion based on Belief Propagation |
Authors | Francesco Fusco, Seshu Tirupathi, Robert Gormally |
Abstract | The increasing complexity of the power grid, due to higher penetration of distributed resources and the growing availability of interconnected, distributed metering devices re- quires novel tools for providing a unified and consistent view of the system. A computational framework for power systems data fusion, based on probabilistic graphical models, capable of combining heterogeneous data sources with classical state estimation nodes and other customised computational nodes, is proposed. The framework allows flexible extension of the notion of grid state beyond the view of flows and injection in bus-branch models, and an efficient, naturally distributed inference algorithm can be derived. An application of the data fusion model to the quantification of distributed solar energy is proposed through numerical examples based on semi-synthetic simulations of the standard IEEE 14-bus test case. |
Tasks | |
Published | 2017-05-24 |
URL | http://arxiv.org/abs/1705.08815v1 |
http://arxiv.org/pdf/1705.08815v1.pdf | |
PWC | https://paperswithcode.com/paper/power-systems-data-fusion-based-on-belief |
Repo | |
Framework | |
Lifelong Learning CRF for Supervised Aspect Extraction
Title | Lifelong Learning CRF for Supervised Aspect Extraction |
Authors | Lei Shu, Hu Xu, Bing Liu |
Abstract | This paper makes a focused contribution to supervised aspect extraction. It shows that if the system has performed aspect extraction from many past domains and retained their results as knowledge, Conditional Random Fields (CRF) can leverage this knowledge in a lifelong learning manner to extract in a new domain markedly better than the traditional CRF without using this prior knowledge. The key innovation is that even after CRF training, the model can still improve its extraction with experiences in its applications. |
Tasks | Aspect Extraction |
Published | 2017-04-29 |
URL | http://arxiv.org/abs/1705.00251v1 |
http://arxiv.org/pdf/1705.00251v1.pdf | |
PWC | https://paperswithcode.com/paper/lifelong-learning-crf-for-supervised-aspect |
Repo | |
Framework | |
Selling to a No-Regret Buyer
Title | Selling to a No-Regret Buyer |
Authors | Mark Braverman, Jieming Mao, Jon Schneider, S. Matthew Weinberg |
Abstract | We consider the problem of a single seller repeatedly selling a single item to a single buyer (specifically, the buyer has a value drawn fresh from known distribution $D$ in every round). Prior work assumes that the buyer is fully rational and will perfectly reason about how their bids today affect the seller’s decisions tomorrow. In this work we initiate a different direction: the buyer simply runs a no-regret learning algorithm over possible bids. We provide a fairly complete characterization of optimal auctions for the seller in this domain. Specifically: - If the buyer bids according to EXP3 (or any “mean-based” learning algorithm), then the seller can extract expected revenue arbitrarily close to the expected welfare. This auction is independent of the buyer’s valuation $D$, but somewhat unnatural as it is sometimes in the buyer’s interest to overbid. - There exists a learning algorithm $\mathcal{A}$ such that if the buyer bids according to $\mathcal{A}$ then the optimal strategy for the seller is simply to post the Myerson reserve for $D$ every round. - If the buyer bids according to EXP3 (or any “mean-based” learning algorithm), but the seller is restricted to “natural” auction formats where overbidding is dominated (e.g. Generalized First-Price or Generalized Second-Price), then the optimal strategy for the seller is a pay-your-bid format with decreasing reserves over time. Moreover, the seller’s optimal achievable revenue is characterized by a linear program, and can be unboundedly better than the best truthful auction yet simultaneously unboundedly worse than the expected welfare. |
Tasks | |
Published | 2017-11-25 |
URL | http://arxiv.org/abs/1711.09176v1 |
http://arxiv.org/pdf/1711.09176v1.pdf | |
PWC | https://paperswithcode.com/paper/selling-to-a-no-regret-buyer |
Repo | |
Framework | |
Improving Social Media Text Summarization by Learning Sentence Weight Distribution
Title | Improving Social Media Text Summarization by Learning Sentence Weight Distribution |
Authors | Jingjing Xu |
Abstract | Recently, encoder-decoder models are widely used in social media text summarization. However, these models sometimes select noise words in irrelevant sentences as part of a summary by error, thus declining the performance. In order to inhibit irrelevant sentences and focus on key information, we propose an effective approach by learning sentence weight distribution. In our model, we build a multi-layer perceptron to predict sentence weights. During training, we use the ROUGE score as an alternative to the estimated sentence weight, and try to minimize the gap between estimated weights and predicted weights. In this way, we encourage our model to focus on the key sentences, which have high relevance with the summary. Experimental results show that our approach outperforms baselines on a large-scale social media corpus. |
Tasks | Text Summarization |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1710.11332v1 |
http://arxiv.org/pdf/1710.11332v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-social-media-text-summarization-by |
Repo | |
Framework | |
Some methods for heterogeneous treatment effect estimation in high-dimensions
Title | Some methods for heterogeneous treatment effect estimation in high-dimensions |
Authors | Scott Powers, Junyang Qian, Kenneth Jung, Alejandro Schuler, Nigam H. Shah, Trevor Hastie, Robert Tibshirani |
Abstract | When devising a course of treatment for a patient, doctors often have little quantitative evidence on which to base their decisions, beyond their medical education and published clinical trials. Stanford Health Care alone has millions of electronic medical records (EMRs) that are only just recently being leveraged to inform better treatment recommendations. These data present a unique challenge because they are high-dimensional and observational. Our goal is to make personalized treatment recommendations based on the outcomes for past patients similar to a new patient. We propose and analyze three methods for estimating heterogeneous treatment effects using observational data. Our methods perform well in simulations using a wide variety of treatment effect functions, and we present results of applying the two most promising methods to data from The SPRINT Data Analysis Challenge, from a large randomized trial of a treatment for high blood pressure. |
Tasks | |
Published | 2017-07-01 |
URL | http://arxiv.org/abs/1707.00102v1 |
http://arxiv.org/pdf/1707.00102v1.pdf | |
PWC | https://paperswithcode.com/paper/some-methods-for-heterogeneous-treatment |
Repo | |
Framework | |
Recent Developments and Future Challenges in Medical Mixed Reality
Title | Recent Developments and Future Challenges in Medical Mixed Reality |
Authors | Long Chen, Thomas Day, Wen Tang, Nigel W. John |
Abstract | Mixed Reality (MR) is of increasing interest within technology-driven modern medicine but is not yet used in everyday practice. This situation is changing rapidly, however, and this paper explores the emergence of MR technology and the importance of its utility within medical applications. A classification of medical MR has been obtained by applying an unbiased text mining method to a database of 1,403 relevant research papers published over the last two decades. The classification results reveal a taxonomy for the development of medical MR research during this period as well as suggesting future trends. We then use the classification to analyse the technology and applications developed in the last five years. Our objective is to aid researchers to focus on the areas where technology advancements in medical MR are most needed, as well as providing medical practitioners with a useful source of reference. |
Tasks | |
Published | 2017-08-03 |
URL | http://arxiv.org/abs/1708.01225v1 |
http://arxiv.org/pdf/1708.01225v1.pdf | |
PWC | https://paperswithcode.com/paper/recent-developments-and-future-challenges-in |
Repo | |
Framework | |
Neural Episodic Control
Title | Neural Episodic Control |
Authors | Alexander Pritzel, Benigno Uria, Sriram Srinivasan, Adrià Puigdomènech, Oriol Vinyals, Demis Hassabis, Daan Wierstra, Charles Blundell |
Abstract | Deep reinforcement learning methods attain super-human performance in a wide range of environments. Such methods are grossly inefficient, often taking orders of magnitudes more data than humans to achieve reasonable performance. We propose Neural Episodic Control: a deep reinforcement learning agent that is able to rapidly assimilate new experiences and act upon them. Our agent uses a semi-tabular representation of the value function: a buffer of past experience containing slowly changing state representations and rapidly updated estimates of the value function. We show across a wide range of environments that our agent learns significantly faster than other state-of-the-art, general purpose deep reinforcement learning agents. |
Tasks | |
Published | 2017-03-06 |
URL | http://arxiv.org/abs/1703.01988v1 |
http://arxiv.org/pdf/1703.01988v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-episodic-control |
Repo | |
Framework | |
A Computer Composes A Fabled Problem: Four Knights vs. Queen
Title | A Computer Composes A Fabled Problem: Four Knights vs. Queen |
Authors | Azlan Iqbal |
Abstract | We explain how the prototype automatic chess problem composer, Chesthetica, successfully composed a rare and interesting chess problem using the new Digital Synaptic Neural Substrate (DSNS) computational creativity approach. This problem represents a greater challenge from a creative standpoint because the checkmate is not always clear and the method of winning even less so. Creating a decisive chess problem of this type without the aid of an omniscient 7-piece endgame tablebase (and one that also abides by several chess composition conventions) would therefore be a challenge for most human players and composers working on their own. The fact that a small computer with relatively low processing power and memory was sufficient to compose such a problem using the DSNS approach in just 10 days is therefore noteworthy. In this report we document the event and result in some detail. It lends additional credence to the DSNS as a viable new approach in the field of computational creativity. In particular, in areas where human-like creativity is required for targeted or specific problems with no clear path to the solution. |
Tasks | |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.00931v1 |
http://arxiv.org/pdf/1709.00931v1.pdf | |
PWC | https://paperswithcode.com/paper/a-computer-composes-a-fabled-problem-four |
Repo | |
Framework | |
Online deforestation detection
Title | Online deforestation detection |
Authors | Emiliano Diaz |
Abstract | Deforestation detection using satellite images can make an important contribution to forest management. Current approaches can be broadly divided into those that compare two images taken at similar periods of the year and those that monitor changes by using multiple images taken during the growing season. The CMFDA algorithm described in Zhu et al. (2012) is an algorithm that builds on the latter category by implementing a year-long, continuous, time-series based approach to monitoring images. This algorithm was developed for 30m resolution, 16-day frequency reflectance data from the Landsat satellite. In this work we adapt the algorithm to 1km, 16-day frequency reflectance data from the modis sensor aboard the Terra satellite. The CMFDA algorithm is composed of two submodels which are fitted on a pixel-by-pixel basis. The first estimates the amount of surface reflectance as a function of the day of the year. The second estimates the occurrence of a deforestation event by comparing the last few predicted and real reflectance values. For this comparison, the reflectance observations for six different bands are first combined into a forest index. Real and predicted values of the forest index are then compared and high absolute differences for consecutive observation dates are flagged as deforestation events. Our adapted algorithm also uses the two model framework. However, since the modis 13A2 dataset used, includes reflectance data for different spectral bands than those included in the Landsat dataset, we cannot construct the forest index. Instead we propose two contrasting approaches: a multivariate and an index approach similar to that of CMFDA. |
Tasks | Time Series |
Published | 2017-04-03 |
URL | http://arxiv.org/abs/1704.00829v1 |
http://arxiv.org/pdf/1704.00829v1.pdf | |
PWC | https://paperswithcode.com/paper/online-deforestation-detection |
Repo | |
Framework | |
A unified framework for manifold landmarking
Title | A unified framework for manifold landmarking |
Authors | Hongteng Xu, Licheng Yu, Mark Davenport, Hongyuan Zha |
Abstract | The success of semi-supervised manifold learning is highly dependent on the quality of the labeled samples. Active manifold learning aims to select and label representative landmarks on a manifold from a given set of samples to improve semi-supervised manifold learning. In this paper, we propose a novel active manifold learning method based on a unified framework of manifold landmarking. In particular, our method combines geometric manifold landmarking methods with algebraic ones. We achieve this by using the Gershgorin circle theorem to construct an upper bound on the learning error that depends on the landmarks and the manifold’s alignment matrix in a way that captures both the geometric and algebraic criteria. We then attempt to select landmarks so as to minimize this bound by iteratively deleting the Gershgorin circles corresponding to the selected landmarks. We also analyze the complexity, scalability, and robustness of our method through simulations, and demonstrate its superiority compared to existing methods. Experiments in regression and classification further verify that our method performs better than its competitors. |
Tasks | |
Published | 2017-10-25 |
URL | http://arxiv.org/abs/1710.09334v2 |
http://arxiv.org/pdf/1710.09334v2.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-framework-for-manifold-landmarking |
Repo | |
Framework | |