Paper Group ANR 571
Deep Reinforcement Learning for Scheduling in Cellular Networks. Semantic Matching by Weakly Supervised 2D Point Set Registration. Recognizing American Sign Language Manual Signs from RGB-D Videos. Shareable Representations for Search Query Understanding. Fast Bayesian Restoration of Poisson Corrupted Images with INLA. Three dimensional waveguide-i …
Deep Reinforcement Learning for Scheduling in Cellular Networks
Title | Deep Reinforcement Learning for Scheduling in Cellular Networks |
Authors | Jian Wang, Chen Xu, Yourui Huangfu, Rong Li, Yiqun Ge, Jun Wang |
Abstract | Integrating artificial intelligence (AI) into wireless networks has drawn significant interest in both industry and academia. A common solution is to replace partial or even all modules in the conventional systems, which is often lack of efficiency and robustness due to their ignoring of expert knowledge. In this paper, we take deep reinforcement learning (DRL) based scheduling as an example to investigate how expert knowledge can help with AI module in cellular networks. A simulation platform, which has considered link adaption, feedback and other practical mechanisms, is developed to facilitate the investigation. Besides the traditional way, which is learning directly from the environment, for training DRL agent, we propose two novel methods, i.e., learning from a dual AI module and learning from the expert solution. The results show that, for the considering scheduling problem, DRL training procedure can be improved on both performance and convergence speed by involving the expert knowledge. Hence, instead of replacing conventional scheduling module in the system, adding a newly introduced AI module, which is capable to interact with the conventional module and provide more flexibility, is a more feasible solution. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.05914v2 |
https://arxiv.org/pdf/1905.05914v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-for-scheduling-in-1 |
Repo | |
Framework | |
Semantic Matching by Weakly Supervised 2D Point Set Registration
Title | Semantic Matching by Weakly Supervised 2D Point Set Registration |
Authors | Zakaria Laskar, Hamed R. Tavakoli, Juho Kannala |
Abstract | In this paper we address the problem of establishing correspondences between different instances of the same object. The problem is posed as finding the geometric transformation that aligns a given image pair. We use a convolutional neural network (CNN) to directly regress the parameters of the transformation model. The alignment problem is defined in the setting where an unordered set of semantic key-points per image are available, but, without the correspondence information. To this end we propose a novel loss function based on cyclic consistency that solves this 2D point set registration problem by inferring the optimal geometric transformation model parameters. We train and test our approach on a standard benchmark dataset Proposal-Flow (PF-PASCAL)\cite{proposal_flow}. The proposed approach achieves state-of-the-art results demonstrating the effectiveness of the method. In addition, we show our approach further benefits from additional training samples in PF-PASCAL generated by using category level information. |
Tasks | |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08341v1 |
http://arxiv.org/pdf/1901.08341v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-matching-by-weakly-supervised-2d |
Repo | |
Framework | |
Recognizing American Sign Language Manual Signs from RGB-D Videos
Title | Recognizing American Sign Language Manual Signs from RGB-D Videos |
Authors | Longlong Jing, Elahe Vahdani, Matt Huenerfauth, Yingli Tian |
Abstract | In this paper, we propose a 3D Convolutional Neural Network (3DCNN) based multi-stream framework to recognize American Sign Language (ASL) manual signs (consisting of movements of the hands, as well as non-manual face movements in some cases) in real-time from RGB-D videos, by fusing multimodality features including hand gestures, facial expressions, and body poses from multi-channels (RGB, depth, motion, and skeleton joints). To learn the overall temporal dynamics in a video, a proxy video is generated by selecting a subset of frames for each video which are then used to train the proposed 3DCNN model. We collect a new ASL dataset, ASL-100-RGBD, which contains 42 RGB-D videos captured by a Microsoft Kinect V2 camera, each of 100 ASL manual signs, including RGB channel, depth maps, skeleton joints, face features, and HDface. The dataset is fully annotated for each semantic region (i.e. the time duration of each word that the human signer performs). Our proposed method achieves 92.88 accuracy for recognizing 100 ASL words in our newly collected ASL-100-RGBD dataset. The effectiveness of our framework for recognizing hand gestures from RGB-D videos is further demonstrated on the Chalearn IsoGD dataset and achieves 76 accuracy which is 5.51 higher than the state-of-the-art work in terms of average fusion by using only 5 channels instead of 12 channels in the previous work. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.02851v1 |
https://arxiv.org/pdf/1906.02851v1.pdf | |
PWC | https://paperswithcode.com/paper/recognizing-american-sign-language-manual |
Repo | |
Framework | |
Shareable Representations for Search Query Understanding
Title | Shareable Representations for Search Query Understanding |
Authors | Mukul Kumar, Youna Hu, Will Headden, Rahul Goutam, Heran Lin, Bing Yin |
Abstract | Understanding search queries is critical for shopping search engines to deliver a satisfying customer experience. Popular shopping search engines receive billions of unique queries yearly, each of which can depict any of hundreds of user preferences or intents. In order to get the right results to customers it must be known queries like “inexpensive prom dresses” are intended to not only surface results of a certain product type but also products with a low price. Referred to as query intents, examples also include preferences for author, brand, age group, or simply a need for customer service. Recent works such as BERT have demonstrated the success of a large transformer encoder architecture with language model pre-training on a variety of NLP tasks. We adapt such an architecture to learn intents for search queries and describe methods to account for the noisiness and sparseness of search query data. We also describe cost effective ways of hosting transformer encoder models in context with low latency requirements. With the right domain-specific training we can build a shareable deep learning model whose internal representation can be reused for a variety of query understanding tasks including query intent identification. Model sharing allows for fewer large models needed to be served at inference time and provides a platform to quickly build and roll out new search query classifiers. |
Tasks | Language Modelling |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/2001.04345v1 |
https://arxiv.org/pdf/2001.04345v1.pdf | |
PWC | https://paperswithcode.com/paper/shareable-representations-for-search-query |
Repo | |
Framework | |
Fast Bayesian Restoration of Poisson Corrupted Images with INLA
Title | Fast Bayesian Restoration of Poisson Corrupted Images with INLA |
Authors | Takahiro Kawashima, Hayaru Shouno |
Abstract | Photon-limited images are often seen in fields such as medical imaging. Although the number of collected photons on an image sensor statistically follows Poisson distribution, this type of noise is intractable, unlike Gaussian noise. In this study, we propose a Bayesian restoration method of Poisson corrupted image using Integrated Nested Laplace Approximation (INLA), which is a computational method to evaluate marginalized posterior distributions of latent Gaussian models (LGMs). When the original image can be regarded as ICAR (intrinsic conditional auto-regressive) model reasonably, our method performs very faster than well-known ones such as loopy belief propagation-based method and Markov chain Monte Carlo (MCMC) without decreasing the accuracy. |
Tasks | |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01357v1 |
http://arxiv.org/pdf/1904.01357v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-bayesian-restoration-of-poisson |
Repo | |
Framework | |
Three dimensional waveguide-interconnects for scalable integration of photonic neural networks
Title | Three dimensional waveguide-interconnects for scalable integration of photonic neural networks |
Authors | Johnny Moughames, Xavier Porte, Michael Thiel, Gwenn Ulliac, Maxime Jacquot, Laurent Larger, Muamer Kadic, Daniel Brunner |
Abstract | Photonic waveguides are prime candidates for integrated and parallel photonic interconnects. Such interconnects correspond to large-scale vector matrix products, which are at the heart of neural network computation. However, parallel interconnect circuits realized in two dimensions, for example by lithography, are strongly limited in size due to disadvantageous scaling. We use three dimensional (3D) printed photonic waveguides to overcome this limitation. 3D optical-couplers with fractal topology efficiently connect large numbers of input and output channels, and we show that the substrate’s footprint area scales linearly. Going beyond simple couplers, we introduce functional circuits for discrete spatial filters identical to those used in deep convolutional neural networks. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.08203v2 |
https://arxiv.org/pdf/1912.08203v2.pdf | |
PWC | https://paperswithcode.com/paper/three-dimensional-waveguide-interconnects-for |
Repo | |
Framework | |
Simple Kinematic Feedback Enhances Autonomous Learning in Bio-Inspired Tendon-Driven Systems
Title | Simple Kinematic Feedback Enhances Autonomous Learning in Bio-Inspired Tendon-Driven Systems |
Authors | Ali Marjaninejad, Darío Urbina-Meléndez, Francisco J. Valero-Cuevas |
Abstract | Error feedback is known to improve performance by correcting control signals in response to perturbations. Here we show how adding simple error feedback can also accelerate and robustify autonomous learning in a tendon-driven robot. We implemented two versions of the General-to-Particular (G2P) autonomous learning algorithm to produce multiple movement tasks using a tendon-driven leg with two joints and three tendons: one with and one without kinematic feedback. As expected, feedback improved performance in simulation and hardware. However, we see these improvements even in the presence of sensory delays of up to 100 ms and when experiencing substantial contact collisions. Importantly, feedback accelerates learning and enhances G2P’s continual refinement of the initial inverse map by providing the system with more relevant data to train on. This allows the system to perform well even after only 60 seconds of initial motor babbling. |
Tasks | |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04539v2 |
https://arxiv.org/pdf/1907.04539v2.pdf | |
PWC | https://paperswithcode.com/paper/simple-kinematic-feedback-enhances-autonomous |
Repo | |
Framework | |
Baby steps towards few-shot learning with multiple semantics
Title | Baby steps towards few-shot learning with multiple semantics |
Authors | Eli Schwartz, Leonid Karlinsky, Rogerio Feris, Raja Giryes, Alex M. Bronstein |
Abstract | Learning from one or few visual examples is one of the key capabilities of humans since early infancy, but is still a significant challenge for modern AI systems. While considerable progress has been achieved in few-shot learning from a few image examples, much less attention has been given to the verbal descriptions that are usually provided to infants when they are presented with a new object. In this paper, we focus on the role of additional semantics that can significantly facilitate few-shot visual learning. Building upon recent advances in few-shot learning with additional semantic information, we demonstrate that further improvements are possible by combining multiple and richer semantics (category labels, attributes, and natural language descriptions). Using these ideas, we offer the community new results on the popular miniImageNet and CUB few-shot benchmarks, comparing favorably to the previous state-of-the-art results for both visual only and visual plus semantics-based approaches. We also performed an ablation study investigating the components and design choices of our approach. |
Tasks | Few-Shot Image Classification, Few-Shot Learning |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01905v2 |
https://arxiv.org/pdf/1906.01905v2.pdf | |
PWC | https://paperswithcode.com/paper/baby-steps-towards-few-shot-learning-with |
Repo | |
Framework | |
Networked Control of Nonlinear Systems under Partial Observation Using Continuous Deep Q-Learning
Title | Networked Control of Nonlinear Systems under Partial Observation Using Continuous Deep Q-Learning |
Authors | Junya Ikemoto, Toshimitsu Ushio |
Abstract | In this paper, we propose a design of a model-free networked controller for a nonlinear plant whose mathematical model is unknown. In a networked control system, the controller and plant are located away from each other and exchange data over a network, which causes network delays that may fluctuate randomly due to network routing. So, in this paper, we assume that the current network delay is not known but the maximum value of fluctuating network delays is known beforehand. Moreover, we also assume that the sensor cannot observe all state variables of the plant. Under these assumption, we apply continuous deep Q-learning to the design of the networked controller. Then, we introduce an extended state consisting of a sequence of past control inputs and outputs as inputs to the deep neural network. By simulation, it is shown that, using the extended state, the controller can learn a control policy robust to the fluctuation of the network delays under the partial observation. |
Tasks | Q-Learning |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10722v2 |
https://arxiv.org/pdf/1908.10722v2.pdf | |
PWC | https://paperswithcode.com/paper/networked-control-of-nonlinear-systems-under |
Repo | |
Framework | |
DyANE: Dynamics-aware node embedding for temporal networks
Title | DyANE: Dynamics-aware node embedding for temporal networks |
Authors | Koya Sato, Mizuki Oka, Alain Barrat, Ciro Cattuto |
Abstract | Low-dimensional vector representations of network nodes have proven successful to feed graph data to machine learning algorithms and to improve performance across diverse tasks. Most of the embedding techniques, however, have been developed with the goal of achieving dense, low-dimensional encoding of network structure and patterns. Here, we present a node embedding technique aimed at providing low-dimensional feature vectors that are informative of dynamical processes occurring over temporal networks - rather than of the network structure itself - with the goal of enabling prediction tasks related to the evolution and outcome of these processes. We achieve this by using a modified supra-adjacency representation of temporal networks and building on standard embedding techniques for static graphs based on random-walks. We show that the resulting embedding vectors are useful for prediction tasks related to paradigmatic dynamical processes, namely epidemic spreading over empirical temporal networks. In particular, we illustrate the performance of our approach for the prediction of nodes’ epidemic states in a single instance of the spreading process. We show how framing this task as a supervised multi-label classification task on the embedding vectors allows us to estimate the temporal evolution of the entire system from a partial sampling of nodes at random times, with potential impact for nowcasting infectious disease dynamics. |
Tasks | Multi-Label Classification |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05976v1 |
https://arxiv.org/pdf/1909.05976v1.pdf | |
PWC | https://paperswithcode.com/paper/dyane-dynamics-aware-node-embedding-for |
Repo | |
Framework | |
Optimal Explanations of Linear Models
Title | Optimal Explanations of Linear Models |
Authors | Dimitris Bertsimas, Arthur Delarue, Patrick Jaillet, Sebastien Martin |
Abstract | When predictive models are used to support complex and important decisions, the ability to explain a model’s reasoning can increase trust, expose hidden biases, and reduce vulnerability to adversarial attacks. However, attempts at interpreting models are often ad hoc and application-specific, and the concept of interpretability itself is not well-defined. We propose a general optimization framework to create explanations for linear models. Our methodology decomposes a linear model into a sequence of models of increasing complexity using coordinate updates on the coefficients. Computing this decomposition optimally is a difficult optimization problem for which we propose exact algorithms and scalable heuristics. By solving this problem, we can derive a parametrized family of interpretability metrics for linear models that generalizes typical proxies, and study the tradeoff between interpretability and predictive accuracy. |
Tasks | |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.04669v1 |
https://arxiv.org/pdf/1907.04669v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-explanations-of-linear-models |
Repo | |
Framework | |
Music Source Separation in the Waveform Domain
Title | Music Source Separation in the Waveform Domain |
Authors | Alexandre Défossez, Nicolas Usunier, Léon Bottou, Francis Bach |
Abstract | Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song. Such components include voice, bass, drums and any other accompaniments. Contrarily to many audio synthesis tasks where the best performances are achieved by models that directly generate the waveform, the state-of-the-art in source separation for music is to compute masks on the magnitude spectrum. In this paper, we first show that an adaptation of Conv-Tasnet (Luo & Mesgarani, 2019), a waveform-to-waveform model for source separation for speech, significantly beats the state-of-the-art on the MusDB dataset, the standard benchmark of multi-instrument source separation. Second, we observe that Conv-Tasnet follows a masking approach on the input signal, which has the potential drawback of removing parts of the relevant source without the capacity to reconstruct it. We propose Demucs, a new waveform-to-waveform model, which has an architecture closer to models for audio generation with more capacity on the decoder. Experiments on the MusDB dataset show that Demucs beats previously reported results in terms of signal to distortion ratio (SDR), but lower than Conv-Tasnet. Human evaluations show that Demucs has significantly higher quality (as assessed by mean opinion score) than Conv-Tasnet, but slightly more contamination from other sources, which explains the difference in SDR. Additional experiments with a larger dataset suggest that the gap in SDR between Demucs and Conv-Tasnet shrinks, showing that our approach is promising. |
Tasks | Audio Generation, Music Source Separation |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.13254v1 |
https://arxiv.org/pdf/1911.13254v1.pdf | |
PWC | https://paperswithcode.com/paper/music-source-separation-in-the-waveform-1 |
Repo | |
Framework | |
A Game Theoretical Framework for the Evaluation of Unmanned Aircraft Systems Airspace Integration Concepts
Title | A Game Theoretical Framework for the Evaluation of Unmanned Aircraft Systems Airspace Integration Concepts |
Authors | Negin Musavi |
Abstract | Predicting the outcomes of integrating Unmanned Aerial Systems (UAS) into the National Aerospace (NAS) is a complex problem which is required to be addressed by simulation studies before allowing the routine access of UAS into the NAS. This thesis focuses on providing 2D and 3D simulation frameworks using a game theoretical methodology to evaluate integration concepts in scenarios where manned and unmanned air vehicles co-exist. The fundamental gap in the literature is that the models of interaction between manned and unmanned vehicles are insufficient: a) they assume that pilot behavior is known a priori and b) they disregard decision making processes. The contribution of this work is to propose a modeling framework, in which, human pilot reactions are modeled using reinforcement learning and a game theoretical concept called level-k reasoning to fill this gap. The level-k reasoning concept is based on the assumption that humans have various levels of decision making. Reinforcement learning is a mathematical learning method that is rooted in human learning. In this work, a classical and an approximate reinforcement learning (Neural Fitted Q Iteration) methods are used to model time-extended decisions of pilots with 2D and 3D maneuvers. An analysis of UAS integration is conducted using example scenarios in the presence of manned aircraft and fully autonomous UAS equipped with sense and avoid algorithms. |
Tasks | Decision Making |
Published | 2019-04-17 |
URL | http://arxiv.org/abs/1904.08477v1 |
http://arxiv.org/pdf/1904.08477v1.pdf | |
PWC | https://paperswithcode.com/paper/a-game-theoretical-framework-for-the |
Repo | |
Framework | |
Collecting and Analyzing Multidimensional Data with Local Differential Privacy
Title | Collecting and Analyzing Multidimensional Data with Local Differential Privacy |
Authors | Ning Wang, Xiaokui Xiao, Yin Yang, Jun Zhao, Siu Cheung Hui, Hyejin Shin, Junbum Shin, Ge Yu |
Abstract | Local differential privacy (LDP) is a recently proposed privacy standard for collecting and analyzing data, which has been used, e.g., in the Chrome browser, iOS and macOS. In LDP, each user perturbs her information locally, and only sends the randomized version to an aggregator who performs analyses, which protects both the users and the aggregator against private information leaks. Although LDP has attracted much research attention in recent years, the majority of existing work focuses on applying LDP to complex data and/or analysis tasks. In this paper, we point out that the fundamental problem of collecting multidimensional data under LDP has not been addressed sufficiently, and there remains much room for improvement even for basic tasks such as computing the mean value over a single numeric attribute under LDP. Motivated by this, we first propose novel LDP mechanisms for collecting a numeric attribute, whose accuracy is at least no worse (and usually better) than existing solutions in terms of worst-case noise variance. Then, we extend these mechanisms to multidimensional data that can contain both numeric and categorical attributes, where our mechanisms always outperform existing solutions regarding worst-case noise variance. As a case study, we apply our solutions to build an LDP-compliant stochastic gradient descent algorithm (SGD), which powers many important machine learning tasks. Experiments using real datasets confirm the effectiveness of our methods, and their advantages over existing solutions. |
Tasks | |
Published | 2019-06-28 |
URL | https://arxiv.org/abs/1907.00782v1 |
https://arxiv.org/pdf/1907.00782v1.pdf | |
PWC | https://paperswithcode.com/paper/collecting-and-analyzing-multidimensional |
Repo | |
Framework | |
Filling Conversation Ellipsis for Better Social Dialog Understanding
Title | Filling Conversation Ellipsis for Better Social Dialog Understanding |
Authors | Xiyuan Zhang, Chengxi Li, Dian Yu, Samuel Davidson, Zhou Yu |
Abstract | The phenomenon of ellipsis is prevalent in social conversations. Ellipsis increases the difficulty of a series of downstream language understanding tasks, such as dialog act prediction and semantic role labeling. We propose to resolve ellipsis through automatic sentence completion to improve language understanding. However, automatic ellipsis completion can result in output which does not accurately reflect user intent. To address this issue, we propose a method which considers both the original utterance that has ellipsis and the automatically completed utterance in dialog act and semantic role labeling tasks. Specifically, we first complete user utterances to resolve ellipsis using an end-to-end pointer network model. We then train a prediction model using both utterances containing ellipsis and our automatically completed utterances. Finally, we combine the prediction results from these two utterances using a selection model that is guided by expert knowledge. Our approach improves dialog act prediction and semantic role labeling by 1.3% and 2.5% in F1 score respectively in social conversations. We also present an open-domain human-machine conversation dataset with manually completed user utterances and annotated semantic role labeling after manual completion. |
Tasks | Semantic Role Labeling |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.10776v1 |
https://arxiv.org/pdf/1911.10776v1.pdf | |
PWC | https://paperswithcode.com/paper/filling-conversation-ellipsis-for-better |
Repo | |
Framework | |