Paper Group ANR 1772
Neural Networks for Modeling Source Code Edits. A deep learning approach to detecting volcano deformation from satellite imagery using synthetic datasets. Multi-Species Cuckoo Search Algorithm for Global Optimization. A Data-Center FPGA Acceleration Platform for Convolutional Neural Networks. Biased Aggregation, Rollout, and Enhanced Policy Improve …
Neural Networks for Modeling Source Code Edits
Title | Neural Networks for Modeling Source Code Edits |
Authors | Rui Zhao, David Bieber, Kevin Swersky, Daniel Tarlow |
Abstract | Programming languages are emerging as a challenging and interesting domain for machine learning. A core task, which has received significant attention in recent years, is building generative models of source code. However, to our knowledge, previous generative models have always been framed in terms of generating static snapshots of code. In this work, we instead treat source code as a dynamic object and tackle the problem of modeling the edits that software developers make to source code files. This requires extracting intent from previous edits and leveraging it to generate subsequent edits. We develop several neural networks and use synthetic data to test their ability to learn challenging edit patterns that require strong generalization. We then collect and train our models on a large-scale dataset of Google source code, consisting of millions of fine-grained edits from thousands of Python developers. From the modeling perspective, our main conclusion is that a new composition of attentional and pointer network components provides the best overall performance and scalability. From the application perspective, our results provide preliminary evidence of the feasibility of developing tools that learn to predict future edits. |
Tasks | |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02818v1 |
http://arxiv.org/pdf/1904.02818v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-networks-for-modeling-source-code |
Repo | |
Framework | |
A deep learning approach to detecting volcano deformation from satellite imagery using synthetic datasets
Title | A deep learning approach to detecting volcano deformation from satellite imagery using synthetic datasets |
Authors | Nantheera Anantrasirichai, Juliet Biggs, Fabien Albino, David Bull |
Abstract | Satellites enable widespread, regional or global surveillance of volcanoes and can provide the first indication of volcanic unrest or eruption. Here we consider Interferometric Synthetic Aperture Radar (InSAR), which can be employed to detect surface deformation with a strong statistical link to eruption. The ability of machine learning to automatically identify signals of interest in these large InSAR datasets has already been demonstrated, but data-driven techniques, such as convolutional neutral networks (CNN) require balanced training datasets of positive and negative signals to effectively differentiate between real deformation and noise. As only a small proportion of volcanoes are deforming and atmospheric noise is ubiquitous, the use of machine learning for detecting volcanic unrest is more challenging. In this paper, we address this problem using synthetic interferograms to train the AlexNet. The synthetic interferograms are composed of 3 parts: 1) deformation patterns based on a Monte Carlo selection of parameters for analytic forward models, 2) stratified atmospheric effects derived from weather models and 3) turbulent atmospheric effects based on statistical simulations of correlated noise. The AlexNet architecture trained with synthetic data outperforms that trained using real interferograms alone, based on classification accuracy and positive predictive value (PPV). However, the models used to generate the synthetic signals are a simplification of the natural processes, so we retrain the CNN with a combined dataset consisting of synthetic models and selected real examples, achieving a final PPV of 82%. Although applying atmospheric corrections to the entire dataset is computationally expensive, it is relatively simple to apply them to the small subset of positive results. This further improves the detection performance without a significant increase in computational burden. |
Tasks | |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07286v1 |
https://arxiv.org/pdf/1905.07286v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-approach-to-detecting-volcano |
Repo | |
Framework | |
Multi-Species Cuckoo Search Algorithm for Global Optimization
Title | Multi-Species Cuckoo Search Algorithm for Global Optimization |
Authors | Xin-She Yang, Suash Deb, Sudhanshu K Mishra |
Abstract | Many optimization problems in science and engineering are highly nonlinear, and thus require sophisticated optimization techniques to solve. Traditional techniques such as gradient-based algorithms are mostly local search methods, and often struggle to cope with such challenging optimization problems. Recent trends tend to use nature-inspired optimization algorithms. This work extends the standard cuckoo search (CS) by using the successful features of the cuckoo-host co-evolution with multiple interacting species, and the proposed multi-species cuckoo search (MSCS) intends to mimic the multiple species of cuckoos that compete for the survival of the fittest, and they co-evolve with host species with solution vectors being encoded as position vectors. The proposed algorithm is then validated by 15 benchmark functions as well as five nonlinear, multimodal design case studies in practical applications. Simulation results suggest that the proposed algorithm can be effective for finding optimal solutions and in this case all optimal solutions are achievable. The results for the test benchmarks are also compared with those obtained by other methods such as the standard cuckoo search and genetic algorithm, which demonstrated the efficiency of the present algorithm. Based on numerical experiments and case studies, we can conclude that the proposed algorithm can be more efficient in most cases, leading a potentially very effective tool for solving nonlinear optimization problems. |
Tasks | |
Published | 2019-03-27 |
URL | http://arxiv.org/abs/1903.11446v1 |
http://arxiv.org/pdf/1903.11446v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-species-cuckoo-search-algorithm-for |
Repo | |
Framework | |
A Data-Center FPGA Acceleration Platform for Convolutional Neural Networks
Title | A Data-Center FPGA Acceleration Platform for Convolutional Neural Networks |
Authors | Xiaoyu Yu, Yuwei Wang, Jie Miao, Ephrem Wu, Heng Zhang, Yu Meng, Bo Zhang, Biao Min, Dewei Chen, Jianlin Gao |
Abstract | Intensive computation is entering data centers with multiple workloads of deep learning. To balance the compute efficiency, performance, and total cost of ownership (TCO), the use of a field-programmable gate array (FPGA) with reconfigurable logic provides an acceptable acceleration capacity and is compatible with diverse computation-sensitive tasks in the cloud. In this paper, we develop an FPGA acceleration platform that leverages a unified framework architecture for general-purpose convolutional neural network (CNN) inference acceleration at a data center. To overcome the computation bound, 4,096 DSPs are assembled and shaped as supertile units (SUs) for different types of convolution, which provide up to 4.2 TOP/s 16-bit fixed-point performance at 500 MHz. The interleaved-task-dispatching method is proposed to map the computation across the SUs, and the memory bound is solved by a dispatching-assembling buffering model and broadcast caches. For various non-convolution operators, a filter processing unit is designed for general-purpose filter-like/pointwise operators. In the experiment, the performances of CNN models running on server-class CPUs, a GPU, and an FPGA are compared. The results show that our design achieves the best FPGA peak performance and a throughput at the same level as that of the state-of-the-art GPU in data centers, with more than 50 times lower latency. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07973v1 |
https://arxiv.org/pdf/1909.07973v1.pdf | |
PWC | https://paperswithcode.com/paper/a-data-center-fpga-acceleration-platform-for |
Repo | |
Framework | |
Biased Aggregation, Rollout, and Enhanced Policy Improvement for Reinforcement Learning
Title | Biased Aggregation, Rollout, and Enhanced Policy Improvement for Reinforcement Learning |
Authors | Dimitri Bertsekas |
Abstract | We propose a new aggregation framework for approximate dynamic programming, which provides a connection with rollout algorithms, approximate policy iteration, and other single and multistep lookahead methods. The central novel characteristic is the use of a bias function $V$ of the state, which biases the values of the aggregate cost function towards their correct levels. The classical aggregation framework is obtained when $V\equiv0$, but our scheme works best when $V$ is a known reasonably good approximation to the optimal cost function $J^*$. When $V$ is equal to the cost function $J_{\mu}$ of some known policy $\mu$ and there is only one aggregate state, our scheme is equivalent to the rollout algorithm based on $\mu$ (i.e., the result of a single policy improvement starting with the policy $\mu$). When $V=J_{\mu}$ and there are multiple aggregate states, our aggregation approach can be used as a more powerful form of improvement of $\mu$. Thus, when combined with an approximate policy evaluation scheme, our approach can form the basis for a new and enhanced form of approximate policy iteration. When $V$ is a generic bias function, our scheme is equivalent to approximation in value space with lookahead function equal to $V$ plus a local correction within each aggregate state. The local correction levels are obtained by solving a low-dimensional aggregate DP problem, yielding an arbitrarily close approximation to $J^*$, when the number of aggregate states is sufficiently large. Except for the bias function, the aggregate DP problem is similar to the one of the classical aggregation framework, and its algorithmic solution by simulation or other methods is nearly identical to one for classical aggregation, assuming values of $V$ are available when needed. |
Tasks | |
Published | 2019-10-06 |
URL | https://arxiv.org/abs/1910.02426v1 |
https://arxiv.org/pdf/1910.02426v1.pdf | |
PWC | https://paperswithcode.com/paper/biased-aggregation-rollout-and-enhanced |
Repo | |
Framework | |
NewsDeps: Visualizing the Origin of Information in News Articles
Title | NewsDeps: Visualizing the Origin of Information in News Articles |
Authors | Felix Hamborg, Philipp Meschenmoser, Moritz Schubotz, Bela Gipp |
Abstract | In scientific publications, citations allow readers to assess the authenticity of the presented information and verify it in the original context. News articles, however, do not contain citations and only rarely refer readers to further sources. Readers often cannot assess the authenticity of the presented information as its origin is unclear. We present NewsDeps, the first approach that analyzes and visualizes where information in news articles stems from. NewsDeps employs methods from natural language processing and plagiarism detection to measure article similarity. We devise a temporal-force-directed graph that places articles as nodes chronologically. The graph connects articles by edges varying in width depending on the articles’ similarity. We demonstrate our approach in a case study with two real-world scenarios. We find that NewsDeps increases efficiency and transparency in news consumption by revealing which previously published articles are the primary sources of each given article. |
Tasks | |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10266v1 |
https://arxiv.org/pdf/1909.10266v1.pdf | |
PWC | https://paperswithcode.com/paper/190910266 |
Repo | |
Framework | |
Stochastic Reinforcement Learning
Title | Stochastic Reinforcement Learning |
Authors | Nikki Lijing Kuang, Clement H. C. Leung, Vienne W. K. Sung |
Abstract | In reinforcement learning episodes, the rewards and punishments are often non-deterministic, and there are invariably stochastic elements governing the underlying situation. Such stochastic elements are often numerous and cannot be known in advance, and they have a tendency to obscure the underlying rewards and punishments patterns. Indeed, if stochastic elements were absent, the same outcome would occur every time and the learning problems involved could be greatly simplified. In addition, in most practical situations, the cost of an observation to receive either a reward or punishment can be significant, and one would wish to arrive at the correct learning conclusion by incurring minimum cost. In this paper, we present a stochastic approach to reinforcement learning which explicitly models the variability present in the learning environment and the cost of observation. Criteria and rules for learning success are quantitatively analyzed, and probabilities of exceeding the observation cost bounds are also obtained. |
Tasks | |
Published | 2019-02-11 |
URL | http://arxiv.org/abs/1902.04178v1 |
http://arxiv.org/pdf/1902.04178v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-reinforcement-learning |
Repo | |
Framework | |
A New Ratio Image Based CNN Algorithm For SAR Despeckling
Title | A New Ratio Image Based CNN Algorithm For SAR Despeckling |
Authors | Sergio Vitale, Giampaolo Ferraioli, Vito Pascazio |
Abstract | In SAR domain many application like classification, detection and segmentation are impaired by speckle. Hence, despeckling of SAR images is the key for scene understanding. Usually despeckling filters face the trade-off of speckle suppression and information preservation. In the last years deep learning solutions for speckle reduction have been proposed. One the biggest issue for these methods is how to train a network given the lack of a reference. In this work we proposed a convolutional neural network based solution trained on simulated data. We propose the use of a cost function taking into account both spatial and statistical properties. The aim is two fold: overcome the trade-off between speckle suppression and details suppression; find a suitable cost function for despeckling in unsupervised learning. The algorithm is validated on both real and simulated data, showing interesting performances. |
Tasks | Scene Understanding |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.04111v1 |
https://arxiv.org/pdf/1906.04111v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-ratio-image-based-cnn-algorithm-for-sar |
Repo | |
Framework | |
Using fuzzy bits and neural networks to partially invert few rounds of some cryptographic hash functions
Title | Using fuzzy bits and neural networks to partially invert few rounds of some cryptographic hash functions |
Authors | Sergij V. Goncharov |
Abstract | We consider fuzzy, or continuous, bits, which take values in [0;1] and (-1;1] instead of {0;1}, and operations on them (NOT, XOR etc.) and on their sequences (ADD), to obtain the generalization of cryptographic hash functions, CHFs, for the messages consisting of fuzzy bits, so that CHFs become smooth and non-constant functions of each bit of the message. We then train the neural networks to predict the message that has a given hash, where the loss function for the hash of predicted message and given true hash is backpropagatable. The results of the trainings for the standard CHFs - MD5, SHA1, SHA2-256, and SHA3/Keccak - with small number of (optionally weakened) rounds are presented and compared. |
Tasks | |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02438v1 |
http://arxiv.org/pdf/1901.02438v1.pdf | |
PWC | https://paperswithcode.com/paper/using-fuzzy-bits-and-neural-networks-to |
Repo | |
Framework | |
MOTS: Multi-Object Tracking and Segmentation
Title | MOTS: Multi-Object Tracking and Segmentation |
Authors | Paul Voigtlaender, Michael Krause, Aljosa Osep, Jonathon Luiten, Berin Balachandar Gnana Sekar, Andreas Geiger, Bastian Leibe |
Abstract | This paper extends the popular task of multi-object tracking to multi-object tracking and segmentation (MOTS). Towards this goal, we create dense pixel-level annotations for two existing tracking datasets using a semi-automatic annotation procedure. Our new annotations comprise 65,213 pixel masks for 977 distinct objects (cars and pedestrians) in 10,870 video frames. For evaluation, we extend existing multi-object tracking metrics to this new task. Moreover, we propose a new baseline method which jointly addresses detection, tracking, and segmentation with a single convolutional network. We demonstrate the value of our datasets by achieving improvements in performance when training on MOTS annotations. We believe that our datasets, metrics and baseline will become a valuable resource towards developing multi-object tracking approaches that go beyond 2D bounding boxes. We make our annotations, code, and models available at https://www.vision.rwth-aachen.de/page/mots. |
Tasks | Multi-Object Tracking, Object Tracking |
Published | 2019-02-10 |
URL | http://arxiv.org/abs/1902.03604v2 |
http://arxiv.org/pdf/1902.03604v2.pdf | |
PWC | https://paperswithcode.com/paper/mots-multi-object-tracking-and-segmentation |
Repo | |
Framework | |
Machine Learning based Prediction of Hierarchical Classification of Transposable Elements
Title | Machine Learning based Prediction of Hierarchical Classification of Transposable Elements |
Authors | Manisha Panta, Avdesh Mishra, Md Tamjidul Hoque, Joel Atallah |
Abstract | Transposable Elements (TEs) or jumping genes are the DNA sequences that have an intrinsic capability to move within a host genome from one genomic location to another. Studies show that the presence of a TE within or adjacent to a functional gene may alter its expression. TEs can also cause an increase in the rate of mutation and can even mediate duplications and large insertions and deletions in the genome, promoting gross genetic rearrangements. Thus, the proper classification of the identified jumping genes is essential to understand their genetic and evolutionary effects in the genome. While computational methods have been developed that perform either binary classification or multi-label classification of TEs, few studies have focused on their hierarchical classification. The state-of-the-art machine learning classification method utilizes a Multi-Layer Perceptron (MLP), a class of neural network, for hierarchical classification of TEs. However, the existing methods have limited accuracy in classifying TEs. A more effective classifier, which can explain the role of TEs in germline and somatic evolution, is needed. In this study, we examine the performance of a variety of machine learning (ML) methods. And eventually, propose a robust approach for the hierarchical classification of TEs, with higher accuracy, using Support Vector Machines (SVM). |
Tasks | Multi-Label Classification |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01674v3 |
https://arxiv.org/pdf/1907.01674v3.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-based-prediction-of |
Repo | |
Framework | |
A refined primal-dual analysis of the implicit bias
Title | A refined primal-dual analysis of the implicit bias |
Authors | Ziwei Ji, Matus Telgarsky |
Abstract | Recent work shows that gradient descent on linearly separable data is implicitly biased towards the maximum margin solution. However, no convergence rate which is tight in both n (the dataset size) and t (the training time) is given. This work proves that the normalized gradient descent iterates converge to the maximum margin solution at a rate of O(ln(n)/ ln(t)), which is tight in both n and t. The proof is via a dual convergence result: gradient descent induces a multiplicative weights update on the (normalized) SVM dual objective, whose convergence rate leads to the tight implicit bias rate. |
Tasks | |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04540v1 |
https://arxiv.org/pdf/1906.04540v1.pdf | |
PWC | https://paperswithcode.com/paper/a-refined-primal-dual-analysis-of-the |
Repo | |
Framework | |
An efficient Lagrangian-based heuristic to solve a multi-objective sustainable supply chain problem
Title | An efficient Lagrangian-based heuristic to solve a multi-objective sustainable supply chain problem |
Authors | Camila P. S. Tautenhain, Ana Paula Barbosa-Povoa, Bruna Mota, Mariá C. V. Nascimento |
Abstract | Sustainable Supply Chain (SSC) management aims at integrating economic, environmental and social goals to assist in the long-term planning of a company and its supply chains. There is no consensus in the literature as to whether social and environmental responsibilities are profit-compatible. However, the conflicting nature of these goals is explicit when considering specific assessment measures and, in this scenario, multi-objective optimization is a way to represent problems that simultaneously optimize the goals. This paper proposes a Lagrangian matheuristic method, called $AugMathLagr$, to solve a hard and relevant multi-objective problem found in the literature. $AugMathLagr$ was extensively tested using artificial instances defined by a generator presented in this paper. The results show a competitive performance of $AugMathLagr$ when compared with an exact multi-objective method limited by time and a matheuristic recently proposed in the literature and adapted here to address the studied problem. In addition, computational results on a case study are presented and analyzed, and demonstrate the outstanding performance of $AugMathLagr$. |
Tasks | |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06375v1 |
https://arxiv.org/pdf/1906.06375v1.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-lagrangian-based-heuristic-to |
Repo | |
Framework | |
Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources
Title | Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources |
Authors | Qianhui Wu, Zijia Lin, Guoxin Wang, Hui Chen, Börje F. Karlsson, Biqing Huang, Chin-Yew Lin |
Abstract | For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model’s generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board. |
Tasks | Meta-Learning, Named Entity Recognition |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.06161v1 |
https://arxiv.org/pdf/1911.06161v1.pdf | |
PWC | https://paperswithcode.com/paper/enhanced-meta-learning-for-cross-lingual |
Repo | |
Framework | |
Neural Theorem Provers Do Not Learn Rules Without Exploration
Title | Neural Theorem Provers Do Not Learn Rules Without Exploration |
Authors | Michiel de Jong, Fei Sha |
Abstract | Neural symbolic processing aims to combine the generalization of logical learning approaches and the performance of neural networks. The Neural Theorem Proving (NTP) model by Rocktaschel et al (2017) learns embeddings for concepts and performs logical unification. While NTP is promising and effective in predicting facts accurately, we have little knowledge how well it can extract true relationship among data. To this end, we create synthetic logical datasets with injected relationships, which can be generated on-the-fly, to test neural-based relation learning algorithms including NTP. We show that it has difficulty recovering relationships in all but the simplest settings. Critical analysis and diagnostic experiments suggest that the optimization algorithm suffers from poor local minima due to its greedy winner-takes-all strategy in identifying the most informative structure (proof path) to pursue. We alter the NTP algorithm to increase exploration, which sharply improves performance. We argue and demonstate that it is insightful to benchmark with synthetic data with ground-truth relationships, for both evaluating models and revealing algorithmic issues. |
Tasks | Automated Theorem Proving |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.06805v1 |
https://arxiv.org/pdf/1906.06805v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-theorem-provers-do-not-learn-rules |
Repo | |
Framework | |