Paper Group ANR 169
Fixing the Infix: Unsupervised Discovery of Root-and-Pattern Morphology. Training and Testing Object Detectors with Virtual Images. Learning Word Embeddings from Speech. Integration of Machine Learning Techniques to Evaluate Dynamic Customer Segmentation Analysis for Mobile Customers. Kernel k-Groups via Hartigan’s Method. Evolutionary Acyclic Grap …
Fixing the Infix: Unsupervised Discovery of Root-and-Pattern Morphology
Title | Fixing the Infix: Unsupervised Discovery of Root-and-Pattern Morphology |
Authors | Tarek Sakakini, Suma Bhat, Pramod Viswanath |
Abstract | We present an unsupervised and language-agnostic method for learning root-and-pattern morphology in Semitic languages. This form of morphology, abundant in Semitic languages, has not been handled in prior unsupervised approaches. We harness the syntactico-semantic information in distributed word representations to solve the long standing problem of root-and-pattern discovery in Semitic languages. Moreover, we construct an unsupervised root extractor based on the learned rules. We prove the validity of learned rules across Arabic, Hebrew, and Amharic, alongside showing that our root extractor compares favorably with a widely used, carefully engineered root extractor: ISRI. |
Tasks | |
Published | 2017-02-07 |
URL | http://arxiv.org/abs/1702.02211v2 |
http://arxiv.org/pdf/1702.02211v2.pdf | |
PWC | https://paperswithcode.com/paper/fixing-the-infix-unsupervised-discovery-of |
Repo | |
Framework | |
Training and Testing Object Detectors with Virtual Images
Title | Training and Testing Object Detectors with Virtual Images |
Authors | Yonglin Tian, Xuan Li, Kunfeng Wang, Fei-Yue Wang |
Abstract | In the area of computer vision, deep learning has produced a variety of state-of-the-art models that rely on massive labeled data. However, collecting and annotating images from the real world has a great demand for labor and money investments and is usually too passive to build datasets with specific characteristics, such as small area of objects and high occlusion level. Under the framework of Parallel Vision, this paper presents a purposeful way to design artificial scenes and automatically generate virtual images with precise annotations. A virtual dataset named ParallelEye is built, which can be used for several computer vision tasks. Then, by training the DPM (Deformable Parts Model) and Faster R-CNN detectors, we prove that the performance of models can be significantly improved by combining ParallelEye with publicly available real-world datasets during the training phase. In addition, we investigate the potential of testing the trained models from a specific aspect using intentionally designed virtual datasets, in order to discover the flaws of trained models. From the experimental results, we conclude that our virtual dataset is viable to train and test the object detectors. |
Tasks | |
Published | 2017-12-22 |
URL | http://arxiv.org/abs/1712.08470v1 |
http://arxiv.org/pdf/1712.08470v1.pdf | |
PWC | https://paperswithcode.com/paper/training-and-testing-object-detectors-with |
Repo | |
Framework | |
Learning Word Embeddings from Speech
Title | Learning Word Embeddings from Speech |
Authors | Yu-An Chung, James Glass |
Abstract | In this paper, we propose a novel deep neural network architecture, Sequence-to-Sequence Audio2Vec, for unsupervised learning of fixed-length vector representations of audio segments excised from a speech corpus, where the vectors contain semantic information pertaining to the segments, and are close to other vectors in the embedding space if their corresponding segments are semantically similar. The design of the proposed model is based on the RNN Encoder-Decoder framework, and borrows the methodology of continuous skip-grams for training. The learned vector representations are evaluated on 13 widely used word similarity benchmarks, and achieved competitive results to that of GloVe. The biggest advantage of the proposed model is its capability of extracting semantic information of audio segments taken directly from raw speech, without relying on any other modalities such as text or images, which are challenging and expensive to collect and annotate. |
Tasks | Learning Word Embeddings, Word Embeddings |
Published | 2017-11-05 |
URL | http://arxiv.org/abs/1711.01515v1 |
http://arxiv.org/pdf/1711.01515v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-word-embeddings-from-speech |
Repo | |
Framework | |
Integration of Machine Learning Techniques to Evaluate Dynamic Customer Segmentation Analysis for Mobile Customers
Title | Integration of Machine Learning Techniques to Evaluate Dynamic Customer Segmentation Analysis for Mobile Customers |
Authors | Cormac Dullaghan, Eleni Rozaki |
Abstract | The telecommunications industry is highly competitive, which means that the mobile providers need a business intelligence model that can be used to achieve an optimal level of churners, as well as a minimal level of cost in marketing activities. Machine learning applications can be used to provide guidance on marketing strategies. Furthermore, data mining techniques can be used in the process of customer segmentation. The purpose of this paper is to provide a detailed analysis of the C.5 algorithm, within naive Bayesian modelling for the task of segmenting telecommunication customers behavioural profiling according to their billing and socio-demographic aspects. Results have been experimentally implemented. |
Tasks | |
Published | 2017-01-31 |
URL | http://arxiv.org/abs/1702.02215v1 |
http://arxiv.org/pdf/1702.02215v1.pdf | |
PWC | https://paperswithcode.com/paper/integration-of-machine-learning-techniques-to |
Repo | |
Framework | |
Kernel k-Groups via Hartigan’s Method
Title | Kernel k-Groups via Hartigan’s Method |
Authors | Guilherme França, Maria L. Rizzo, Joshua T. Vogelstein |
Abstract | Energy statistics was proposed by Sz'{e}kely in the 80’s inspired by Newton’s gravitational potential in classical mechanics and it provides a model-free hypothesis test for equality of distributions. In its original form, energy statistics was formulated in Euclidean spaces. More recently, it was generalized to metric spaces of negative type. In this paper, we consider a formulation for the clustering problem using a weighted version of energy statistics in spaces of negative type. We show that this approach leads to a quadratically constrained quadratic program in the associated kernel space, establishing connections with graph partitioning problems and kernel methods in machine learning. To find local solutions of such an optimization problem, we propose kernel k-groups, which is an extension of Hartigan’s method to kernel spaces. Kernel k-groups is cheaper than spectral clustering and has the same computational cost as kernel k-means (which is based on Lloyd’s heuristic) but our numerical results show an improved performance, especially in higher dimensions. Moreover, we verify the efficiency of kernel k-groups in community detection in sparse stochastic block models which has fascinating applications in several areas of science. |
Tasks | Community Detection, graph partitioning |
Published | 2017-10-26 |
URL | https://arxiv.org/abs/1710.09859v3 |
https://arxiv.org/pdf/1710.09859v3.pdf | |
PWC | https://paperswithcode.com/paper/kernel-k-groups-via-hartigans-method |
Repo | |
Framework | |
Evolutionary Acyclic Graph Partitioning
Title | Evolutionary Acyclic Graph Partitioning |
Authors | Orlando Moreira, Merten Popp, Christian Schulz |
Abstract | Directed graphs are widely used to model data flow and execution dependencies in streaming applications. This enables the utilization of graph partitioning algorithms for the problem of parallelizing computation for multiprocessor architectures. However due to resource restrictions, an acyclicity constraint on the partition is necessary when mapping streaming applications to an embedded multiprocessor. Here, we contribute a multi-level algorithm for the acyclic graph partitioning problem. Based on this, we engineer an evolutionary algorithm to further reduce communication cost, as well as to improve load balancing and the scheduling makespan on embedded multiprocessor architectures. |
Tasks | graph partitioning |
Published | 2017-09-25 |
URL | http://arxiv.org/abs/1709.08563v1 |
http://arxiv.org/pdf/1709.08563v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-acyclic-graph-partitioning |
Repo | |
Framework | |
On the definition of Shape Parts: a Dominant Sets Approach
Title | On the definition of Shape Parts: a Dominant Sets Approach |
Authors | Foteini Fotopoulou, George Economou |
Abstract | In the present paper a novel graph-based approach to the shape decomposition problem is addressed. The shape is appropriately transformed into a visibility graph enriched with local neighborhood information. A two-step diffusion process is then applied to the visibility graph that efficiently enhances the information provided, thus leading to a more robust and meaningful graph construction. Inspired by the notion of a clique as a strict cluster definition, the dominant sets algorithm is invoked, slightly modified to comport with the specific problem of defining shape parts. The cluster cohesiveness and a node participation vector are two important outputs of the proposed graph partitioning method. Opposed to most of the existing techniques, the final number of the clusters is determined automatically, by estimating the cluster cohesiveness on a random network generation process. Experimental results on several shape databases show the effectiveness of our framework for graph based shape decomposition. |
Tasks | graph construction, graph partitioning |
Published | 2017-09-11 |
URL | http://arxiv.org/abs/1709.03588v1 |
http://arxiv.org/pdf/1709.03588v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-definition-of-shape-parts-a-dominant |
Repo | |
Framework | |
3D Cell Nuclei Segmentation with Balanced Graph Partitioning
Title | 3D Cell Nuclei Segmentation with Balanced Graph Partitioning |
Authors | Julian Arz, Peter Sanders, Johannes Stegmaier, Ralf Mikut |
Abstract | Cell nuclei segmentation is one of the most important tasks in the analysis of biomedical images. With ever-growing sizes and amounts of three-dimensional images to be processed, there is a need for better and faster segmentation methods. Graph-based image segmentation has seen a rise in popularity in recent years, but is seen as very costly with regard to computational demand. We propose a new segmentation algorithm which overcomes these limitations. Our method uses recursive balanced graph partitioning to segment foreground components of a fast and efficient binarization. We construct a model for the cell nuclei to guide the partitioning process. Our algorithm is compared to other state-of-the-art segmentation algorithms in an experimental evaluation on two sets of realistically simulated inputs. Our method is faster, has similar or better quality and an acceptable memory overhead. |
Tasks | graph partitioning, Semantic Segmentation |
Published | 2017-02-17 |
URL | http://arxiv.org/abs/1702.05413v1 |
http://arxiv.org/pdf/1702.05413v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-cell-nuclei-segmentation-with-balanced |
Repo | |
Framework | |
Semantic Web Today: From Oil Rigs to Panama Papers
Title | Semantic Web Today: From Oil Rigs to Panama Papers |
Authors | Rivindu Perera, Parma Nand, Boris Bacic, Wen-Hsin Yang, Kazuhiro Seki, Radek Burget |
Abstract | The next leap on the internet has already started as Semantic Web. At its core, Semantic Web transforms the document oriented web to a data oriented web enriched with semantics embedded as metadata. This change in perspective towards the web offers numerous benefits for vast amount of data intensive industries that are bound to the web and its related applications. The industries are diverse as they range from Oil & Gas exploration to the investigative journalism, and everything in between. This paper discusses eight different industries which currently reap the benefits of Semantic Web. The paper also offers a future outlook into Semantic Web applications and discusses the areas in which Semantic Web would play a key role in the future. |
Tasks | |
Published | 2017-11-05 |
URL | http://arxiv.org/abs/1711.01518v1 |
http://arxiv.org/pdf/1711.01518v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-web-today-from-oil-rigs-to-panama |
Repo | |
Framework | |
Weighted Low Rank Approximation for Background Estimation Problems
Title | Weighted Low Rank Approximation for Background Estimation Problems |
Authors | Aritra Dutta, Xin Li |
Abstract | Classical principal component analysis (PCA) is not robust to the presence of sparse outliers in the data. The use of the $\ell_1$ norm in the Robust PCA (RPCA) method successfully eliminates the weakness of PCA in separating the sparse outliers. In this paper, by sticking a simple weight to the Frobenius norm, we propose a weighted low rank (WLR) method to avoid the often computationally expensive algorithms relying on the $\ell_1$ norm. As a proof of concept, a background estimation model has been presented and compared with two $\ell_1$ norm minimization algorithms. We illustrate that as long as a simple weight matrix is inferred from the data, one can use the weighted Frobenius norm and achieve the same or better performance. |
Tasks | |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.01753v1 |
http://arxiv.org/pdf/1707.01753v1.pdf | |
PWC | https://paperswithcode.com/paper/weighted-low-rank-approximation-for |
Repo | |
Framework | |
Intelligent Fault Analysis in Electrical Power Grids
Title | Intelligent Fault Analysis in Electrical Power Grids |
Authors | Biswarup Bhattacharya, Abhishek Sinha |
Abstract | Power grids are one of the most important components of infrastructure in today’s world. Every nation is dependent on the security and stability of its own power grid to provide electricity to the households and industries. A malfunction of even a small part of a power grid can cause loss of productivity, revenue and in some cases even life. Thus, it is imperative to design a system which can detect the health of the power grid and take protective measures accordingly even before a serious anomaly takes place. To achieve this objective, we have set out to create an artificially intelligent system which can analyze the grid information at any given time and determine the health of the grid through the usage of sophisticated formal models and novel machine learning techniques like recurrent neural networks. Our system simulates grid conditions including stimuli like faults, generator output fluctuations, load fluctuations using Siemens PSS/E software and this data is trained using various classifiers like SVM, LSTM and subsequently tested. The results are excellent with our methods giving very high accuracy for the data. This model can easily be scaled to handle larger and more complex grid architectures. |
Tasks | |
Published | 2017-11-08 |
URL | http://arxiv.org/abs/1711.03026v1 |
http://arxiv.org/pdf/1711.03026v1.pdf | |
PWC | https://paperswithcode.com/paper/intelligent-fault-analysis-in-electrical |
Repo | |
Framework | |
Recursive Exponential Weighting for Online Non-convex Optimization
Title | Recursive Exponential Weighting for Online Non-convex Optimization |
Authors | Lin Yang, Cheng Tan, Wing Shing Wong |
Abstract | In this paper, we investigate the online non-convex optimization problem which generalizes the classic {online convex optimization problem by relaxing the convexity assumption on the cost function. For this type of problem, the classic exponential weighting online algorithm has recently been shown to attain a sub-linear regret of $O(\sqrt{T\log T})$. In this paper, we introduce a novel recursive structure to the online algorithm to define a recursive exponential weighting algorithm that attains a regret of $O(\sqrt{T})$, matching the well-known regret lower bound. To the best of our knowledge, this is the first online algorithm with provable $O(\sqrt{T})$ regret for the online non-convex optimization problem. |
Tasks | |
Published | 2017-09-13 |
URL | http://arxiv.org/abs/1709.04136v1 |
http://arxiv.org/pdf/1709.04136v1.pdf | |
PWC | https://paperswithcode.com/paper/recursive-exponential-weighting-for-online |
Repo | |
Framework | |
A Multiscale Patch Based Convolutional Network for Brain Tumor Segmentation
Title | A Multiscale Patch Based Convolutional Network for Brain Tumor Segmentation |
Authors | Jean Stawiaski |
Abstract | This article presents a multiscale patch based convolutional neural network for the automatic segmentation of brain tumors in multi-modality 3D MR images. We use multiscale deep supervision and inputs to train a convolutional network. We evaluate the effectiveness of the proposed approach on the BRATS 2017 segmentation challenge where we obtained dice scores of 0.755, 0.900, 0.782 and 95% Hausdorff distance of 3.63mm, 4.10mm, and 6.81mm for enhanced tumor core, whole tumor and tumor core respectively. |
Tasks | Brain Tumor Segmentation |
Published | 2017-10-06 |
URL | http://arxiv.org/abs/1710.02316v1 |
http://arxiv.org/pdf/1710.02316v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multiscale-patch-based-convolutional |
Repo | |
Framework | |
Data Sets: Word Embeddings Learned from Tweets and General Data
Title | Data Sets: Word Embeddings Learned from Tweets and General Data |
Authors | Quanzhi Li, Sameena Shah, Xiaomo Liu, Armineh Nourbakhsh |
Abstract | A word embedding is a low-dimensional, dense and real- valued vector representation of a word. Word embeddings have been used in many NLP tasks. They are usually gener- ated from a large text corpus. The embedding of a word cap- tures both its syntactic and semantic aspects. Tweets are short, noisy and have unique lexical and semantic features that are different from other types of text. Therefore, it is necessary to have word embeddings learned specifically from tweets. In this paper, we present ten word embedding data sets. In addition to the data sets learned from just tweet data, we also built embedding sets from the general data and the combination of tweets with the general data. The general data consist of news articles, Wikipedia data and other web data. These ten embedding models were learned from about 400 million tweets and 7 billion words from the general text. In this paper, we also present two experiments demonstrating how to use the data sets in some NLP tasks, such as tweet sentiment analysis and tweet topic classification tasks. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2017-08-14 |
URL | http://arxiv.org/abs/1708.03994v1 |
http://arxiv.org/pdf/1708.03994v1.pdf | |
PWC | https://paperswithcode.com/paper/data-sets-word-embeddings-learned-from-tweets |
Repo | |
Framework | |
Aspect-Based Sentiment Analysis Using a Two-Step Neural Network Architecture
Title | Aspect-Based Sentiment Analysis Using a Two-Step Neural Network Architecture |
Authors | Soufian Jebbara, Philipp Cimiano |
Abstract | The World Wide Web holds a wealth of information in the form of unstructured texts such as customer reviews for products, events and more. By extracting and analyzing the expressed opinions in customer reviews in a fine-grained way, valuable opportunities and insights for customers and businesses can be gained. We propose a neural network based system to address the task of Aspect-Based Sentiment Analysis to compete in Task 2 of the ESWC-2016 Challenge on Semantic Sentiment Analysis. Our proposed architecture divides the task in two subtasks: aspect term extraction and aspect-specific sentiment extraction. This approach is flexible in that it allows to address each subtask independently. As a first step, a recurrent neural network is used to extract aspects from a text by framing the problem as a sequence labeling task. In a second step, a recurrent network processes each extracted aspect with respect to its context and predicts a sentiment label. The system uses pretrained semantic word embedding features which we experimentally enhance with semantic knowledge extracted from WordNet. Further features extracted from SenticNet prove to be beneficial for the extraction of sentiment labels. As the best performing system in its category, our proposed system proves to be an effective approach for the Aspect-Based Sentiment Analysis. |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2017-09-19 |
URL | http://arxiv.org/abs/1709.06311v1 |
http://arxiv.org/pdf/1709.06311v1.pdf | |
PWC | https://paperswithcode.com/paper/aspect-based-sentiment-analysis-using-a-two |
Repo | |
Framework | |