Paper Group ANR 585
Using Social Network Service to determine the Initial User Requirements for Small Software Businesses. Strong and Simple Baselines for Multimodal Utterance Embeddings. Experimental neural network enhanced quantum tomography. From self-tuning regulators to reinforcement learning and back again. Learning Instance-wise Sparsity for Accelerating Deep M …
Using Social Network Service to determine the Initial User Requirements for Small Software Businesses
Title | Using Social Network Service to determine the Initial User Requirements for Small Software Businesses |
Authors | Nazakat Ali, Jang-Eui Hong |
Abstract | Background/Objectives: Software engineering community has been studied extensively on large-sized software organizations and has provided suitable and interesting solutions. However, small software companies that make a large part of the software industry have been overlooked. Methods/Statistical analysis: The current requirement engineering practices are not suitable for small software companies. We propose a social network-based requirement engineering approach that will complement the traditional requirement engineering approaches and will make it suitable for small software companies. Findings: We have applied our SNS-based requirements determination approach to knowing about its validity. As a result, we concluded that 33.06 % of invited end-users participated in our approach and figured out 156 distinct user requirements. It has been seen that it was not necessary for users to have requirements engineering knowledge to participate in our proposed SNS-based approach that made maximum users to be involved during requirements elicitation process. By investigating the ideas and opinions communicated by users, we were able to figure out a high number of user requirements. It was observed that maximum user-requirements were determined within a short period of time (7days). Our experience with SNS-based approach also says that end-users hardly know about non-functional requirements and express it explicitly. Improvements/Applications: we believe that researchers will consider SNS other than Facebook that would allow applying our SNS-based approach for requirements identification. We have experienced our approach with Facebook but we do not know how our approach would actually work with other SNSs. |
Tasks | |
Published | 2019-04-26 |
URL | http://arxiv.org/abs/1904.12583v1 |
http://arxiv.org/pdf/1904.12583v1.pdf | |
PWC | https://paperswithcode.com/paper/using-social-network-service-to-determine-the |
Repo | |
Framework | |
Strong and Simple Baselines for Multimodal Utterance Embeddings
Title | Strong and Simple Baselines for Multimodal Utterance Embeddings |
Authors | Paul Pu Liang, Yao Chong Lim, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov, Louis-Philippe Morency |
Abstract | Human language is a rich multimodal signal consisting of spoken words, facial expressions, body gestures, and vocal intonations. Learning representations for these spoken utterances is a complex research problem due to the presence of multiple heterogeneous sources of information. Recent advances in multimodal learning have followed the general trend of building more complex models that utilize various attention, memory and recurrent components. In this paper, we propose two simple but strong baselines to learn embeddings of multimodal utterances. The first baseline assumes a conditional factorization of the utterance into unimodal factors. Each unimodal factor is modeled using the simple form of a likelihood function obtained via a linear transformation of the embedding. We show that the optimal embedding can be derived in closed form by taking a weighted average of the unimodal features. In order to capture richer representations, our second baseline extends the first by factorizing into unimodal, bimodal, and trimodal factors, while retaining simplicity and efficiency during learning and inference. From a set of experiments across two tasks, we show strong performance on both supervised and semi-supervised multimodal prediction, as well as significant (10 times) speedups over neural models during inference. Overall, we believe that our strong baseline models offer new benchmarking options for future research in multimodal learning. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1906.02125v2 |
https://arxiv.org/pdf/1906.02125v2.pdf | |
PWC | https://paperswithcode.com/paper/190602125 |
Repo | |
Framework | |
Experimental neural network enhanced quantum tomography
Title | Experimental neural network enhanced quantum tomography |
Authors | Adriano Macarone Palmieri, Egor Kovlakov, Federico Bianchi, Dmitry Yudin, Stanislav Straupe, Jacob Biamonte, Sergei Kulik |
Abstract | Quantum tomography is currently ubiquitous for testing any implementation of a quantum information processing device. Various sophisticated procedures for state and process reconstruction from measured data are well developed and benefit from precise knowledge of the model describing state preparation and the measurement apparatus. However, physical models suffer from intrinsic limitations as actual measurement operators and trial states cannot be known precisely. This scenario inevitably leads to state-preparation-and-measurement (SPAM) errors degrading reconstruction performance. Here we develop and experimentally implement a machine learning based protocol reducing SPAM errors. We trained a supervised neural network to filter the experimental data and hence uncovered salient patterns that characterize the measurement probabilities for the original state and the ideal experimental apparatus free from SPAM errors. We compared the neural network state reconstruction protocol with a protocol treating SPAM errors by process tomography, as well as to a SPAM-agnostic protocol with idealized measurements. The average reconstruction fidelity is shown to be enhanced by 10% and 27%, respectively. The presented methods apply to the vast range of quantum experiments which rely on tomography. |
Tasks | |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05902v2 |
http://arxiv.org/pdf/1904.05902v2.pdf | |
PWC | https://paperswithcode.com/paper/experimental-neural-network-enhanced-quantum |
Repo | |
Framework | |
From self-tuning regulators to reinforcement learning and back again
Title | From self-tuning regulators to reinforcement learning and back again |
Authors | Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu |
Abstract | Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world. Examples include self-driving vehicles, distributed sensor networks, and agile robots. However, when machine learning is to be applied in these new settings, the algorithms had better come with the same type of reliability, robustness, and safety bounds that are hallmarks of control theory, or failures could be catastrophic. Thus, as learning algorithms are increasingly and more aggressively deployed in safety critical settings, it is imperative that control theorists join the conversation. The goal of this tutorial paper is to provide a starting point for control theorists wishing to work on learning related problems, by covering recent advances bridging learning and control theory, and by placing these results within an appropriate historical context of system identification and adaptive control. |
Tasks | |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11392v2 |
https://arxiv.org/pdf/1906.11392v2.pdf | |
PWC | https://paperswithcode.com/paper/from-self-tuning-regulators-to-reinforcement |
Repo | |
Framework | |
Learning Instance-wise Sparsity for Accelerating Deep Models
Title | Learning Instance-wise Sparsity for Accelerating Deep Models |
Authors | Chuanjian Liu, Yunhe Wang, Kai Han, Chunjing Xu, Chang Xu |
Abstract | Exploring deep convolutional neural networks of high efficiency and low memory usage is very essential for a wide variety of machine learning tasks. Most of existing approaches used to accelerate deep models by manipulating parameters or filters without data, e.g., pruning and decomposition. In contrast, we study this problem from a different perspective by respecting the difference between data. An instance-wise feature pruning is developed by identifying informative features for different instances. Specifically, by investigating a feature decay regularization, we expect intermediate feature maps of each instance in deep neural networks to be sparse while preserving the overall network performance. During online inference, subtle features of input images extracted by intermediate layers of a well-trained neural network can be eliminated to accelerate the subsequent calculations. We further take coefficient of variation as a measure to select the layers that are appropriate for acceleration. Extensive experiments conducted on benchmark datasets and networks demonstrate the effectiveness of the proposed method. |
Tasks | |
Published | 2019-07-27 |
URL | https://arxiv.org/abs/1907.11840v1 |
https://arxiv.org/pdf/1907.11840v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-instance-wise-sparsity-for |
Repo | |
Framework | |
Sinkhorn Divergences for Unbalanced Optimal Transport
Title | Sinkhorn Divergences for Unbalanced Optimal Transport |
Authors | Thibault Séjourné, Jean Feydy, François-Xavier Vialard, Alain Trouvé, Gabriel Peyré |
Abstract | This paper extends the formulation of Sinkhorn divergences to the unbalanced setting of arbitrary positive measures, providing both theoretical and algorithmic advances. Sinkhorn divergences leverage the entropic regularization of Optimal Transport (OT) to define geometric loss functions. They are differentiable, cheap to compute and do not suffer from the curse of dimensionality, while maintaining the geometric properties of OT, in particular they metrize the weak$^$ convergence. Extending these divergences to the unbalanced setting is of utmost importance since most applications in data sciences require to handle both transportation and creation/destruction of mass. This includes for instance problems as diverse as shape registration in medical imaging, density fitting in statistics, generative modeling in machine learning, and particles flows involving birth/death dynamics. Our first set of contributions is the definition and the theoretical analysis of the unbalanced Sinkhorn divergences. They enjoy the same properties as the balanced divergences (classical OT), which are obtained as a special case. Indeed, we show that they are convex, differentiable and metrize the weak$^$ convergence. Our second set of contributions studies generalized Sinkkhorn iterations, which enable a fast, stable and massively parallelizable algorithm to compute these divergences. We show, under mild assumptions, a linear rate of convergence, independent of the number of samples, i.e. which can cope with arbitrary input measures. We also highlight the versatility of this method, which takes benefit from the latest advances in term of GPU computing, for instance through the KeOps library for fast and scalable kernel operations. |
Tasks | |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12958v1 |
https://arxiv.org/pdf/1910.12958v1.pdf | |
PWC | https://paperswithcode.com/paper/sinkhorn-divergences-for-unbalanced-optimal |
Repo | |
Framework | |
Plexus Convolutional Neural Network (PlexusNet): A novel neural network architecture for histologic image analysis
Title | Plexus Convolutional Neural Network (PlexusNet): A novel neural network architecture for histologic image analysis |
Authors | Okyaz Eminaga, Mahmoud Abbas, Christian Kunder, Andreas M. Loening, Jeanne Shen, James D. Brooks, Curtis P. Langlotz, Daniel L. Rubin |
Abstract | Different convolutional neural network (CNN) models have been tested for their application in histologic imaging analyses. However, these models are prone to overfitting due to their large parameter capacity, requiring more data and expensive computational resources for model training. Given these limitations, we developed and tested PlexusNet for histologic evaluation using a single GPU by a batch dimension of 16x512x512x3. We utilized 62 Hematoxylin and eosin stain (H&E) annotated histological images of radical prostatectomy cases from TCGA-PRAD and Stanford University, and 24 H&E whole-slide images with hepatocellular carcinoma from TCGA-LIHC diagnostic histology images. Base models were DenseNet, Inception V3, and MobileNet and compared with PlexusNet. The dice coefficient (DSC) was evaluated for each model. PlexusNet delivered comparable classification performance (DSC at patch level: 0.89) for H&E whole-slice images in distinguishing prostate cancer from normal tissues. The parameter capacity of PlexusNet is 9 times smaller than MobileNet or 58 times smaller than Inception V3, respectively. Similar findings were observed in distinguishing hepatocellular carcinoma from non-cancerous liver histologies (DSC at patch level: 0.85). As conclusion, PlexusNet represents a novel model architecture for histological image analysis that achieves classification performance comparable to the base models while providing orders-of-magnitude memory savings. |
Tasks | |
Published | 2019-08-24 |
URL | https://arxiv.org/abs/1908.09067v1 |
https://arxiv.org/pdf/1908.09067v1.pdf | |
PWC | https://paperswithcode.com/paper/plexus-convolutional-neural-network-plexusnet |
Repo | |
Framework | |
Measuring Inter-group Agreement on zSlice Based General Type-2 Fuzzy Sets
Title | Measuring Inter-group Agreement on zSlice Based General Type-2 Fuzzy Sets |
Authors | Javier Navarro, Christian Wagner |
Abstract | Recently, there has been much research into modelling of uncertainty in human perception through Fuzzy Sets (FSs). Most of this research has focused on allowing respondents to express their (intra) uncertainty using intervals. Here, depending on the technique used and types of uncertainties being modelled different types of FSs can be obtained (e.g., Type-1, Interval Type-2, General Type-2). Arguably, one of the most flexible techniques is the Interval Agreement Approach (IAA) as it allows to model the perception of all respondents without making assumptions such as outlier removal or predefined membership function types (e.g. Gaussian). A key aspect in the analysis of interval-valued data and indeed, IAA based agreement models of said data, is to determine the position and strengths of agreement across all the sources/participants. While previously, the Agreement Ratio was proposed to measure the strength of agreement in fuzzy set based models of interval data, said measure has only been applicable to type-1 fuzzy sets. In this paper, we extend the Agreement Ratio to capture the degree of inter-group agreement modelled by a General Type-2 Fuzzy Set when using the IAA. This measure relies on using a similarity measure to quantitatively express the relation between the different levels of agreement in a given FS. Synthetic examples are provided in order to demonstrate both behaviour and calculation of the measure. Finally, an application to real-world data is provided in order to show the potential of this measure to assess the divergence of opinions for ambiguous concepts when heterogeneous groups of participants are involved. |
Tasks | |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04679v1 |
https://arxiv.org/pdf/1907.04679v1.pdf | |
PWC | https://paperswithcode.com/paper/measuring-inter-group-agreement-on-zslice |
Repo | |
Framework | |
Personal Universes: A Solution to the Multi-Agent Value Alignment Problem
Title | Personal Universes: A Solution to the Multi-Agent Value Alignment Problem |
Authors | Roman V. Yampolskiy |
Abstract | AI Safety researchers attempting to align values of highly capable intelligent systems with those of humanity face a number of challenges including personal value extraction, multi-agent value merger and finally in-silico encoding. State-of-the-art research in value alignment shows difficulties in every stage in this process, but merger of incompatible preferences is a particularly difficult challenge to overcome. In this paper we assume that the value extraction problem will be solved and propose a possible way to implement an AI solution which optimally aligns with individual preferences of each user. We conclude by analyzing benefits and limitations of the proposed approach. |
Tasks | |
Published | 2019-01-01 |
URL | http://arxiv.org/abs/1901.01851v1 |
http://arxiv.org/pdf/1901.01851v1.pdf | |
PWC | https://paperswithcode.com/paper/personal-universes-a-solution-to-the-multi |
Repo | |
Framework | |
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
Title | HUBERT Untangles BERT to Improve Transfer across NLP Tasks |
Authors | Mehrad Moradshahi, Hamid Palangi, Monica S. Lam, Paul Smolensky, Jianfeng Gao |
Abstract | We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional Transformer language model. We show that there is shared structure between different NLP datasets that HUBERT, but not BERT, is able to learn and leverage. We validate the effectiveness of our model on the GLUE benchmark and HANS dataset. Our experiment results show that untangling data-specific semantics from general language structure is key for better transfer among NLP tasks. |
Tasks | Language Modelling |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.12647v1 |
https://arxiv.org/pdf/1910.12647v1.pdf | |
PWC | https://paperswithcode.com/paper/hubert-untangles-bert-to-improve-transfer-1 |
Repo | |
Framework | |
Exploring Methods for the Automatic Detection of Errors in Manual Transcription
Title | Exploring Methods for the Automatic Detection of Errors in Manual Transcription |
Authors | Xiaofei Wang, Jinyi Yang, Ruizhi Li, Samik Sadhu, Hynek Hermansky |
Abstract | Quality of data plays an important role in most deep learning tasks. In the speech community, transcription of speech recording is indispensable. Since the transcription is usually generated artificially, automatically finding errors in manual transcriptions not only saves time and labors but benefits the performance of tasks that need the training process. Inspired by the success of hybrid automatic speech recognition using both language model and acoustic model, two approaches of automatic error detection in the transcriptions have been explored in this work. Previous study using a biased language model approach, relying on a strong transcription-dependent language model, has been reviewed. In this work, we propose a novel acoustic model based approach, focusing on the phonetic sequence of speech. Both methods have been evaluated on a completely real dataset, which was originally transcribed with errors and strictly corrected manually afterwards. |
Tasks | Language Modelling, Speech Recognition |
Published | 2019-04-08 |
URL | https://arxiv.org/abs/1904.04294v2 |
https://arxiv.org/pdf/1904.04294v2.pdf | |
PWC | https://paperswithcode.com/paper/exploring-methods-for-the-automatic-detection |
Repo | |
Framework | |
On the Necessity and Effectiveness of Learning the Prior of Variational Auto-Encoder
Title | On the Necessity and Effectiveness of Learning the Prior of Variational Auto-Encoder |
Authors | Haowen Xu, Wenxiao Chen, Jinlin Lai, Zhihan Li, Youjian Zhao, Dan Pei |
Abstract | Using powerful posterior distributions is a popular approach to achieving better variational inference. However, recent works showed that the aggregated posterior may fail to match unit Gaussian prior, thus learning the prior becomes an alternative way to improve the lower-bound. In this paper, for the first time in the literature, we prove the necessity and effectiveness of learning the prior when aggregated posterior does not match unit Gaussian prior, analyze why this situation may happen, and propose a hypothesis that learning the prior may improve reconstruction loss, all of which are supported by our extensive experiment results. We show that using learned Real NVP prior and just one latent variable in VAE, we can achieve test NLL comparable to very deep state-of-the-art hierarchical VAE, outperforming many previous works with complex hierarchical VAE architectures. |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13452v1 |
https://arxiv.org/pdf/1905.13452v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-necessity-and-effectiveness-of |
Repo | |
Framework | |
Neural Architecture Search for Class-incremental Learning
Title | Neural Architecture Search for Class-incremental Learning |
Authors | Shenyang Huang, Vincent François-Lavet, Guillaume Rabusseau |
Abstract | In class-incremental learning, a model learns continuously from a sequential data stream in which new classes occur. Existing methods often rely on static architectures that are manually crafted. These methods can be prone to capacity saturation because a neural network’s ability to generalize to new concepts is limited by its fixed capacity. To understand how to expand a continual learner, we focus on the neural architecture design problem in the context of class-incremental learning: at each time step, the learner must optimize its performance on all classes observed so far by selecting the most competitive neural architecture. To tackle this problem, we propose Continual Neural Architecture Search (CNAS): an autoML approach that takes advantage of the sequential nature of class-incremental learning to efficiently and adaptively identify strong architectures in a continual learning setting. We employ a task network to perform the classification task and a reinforcement learning agent as the meta-controller for architecture search. In addition, we apply network transformations to transfer weights from previous learning step and to reduce the size of the architecture search space, thus saving a large amount of computational resources. We evaluate CNAS on the CIFAR-100 dataset under varied incremental learning scenarios with limited computational power (1 GPU). Experimental results demonstrate that CNAS outperforms architectures that are optimized for the entire dataset. In addition, CNAS is at least an order of magnitude more efficient than naively using existing autoML methods. |
Tasks | AutoML, Continual Learning, Neural Architecture Search |
Published | 2019-09-14 |
URL | https://arxiv.org/abs/1909.06686v1 |
https://arxiv.org/pdf/1909.06686v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-architecture-search-for-class |
Repo | |
Framework | |
Vision-Based Proprioceptive Sensing for Soft Inflatable Actuators
Title | Vision-Based Proprioceptive Sensing for Soft Inflatable Actuators |
Authors | Peter Werner, Matthias Hofer, Carmelo Sferrazza, Raffaello D’Andrea |
Abstract | This paper presents a vision-based sensing approach for a soft linear actuator, which is equipped with an integrated camera. The proposed vision-based sensing pipeline predicts the three-dimensional position of a point of interest on the actuator. To train and evaluate the algorithm, predictions are compared to ground truth data from an external motion capture system. An off-the-shelf distance sensor is integrated in a similar actuator and its performance is used as a baseline for comparison. The resulting sensing pipeline runs at 40 Hz in real-time on a standard laptop and is additionally used for closed loop elongation control of the actuator. It is shown that the approach can achieve comparable accuracy to the distance sensor. |
Tasks | Motion Capture |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.09096v1 |
https://arxiv.org/pdf/1909.09096v1.pdf | |
PWC | https://paperswithcode.com/paper/vision-based-proprioceptive-sensing-for-soft |
Repo | |
Framework | |
Neuron ranking – an informed way to condense convolutional neural networks architecture
Title | Neuron ranking – an informed way to condense convolutional neural networks architecture |
Authors | Kamil Adamczewski, Mijung Park |
Abstract | Convolutional neural networks (CNNs) in recent years have made a dramatic impact in science, technology and industry, yet the theoretical mechanism of CNN architecture design remains surprisingly vague. The CNN neurons, including its distinctive element, convolutional filters, are known to be learnable features, yet their individual role in producing the output is rather unclear. The thesis of this work is that not all neurons are equally important and some of them contain more useful information to perform a given task . Consequently, we quantify the significance of each filter and rank its importance in describing input to produce the desired output. This work presents two different methods: (1) a game theoretical approach based on Shapley value which computes the marginal contribution of each filter; and (2) a probabilistic approach based on what-we-call, the Importance switch using variational inference. Strikingly, these two vastly different methods produce similar experimental results, confirming the general theory that some of the filters are inherently more important that the others. The learned ranks can be readily useable for network compression and interpretability. |
Tasks | |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.02519v2 |
https://arxiv.org/pdf/1907.02519v2.pdf | |
PWC | https://paperswithcode.com/paper/neuron-ranking-an-informed-way-to-condense |
Repo | |
Framework | |