Paper Group ANR 727
Food Recognition using Fusion of Classifiers based on CNNs. Distributed Evolutionary k-way Node Separators. Exploring Food Detection using CNNs. Probabilistic Global Scale Estimation for MonoSLAM Based on Generic Object Detection. MIT Advanced Vehicle Technology Study: Large-Scale Naturalistic Driving Study of Driver Behavior and Interaction with A …
Food Recognition using Fusion of Classifiers based on CNNs
Title | Food Recognition using Fusion of Classifiers based on CNNs |
Authors | Eduardo Aguilar, Marc Bolaños, Petia Radeva |
Abstract | With the arrival of convolutional neural networks, the complex problem of food recognition has experienced an important improvement in recent years. The best results have been obtained using methods based on very deep convolutional neural networks, which show that the deeper the model,the better the classification accuracy will be obtain. However, very deep neural networks may suffer from the overfitting problem. In this paper, we propose a combination of multiple classifiers based on different convolutional models that complement each other and thus, achieve an improvement in performance. The evaluation of our approach is done on two public datasets: Food-101 as a dataset with a wide variety of fine-grained dishes, and Food-11 as a dataset of high-level food categories, where our approach outperforms the independent CNN models. |
Tasks | Food Recognition |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04864v1 |
http://arxiv.org/pdf/1709.04864v1.pdf | |
PWC | https://paperswithcode.com/paper/food-recognition-using-fusion-of-classifiers |
Repo | |
Framework | |
Distributed Evolutionary k-way Node Separators
Title | Distributed Evolutionary k-way Node Separators |
Authors | Peter Sanders, Christian Schulz, Darren Strash, Robert Williger |
Abstract | Computing high quality node separators in large graphs is necessary for a variety of applications, ranging from divide-and-conquer algorithms to VLSI design. In this work, we present a novel distributed evolutionary algorithm tackling the k-way node separator problem. A key component of our contribution includes new k-way local search algorithms based on maximum flows. We combine our local search with a multilevel approach to compute an initial population for our evolutionary algorithm, and further show how to modify the coarsening stage of our multilevel algorithm to create effective combine and mutation operations. Lastly, we combine these techniques with a scalable communication protocol, producing a system that is able to compute high quality solutions in a short amount of time. Our experiments against competing algorithms show that our advanced evolutionary algorithm computes the best result on 94% of the chosen benchmark instances. |
Tasks | |
Published | 2017-02-06 |
URL | http://arxiv.org/abs/1702.01692v1 |
http://arxiv.org/pdf/1702.01692v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-evolutionary-k-way-node |
Repo | |
Framework | |
Exploring Food Detection using CNNs
Title | Exploring Food Detection using CNNs |
Authors | Eduardo Aguilar, Marc Bolaños, Petia Radeva |
Abstract | One of the most common critical factors directly related to the cause of a chronic disease is unhealthy diet consumption. In this sense, building an automatic system for food analysis could allow a better understanding of the nutritional information with respect to the food eaten and thus it could help in taking corrective actions in order to consume a better diet. The Computer Vision community has focused its efforts on several areas involved in the visual food analysis such as: food detection, food recognition, food localization, portion estimation, among others. For food detection, the best results evidenced in the state of the art were obtained using Convolutional Neural Network. However, the results of all these different approaches were gotten on different datasets and therefore are not directly comparable. This article proposes an overview of the last advances on food detection and an optimal model based on GoogLeNet Convolutional Neural Network method, principal component analysis, and a support vector machine that outperforms the state of the art on two public food/non-food datasets. |
Tasks | Food Recognition |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04800v1 |
http://arxiv.org/pdf/1709.04800v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-food-detection-using-cnns |
Repo | |
Framework | |
Probabilistic Global Scale Estimation for MonoSLAM Based on Generic Object Detection
Title | Probabilistic Global Scale Estimation for MonoSLAM Based on Generic Object Detection |
Authors | Edgar Sucar, Jean-Bernard Hayet |
Abstract | This paper proposes a novel method to estimate the global scale of a 3D reconstructed model within a Kalman filtering-based monocular SLAM algorithm. Our Bayesian framework integrates height priors over the detected objects belonging to a set of broad predefined classes, based on recent advances in fast generic object detection. Each observation is produced on single frames, so that we do not need a data association process along video frames. This is because we associate the height priors with the image region sizes at image places where map features projections fall within the object detection regions. We present very promising results of this approach obtained on several experiments with different object classes. |
Tasks | Object Detection |
Published | 2017-05-27 |
URL | http://arxiv.org/abs/1705.09860v1 |
http://arxiv.org/pdf/1705.09860v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-global-scale-estimation-for |
Repo | |
Framework | |
MIT Advanced Vehicle Technology Study: Large-Scale Naturalistic Driving Study of Driver Behavior and Interaction with Automation
Title | MIT Advanced Vehicle Technology Study: Large-Scale Naturalistic Driving Study of Driver Behavior and Interaction with Automation |
Authors | Lex Fridman, Daniel E. Brown, Michael Glazer, William Angell, Spencer Dodd, Benedikt Jenik, Jack Terwilliger, Aleksandr Patsekin, Julia Kindelsberger, Li Ding, Sean Seaman, Alea Mehler, Andrew Sipperley, Anthony Pettinato, Bobbie Seppelt, Linda Angell, Bruce Mehler, Bryan Reimer |
Abstract | For the foreseeble future, human beings will likely remain an integral part of the driving task, monitoring the AI system as it performs anywhere from just over 0% to just under 100% of the driving. The governing objectives of the MIT Autonomous Vehicle Technology (MIT-AVT) study are to (1) undertake large-scale real-world driving data collection that includes high-definition video to fuel the development of deep learning based internal and external perception systems, (2) gain a holistic understanding of how human beings interact with vehicle automation technology by integrating video data with vehicle state data, driver characteristics, mental models, and self-reported experiences with technology, and (3) identify how technology and other factors related to automation adoption and use can be improved in ways that save lives. In pursuing these objectives, we have instrumented 23 Tesla Model S and Model X vehicles, 2 Volvo S90 vehicles, 2 Range Rover Evoque, and 2 Cadillac CT6 vehicles for both long-term (over a year per driver) and medium term (one month per driver) naturalistic driving data collection. Furthermore, we are continually developing new methods for analysis of the massive-scale dataset collected from the instrumented vehicle fleet. The recorded data streams include IMU, GPS, CAN messages, and high-definition video streams of the driver face, the driver cabin, the forward roadway, and the instrument cluster (on select vehicles). The study is on-going and growing. To date, we have 122 participants, 15,610 days of participation, 511,638 miles, and 7.1 billion video frames. This paper presents the design of the study, the data collection hardware, the processing of the data, and the computer vision algorithms currently being used to extract actionable knowledge from the data. |
Tasks | |
Published | 2017-11-19 |
URL | https://arxiv.org/abs/1711.06976v4 |
https://arxiv.org/pdf/1711.06976v4.pdf | |
PWC | https://paperswithcode.com/paper/mit-autonomous-vehicle-technology-study-large |
Repo | |
Framework | |
Perspectives for Evaluating Conversational AI
Title | Perspectives for Evaluating Conversational AI |
Authors | Mahipal Jadeja, Neelanshi Varia |
Abstract | Conversational AI systems are becoming famous in day to day lives. In this paper, we are trying to address the following key question: To identify whether design, as well as development efforts for search oriented conversational AI are successful or not.It is tricky to define ‘success’ in the case of conversational AI and equally tricky part is to use appropriate metrics for the evaluation of conversational AI. We propose four different perspectives namely user experience, information retrieval, linguistic and artificial intelligence for the evaluation of conversational AI systems. Additionally, background details of conversational AI systems are provided including desirable characteristics of personal assistants, differences between chatbot and an AI based personal assistant. An importance of personalization and how it can be achieved is explained in detail. Current challenges in the development of an ideal conversational AI (personal assistant) are also highlighted along with guidelines for achieving personalized experience for users. |
Tasks | Chatbot, Information Retrieval |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04734v1 |
http://arxiv.org/pdf/1709.04734v1.pdf | |
PWC | https://paperswithcode.com/paper/perspectives-for-evaluating-conversational-ai |
Repo | |
Framework | |
Universal Sampling Rate Distortion
Title | Universal Sampling Rate Distortion |
Authors | Vinay Praneeth Boda, Prakash Narayan |
Abstract | We examine the coordinated and universal rate-efficient sampling of a subset of correlated discrete memoryless sources followed by lossy compression of the sampled sources. The goal is to reconstruct a predesignated subset of sources within a specified level of distortion. The combined sampling mechanism and rate distortion code are universal in that they are devised to perform robustly without exact knowledge of the underlying joint probability distribution of the sources. In Bayesian as well as nonBayesian settings, single-letter characterizations are provided for the universal sampling rate distortion function for fixed-set sampling, independent random sampling and memoryless random sampling. It is illustrated how these sampling mechanisms are successively better. Our achievability proofs bring forth new schemes for joint source distribution-learning and lossy compression. |
Tasks | |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07409v1 |
http://arxiv.org/pdf/1706.07409v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-sampling-rate-distortion |
Repo | |
Framework | |
Machine Assisted Analysis of Vowel Length Contrasts in Wolof
Title | Machine Assisted Analysis of Vowel Length Contrasts in Wolof |
Authors | Elodie Gauthier, Laurent Besacier, Sylvie Voisin |
Abstract | Growing digital archives and improving algorithms for automatic analysis of text and speech create new research opportunities for fundamental research in phonetics. Such empirical approaches allow statistical evaluation of a much larger set of hypothesis about phonetic variation and its conditioning factors (among them geographical / dialectal variants). This paper illustrates this vision and proposes to challenge automatic methods for the analysis of a not easily observable phenomenon: vowel length contrast. We focus on Wolof, an under-resourced language from Sub-Saharan Africa. In particular, we propose multiple features to make a fine evaluation of the degree of length contrast under different factors such as: read vs semi spontaneous speech ; standard vs dialectal Wolof. Our measures made fully automatically on more than 20k vowel tokens show that our proposed features can highlight different degrees of contrast for each vowel considered. We notably show that contrast is weaker in semi-spontaneous speech and in a non standard semi-spontaneous dialect. |
Tasks | |
Published | 2017-06-01 |
URL | http://arxiv.org/abs/1706.00465v1 |
http://arxiv.org/pdf/1706.00465v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-assisted-analysis-of-vowel-length |
Repo | |
Framework | |
Sample Efficient Feature Selection for Factored MDPs
Title | Sample Efficient Feature Selection for Factored MDPs |
Authors | Zhaohan Daniel Guo, Emma Brunskill |
Abstract | In reinforcement learning, the state of the real world is often represented by feature vectors. However, not all of the features may be pertinent for solving the current task. We propose Feature Selection Explore and Exploit (FS-EE), an algorithm that automatically selects the necessary features while learning a Factored Markov Decision Process, and prove that under mild assumptions, its sample complexity scales with the in-degree of the dynamics of just the necessary features, rather than the in-degree of all features. This can result in a much better sample complexity when the in-degree of the necessary features is smaller than the in-degree of all features. |
Tasks | Feature Selection |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03454v1 |
http://arxiv.org/pdf/1703.03454v1.pdf | |
PWC | https://paperswithcode.com/paper/sample-efficient-feature-selection-for |
Repo | |
Framework | |
Effective Representations of Clinical Notes
Title | Effective Representations of Clinical Notes |
Authors | Sebastien Dubois, Nathanael Romano, David C. Kale, Nigam Shah, Kenneth Jung |
Abstract | Clinical notes are a rich source of information about patient state. However, using them to predict clinical events with machine learning models is challenging. They are very high dimensional, sparse and have complex structure. Furthermore, training data is often scarce because it is expensive to obtain reliable labels for many clinical events. These difficulties have traditionally been addressed by manual feature engineering encoding task specific domain knowledge. We explored the use of neural networks and transfer learning to learn representations of clinical notes that are useful for predicting future clinical events of interest, such as all causes mortality, inpatient admissions, and emergency room visits. Our data comprised 2.7 million notes and 115 thousand patients at Stanford Hospital. We used the learned representations, along with commonly used bag of words and topic model representations, as features for predictive models of clinical events. We evaluated the effectiveness of these representations with respect to the performance of the models trained on small datasets. Models using the neural network derived representations performed significantly better than models using the baseline representations with small ($N < 1000$) training datasets. The learned representations offer significant performance gains over commonly used baseline representations for a range of predictive modeling tasks and cohort sizes, offering an effective alternative to task specific feature engineering when plentiful labeled training data is not available. |
Tasks | Feature Engineering, Transfer Learning |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07025v3 |
http://arxiv.org/pdf/1705.07025v3.pdf | |
PWC | https://paperswithcode.com/paper/effective-representations-of-clinical-notes |
Repo | |
Framework | |
Geometric Multi-Model Fitting with a Convex Relaxation Algorithm
Title | Geometric Multi-Model Fitting with a Convex Relaxation Algorithm |
Authors | Paul Amayo, Pedro Pinies, Lina M. Paz, Paul Newman |
Abstract | We propose a novel method to fit and segment multi-structural data via convex relaxation. Unlike greedy methods –which maximise the number of inliers– this approach efficiently searches for a soft assignment of points to models by minimising the energy of the overall classification. Our approach is similar to state-of-the-art energy minimisation techniques which use a global energy. However, we deal with the scaling factor (as the number of models increases) of the original combinatorial problem by relaxing the solution. This relaxation brings two advantages: first, by operating in the continuous domain we can parallelize the calculations. Second, it allows for the use of different metrics which results in a more general formulation. We demonstrate the versatility of our technique on two different problems of estimating structure from images: plane extraction from RGB-D data and homography estimation from pairs of images. In both cases, we report accurate results on publicly available datasets, in most of the cases outperforming the state-of-the-art. |
Tasks | Homography Estimation |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01553v1 |
http://arxiv.org/pdf/1706.01553v1.pdf | |
PWC | https://paperswithcode.com/paper/geometric-multi-model-fitting-with-a-convex |
Repo | |
Framework | |
Learning Policies for Markov Decision Processes from Data
Title | Learning Policies for Markov Decision Processes from Data |
Authors | Manjesh K. Hanawal, Hao Liu, Henghui Zhu, Ioannis Ch. Paschalidis |
Abstract | We consider the problem of learning a policy for a Markov decision process consistent with data captured on the state-actions pairs followed by the policy. We assume that the policy belongs to a class of parameterized policies which are defined using features associated with the state-action pairs. The features are known a priori, however, only an unknown subset of them could be relevant. The policy parameters that correspond to an observed target policy are recovered using $\ell_1$-regularized logistic regression that best fits the observed state-action samples. We establish bounds on the difference between the average reward of the estimated and the original policy (regret) in terms of the generalization error and the ergodic coefficient of the underlying Markov chain. To that end, we combine sample complexity theory and sensitivity analysis of the stationary distribution of Markov chains. Our analysis suggests that to achieve regret within order $O(\sqrt{\epsilon})$, it suffices to use training sample size on the order of $\Omega(\log n \cdot poly(1/\epsilon))$, where $n$ is the number of the features. We demonstrate the effectiveness of our method on a synthetic robot navigation example. |
Tasks | Robot Navigation |
Published | 2017-01-21 |
URL | http://arxiv.org/abs/1701.05954v1 |
http://arxiv.org/pdf/1701.05954v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-policies-for-markov-decision |
Repo | |
Framework | |
Using Deep learning methods for generation of a personalized list of shuffled songs
Title | Using Deep learning methods for generation of a personalized list of shuffled songs |
Authors | Rushin Gindra, Srushti Kotak, Asmita Natekar, Grishma Sharma |
Abstract | The shuffle mode, where songs are played in a randomized order that is decided upon for all tracks at once, is widely found and known to exist in music player systems. There are only few music enthusiasts who use this mode since it either is too random to suit their mood or it keeps on repeating the same list every time. In this paper, we propose to build a convolutional deep belief network(CDBN) that is trained to perform genre recognition based on audio features retrieved from the records of the Million Song Dataset. The learned parameters shall be used to initialize a multi-layer perceptron which takes extracted features of user’s playlist as input alongside the metadata to classify to various categories. These categories will be shuffled retrospectively based on the metadata to autonomously provide with a list that is efficacious in playing songs that are desired by humans in normal conditions. |
Tasks | |
Published | 2017-12-17 |
URL | https://arxiv.org/abs/1712.06076v2 |
https://arxiv.org/pdf/1712.06076v2.pdf | |
PWC | https://paperswithcode.com/paper/using-deep-learning-methods-for-generation-of |
Repo | |
Framework | |
Why Do Deep Neural Networks Still Not Recognize These Images?: A Qualitative Analysis on Failure Cases of ImageNet Classification
Title | Why Do Deep Neural Networks Still Not Recognize These Images?: A Qualitative Analysis on Failure Cases of ImageNet Classification |
Authors | Han S. Lee, Alex A. Agarwal, Junmo Kim |
Abstract | In a recent decade, ImageNet has become the most notable and powerful benchmark database in computer vision and machine learning community. As ImageNet has emerged as a representative benchmark for evaluating the performance of novel deep learning models, its evaluation tends to include only quantitative measures such as error rate, rather than qualitative analysis. Thus, there are few studies that analyze the failure cases of deep learning models in ImageNet, though there are numerous works analyzing the networks themselves and visualizing them. In this abstract, we qualitatively analyze the failure cases of ImageNet classification results from recent deep learning model, and categorize these cases according to the certain image patterns. Through this failure analysis, we believe that it can be discovered what the final challenges are in ImageNet database, which the current deep learning model is still vulnerable to. |
Tasks | |
Published | 2017-09-11 |
URL | http://arxiv.org/abs/1709.03439v1 |
http://arxiv.org/pdf/1709.03439v1.pdf | |
PWC | https://paperswithcode.com/paper/why-do-deep-neural-networks-still-not |
Repo | |
Framework | |
Repeated Inverse Reinforcement Learning
Title | Repeated Inverse Reinforcement Learning |
Authors | Kareem Amin, Nan Jiang, Satinder Singh |
Abstract | We introduce a novel repeated Inverse Reinforcement Learning problem: the agent has to act on behalf of a human in a sequence of tasks and wishes to minimize the number of tasks that it surprises the human by acting suboptimally with respect to how the human would have acted. Each time the human is surprised, the agent is provided a demonstration of the desired behavior by the human. We formalize this problem, including how the sequence of tasks is chosen, in a few different ways and provide some foundational results. |
Tasks | Imitation Learning |
Published | 2017-05-15 |
URL | http://arxiv.org/abs/1705.05427v3 |
http://arxiv.org/pdf/1705.05427v3.pdf | |
PWC | https://paperswithcode.com/paper/repeated-inverse-reinforcement-learning |
Repo | |
Framework | |