Model Selection Comparing, validating and choosing parameters and models Clustering
Automatic grouping of similar objects into sets Classification Identifying to which category an object belongs to
A simple and efficient tool for data mining and data analysis.
What is so great about it is that it should be accessible to everybody, and reusable in various contexts. It should be written in Python and is capable of running on top of popular neural network frameworks like TensorFlow, CNTK or Theano.
Support for interactive data visualization and use of GUI toolkits
A browser-based notebook with support for mathematical expressions, inline plots and other rich media
Flexible, embeddable interpreters to load into ones own projects Ability to analyze terabyte scale data at interactive speeds, on your desktop A Single platform for tabular data, graphs, text, and images Run the same code in a distributed system, using a Hadoop Yarn or EC2 cluster Focus on tasks simulation