In this post, I mainly make notes about Spark Streaming. This post is based on the open tutorial from here.
In this post, I mainly make notes about Spark with DFrame and DStream. This post is mainly based on the open tutorial from here.
Spark is a unified analytics engine for large-scale data processing. It achieves high performance for both batch and streaming data, using a state-of-the-art DAG(Directed Acyclic Graph) scheduler, a query optimizer, and a physical execution engine.
In this blog, I will introduce a set of techniques for improving the predictive performance of ML models, such as how to fine-tune the performance by searching for the optimal set of training conditions. Moreover, I will record some useful techniques for certain types of work.
Today I am going to introduce some methods of evaluation models. I will provide some performance measures that reasonably reflect a model’s ability to predict or forecast unseen data. And we will see how to use R to apply those useful measures and methods to the predictive models.