Guangjing Wang Assistant Professor @ USF

Blogs

Notes for Spark (3)

2017-07-24

Data Analysis

Dstream

In this post, I mainly make notes about Spark Streaming. This post is based on the open tutorial from here.

Read All
Notes for Spark (2)

2017-07-17

Data Analysis

Hive

In this post, I mainly make notes about Spark with DFrame and DStream. This post is mainly based on the open tutorial from here.

Read All
Notes for Spark (1)

2017-07-10

Data Analysis

Spark

Spark is a unified analytics engine for large-scale data processing. It achieves high performance for both batch and streaming data, using a state-of-the-art DAG(Directed Acyclic Graph) scheduler, a query optimizer, and a physical execution engine.

Read All
Machine Learning with R (7)

2016-10-30

AI

Bagging Boosting

In this blog, I will introduce a set of techniques for improving the predictive performance of ML models, such as how to fine-tune the performance by searching for the optimal set of training conditions. Moreover, I will record some useful techniques for certain types of work.

Read All
Machine Learning with R (6)

2016-10-23

AI

Metrics

Today I am going to introduce some methods of evaluation models. I will provide some performance measures that reasonably reflect a model’s ability to predict or forecast unseen data. And we will see how to use R to apply those useful measures and methods to the predictive models.

Read All

6/7

Recent Posts

Categories

Tags