Data

Trainings

Apache Spark is a data analysis and aggregation tool built atop Scala. It is also a distributed calculation tool across multiple worker machines in a cluster. What makes the relationship of Spark and Scala so special is the ability to perform data analysis with functional programming or SQL.

This course is tailored for data analysts and engineers looking to harness their data workloads and develop solutions.

Talks

Machine Learning with Spark MLLib

Spark has a machine learning aspect to it and it’s called Spark MLLib. We discuss an intro into machine learning, some models, then apply some of those common machine learning models.

In-Depth Jupyter Notebooks

Jupyter Notebooks has been a platform for Data Analysts and Data Scientists for the last few years but it may be expanding to a more general population including students, financial analysts, and other Scientific rigors. Running a Jupyter Notebook today is just as important as running a web browser. It is an essential platform for learning, conveying information, and telling a story.

Machine Learning Data Pipelines

How do we move information realtime and connect machine learning models to make decisions on our business data? This presentation goes through machine learning and Kafka tools that would help achieve that goal.

In-Depth Jupyter Lab

Jupyter Lab has been a platform for Data Analysts and Data Scientists for the last few years. Still, it may expand to a more general population, including students, financial analysts, and other Scientific rigors. Running a Jupyter Lab today is just as important as running a web browser. It is an essential platform for learning, conveying information, and telling a story.