Tag: spark

Lesson learned at the customer: Merging two tables to calculate On Time In Full (OTIF) [Part III]

Lesson learned at the customer: Merging two tables to calculate On Time In Full (OTIF) [Part III]

In this final blog, we discuss the measures that have to be made. Note: this is the third part of the blog. Missed the second part? Read it here. Calculations From here on out we will continue with all the calculations that have to take place to get to the end result of the time aspect of OTIF. […]

Lesson learned at the customer: Merging two tables to calculate On Time In Full (OTIF) [Part II]

Lesson learned at the customer: Merging two tables to calculate On Time In Full (OTIF) [Part II]

Bringing everything together in one table Note: this is the second part of the blog. Missed the first part? Read it here. Step 1 Select the new Orders table you just made and click “Append Queries as New” Now append the newly created Orders table with the newly created Deliveries table If  you have more than 2 tables […]

Kimball in a data lake? Come again?

Kimball in a data lake? Come again?

Introduction Most companies are already familiar with data modelling (be it Kimball or any other modelling technique) and data warehousing with a classical ETL (Extract-Transform-Load) flow. In the age of big data, an increasing number of companies are moving towards a data lake using Spark to store massive amounts of data. However, we often see […]

Transfer learning in Spark for image recognition

Transfer learning in Spark for image recognition

Transfer learning in Spark demystified in less than 3 minutes reading Introduction Businesses that want to classify a huge set of images in batch per day can do this by leveraging the parallel processing power of PySpark and the accuracy of models trained on a huge set of images using transfer learning. Let’s first explain […]

Azure Synapse Analytics

Azure Synapse Analytics

Organizations understand the value of data more than ever. A Data Warehouse as a single source of truth, a data lake to store data for analytical exploration, self-service tools for data transformation, visualisation, and consumption as well as clusters to process immense data volumes. All these different use cases require other specialised tools resulting in […]

Managed Big Data: DataBricks, Spark as a Service

Managed Big Data: DataBricks, Spark as a Service

The title accompanying this blog post is quite the mouth full. This blog post will explain why you should be using Spark. If a use case would make sense, then we will introduce you to the DataBricks product, which is available on Azure. Being recognised as a Leader in the Magic Quadrant, emphasizes the operational […]

Kimball in a data lake? Come again?

Kimball in a data lake? Come again?

Most companies are already familiar with data modelling (be it Kimball or any other modelling technique) and data warehousing with a classical ETL (Extract-Transform-Load) flow. In the age of big data, an increasing number of companies are moving towards a data lake using Spark to store massive amounts of data. However, we often see that […]

Transfer learning in Spark for image recognition

Transfer learning in Spark for image recognition

Transfer learning in Spark demystified in less than 3 minutes reading Businesses that want to classify a huge set of images in batch per day can do this by leveraging the parallel processing power of PySpark and the accuracy of models trained on a huge set of images using transfer learning. Let’s first explain the […]

Managed Big Data: DataBricks, Spark as a Service

Managed Big Data: DataBricks, Spark as a Service

The title accompanying this blog post is quite the mouth full. This blog post will explain why you should be using Spark. If a use case would make sense, then we will introduce you to the DataBricks product, which is available on Azure. Being recognised as a Leader in the Magic Quadrant, emphasizes the operational […]