Tag: databricks

Uploading Json data to an on-premise REST API using Azure Data Factory and Azure Databricks

Uploading Json data to an on-premise REST API using Azure Data Factory and Azure Databricks

The goal of this exercise is to upload some data we read from csv files, structure them as Json and upload the data to a REST API. The catch is that we are uploading this data to an on-premise API from our azure cloud. The API requires authentication via an access token, and the API […]

Synapse vs Databricks: Which one to choose from?

Synapse vs Databricks: Which one to choose from?

Synapse or Databricks, which one to choose? This question is asked a lot when implementing a future-proof data platform. Both platforms are cloud-based, future-proof and state-of-art technologies for creating a Data Lakehouse architecture. However, each of the solutions has its own strengths.   Today, the ‘lakehouse’ trend is real and companies are switching from classic […]

Pandas, Koalas and PySpark in Python

Pandas, Koalas and PySpark in Python

If you landed on this page to learn more about animals, I have to disappoint you. Pandas, Koalas and PySpark are all packages that serve a similar purpose in the programming language Python.  Python has increasingly gained traction over the past years, as illustrated in the Stack Overflow trends. Originally designed as a general purpose […]

Managed Big Data: DataBricks, Spark as a Service

Managed Big Data: DataBricks, Spark as a Service

The title accompanying this blog post is quite the mouth full. This blog post will explain why you should be using Spark. If a use case would make sense, then we will introduce you to the DataBricks product, which is available on Azure. Being recognised as a Leader in the Magic Quadrant, emphasizes the operational […]

Pandas, Koalas and PySpark in Python

Pandas, Koalas and PySpark in Python

If you landed on this page to learn more about animals, I have to disappoint you. Pandas, Koalas and PySpark are all packages that serve a similar purpose in the programming language Python.  Python has increasingly gained traction over the past years, as illustrated in the Stack Overflow trends. Originally designed as a general purpose […]

Transfer learning in Spark for image recognition

Transfer learning in Spark for image recognition

Transfer learning in Spark demystified in less than 3 minutes reading Businesses that want to classify a huge set of images in batch per day can do this by leveraging the parallel processing power of PySpark and the accuracy of models trained on a huge set of images using transfer learning. Let’s first explain the […]

Managed Big Data: DataBricks, Spark as a Service

Managed Big Data: DataBricks, Spark as a Service

The title accompanying this blog post is quite the mouth full. This blog post will explain why you should be using Spark. If a use case would make sense, then we will introduce you to the DataBricks product, which is available on Azure. Being recognised as a Leader in the Magic Quadrant, emphasizes the operational […]