Scarf analytics pixel

Webinar

Webinar

Webinar

Build ETL Workflows with Unstructured API

Build ETL Workflows with Unstructured API

Build ETL Workflows with Unstructured API

Learn how to build custom, programmatic ETL workflows for unstructured data using the Unstructured API and Workflow Endpoint.

Watch this webinar

Watch this webinar

Watch this webinar

Speakers

Speakers

Speakers

Maria Khalusova

Maria Khalusova

Head of Developer Relations, Unstructured

Date of recording

Date of recording

Date of recording

Tuesday, April 29, 2025

Overview

Unstructured makes it easy to build scalable, programmatic ETL workflows for unstructured data. In this session, we’ll walk through how to use the Unstructured API to connect to a data source, such as Amazon S3, preprocess your documents, and pipe structured outputs into a destination, such as a vector store, a database, or a search engine.

Whether you’re building from scratch or looking to streamline an existing workflow, this webinar will show you how to automate the full ETL pipeline—from ingestion to transformation to destination—using Unstructured’s Workflow Endpoint and Python SDK.

Technical Details

Watch this recording to learn how to build and run a complete ETL pipeline using the Unstructured API and Workflow Endpoint. In this recorded session we discussed:

  • Connect to S3 using source and destination connectors

  • Define a custom DAG of processing steps—including partitioning, chunking, and embedding

  • Preprocess documents with advanced partitioning strategies

  • Output structured data directly into your preferred destination

  • Track workflow runs and confirm successful completion via the Python SDK


Watch this recording to learn how to build and run a complete ETL pipeline using the Unstructured API and Workflow Endpoint. In this recorded session we discussed:

  • Connect to S3 using source and destination connectors

  • Define a custom DAG of processing steps—including partitioning, chunking, and embedding

  • Preprocess documents with advanced partitioning strategies

  • Output structured data directly into your preferred destination

  • Track workflow runs and confirm successful completion via the Python SDK


Watch this recording to learn how to build and run a complete ETL pipeline using the Unstructured API and Workflow Endpoint. In this recorded session we discussed:

  • Connect to S3 using source and destination connectors

  • Define a custom DAG of processing steps—including partitioning, chunking, and embedding

  • Preprocess documents with advanced partitioning strategies

  • Output structured data directly into your preferred destination

  • Track workflow runs and confirm successful completion via the Python SDK


Unstructured

ETL for LLMs

GDPR

Visit Unstructured’s Trust Portal to learn more.

Copyright © 2025 Unstructured

Unstructured

ETL for LLMs

GDPR

Visit Unstructured’s Trust Portal to learn more.

Copyright © 2025 Unstructured

Unstructured

ETL for LLMs

GDPR

Visit Unstructured’s Trust Portal to learn more.

Copyright © 2025 Unstructured