Feb 7, 2024

Introducing Unstructured Platform

Unstructured

LLM

Production Grade ETL for Enterprise LLM Apps

As retrieval augmented generation (RAG) apps advance from prototype to production to mature enterprise tools, we have continued to evolve our product suite to meet the needs of our users. In early 2023, our open source library became the de facto standard for connecting LLMs to external data, and now has over 5 million downloads and more than 100,000 unique users. As RAG moved to production, we launched SaaS, AWS Marketplace, and Azure Marketplace APIs—which now have thousands of users—to meet the needs of AI startups and innovation teams. The latest evolution in the LLM landscape has seen large organizations relying increasingly heavily on RAG apps for business critical processes.

To meet the needs of these users, we are proud to introduce the Unstructured Platform, a solution that provides scalable, reliable, and secure ETL for enterprise grade ETL apps. For the initial launch, Platform will include a user interface that allows users to:

  • Ingest documents from 10 upstream data sources (Azure Blob Storage, S3, SFTP, Databricks Delta Tables, Google Cloud Storage, Google Drive, OneDrive, SalesForce, ElasticSearch, and OpenSearch)

  • Deliver normalized outputs to 10 downstream data sources (Pinecone, Weaviate, Chroma, S3, Azure Blob Storage, Databricks Delta Tables, Google Cloud Storage, Azure AI Search, Postgres, and ElasticSearch)

  • Connect sources and destinations through workflows

  • Run ad-hoc workflows and schedule recurring workflows

  • Monitor the status of scheduled, in progress, and completed jobs

Unstructured Platform


We plan to expand to 30 source and destination connectors, add audio and image processing, allow users to bring their own embedding models, and support integration with external tools such as Azure AI Document Intelligence and AWS Textract. We’ll also add data storage, vector syncing, and introduce our next generation of models for table and form extraction.

Our Platform includes a Pay As You Go option, a Subscription plan, or you can work with us to build a customized solution for your business. Check out our Platform page to see which option best fits your needs. You can also try our compute calculator to estimate your usage costs for each plan.

Worried about data security? We’ve completed SOC2 Type 1, and our SOC2 Type 2 certification is in progress. For users with data that can’t leave their network, we’re hard at work on an enterprise version of Platform that will run in the customer’s VPC using a control plane and data plane architecture. Details will be forthcoming on our LinkedIn, Twitter, and blog. In the meantime, contact sales@unstructured.io if you have an enterprise use case you’d like to discuss.

We’re proud to be a part of your LLM journey and love to hear about what our users are doing. Join us on our Community Slack or reach out at hello@unstructured.io if you have questions or want to talk about what you’re doing with LLMs!