Preprocessing Unstructured Data for LLM Applications

Software > Computer Software > Educational Software DeepLearning.AI

Course Overview

Enhancing a RAG system’s performance depends on efficiently processing diverse unstructured data sources. In this course, you’ll learn techniques for representing all sorts of unstructured data, like text, images, and tables, from many different sources and implement them to extend your LLM RAG pipeline to include Excel, Word, PowerPoint, PDF, and EPUB files. 1. How to preprocess data for your LLM application development, focusing on how to work with different document types. 2. How to extract and normalize various documents into a common JSON format and enrich it with metadata to improve search results. 3. Techniques for document image analysis, including layout detection and vision transformers, to extract and understand PDFs, images, and tables. 4. How to build a RAG bot that is able to ingest different documents like PDFs, PowerPoints, and Markdown files. Apply the skills you’ll learn in this course to real-world scenarios, enhancing your RAG application and expanding its versatility.

Course FAQs

What are the prerequisites for 'Preprocessing Unstructured Data for LLM Applications'?

Prerequisites for this continuing education class are set by DeepLearning.AI. Most professional development online classes benefit from some prior knowledge. Please check the provider's page for specific requirements.

Will I receive a certificate for this CE class?

Yes, upon successful completion, DeepLearning.AI typically offers a shareable certificate to showcase your new skills and fulfill your continuing education requirements.

How long does this online course take to complete?

Completion times for online continuing education courses vary. The provider's website will have the most accurate estimate of the time commitment needed.