Generative AI Advance Fine-Tuning for LLMs

This course is part of multiple programs. Learn more

Instructors: Joseph Santarcangelo +3 more

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

What you'll learn

  •   In-demand gen AI engineering skills in fine-tuning LLMs employers are actively looking for in just 2 weeks
  •   Instruction-tuning and reward modeling with the Hugging Face, plus LLMs as policies and RLHF
  •   Direct preference optimization (DPO) with partition function and Hugging Face and how to create an optimal solution to a DPO problem
  •   How to use proximal policy optimization (PPO) with Hugging Face to create a scoring function and perform dataset tokenization
  • Skills you'll gain

  •   Prompt Engineering
  •   Large Language Modeling
  •   Performance Tuning
  •   Reinforcement Learning
  •   Generative AI
  •   Natural Language Processing
  • There are 2 modules in this course

    During this course, you’ll explore different approaches to fine-tuning and causal LLMs with human feedback and direct preference. You’ll look at LLMs as policies for probability distributions for generating responses and the concepts of instruction-tuning with Hugging Face. You’ll learn to calculate rewards using human feedback and reward modeling with Hugging Face. Plus, you’ll explore reinforcement learning from human feedback (RLHF), proximal policy optimization (PPO) and PPO Trainer, and optimal solutions for direct preference optimization (DPO) problems. As you learn, you’ll get valuable hands-on experience in online labs where you’ll work on reward modeling, PPO, and DPO. If you’re looking to add in-demand capabilities in fine-tuning LLMs to your resume, ENROLL TODAY and build the job-ready skills employers are looking for in just two weeks!

    Fine-Tuning Causal LLMs with Human Feedback and Direct Preference

    Explore more from Machine Learning

    ©2025  ementorhub.com. All rights reserved