Jackson Kek

「In the midst of chaos, find your inner peace」

Self-Rewarding Language Models

Fine-tune Llama 2 70B that outperforms GPT-4 0613...

Self-Rewarding Language Model It’s been some time since my last post here. The world of AI doesn’t seem to slow down, and it’s always full of new things. It’s amazing how quickly things change; so...

Road to Efficient LLMs 2: QLoRA

QLoRA: Efficient Finetuning of Quantized LLMs

Introduction Previously, we discussed Low-Rank Adapters (LoRA) as a method for efficiently fine-tuning large language models (LLMs). In this post, we will discuss QLoRA, a new quantization method t...

LongLoRA: Efficient Fine tuning of Long Context Large Language Models

100k context length Llama 7B....How?

Introduction Recently, many efforts have emerged aiming to expand the context length of Large Language Models (LLMs), all while keeping their perplexity stable. Among these initiatives, a noteworth...

Road to Efficient LLMs 1: LoRA

Low-Rank Adaptation of Large Language Models

Given the rapid advancements in large language models (LLMs) like the recent launch of Llama 2 and research focusing on parameter efficiency, hallucination reduction, and accelerated inference, ...

Getting Started with Distributed Data Parallel in PyTorch: A Beginner's Guide

Learn Multi GPU Training with DDP: Step by Step Tutorial and Tips for Deep Learning Scaling

Introduction With the launch of cutting-edge models like ChatGPT, the world has been witnessing a remarkable shift towards the development of Large Language Models (LLMs). Take, for example, Meta’s...

FlashAttention: Fast and Memory Efficient Exact Attention with IO Awareness

Speed up transformers...from hardware pespective?

Transformers Ever tried naming a Large Language Model (LLM) that doesn’t have a Transformer hiding under its hood? I’d bet you’d have an easier time getting a cat to walk on a leash without transfo...

Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

A new approach to learns and reason more like human?

What is self-supervised learning? Self-supervised learning, a learning paradigm passionately advocated by Meta AI’s VP and Chief AI Scientist, Yann LeCun, provides a remarkable avenue for models to...

Break-A-Scene: Extracting Multiple Concepts from a Single Image

First Work That Attempt to Learn Multiple Concepts from a Single Image

What is text-to-image model personalization? In a world where memories are precious and photos hold cherished moments, there are times when we yearn for more. Imagine having the power to bring your...

How to Craft a Website using Jekyll and GitHub Pages

A Beginner's Guide to Building and Hosting Your Own Website

Introduction For quite some time now, I’ve harbored a keen interest in owning a personal blogging site. It’s become increasingly important to me to have a dedicated space where I can document and s...