การจำแนกหมาแมว (Pet) โดยใช้ Transfer Learning บน FastAI

ขั้นตอนการทำ Image Classification

Ref : AI จำแนกรูปภาพ หมา แมว 37 สายพันธุ์ ใช้ Pet Dataset เทรน Machine Learning สร้างโมเดล Deep Neural Network ด้วย fastai ภาษา Python – Image Classification ep.1

การเทรน Dog vs Cat

Tool : FastAI

โหลดค่าเริ่มต้น

Magic command
Import library

เตรียมข้อมูล

untar_data(URLs.PET) เพื่อแตกไฟล์โหลดจาก url
ดึงลิสท์ของพาธรูปทั้งหมด ด้วย get_image_files
กำหนด batchsize, seed, reg pattern
สร้าง DataBunch โดยใช้ ImageDataBunch

Regular Expression : from_name_re โดยแกะคลาสจากชื่อไฟล์ [Data Pipeline]
Data Transform : ds_tfms=get_transforms() [Data Augmentation]
Image size
Batch size
Normalize : normalize(imagenet_stat)

สำรวจข้อมูลที่ได้

databunch.show_batch()
databunch.classes
databunch.c

สร้างโมเดล

cnn_learner โดย default จะแทนที่ layer สุดท้ายด้วย dense จำนวนคลาส
โมเดลถูกเซฟที่ $home/.torch/models
กำหนด Data bunch, models.resnet34, metrics=accuracy

เทรนโมเดล

learner.fit_one_cycle(4) คือ การเทรนแบบพิเศษ ที่ใช้ Learning Rate ไม่คงที่
1 epoch คือ ใช้ป้อนข้อมูล หมด Dataset 1 รอบ

ดูผลลัพธ์

learner.recorder.plot_losses() กราฟ Loss
learner.recorder.plot_lr() กราฟ Learning Rate
learner.recorder.plot_metrics() กราฟ Accuracy
learner.save() และ learner.load() เซฟและโหลด
สร้าง ClassificationInterpretation.from_learner(learner) มาช่วยตีความผลลัพธ์
interpretation.plot_top_losses() 9 อันดับแรกที่ทายผิด
interpretation.plot_confusion_matrix()
interpretation.most_confused()

เทรนต่อ

learner.unfreeze()
learner.lr_find()

ฟังก์ชันนี้ จะลองเทรนโมเดลดู โดยค่อย ๆ เพิ่ม Learning Rate ไปจนกระทั้ง Loss พุ่ง เพื่อดูว่าโมเดลสามารถรับ Learning Rate ได้มากสุดเท่าไร

learner.recorder.plot()

พล็อตกราฟ Learning Rate มาดู จะเห็นได้ว่า Loss ค่อย ๆ เพิ่มตาม Learning Rate ค่อย ๆ ชันไปจนพุ่งทะลุเพดาน ที่ประมาณ 1e-3 ให้เราเลือก Learning Rate ก่อนที่มันจะชันแล้วลดลง 2-5 เท่า แล้วลองเทรนดู ถ้าผลลัพธ์ไม่น่าพอใจก็ให้ load โมเดลใหม่ แล้วปรับ Learning Rate

learner.fit_one_cycle(3, max_lr=slice(1e-6, 5e-4))

max_lr คือ เนื่องจากเรากำลังเทรนหลาย ๆ Layer พร้อม ๆ กัน แล้ว Layer แต่ละ Layer ก็ต้องการ Learning Rate ที่ไม่เท่ากัน เราสามารถกำหนด Learning Rate ให้กับทุก ๆ Layer ได้ด้วยฟังก์ชัน slice ที่จะกระจายให้ทุก Layer ไล่ค่าตั้งแต่ Layer แรก = 1e-6 น้อยสุด ไป Layer สุดท้าย = 5e-4 มากสุด

เสร็จแล้ว

การทำ Data Pipeline

1) Data Pipeline คือ การจัดเตรียมข้อมูล ให้อยู่ในรูปแบบที่เหมาะสม ป้อนให้โมเดล Machine Learning นำไปใช้ได้ ตั้งแต่ต้นทางไม่ว่าจะเป็น ไฟล์รูปภาพ ไฟล์ข้อความ ไฟล์เสียง ไฟล์วิดีโอ ข้อมูลตาราง Tabular

2) ขั้นตอนดังนี้

List All Examples / Get Files

ดึงรายการข้อมูลทั้งหมดใน Dataset (ชื่อไฟล์)
tfms – พ่วงด้วย Transform ที่จำเป็น

Split to Training Set, Validation Set

แบ่งข้อมูลออกเป็น Training Set, Validation Set
by Random %, Folder name, CSV, … – ด้วยวิธีต่าง ๆ เช่น Random %, ตามไฟล์เดอร์, ตามที่ระบุในไฟล์ CSV, …

Label

แปะ Label ให้กับข้อมูล สำหรับงาน Supervised Learning
Folder name, File name, CSV, … – จากชื่อโฟลเดอร์, ชื่อไฟล์, ตามที่ระบุใน CSV, etc. โดย Label ของ Validation Set จะขึ้นกับ Training Set

Transform (Optional)

แปลงข้อมูล โดย Transform ของ Validation Set จะขึ้นกับ Training Set
per Example/Image – ต่อ 1 ตัวอย่าง เช่น แปลง Channel รูป, Resize รูป, etc.
per Training Set – Normalize, Fill N/A wtih Median, Categorize, Tokenize, Numericalize, etc.

To Tensor

แปลงเป็น Tensor เนื่องจาก PyTorch รับ Tensor

DataLoader to Batch

เราไม่สามารถโหลดทั้ง Dataset ได้พร้อมกัน เราจำเป็นต้องใช้ DataLoader สับไพ่ข้อมูล (Shuffle) และแบ่งข้อมูลออกเป็น Batch (Lazy Loading)

Transform per Batch

แปลงข้อมูล ต่อ Batch

DataBunch

สร้าง DataBunch ห่อ Training Set, Validation Set

Add Test Set (Optional)

เพิ่มข้อมูล Test Set (ถ้ามี)

from fastai.vision.all import *

path  = untar_data(URLs.PETS)
files = get_image_files(path/"images")
np.random.seed(42)

print("First item = " ,files[0])
print("len images = " ,len(files))


regex_pattern = r'/([^/]+)_\d+.jpg$'
dls = ImageDataLoaders.from_path_re(path, files, regex_pattern, item_tfms=Resize(128))

print("Class = ", dls.vocab)
print("Len Class = ", dls.c)
dls.show_batch()


learn = cnn_learner(dls, resnet18, metrics=[error_rate,accuracy])
learn.fit_one_cycle(4)


learn.save('pets-resnet18-1')
learn.recorder.plot_sched()
learn.recorder.plot_loss()
learn.load('pets-resnet18-1')

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(15,15))
interp.print_classification_report()
interp.plot_top_losses(9, figsize=(15,15))
interp.most_confused(min_val=5)


learn.unfreeze()
learn.lr_find()
learn.fit_one_cycle(3, lr_max=slice(9e-6, 5e-4))


learn.save('pets-resnet18-2')
learn.load('pets-resnet18-2')
learn.recorder.plot_sched()
learn.recorder.plot_loss()

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(15,15))
interp.print_classification_report()
interp.plot_top_losses(9, figsize=(15,15))
interp.most_confused(min_val=5)

Verzaru's Notes

การจำแนกหมาแมว (Pet) โดยใช้ Transfer Learning บน FastAI

การเทรน Dog vs Cat

การทำ Data Pipeline