- Why to be a filesystem developer
- Big data (GFS)
- flash storage
- container (btfs)
- ...
- Why to be a btrfs developer
- bleeding edge
- feature rich
- already have
- cow, compression, deduplications, raid
- Under active development
- inband deduplication
- subpage sector size support
- separate qgroup accounting for metadat
- ...
- bleeding edge
- Why to be a btrfs developer
- next gen filesystem
- btrfs is considered as next gene9ration filesystem
- come projects are already using btfs
- systemd, docker, oopensuse, facebook
- lead on latest tech
- oracle database with zfs deduplication
- next gen filesystem
- How to be a FS developer (roadmap)
0. normal user, bug report and QT
0. understanding on-disk format
0. btrfs-progs developer
0. kernel btrfs developer - recommend rolling deistributino to test latest filesystem
- normal use bug report and QA
- bug report
- with detailed infor
- kernel/btffs-progs version
- kernel backtrace if needed
- reproducer if reproducible
- QA: need a little more skills
- compoile latest kernel and btrfs-progs from source
- git & bisect
- bug report
- QA
- performance test
- PTS
- sysbench
- fio
- function test
- fstests
- LTP
- performance test
- Understanding on-disk format
- on-disk data is statick
- no c codes involved
- existing good tool to exam them: dump2fs, btrfs-debug-tree
- btrfs
- btrfs stands for b-tree fs
- all metadata is stored in a b-tree
- node
- records pointer to its child lead/node
- leaf; record detailed info with its idnex key
- practice with btrfs-debug-tree
- with almost every detail of btrfs -b-tree
- debug-tree -> do some operation => debug-tree
- don't forgot to call 'sync' before debug-tree
- to see how btrfs records filess and dirs
- if careful enought , you can also see how btrfs do cow
- "fs tree" shoould be the easiest start point
- reference
- extra explain on btrfs features (2015 linuxcon japan)
- about each key type and corresponding data strue
- btrfs-progs developer
- why starts from btrfs-progs?
- single thread
- direct metadata operation: no extra infrastrucre, can use what you learn in previouse step
- quick review: special thanks for David Sterb
- needed skill
- C
- GDB
- Understanding btrfs b-tree
- why starts from btrfs-progs?
- development directions
- btrfs-debugtree enhancement
- easiest one
- can refer to existing codes quite easily
- helop you to undersnad b-tree
- btrfsck enhancement
- more challenge
- a little complicated data sructure
- may fix your own problem
- btrfs-convert debug
- most complicated
- needs to refer to kernel codes
- not
- btrfs-debugtree enhancement
- btrfs kernel developer
- the hardest part, a lot of challenges
- extra kernel facility
- kernel race/debugging
- concurrency
- old, bad commented codes
- ...
- needs much more time to test
- just make it run, without panic/BUG_ON/warning
- function test
- performance test
- but also huge accomplishment when patch is merged
- the hardest part, a lot of challenges
- challenges (kernel facilities)
- modern filesystem also implement quite a lot optimization
- delay allocatino
- at buffered wirte time, only early check is done, no space is allocated
- page cache
- these unwritten data is stored in page cache by MM
- fs need to keep page cache up to date under a lot of operations (fallocate, truncate, unlink, ...)
- tons of minor features
- Direct I/O
- fsync
- ...
- solution?
- read the funning code
- challenges (kernel trace/debugging)
- hard to debug compare to user-space program
- recompile tackes a lot of time
- kernel panic is hard to capture
- hard to set breakpoint/watchpoint
- ...
- solutions
- use ccache/distcache and only recompiile given module
- use kdump to capture crash
- use vm with gdb to set kernel breakpoint/watchpoint
- or old fashion pr_info()
- hard to debug compare to user-space program
- challenges (concurrency)
- kernel is designed for performance, not education
- concurrency is everywhere, tons of lock, mutex, workqueue, wait_event
- lockdep is the best solution
- need to enable in kernel config
- it's runtime detection, needs tests to trigger it
- with quite google output explaining how it will cause deadlock
- but not perfect, only detect spinlock/rwlock/mutex and so on, not support for wait_event
- echo w >/proc/sysrq-tirgger
- kernel doc => lwn => google => rtfc (fs) => rtfc (facility)
- RTFC - Read The Funny Code
- future plans
- quota reserve space framework rework
- inband de-duplication
- btrfs-convert rework
- RAID 5/6 readahead
Q&A:
- ECC RAM?
- dedup 實用性
- block size