Published on

Inodes and Links in Linux Filesystem

Inode, called index node, is a data structure in the Linux file system, that stores metadata of the files and directories in the filesystem. The metadata includes permissions, creation, modification time, size and the pointers to the data blocks in the physical memory that have the data. This post covers what is an inode and how the files are managed in a Linux file system with some examples.

Commands that are used in this post

  1. ls - List a directory
  2. df - To get stats on the file system.
  3. ln - Creating references (Links) to files and folders.
  4. stat - Print the metadata of the file or directory.

Inodes

An inode object is represented by C language struct in the <linux/fs.h>. Every file and directory in the filesystem has an inode associated with it. It has all the metadata related to a file to perform operations on it. Every file in the filesystem is given a unique inode number. Interestingly, inode does not store the name of the file or the directories.

// inode data structure
struct inode {
        loff_t                  i_size;              /* file size in bytes */
        unsigned long           i_ino;               /* inode number */
        struct timespec         i_atime;             /* last access time */
        struct timespec         i_mtime;             /* last modify time */
        struct timespec         i_ctime;             /* last change time */
        umode_t                 i_mode;              /* access permissions */
        .
        .
        .
}

The number of inodes in a filesystem is predefined. It is created when you format a disk partition with a Linux filesystem. This also means that there is a hard upper limit on the number of files that you can create in the filesystem. Usually, the number of inodes to the free space ratio is 1:16Kb. hence by default, the amount of inodes is determined by the free space available. Since the inode size is fixed (it's a C struct), the inodes are stored separately on the disk as an array.

disk viz

The inode count-related statistics of a filesystem can be obtained using the df command with the -i flag. Whenever you create a new file, you can see the IUsed column increments and IFree decrements in the filesystem.

thish@thish:~/dir$ df -i /dev/sda1
Filesystem Inodes  IUsed  IFree IUse% Mounted on
/dev/sda1  950272 189747 760525   20% /

thish@thish:~/dir$ touch one.txt

thish@thish:~/dir$ df -i /dev/sda1
Filesystem Inodes  IUsed  IFree IUse% Mounted on
/dev/sda1  950272 189748 760524   20% /
thish@thish:~/dir$

You can see the inode number allocated to this file using the stat command. In this case, the inode number of one.txt is 553446. The stat command also prints the size, number of memory blocks used to store the file, the inode number, the number of links pointing to it, permissions and the time when it got modified.

thish@thish:~/dir$ stat one.txt
  File: one.txt
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: fd00h/64768d	Inode: 553446      Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/   thish)   Gid: ( 1000/   thish)
Access: 2023-07-29 09:19:47.045313330 +0000
Modify: 2023-07-29 09:19:47.045313330 +0000
Change: 2023-07-29 09:19:47.045313330 +0000
 Birth: 2023-07-29 09:19:47.045313330 +0000

# ls -i also shows the inode number
thish@thish:~/dir$ ls -i
553446 one.txt

The below image depicts how this file is stored in the file system. An inode is assigned to the file and the inode points to the data block where the contents are stored.

one.txt inode

Directory listing

The names of files and directories are not tracked in inode, instead, they are stored in a directory entry. A directory entry is simply a mapping between the file name and the inode number that exists in a directory. A directory in Linux filesystem also has an inode number associated with it and this points to the directory listing block.

thish@thish:~/dir$ tree .
.
├── one.txt
├── subdir
│   └── three.txt
└── two.txt

thish@thish:~/dir$ ls -ila
total 12
553453 drwxrwxr-x  3 thish thish 4096 Jul 29 10:40 .
524292 drwxr-x--- 19 thish thish 4096 Jul 29 09:10 ..
553446 -rw-rw-r--  1 thish thish    0 Jul 29 09:28 one.txt
553448 drwxrwxr-x  2 thish thish 4096 Jul 29 10:40 subdir
553447 -rw-rw-r--  1 thish thish    0 Jul 29 10:30 two.txt

The . and .. are hard links to the current directory and parent directory respectively. Link count is determined by the number of hard links to the inode. A file will get deleted only when the number of links (hard links) pointing to the inode becomes 0. More on this at Links section.

Since the file name is stored separately in the directory listing block, the file name can be freely edited while other programs have it open without causing any issues. This is particularly helpful in cases such as log rotation where the old logs will be moved to a file with a different name but the contents remain the same.

Putting it all together

The below diagram shows a rough representation of inodes, directory listings and the data blocks.

  1. When we want to read one.txt from dir/, the OS finds the inode number of dir to be 553453.
  2. It gets the directory listing of dir following the inode pointer, which points to Directory Block 1 (Directory Block 1 is where the directory listing for dir/ is stored). It has the mapping of file -> Inode number of each file present in that directory.
  3. It again follows the inode number and goes to Data Block 1 to retrieve the data of one.txt.
  4. The process again happens for the sub-directories present in dir whenever w
full

The file system starts from the root directory whose inode number is 2.

thish@thish:~/dir$ stat /
  File: /
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: fd00h/64768d	Inode: 2           Links: 19
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)

You can create multiple references to the same file using Links. ln one.txt A creates a hard link A pointing to the file one.txt. There are two types of Links. Hard link and Soft link.

Hard Links point to the same inode number. Only the name or the location is different. Each inode has a counter (See Links when you stat a file or directory) which stores the number of hard links that are pointing to this inode. When you delete a file, it decrements the count. The file will be removed by the OS only when this count becomes 0. Since hardlinks point to the inode directly that exists within that file system, hardlinks can only be created for a file that exists within the same filesystem (hardlink cannot know about inode number that is in another filesystem).

Softlinks, on the other hand, is a completely separate file that stores the link to another file. Softlinks can point to any file that exists in any other filesystem that is mounted but will become invalid when the original file is deleted or the file system is unmounted.

thish@thish:~/dir$ ln -s one.txt softlink
thish@thish:~/dir$ stat one.txt
  File: one.txt
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: fd00h/64768d	Inode: 553446      Links: 1

# The Links count increments after creating a hardlink
thish@thish:~/dir$
thish@thish:~/dir$ ln one.txt hardlink
thish@thish:~/dir$ stat one.txt
  File: one.txt
  Size: 0         	Blocks: 0          IO Block: 4096   regular empty file
Device: fd00h/64768d	Inode: 553446      Links: 2

# The hardlink and one.txt file has the same inode number but softlink is a different file.
thish@thish:~/dir$
thish@thish:~/dir$ ls -ail
total 12
#inode             Link count
553453 drwxrwxr-x  3 thish thish 4096 Jul 29 11:59 .
524292 drwxr-x--- 19 thish thish 4096 Jul 29 09:10 .. # This has link count as 19 because there are number of subdirs within it each having a `..` reference.
553446 -rw-rw-r--  2 thish thish    0 Jul 29 09:28 hardlink
553446 -rw-rw-r--  2 thish thish    0 Jul 29 09:28 one.txt
553454 lrwxrwxrwx  1 thish thish    7 Jul 29 11:59 softlink -> one.txt
553448 drwxrwxr-x  2 thish thish 4096 Jul 29 10:40 subdir
553447 -rw-rw-r--  1 thish thish    0 Jul 29 10:30 two.txt