This information has been compiled from resources
published on the internet, the Linux kernel source and
information gained from experimentation.
Introduction
minix -> extend fs -> 2nd extended
Structure
The file system is created from a sequential
collection of blocks. These blocks can be either 1k, 2k
or 4k in size. These blocks are divided up into groups
for various reasons.
The starting point for the filesystem is the
superblock. This is a structure 1024 bytes in length and
is always located at an offset of 1024 bytes from the
start of the filesystem.
The following is a list of the structure used by
Linux. Note that other OS's (like hurd) may use a slightly different
structure. For details see ???.h
field name |
type |
comment |
s_inodes_count |
ULONG |
Count of inodes in the filesystem |
s_blocks_count |
ULONG |
Count of blocks in the filesystem |
s_r_blocks_count |
ULONG |
Count of the number of reserved blocks |
s_free_blocks_count |
ULONG |
Count of the number of free blocks |
s_free_inodes_count |
ULONG |
Count of the number of free inodes |
s_first_data_block |
ULONG |
The first block which contains data |
s_log_block_size |
ULONG |
Indicator of the block size |
s_log_frag_size |
LONG |
Indicator of the size of the fragments |
s_blocks_per_group |
ULONG |
Count of the number of blocks in each block
group |
s_frags_per_group |
ULONG |
Count of the number of fragments in each
block group |
s_inodes_per_group |
ULONG |
Count of the number of inodes in each block
group |
s_mtime |
ULONG |
The time that the filesystem was last mounted |
s_wtime |
ULONG |
The time that the filesystem was last written
to |
s_mnt_count |
USHORT |
The number of times the file system has been
mounted |
s_max_mnt_count |
SHORT |
The number of times the file system can be
mounted |
s_magic |
USHORT |
Magic number indicating ex2fs |
s_state |
USHORT |
Flags indicating the current state of the
filesystem |
s_errors |
USHORT |
Flags indicating the procedures for error
reporting |
s_pad |
USHORT |
padding |
s_lastcheck |
ULONG |
The time that the filesystem was last checked |
s_checkinterval |
ULONG |
The maximum time permissable between checks |
s_creator_os |
ULONG |
Indicator of which OS created the filesystem |
s_rev_level |
ULONG |
The revision level of the filesystem |
s_reserved |
ULONG[235] |
padding to 1024 bytes |
TODO revision 1...
s_r_blocks_count :
this is the number of blocks which are reserved for the
super user
s_first_data_block :
Depending on the block size, the first data block will be
either 0 or 1. See diagram below
s_log_block_size :
0 = 1k block size
1 = 2k
2 = 4k
s_log_frag_size :
At the moment, it seems that fragments are not
implemented. In the future I may have to find out how
they work.
s_blocks_per_group
The filesystem is divided up into block groups. Note that
the last block group may not be complete
s_inodes_per_group
Each block group has space reserved for a number of
inodes.
s_mtime, s_wtime
This may be the mount time, or the umount time. I am not
sure which.
s_mnt_count, s_max_mnt_count
Once this count reaches the maximum, the filesystem must
be checked, the count is then reset.
s_magic
This should contain the magic number 0xEF53
s_state
This contains a set of flags which indicate wether the
filesystem is clean etc.
s_errors
This contains falgs which indicate how the filesystem
should be treated if errors are found.
s_creator_os
For Linux this is ???
s_rev_level
The current revision is ???
The information in the superblock is used to access
the rest of the data on the disk.
The number of block groups = the number of blocks /
the number of blocks per group; // rounded up
All block and inode addresses start at 1. The first
block on the disk is block 1. 0 is used to indicate no
block.
Each block group can be found at the block address
((group number - 1)* blocks per group) and is of course
blocks per group long. Group numbers are 1 based aswell
Each group is just a series of blocks, however the
first blocks in the group have a special purpose. The
remainder are used for storing data.
| Superblock | Group Descriptors | Block Bitmap | INode Bitmap | INode Table | Data blocks |
|--------------------------------|---------------------------------------------------------|
|This is the same for all groups | this is specific to each group |
The superblock is stored in the first data block (except
for group 1)
The Group Descriptors contains information on the
block groups. This data is covers all the groups and is
stored in all the groups for rudundency. This is an array
of the following structure
field name |
type |
comment |
bg_block_bitmap |
ULONG |
The address of the block containing the block
bitmap for this group |
bg_inode_bitmap |
ULONG |
The address of the block containing the inode
bitmap for this group |
bg_inode_table |
ULONG |
The address of the block containing the inode
table for this group |
bg_free_blocks_count |
USHORT |
The count of free blocks in this group |
bg_free_inodes_count |
USHORT |
The count of free inodes in this group |
bg_used_dirs_count |
USHORT |
The number inodes in this group which are
directories |
bg_pad |
USHORT |
padding |
bg_reserved |
ULONG[3] |
padding |
The size of the descriptors can be calculated as
(sizeof(ext2_group) * number of groups) / block size; //
rounded up if necessary
The information in this structure us used to locate
the block and inode bitmaps and inode table.
Remember that the first entry corresponds to block
group 1.
The block bitmap is a bitmap indicating which blocks
in the group have been allocated. If the bit is set then
the block is allocated. The size of the bitmap is (blocks
per group / 8) / block size;// with both divisions
rounded up.
It is necessary to find out which group a particular
block is in to be able to look up the bitmap. The group =
((Block number - 1) / Blocks per group) + 1; // rounded
up
The block in that group is then Block Number - (Group
* Blocks per group)
The inode bitmap is essentaly the same as the block
bitmap, but indicates which inodes are allocated. The
size of the inode bitmpap is (inodes per group / 8) /
block size;// with both divisions rounded up.
The same calculations can be used for finding the
group of a particular inode. The group = ((INode number -
1) / INodes per group) + 1; // rounded up
The inode in that group is then INode Number - (Group
* INodes per group)
The inode table is an array of the inodes for that
particular group. Again, the first entry is for the first
inode in that group.
field name |
type |
description |
i_mode |
USHORT |
File mode |
i_uid |
USHORT |
Owner Uid |
i_size |
ULONG |
Size in bytes |
i_atime |
ULONG |
Access time |
i_ctime |
ULONG |
Creation time |
i_mtime |
ULONG |
Modification time |
i_dtime |
ULONG |
Deletion Time |
i_gid |
USHORT |
Group Id |
i_links_count |
USHORT |
Links count |
i_blocks |
ULONG |
Blocks count |
i_flags |
ULONG |
File flags |
i_reserved1 |
ULONG |
OS dependent |
i_block |
ULONG[15] |
Pointers to blocks |
i_version |
ULONG |
File version (for NFS) |
i_file_acl |
ULONG |
File ACL |
i_dir_acl |
ULONG |
Directory ACL |
i_faddr |
ULONG |
Fragment address |
i_frag |
UCHAR |
Fragment number |
i_fsize |
UCHAR |
Fragment size |
i_pad1 |
USHORT |
|
i_reserved2 |
ULONG[2] |
|
The file mode is a set of flags that specify the
type of file and the access permissions
identifier |
value |
comment |
S_IFMT |
F000 |
format mask |
S_IFSOCK |
A000 |
socket |
S_IFLNK |
C000 |
symbolic link |
S_IFREG |
8000 |
regular file |
S_IFBLK |
6000 |
block device |
S_IFDIR |
4000 |
directory |
S_IFCHR |
2000 |
character device |
S_IFIFO |
1000 |
fifo |
|
|
|
S_ISUID |
0800 |
SUID |
S_ISGID |
0400 |
SGID |
S_ISVTX |
0200 |
sticky bit |
|
|
|
S_IRWXU |
01C0 |
user mask |
S_IRUSR |
0100 |
read |
S_IWUSR |
0080 |
write |
S_IXUSR |
0040 |
execute |
|
|
|
S_IRWXG |
0038 |
group mask |
S_IRGRP |
0020 |
read |
S_IWGRP |
0010 |
write |
S_IXGRP |
0008 |
execute |
|
|
|
S_IRWXO |
0007 |
other mask |
S_IROTH |
0004 |
read |
S_IWOTH |
0002 |
write |
S_IXOTH |
0001 |
execute |
The i_block entry is an array of block
addresses. The first EXT2_NDIR_BLOCKS (12) are direct
block addresses. The data in these blocks is the content
of the file. The next block EXT2_IND_BLOCK in the
indirect block. This is the address of a block which
contains a list of addresses of blocks which contain the
data. There are block size / sizeof(ULONG) addresses in
this block.
The EXT2_DIND_BLOCK is simalar, but it is a double
indirect block. It countains the address of a block which
has a list of indirect block addresses. Each indirect
block then has another list is blocks.
The EXT2_TIND_BLOCK is simalar again, but it is the
tripple indirect block. It contains a list of double
indirect blocks etc.
Now that you know how to find and read inodes, you can
start to read the files. There are a set of special
inodes which are reserved for certain puposes. These
include
indetifier |
value |
description |
EXT2_BAD_INO |
1 |
Bad blocks inode |
EXT2_ROOT_INO |
2 |
Root inode |
EXT2_ACL_IDX_INO |
3 |
ACL inode |
EXT2_ACL_DATA_INO |
4 |
ACL inode |
EXT2_BOOT_LOADER_INO |
5 |
Boot loader inode |
EXT2_UNDEL_DIR_INO |
6 |
Undelete directory inode |
EXT2_FIRST_INO |
11 |
First non reserved inode |
The most important inode here is the root inode.
This is the inode at the root of the file system. This
inode is a directory, which like all directories has the
following structure:
field name |
type |
description |
inode |
ULONG |
address if inode |
rec_len |
USHORT |
length of this record |
name_len |
USHORT |
length of file name |
name |
CHAR[0] |
the file name |
A directory is a list of these structures. The
structures can not pass over a block boundry, so the last
record is extended to fill the block. And entry with an
inode of 0 should be ignored.
|