Jan 4, 2015

How Table | Database | data are physically stored in Sql server

This post discuss about how databases, tables, indexes and data are physically stored in disk in sql server (anatomy).


Database storage in SQL server:

--A simple Script to create database
CREATE DATABASE Exact_Help

When we create a database by default it creates two files:
1. Data file
2. Log file

 --Script to get files of current database
USE Exact_Help

SELECT NAME AS [File Name]
     ,type_desc AS [File Type]
     ,physical_name AS [Physical Path]
     ,size AS [Size In Kb]
FROM sys.database_files

Sample output:
File Name
File Type
Physical Path
Size In Kb
Exact_Help
ROWS
C:\Program Files\MSSQL\DATA\Exact_Help.mdf
520
Exact_Help_log
LOG
C:\Program Files\MSSQL\DATA \Exact_Help_log.ldf
130

a. Log file

A database can have more than one log files. Content of a log file is transaction logs. For each DML and DDL operation, SQL server keeps data modification information in the log file so that in case of rollback,disaster etc data can be recovered for the log file. It is doesn’t keep database errors etc.

--Script to get the content of log file
SELECT *
FROM sys.fn_dblog(NULL, NULL)

b. Data file

A database can have more than one data files. Content of a data file is actual data of the tables.  In case of data file there is concept of file group.  All data files are stored inside the file groups. There can be more than one file groups and each file groups can have more than one data files. Each file group must have at least one data file which is called primary data file (mdf) and rest data files are called secondary data files (ldf).

By default all the data files are stored in the PRIMARY file group. So in the above example data file Exact_Help.mdf is stored inside file group named PRIMARY.

--Script to create database with 3 file groups including primary. Each file group keeps two data files. Also there are two log files.

CREATE DATABASE Exact_Help_File_Group

ON PRIMARY (
     NAME = PFGFile1
     ,FILENAME = N'C:\Data\Primary_File1.mdf'
     )
     ,(
     NAME = PFGFile2
     ,FILENAME = N'D:\Data\Primary_File2.ldf'
     )
         
,FILEGROUP First_File_Group (
     NAME = FFGFile1
     ,FILENAME = N'C:\Data\First_File1.mdf'
     )
     ,(
     NAME = FFGFile2
     ,FILENAME = N'D:\Data\First_File2.ldf'
     )
    
,FILEGROUP Second_File_Group (
     NAME = SFGFile1
     ,FILENAME = N'C:\Data\Second_File1.mdf'
     )
     ,(
     NAME = SFGFile2
     ,FILENAME = N'D:\Data\Second_File2.ldf'
     )
    

LOG ON (
     NAME = Log_File_One
     ,FILENAME = N'C:\Log\log1.ldf'
     )
     ,(
     NAME = Log_File_Two
     ,FILENAME = N'D:\Log\log2.ldf'
     )

--Script to get file group and physical path
SELECT ISNULL(groupname, 'NA') AS [File Group Name]
     ,NAME AS [File Name]
     ,CASE
          WHEN groupname IS NULL
              THEN 'Log'
          ELSE 'Data'
          END AS [File Type]
     ,filename AS [Physical Path]
     ,Size AS [Size In KB]
FROM sys.sysfiles F
LEFT JOIN sys.sysfilegroups FG ON F.groupid = FG.groupid
ORDER BY 3,1

Sample output:
File Group Name
File Name
File Type
Physical Path
Size In KB
First_File_Group
FFGFile1
Data
C:\Data\First_File1.mdf
128
First_File_Group
FFGFile2
Data
D:\Data\First_File2.ldf
128
PRIMARY
PFGFile1
Data
C:\Data\Primary_File1.mdf
520
PRIMARY
PFGFile2
Data
D:\Data\Primary_File2.ldf
128
Second_File_Group
SFGFile1
Data
C:\Data\Second_File1.mdf
128
Second_File_Group
SFGFile2
Data
D:\Data\Second_File2.ldf
128
NA
Log_File_Two
Log
D:\Log\log2.ldf
128
NA
Log_File_One
Log
C:\Log\log1.ldf
128

Table and index storage

In a database only tables and indexes are physically stored in the disk (Inside database files). 

A table without any indexes (Heap)

--Script to create a table
USE Exact_Help

CREATE TABLE tblStudent (
       RollNumber INTEGER
       ,NAME CHAR(50)
       ,City CHAR(10)
)

--Populating couples of records
INSERT INTO tblStudent VALUES
       (1,'Scott','LA'),
       (2,'Greg','WC'),
       (3,'Alain','London'),
       (4,'Rohit','Delhi')

By default any table which is created, is stored in the PRIMARY file group. If a file group contains more than one files then data of a table is stored any of the data files also more than one data files.

--Script to link table with file group and partition
SELECT OBJECT_NAME(I.[object_id]) AS [Table Name]
       ,I.type_desc AS [Table Type]
       ,FG.NAME AS [File Group Name]
       ,(
              SELECT TOP (1) physical_name
              FROM sys.database_files DF
              WHERE FG.data_space_id = DF.data_space_id
              ) AS [Data File Path]
       ,P.rows AS [Total Rows]
FROM sys.indexes I
INNER JOIN sys.filegroups FG ON I.data_space_id = FG.data_space_id
INNER JOIN sys.partitions P ON P.object_id = I.object_id
WHERE I.object_id = OBJECT_ID('tblStudent')

Sample output:
Table Name
Table Type
File Group
Data File Path
Total Rows
tblStudent
HEAP
PRIMARY
C:\Program Files\MSSQL\DATA\Exact_Help.mdf
4

Note: In Sql server there is concept of partition table in which a table has multiple partitions and data of each partition can be stored in different file groups. By default data of a table is stored in first partition. 

Concept of pages:

A data file of a database can have multiple pages.  Sql server has below types of pages:

1. Data page
2.  Index page
3. Text/Image
4.  Global Allocation Map, Shared Global Allocation Map
5. Page Free Space
6. Index Allocation Map
7. Bulk Changed Map
8. Differential Changed Map

Data pages:

It keeps the data of heap (Without indexes) except data of data types: text, ntext, image, nvarchar(max), varchar (max), varbinary (max) and xml if it doesn't fit in case of in row data.

Table is tblStudent is example of heap since it has not any indexes. So, all the data of tblStudent will be stored only in the data pages.

Data storage in a data page:

Size of a data page is (8KB) 8192 bytes. Out of which
a. 96 bytes is stored to keep the header information of a data page.
b.  7 bytes per row is used for row overhead.
c.  2 bytes per is used for row offset.
d.  Rest spaces are used to keep the actual data.

--Script to get all the pages used by a table
DBCC IND('Exact_Help','tblStudent',-1)

Sample output (some columns has been excluded)
PageFID
PagePID
IndexID
PartitionNumber
iam_chain_type
PageType
IndexLevel
1
118
0
1
In-row data
10
NULL
1
89
0
1
In-row data
1
0

To store the all the data of table tblStudent only two pages has used. Here page Type 10 means IAM page and page type 1 means data page. IAM page keeps the meta data information of different pages. Data page keep the actual data of the table.

Content inside data page:

--script to check the content of PageFID 1 and PagePID 89
DBCC TRACEON(3604)
GO

DBCC page('Exact_Help',1,89,3)
GO

DBCC TRACEOFF(3604)
GO

Sample output:

PAGE: (1:89)


BUFFER:


BUF @0x000000017EEFEB80

bpage = 0x00000000956CC000          bhash = 0x0000000000000000          bpageno = (1:89)
bdbid = 12                          breferences = 0                     bcputicks = 0
bsampleCount = 0                    bUse1 = 25164                       bstat = 0xb
blog = 0xb215accc                   bnext = 0x0000000000000000         

PAGE HEADER:


Page @0x00000000956CC000

m_pageId = (1:89)                   m_headerVersion = 1                 m_type = 1
m_typeFlagBits = 0x0                m_level = 0                         m_flagBits = 0x8000
m_objId (AllocUnitId.idObj) = 113   m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594045333504                               
Metadata: PartitionId = 72057594040483840                                Metadata: IndexId = 0
Metadata: ObjectId = 437576597      m_prevPage = (0:0)                  m_nextPage = (0:0)
pminlen = 68                        m_slotCnt = 4                       m_freeCnt = 7804
m_freeData = 380                    m_reservedCnt = 0                   m_lsn = (38:152:27)
m_xactReserved = 0                  m_xdesId = (0:0)                    m_ghostRecCnt = 0
m_tornBits = 0                      DB Frag ID = 1                     

Allocation Status

GAM (1:2) = ALLOCATED               SGAM (1:3) = ALLOCATED             
PFS (1:1) = 0x61 MIXED_EXT ALLOCATED  50_PCT_FULL                        DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED          

Slot 0 Offset 0x60 Length 71

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 71

Memory Dump @0x000000000E658060

0000000000000000:   10004400 01000000 53636f74 74202020 20202020  ..D.....Scott      
0000000000000014:   20202020 20202020 20202020 20202020 20202020                      
0000000000000028:   20202020 20202020 20202020 20202020 20204c41                    LA
000000000000003C:   20202020 20202020 030000                              ...

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4

RollNumber = 1                      

Slot 0 Column 2 Offset 0x8 Length 50 Length (physical) 50

NAME = Scott                                                            

Slot 0 Column 3 Offset 0x3a Length 10 Length (physical) 10

City = LA                          

Slot 1 Offset 0xa7 Length 71

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 71

Memory Dump @0x000000000E6580A7

0000000000000000:   10004400 02000000 47726567 20202020 20202020  ..D.....Greg       
0000000000000014:   20202020 20202020 20202020 20202020 20202020                     
0000000000000028:   20202020 20202020 20202020 20202020 20205743                    WC
000000000000003C:   20202020 20202020 030000                              ...

Slot 1 Column 1 Offset 0x4 Length 4 Length (physical) 4

RollNumber = 2                     

Slot 1 Column 2 Offset 0x8 Length 50 Length (physical) 50

NAME = Greg                                                             

Slot 1 Column 3 Offset 0x3a Length 10 Length (physical) 10

City = WC                          

Slot 2 Offset 0xee Length 71

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 71

Memory Dump @0x000000000E6580EE

0000000000000000:   10004400 03000000 416c6169 6e202020 20202020  ..D.....Alain      
0000000000000014:   20202020 20202020 20202020 20202020 20202020                     
0000000000000028:   20202020 20202020 20202020 20202020 20204c6f                    Lo
000000000000003C:   6e646f6e 20202020 030000                      ndon    ...

Slot 2 Column 1 Offset 0x4 Length 4 Length (physical) 4

RollNumber = 3                     

Slot 2 Column 2 Offset 0x8 Length 50 Length (physical) 50

NAME = Alain                                                            

Slot 2 Column 3 Offset 0x3a Length 10 Length (physical) 10

City = London                      

Slot 3 Offset 0x135 Length 71

Record Type = PRIMARY_RECORD        Record Attributes =  NULL_BITMAP    Record Size = 71

Memory Dump @0x000000000E658135

0000000000000000:   10004400 04000000 526f6869 74202020 20202020  ..D.....Rohit      
0000000000000014:   20202020 20202020 20202020 20202020 20202020                     
0000000000000028:   20202020 20202020 20202020 20202020 20204465                    De
000000000000003C:   6c686920 20202020 030000                      lhi     ...

Slot 3 Column 1 Offset 0x4 Length 4 Length (physical) 4

RollNumber = 4                     

Slot 3 Column 2 Offset 0x8 Length 50 Length (physical) 50

NAME = Rohit                                                             

Slot 3 Column 3 Offset 0x3a Length 10 Length (physical) 10

City = Delhi  

Understanding output:

Header: It keeps multiple header information like page id, page type, version, total free spaces in the page etc.  

Slots:  For each row there is slot. In the tblStudent there are 4 rows hence there are 4 slots (slot 0, slot 1, slot 2 and slot 3)

Page size calculation:

Header Size + Total rows * (Row offset size + Row overhead + Data length of each rows)

For table tblstudent:
96 + 4 * (2 + 7 + (4 + 50 + 10)) = 388
Total free space in the page: 1024 * 8 – 388 = 7804
It is exactly same which is in m_freeCnt of the header of data page.

Note: To calculate the data length of each row is bit complex of variable length data type. For example:
CREATE TABLE tblStudent (
       RollNumber INTEGER
       ,NAME VARCHAR(50)
       ,City VARCHAR(10)
)


INSERT INTO tblStudent VALUES
       (1,'Scott','LA'),
       (2,'Greg','WC'),
       (3,'Alain','London'),
       (4,'Rohit','Delhi')

For this table data length of:
Row 1: 4 + (5 +2) + (2 + 2) + 2 = 17
Row 2: 4 + (4 +2) + (2 + 2) + 2 = 16
Row 3: 4 + (5 +2) + (6 + 2) + 2 = 21
Row 4: 4 + (5 +2) + (5 + 2) + 2 = 20

Note: Each varchar data type takes extra two bytes for each column as well as 2 bytes for each row. 

A table with clustered index

--Script to create a table with clustered index
CREATE TABLE tblStudent (
       RollNumber INTEGER PRIMARY KEY
       ,NAME VARCHAR(50)
       ,City VARCHAR(10)
)

--Populating couples of records
INSERT INTO tblStudent VALUES
       (1,'Scott','LA'),
       (2,'Greg','WC'),
       (3,'Alain','London'),
       (4,'Rohit','Delhi')

It is stored same as heap. It is stored in the data page.

Index Page:

Cluatered index (non-leaf) and Non-clustered index are stored in index page.

--Creating Non-clustered index
CREATE INDEX NC_Name ON tblStudent(Name)

--Script to get all the pages used by a table
DBCC IND('Exact_Help','tblStudent',-1)

Sample output
PageFID
PagePID
IndexID
PartitionNumber
iam_chain_type
PageType
IndexLevel
1
118
0
1
In-row data
10
NULL
1
89
0
1
In-row data
1
0
1
147
2
1
In-row data
10
NULL
1
146
2
1
In-row data
2
0

Here Page type 2 is index page.

--script to check the content of PageFID 1 and PagePID 146
DBCC TRACEON(3604)
GO

DBCC page('Exact_Help',1,146,3)
GO

DBCC TRACEOFF(3604)
GO

Sample Output:
FileId
PageId
Row
Level
NAME (key)
HEAP RID (key)
KeyHashValue
Row Size
1
146
0
0
Alain  
0x9000000001000200
(6514a0b0a058)
62
1
146
1
0
Greg
0x9000000001000100
(276df18e7a94)
62
1
146
2
0
Rohit  
0x9000000001000300
(e1fa712a78b5)
62
1
146
3
0
Scott  
0x9000000001000000
(26810d59e73a)
62

Note: In case of Heap table Sql server auto include RID column in the index page while in case of clustered index table it includes primary key column. 

--script to get the content of PageFID 1 and PagePID 146
DBCC TRACEON(3604)
GO

DBCC page('Exact_Help',1,146,1)
GO

DBCC TRACEOFF(3604)
GO

PAGE HEADER:


Page @0x0000000134AF0000

m_pageId = (1:146)                  m_headerVersion = 1                 m_type = 2
m_typeFlagBits = 0x0                m_level = 0                         m_flagBits = 0x4
m_objId (AllocUnitId.idObj) = 119   m_indexId (AllocUnitId.idInd) = 256
Metadata: AllocUnitId = 72057594045726720                               
Metadata: PartitionId = 72057594040877056                                Metadata: IndexId = 2
Metadata: ObjectId = 517576882      m_prevPage = (0:0)                  m_nextPage = (0:0)
pminlen = 59                        m_slotCnt = 4                       m_freeCnt = 7840
m_freeData = 344                    m_reservedCnt = 0                   m_lsn = (39:48:19)
m_xactReserved = 0                  m_xdesId = (0:0)                    m_ghostRecCnt = 0
m_tornBits = 0                      DB Frag ID = 1                     

Calculation of free space for index page

Size of index page – size of header – Total Number of Rows * (Row offset Size + (Data length of each row of Non-clustered index))
= 8 * 1024 - 96 - 4 * (2 + (50 + 8 + 4))
= 7840
   
Text/Image page:

This page keeps the data of large data types: text, ntext, image, nvarchar (max), varchar (max), varbinary (max), and xml data. Big data are not stored in the data page. It keeps only 4 bytes pointer to Text/Image page. Actual data are stored in Text/Image page.

--Script add new column of data type VARCHAR(MAX)
ALTER TABLE tblStudent ADD StudentResume VARCHAR(MAX)

UPDATE tblStudent
SET StudentResume = REPLICATE('My Info', 80000)

--Script to get all the pages used by a table
DBCC IND('Exact_Help','tblStudent',-1)

PageFID
PagePID
IndexID
PartitionNumber
iam_chain_type
PageType
IndexLevel
1
118
0
1
In-row data
10
NULL
1
89
0
1
In-row data
1
0
1
147
2
1
In-row data
10
NULL
1
146
2
1
In-row data
2
0
1
156
0
1
LOB Data
10
NULL
1
150
0
1
LOB data
3
0
1
162
0
1
LOB data
3
0
1
168
0
1
LOB data
3
0
1
169
0
1
LOB data
3
0

Page type 3 is Text/Image pages. There are 4 Text/Image pages have created in each row of tblStudent.
  
--script to check the content of PageFID 1 and PagePID 150 (text/image page)
DBCC TRACEON(3604)
GO

DBCC page('Exact_Help',1,150,3)
GO

DBCC TRACEOFF(3604)

GO

2 comments:

  1. Anonymous6/29/2016

    Size of a data page is 8KB not 4KB

    ReplyDelete