IF41204 - IF31204 Data Integrity Pengenalan Basisdata Taken from Togu Turnip slide- Institut Teknologi Del Minggu 05 Sesi 01 AGENDA 1 Data Integrity 2 Type of Data Integrity 3 Enforcing Data Integrity Data Integrity 1 When is Data Integrity Lost? • • • • Invalid data added to the database Existing data modified to incorrect value Changes to database lost Changes partially applied Why Data Integrity Important • • • • • Undermines the safety and efficiency and/or assurance of quality of the data Example: drugs that consumers will take. Data integrity problems break trust. We rely largely on trusting the firm to do the right thing when no one is seeing. Integrity of data can be maintained by specifying certain checks at the time of creating and modifying tables and then applying those checks when handling data. Data Integrity • • • In its broadest use, “data integrity” refers to the accuracy and consistency of data stored in a relational database. The term – Data Integrity - can be used to describe a state, a process or a function – and is often used as a proxy for “data quality”. All characteristics of the data must be correct – including business rules, relations, dates, definitions and lineage – for data to be complete. Rules of Data Integrity • • • • Uniqueness Validity Consistency Business Rules 2 Type of Data Integrity Types of Data Integrity 1 2 Requirement Stages Requirement Stages 3 Requirement Stages Illustration of Data Integrity Entity Integrity • • • • • Each entity is unique and table’s PRIMARY KEY must have unique values for each row DBMS automatically checks uniqueness of primary key value for each insert or update Insert or update to include duplicate value in primary key column results in error If required for non-primary key column, UNIQUE constraint or CREATE INDEX Uniqueness also enforces NOT NULL for primary key Domain Integrity A domain defines the possible values of an attribute. Domain Integrity rules govern these values. In a database system, the domain integrity is defined by: • • • • • • • • The datatype and the length The NULL value acceptance The allowable values, through techniques like constraints or rules The default value For example, if you define the attribute of Age, of an Employee entity, is an integer, the value of every instance of that attribute must always be numeric and an integer. Domain Information Each attribute in the model should be assigned domain information which includes: • • • • • • • • Data Type - Basic data types are integer, decimal, or character. Most data bases support variants of these plus special data types for date and time. Length - This is the number of digits or characters in the value. For example, a value of 5 digits or 40 characters. Date Format - The format for date values such as dd/mm/yy or mm/dd/yyyy. Range - The range specifies the lower and upper boundaries of the values the attribute may legally have. Constraints - Are special restrictions on allowable values. For example, the LeavingDate for an Employee must always be greater than the HireDate for that Employee. Null support - Indicates whether the attribute can have null values. Default value (if any) - The value an attribute instance will have if a value is not entered. Referential Integrity • • • • A Rule that states that any foreign key value (on the relation of the many sides) MUST match a primary key value in another table. A constraint that enforce integrity between related tables. Foreign key is a set of attributes that are a primary key for some other relations Foreign key is identifier that enable a dependent relation to refer to its parent relation 3 Enforcing Data Integrity Mechanisms to Enforcing Data Integrity Determining Which Type of Constraint to Use Requirement Stages Domain A domain defines a logical set of values that make up the valid values in a column. Domain integrity is maintained using the following: • • • • • FOREIGN KEY constraint: a FOREIGN KEY column can either have a value that exists in the UNIQUE or PRIMARY KEY columns of the referenced table or it can have a null value. CHECK constraint: specifies the range of valid data values that can be entered into a column. DEFAULT definitions: specify default values for columns that do not accept null values. Domain • • • FOREIGN KEY constraint: a FOREIGN KEY column can either have a value that exists in the UNIQUE or PRIMARY KEY columns of the referenced table or it can have a null value. CHECK constraint: CHECK constraint can be specified on a column within the CREATE TABLE or ALTER TABLE statement. DEFAULT definitions: specify default values for columns that do not accept null values. Example : Domain Constraints Requirement Stages Entity A table is said to comply with entity integrity when no two rows in the table have the exact same values in all the columns. Entity integrity is ensured using: • • • • • • PRIMARY KEY constraint: does not allow duplicate or null values to be inserted. UNIQUE constraint: does not allow duplicate but allows null values to be inserted. However, this column allows a null value to be inserted once. Indexes: prevents duplicate values from being entered in a column. IDENTITY property: defines an identifier column that contains systemgenerated sequential values for every record inserted. Example : Entity Constraints Requirement Stages Example : Key Constraints Referential Referential integrity is implemented using the concept of FOREIGN KEYS. FOREIGN KEYS columns reference UNIQUE or PRIMARY KEY columns in other tables. Referential integrity is ensured using: • • • • • Values can be inserted in a FOREIGN KEY column only if similar values exist in the referenced UNIQUE or PRIMARY KEY column. If a value in the UNIQUE or PRIMARY KEY column is modified, similar modifications are carried out in the referring FOREIGN KEY columns. If a value in the UNIQUE or PRIMARY KEY column is deleted, deletion should be carried out in the referring FOREIGN KEY columns. The Cascading Options • • • • When records in the referenced PRIMARY KEY table are modified, changes can be cascaded to corresponding records in the referring FOREIGN KEY table. This way is used to maintain the referential integrity. The cascading options are defined in the REFERENCES clause of the CREATE TABLE or ALTER TABLE statement. DBMS provides the following two cascading options: • Cascading Update. • Cascading Delete. Example : Referential Constraints Requirement Stages Summary • • • • • • Data Integrity ensures accurate and up-to-date information at any point in time. In DBMS, data integrity is enforced using Constraints, Default Values, Rules and Triggers Types of Data Integrity To maintain accuracy and consistency of data in a relational database. To ensure validity and consistency of data in a database. DBMS supports UNIQUE, CHECK, PRIMARY and FOREIGN KEY constraints on columns in a table. References • DAVID M. KROENKE’S DATABASE CONCEPTS, 2nd Edition, 2005 Pearson Prentice Hall • Slide was adapted from VES (2016) for Introduction to Database course •Database Operations and Constraint Violations in DBMS, accessed from: http://www.edugrabs.com/database-operations-and-constraintviolations/ THANK YOU! DON’T FORGET TO BE AWESOME