Perancangan Basis Data
Lis Sulmont
Curriculum Manager
Identifikasi grup data yang berulang dan buat tabel baru untuknya
Definisi yang lebih formal:
Tujuan normalisasi adalah:
- Mencirikan tingkat redundansi dalam skema relasional
- Menyediakan mekanisme untuk mentransformasi skema guna menghapus redundansi
Diurutkan dari paling tidak hingga paling ternormalisasi:
$$
Data awal
| Student_id | Student_Email | Courses_Completed |
|------------|-----------------|----------------------------------------------------------|
| 235 | [email protected] | Introduction to Python, Intermediate Python |
| 455 | [email protected] | Cleaning Data in R |
| 767 | [email protected] | Machine Learning Toolbox, Deep Learning in Python |
| Student_id | Student_Email |
|------------|-----------------|
| 235 | [email protected] |
| 455 | [email protected] |
| 767 | [email protected] |
| Student_id | Completed |
|------------|--------------------------|
| 235 | Introduction to Python |
| 235 | Intermediate Python |
| 455 | Cleaning Data in R |
| 767 | Machine Learning Toolbox |
| 767 | Deep Learning in Python |
Data awal
| Student_id (PK) | Course_id (PK) | Instructor_id | Instructor | Progress |
|-----------------|----------------|---------------|---------------|----------|
| 235 | 2001 | 560 | Nick Carchedi | .55 |
| 455 | 2345 | 658 | Ginger Grant | .10 |
| 767 | 6584 | 999 | Chester Ismay | 1.00 |
| Student_id (PK) | Course_id (PK) | Percent_Completed |
|-----------------|----------------|-------------------|
| 235 | 2001 | .55 |
| 455 | 2345 | .10 |
| 767 | 6584 | 1.00 |
| Course_id (PK) | Instructor_id | Instructor |
|----------------|---------------|---------------|
| 2001 | 560 | Nick Carchedi |
| 2345 | 658 | Ginger Grant |
| 6584 | 999 | Chester Ismay |
Data awal
| Course_id (PK) | Instructor_id | Instructor | Tech |
|----------------|---------------|---------------|--------|
| 2001 | 560 | Nick Carchedi | Python |
| 2345 | 658 | Ginger Grant | SQL |
| 6584 | 999 | Chester Ismay | R |
| Course_id (PK) | Instructor | Tech |
|----------------|---------------|--------|
| 2001 | Nick Carchedi | Python |
| 2345 | Ginger Grant | SQL |
| 6584 | Chester Ismay | R |
| Instructor_id | Instructor |
|---------------|---------------|
| 560 | Nick Carchedi |
| 658 | Ginger Grant |
| 999 | Chester Ismay |
Apa risikonya jika kita tidak cukup menormalisasi?
1. Anomali pembaruan
2. Anomali penyisipan
3. Anomali penghapusan
Ketidakkonsistenan data akibat redundansi saat memperbarui
| Student_ID | Student_Email | Enrolled_in | Taught_by |
|------------|-----------------|-------------------------|---------------------|
| 230 | [email protected] | Cleaning Data in R | Maggie Matsui |
| 367 | [email protected] | Data Visualization in R | Ronald Pearson |
| 520 | [email protected] | Introduction to Python | Hugo Bowne-Anderson |
| 520 | [email protected] | Arima Models in R | David Stoffer |
Untuk memperbarui email mahasiswa 520:
Tidak dapat menambahkan rekaman karena atribut hilang
| Student_ID | Student_Email | Enrolled_in | Taught_by |
|------------|-----------------|-------------------------|---------------------|
| 230 | [email protected] | Cleaning Data in R | Maggie Matsui |
| 367 | [email protected] | Data Visualization in R | Ronald Pearson |
| 520 | [email protected] | Introduction to Python | Hugo Bowne-Anderson |
| 520 | [email protected] | Arima Models in R | David Stoffer |
Tidak dapat menambahkan mahasiswa yang sudah mendaftar tetapi belum mengambil kursus apa pun
Penghapusan rekaman menyebabkan hilangnya data yang tidak disengaja
| Student_ID | Student_Email | Enrolled_in | Taught_by |
|------------|-----------------|-------------------------|---------------------|
| 230 | [email protected] | Cleaning Data in R | Maggie Matsui |
| 367 | [email protected] | Data Visualization in R | Ronald Pearson |
| 520 | [email protected] | Introduction to Python | Hugo Bowne-Anderson |
| 520 | [email protected] | Arima Models in R | David Stoffer |
Jika kita menghapus mahasiswa 230, apa yang terjadi pada data Cleaning Data in R?
Apa risikonya jika kita tidak cukup menormalisasi?
1. Anomali pembaruan
2. Anomali penyisipan
3. Anomali penghapusan
Semakin ternormalisasi basis data, semakin kecil kemungkinannya terjadi anomali data
Jangan lupa sisi negatif normalisasi dari video sebelumnya
Perancangan Basis Data