Tipe data karakter dan masalah umum

Analisis Data Eksploratif di SQL

Christina Maimone

Data Scientist

Tipe karakter di PostgreSQL

character(n) atau char(n)

  • panjang tetap n
  • spasi di akhir diabaikan saat perbandingan

character varying(n) atau varchar(n)

  • panjang variabel hingga maksimum n

text atau varchar

  • panjang tak terbatas
Analisis Data Eksploratif di SQL

Jenis data teks

Kategorikal

Tues, Tuesday, Mon, TH

shirts, shoes, hats, pants

satisfied, very satisfied, unsatisfied

0349-938, 1254-001, 5477-651

red, blue, green, yellow

Teks tak terstruktur

I really like this product. I use it every day. It's my favorite color.

We've redesigned your favorite t-shirt to make it even better. You'll love...

Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal...

Analisis Data Eksploratif di SQL

Mengelompokkan dan menghitung

SELECT category,        -- variabel kategorikal

       count(*)         -- hitung baris per kategori

  FROM product          -- tabel

 GROUP BY category;     -- variabel kategorikal


 category | count 
----------+-------
 Banana   |     1
 Apple    |     4
 apple    |     2
  apple   |     1
 banana   |     3
(5 rows)
Analisis Data Eksploratif di SQL

Urutan: nilai paling sering

SELECT category,        -- variabel kategorikal

       count(*)         -- hitung baris per kategori

  FROM product          -- tabel

 GROUP BY category      -- variabel kategorikal

 ORDER BY count DESC;   -- tampilkan nilai paling sering terlebih dahulu
 category | count 
----------+-------
 Apple    |     4
 banana   |     3
 apple    |     2
 Banana   |     1
  apple   |     1
(5 rows)
Analisis Data Eksploratif di SQL

Urutan: nilai kategori

SELECT category,        -- variabel kategorikal

       count(*)         -- hitung baris per kategori

  FROM product          -- tabel

 GROUP BY category      -- variabel kategorikal

 ORDER BY category;     -- urutkan menurut kategori
 category | count 
----------+-------
  apple   |     1
 Apple    |     4
 Banana   |     1
 apple    |     2
 banana   |     3
(5 rows)

Analisis Data Eksploratif di SQL

Urutan alfabet

-- Hasil

 category | count 
----------+-------
  apple   |     1
 Apple    |     4
 Banana   |     1
 apple    |     2
 banana   |     3
(5 rows)

-- Urutan alfabet:

' ' < 'A' < 'a'
-- Dari hasil

' ' < 'A' < 'B' < 'a' < 'b'

Analisis Data Eksploratif di SQL

Masalah umum

Huruf besar-kecil berpengaruh

    'apple' != 'Apple'

 

Spasi dihitung

    ' apple' != 'apple'

    '' != '       '

String kosong bukan null

    '' != NULL

 

Perbedaan tanda baca

    'to-do' != 'to–do'

Analisis Data Eksploratif di SQL

Saatnya meninjau data teks

Analisis Data Eksploratif di SQL

Preparing Video For Download...