Pulire stringhe disordinate

Pulizia dei dati nei database SQL Server

Miriam Antona

Software Engineer

Rimuovere spazi extra

SELECT * FROM carriers
| code | name                            |
|------|---------------------------------|
| YV   | Mesa Airlines Inc.              |
| AA   |   American Airlines Inc.        |
| B6   |   JetBlue Airways               |
| DL   |   Delta Air Lines Inc.          |
| HA   |   Hawaiian Airlines Inc.        |
| MQ   |   American Eagle Airlines Inc.  |
| EV   |  ExpressJet Airlines Inc.       |
| UA   |  United Air Lines Inc.          |
| US   |  US Airways Inc.                |
| ...  | ...                             |
Pulizia dei dati nei database SQL Server

Rimuovere spazi extra - TRIM

TRIM ( [characters ] string )
  • Disponibile da SQL Server 2017
  • Rimuove i caratteri indicati all'inizio e alla fine di una stringa
  • Se non specifichi caratteri, rimuove gli spazi
SELECT TRIM('   JetBlue Airways    ');
JetBlue Airways
Pulizia dei dati nei database SQL Server

Rimuovere spazi extra - RTRIM e LTRIM

  • Per versioni precedenti a SQL Server 2017 -> RTRIM e LTRIM.
-- Rimuove tutti gli spazi finali
RTRIM ( character_expression )
-- Rimuove tutti gli spazi iniziali
LTRIM ( character_expression )
SELECT LTRIM(RTRIM('   JetBlue Airways    '));
JetBlue Airways
Pulizia dei dati nei database SQL Server

Rimuovere spazi extra

SELECT code, TRIM(name) AS name FROM carriers
SELECT code, LTRIM(RTRIM(name)) AS name FROM carriers
| code | name                         |
|------|------------------------------|
| YV   | Mesa Airlines Inc.           |
| AA   | American Airlines Inc.       |
| B6   | JetBlue Airways              |
| DL   | Delta Air Lines Inc.         |
| HA   | Hawaiian Airlines Inc.       |
| US   | US Airways Inc.              |
| ...  | ...                          |
Pulizia dei dati nei database SQL Server

Uniformare le stringhe

SELECT * FROM airports 
ORDER BY airport_state
| airport_code | airport_name                            | airport_city    | airport_state |
|--------------|-----------------------------------------|-----------------|---------------|
| ...          | ...                                     | ...             | ...           |
| MIA          | Miami International                     | Miami           | fl            |
| TPA          | Tampa International                     | Tampa           | Fl            |
| FLL          | Fort Lauderdale-Hollywood International | Fort Lauderdale | FL            |
| MCO          | Orlando International                   | Orlando         | Florida       |
| ...          | ...                                     | ...             | ...           |
Pulizia dei dati nei database SQL Server

Uniformare le stringhe - REPLACE

"Fl" / "fl" / "Florida" -> "Florida"

REPLACE ( string_to_replace , occurrences , string_replacement )
  • Sostituisce tutte le occorrenze di una stringa con un'altra

  • Confronta in base alla collation dell'input

    • Usa COLLATE per confrontare con una collation specifica
    • Supponiamo che la collation degli input ignori le maiuscole/minuscole
Pulizia dei dati nei database SQL Server

Uniformare le stringhe - REPLACE

SELECT 
    airport_code, airport_name, airport_city, 
    REPLACE(airport_state, 'FL', 'Florida') AS airport_state
FROM airports
ORDER BY airport_state
| airport_code | airport_name                            | airport_city    | airport_state |
|--------------|-----------------------------------------|-----------------|---------------|
| ...          | ...                                     | ...             | ...           |
| MIA          | Miami International                     | Miami           | Florida       |
| TPA          | Tampa International                     | Tampa           | Florida       |
| FLL          | Fort Lauderdale-Hollywood International | Fort Lauderdale | Florida       |
| MCO          | Orlando International                   | Orlando         | Floridaorida  |
| ...          | ...                                     | ...             | ...           |
Pulizia dei dati nei database SQL Server

Uniformare le stringhe - REPLACE

SELECT 
    airport_code, airport_name, airport_city, 
    REPLACE
        (REPLACE(airport_state, 'FL', 'Florida'),
         'Floridaorida', 'Florida') AS airport_state
FROM airports
ORDER BY airport_state
| airport_code | airport_name                            | airport_city    | airport_state |
|--------------|-----------------------------------------|-----------------|---------------|
| MCO          | Orlando International                   | Orlando         | Florida       |
| TPA          | Tampa International                     | Tampa           | Florida       |
| FLL          | Fort Lauderdale-Hollywood International | Fort Lauderdale | Florida       |
| MIA          | Miami International                     | Miami           | Florida       |
| ...          | ...                                     | ...             | ...           |
Pulizia dei dati nei database SQL Server

Uniformare le stringhe - REPLACE + CASE

SELECT airport_code, airport_name, airport_city, 
    CASE
        WHEN airport_state <> 'Florida' THEN REPLACE(airport_state, 'FL', 'Florida')
        ELSE airport_state 
    END AS airport_state
FROM airports
ORDER BY airport_state
| airport_code | airport_name                            | airport_city    | airport_state |
|--------------|-----------------------------------------|-----------------|---------------|
| MCO          | Orlando International                   | Orlando         | Florida       |
| TPA          | Tampa International                     | Tampa           | Florida       |
| FLL          | Fort Lauderdale-Hollywood International | Fort Lauderdale | Florida       |
| MIA          | Miami International                     | Miami           | Florida       |
| ...          | ...                                     | ...             | ...           |
Pulizia dei dati nei database SQL Server

Uniformare le stringhe - REPLACE + UPPER

"Fl" / "fl" / "Florida" -> "FL"

SELECT 
    airport_code, airport_name, airport_city, 
    REPLACE(airport_state, 'Florida', 'FL') AS airport_state
FROM airports
ORDER BY airport_state
| airport_code | airport_name                            | airport_city    | airport_state |
|--------------|-----------------------------------------|-----------------|---------------|
| MCO          | Orlando International                   | Orlando         | FL            |
| TPA          | Tampa International                     | Tampa           | Fl            |
| FLL          | Fort Lauderdale-Hollywood International | Fort Lauderdale | FL            |
| MIA          | Miami International                     | Miami           | fl            |
| ...          | ...                                     | ...             | ...           |
Pulizia dei dati nei database SQL Server

Uniformare le stringhe - REPLACE + UPPER

UPPER ( character_expression )
  • Converte una stringa in maiuscolo.
SELECT 
    airport_code, airport_name, airport_city, 
        UPPER(
              REPLACE(airport_state, 'Florida', 'FL')
        ) AS airport_state
FROM airports
ORDER BY airport_state
Pulizia dei dati nei database SQL Server

Uniformare le stringhe - REPLACE + UPPER

| airport_code | airport_name                            | airport_city    | airport_state |
|--------------|-----------------------------------------|-----------------|---------------|
| ...          | ...                                     | ...             | ...           |
| MCO          | Orlando International                   | Orlando         | FL            |
| TPA          | Tampa International                     | Tampa           | FL            |
| FLL          | Fort Lauderdale-Hollywood International | Fort Lauderdale | FL            |
| MIA          | Miami International                     | Miami           | FL            |
| ...          | ...                                     | ...             | ...           |
Pulizia dei dati nei database SQL Server

Ayo berlatih!

Pulizia dei dati nei database SQL Server

Preparing Video For Download...