Date standardization

Cleaning Data in Java

Dennis Lee

Software Engineer

Why date standardization matters

  • Different date formats: slashes, dashes, dots
  • Inconsistent formats: hard to sort by date or compute time-based metrics
  • Solution: Standardize dates

 

 

Product Name Date Received
Bell Pepper 3/1/25
Vegetable Oil 04-01-2025
Cheese 2025.06.01
Cleaning Data in Java

Converting string dates to LocalDate

import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
// Specify expected date format (M/d/yy)
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("M/d/yy");

// Convert string to LocalDate LocalDate date = LocalDate.parse("3/1/25", formatter); System.out.println("Date converted from string: " + date);
Date converted from string: 2025-03-01
Cleaning Data in Java

Standardizing different date formats

String[] dates = {"3/1/25", "04-01-2025", "2025.06.01"}; // Input date formats
DateTimeFormatter[] formatters = { // Expected date formats in the input
    DateTimeFormatter.ofPattern("M/d/yy"), // No zero-padding in month and day
    DateTimeFormatter.ofPattern("MM-dd-yyyy"), // Zero-padded month and day
    DateTimeFormatter.ofPattern("yyyy.MM.dd")
};


System.out.println("Standardized dates:"); for (int i = 0; i < dates.length; i++) { LocalDate date = LocalDate.parse(dates[i], formatters[i]); // Convert to date System.out.println("Format " + dates[i] + " as " + date1); }
Cleaning Data in Java

Standardizing different date formats: outputs

Standardized dates:
Format 3/1/25 as 2025-03-01
Format 04-01-2025 as 2025-04-01
Format 2025.06.01 as 2025-06-01
Cleaning Data in Java

Formatting dates for display

// Example input date
LocalDate date = LocalDate.parse("2025-03-01");

// Want to display as March 1, 2025
DateTimeFormatter displayFormat = DateTimeFormatter.ofPattern("MMMM d, yyyy");

// Format date in desired format String formattedDate = date.format(displayFormat); System.out.println("Formatted date: " + formattedDate);
Formatted date: March 1, 2025
Cleaning Data in Java

Working with time zones

import java.time.ZoneId;
LocalDate date = LocalDate.parse("2025-03-01"); // Example input date

ZonedDateTime nyTime = date.atStartOfDay(ZoneId.of("America/New_York")); // Convert same moment to LA time ZonedDateTime laTime = nyTime.withZoneSameInstant(ZoneId.of("America/Los_Angeles"));
System.out.println("New York time: " + nyTime); System.out.println("Los Angeles time: " + laTime);
New York time: 2025-03-01T00:00-05:00[America/New_York]
Los Angeles time: 2025-02-28T21:00-08:00[America/Los_Angeles]
Cleaning Data in Java

Why string manipulation fails with dates

// Wrong: Trying to extract month using string manipulation
System.out.println("Does 3/1/25 start with 3? " + "3/1/25".startsWith("3"));
System.out.println("Does 03-15-25 start with 3? " + "03-15-25".startsWith("3"));

// Correct: Using standardized dates DateTimeFormatter formatter = DateTimeFormatter.ofPattern("MM-d-yy"); LocalDate date = LocalDate.parse("03-15-25", formatter); // Convert to date System.out.println("Month and year of 03-15-25: " + date.getMonth() + " " + date.getYear()); // Get month/year
Does 3/1/25 start with 3? true
Does 03-15-25 start with 3? false

Month and year of 03-15-25: MARCH 2025
Cleaning Data in Java

Putting it all together

Key Imports

import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.time.ZoneId;
  • Specify input date pattern: DateTimeFormatter.ofPattern()
  • Convert string to date: LocalDate.parse()
  • Specify and convert time zones: ZoneId.of(), .withZoneSameInstant()
  • Format a date: .format()
  • Parse dates with LocalDate.parse() instead of manipulating strings (.startsWith())
Cleaning Data in Java

Let's practice!

Cleaning Data in Java

Preparing Video For Download...