Understanding data import fundamentals

Importing Data in Java

Anthony Markham

VP Quant Developer

Meet your instructor!

 

$$

Anthony Markham

  • VP, Quantitative Developer/Analytics Lead
  • C++/Java/Python developer in investment banking
  • University teaching experience

 

Anthony Markham

Importing Data in Java

Data import fundamentals

  • Essential for processing external information in Java applications

  • Common formats include CSV (comma-separated values), JSON, and Excel

Flow chart showing the five steps in the import workflow

  • Java provides robust tools in java.io and java.nio packages
Importing Data in Java

File handling basics

  • The File class represents files or directories
  • Methods such as exists(), length(), and isDirectory() allow us to validate our file
import java.io.File;
File dataFile = new File("data.csv");
boolean exists = dataFile.exists();
long size = dataFile.length();
boolean isDirectory = dataFile.isDirectory();
Importing Data in Java

The Path interface and Files class

  • Path interface and Files class provide modern file operations (java.nio)
  • Benefits: More flexibility, better exception handling, and performance
  • java.io for simple file operations; java.nio for high-performance input/output operations
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.Files;
Path dataPath = Paths.get("data.csv");

boolean exists = Files.exists(dataPath); long size = Files.size(dataPath);
Importing Data in Java

Reading text files

  • Files.readAllLines(): Reads entire file into List<String> (one element per line)
  • Files.readString(): Reads entire file into a single string
Path file = Paths.get("data.csv");

// Read all lines at once List<String> lines = Files.readAllLines(file); // Read entire file as a string String content = Files.readString(file);

$$

$$

  • The entire file is loaded into memory 🛑
Importing Data in Java

Data validation

  • Ensures data quality before processing
  • Check data quality and structure
  • Perform common validations
  • Handle any Exception

Data validation checks

Importing Data in Java

Data validation

  • Common checks: verifying the file isn't empty, confirming required columns in the header
  • Handle Exception with a try-catch block ✅
try {
  Path file = Paths.get("data.csv");
  List<String> lines = Files.readAllLines(file);
  if (lines.isEmpty()) { // Validate file has content
      System.out.println("Warning: File is empty");}
  String header = lines.get(0);
  if (!header.contains("id") || !header.contains("name")) {    // Check header
      System.out.println("Error: File missing required columns");
} catch (Exception e) {
    System.out.println("Error reading file: " + e.getMessage());}
Importing Data in Java

Summary

Class/Interface Method Description
File new File() Creates an abstract representation of a file path
File exists() Checks if a file exists
File length() Gets file size in bytes
Paths get() Creates a Path object from a string
Files exists() Checks if a file exists (modern API)
Files size() Gets file size in bytes (modern API)
Files readAllLines() Reads entire file into List<String>
Files readString() Reads entire file into a single String
1 https://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html
Importing Data in Java

Let's practice!

Importing Data in Java

Preparing Video For Download...