1. Overview
One common task when working with Excel files in Java is identifying empty rows, especially when processing large datasets for analysis or reporting.
Empty rows in an Excel file can disrupt data processing, leading to inaccurate results or unnecessary complications in data analysis. Identifying these rows ensures that operations, such as data cleaning or transformation, run smoothly.
In this tutorial, we’ll examine three popular Java libraries — Apache POI, JExcel, and fastexcel — and see how to use each to read and find empty rows in an Excel spreadsheet.
2. Using Apache POI
Apache POI is a comprehensive library for working with Excel files in Java, supporting both .xls and .xlsx formats. It’s widely used due to its flexibility and robustness.
2.1. Maven Dependencies
To start using Apache POI, we’ll add the dependency to our pom.xml file:
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.3.0</version>
</dependency>
2.2. Detecting the Empty Rows
First, we’ll create a helper class with a single method called isRowEmpty() to detect empty rows. This will iterate through each row, and check if all cells are either null or blank by comparing the cell type with CellType.BLANK:
public class PoiHelper {
public static boolean isRowEmpty(Row row) {
for (int cellNum = row.getFirstCellNum(); cellNum < row.getLastCellNum(); cellNum++) {
Cell cell = row.getCell(cellNum);
if (cell != null && cell.getCellType() != CellType.BLANK) {
return false;
}
}
return true;
}
}
2.3. Testing the Method
We’ll start by creating our test class and method and opening the workbook in a try-with-resources block. We’re using a simple empty file called empty_excel_file.xlsx for testing:
public class PoiDetectEmptyRowUnitTest {
private PoiHelper poiHelper = new PoiHelper();
private static final String XLSX_EMPTY_FILE_PATH = "src/main/resources/empty_excel_file.xlsx";
@Test
public void givenXLSXFile_whenParsingExcelFile_thenDetectAllRowsEmpty() throws IOException {
try (FileInputStream file = new FileInputStream(XLSX_EMPTY_FILE_PATH);
Workbook workbook = new XSSFWorkbook(file)) {
Sheet sheet = workbook.getSheetAt(0);
for (int rowNum = 0; rowNum <= sheet.getLastRowNum(); rowNum++) {
Row row = sheet.getRow(rowNum);
assertTrue(poiHelper.isRowEmpty(row));
}
}
}
}
Then, we obtain the first sheet in the workbook and iterate through all the rows in this sheet. For each row, we’ll apply the isRowEmpty() method previously created and assert that the row is empty.
3. Using JExcel
JExcel is another library that handles Excel files, particularly .xls files. It’s known for its simplicity and ease of use.
3.1. Maven Dependencies
To use JExcel, we’ll add the dependency to our pom.xml file:
<dependency>
<groupId>net.sourceforge.jexcelapi</groupId>
<artifactId>jxl</artifactId>
<version>2.6.12</version>
</dependency>
3.2. Detecting the Empty Rows
Detecting empty rows with JExcel involves iterating through an array of Cell objects and checking that all of them are blank by using the getContents() method:
public class JExcelHelper {
public boolean isRowEmpty(Cell[] row) {
if (row == null) {
return true;
}
for (Cell cell : row) {
if (cell != null && !cell.getContents().trim().isEmpty()) {
return false;
}
}
return true;
}
}
3.3. Testing the Method
Now, let’s see it in action:
public class JExcelDetectEmptyRowUnitTest {
private JExcelHelper jexcelHelper = new JExcelHelper();
private static final String EMPTY_FILE_PATH = "src/main/resources/empty_excel_file.xls";
@Test
public void givenXLSFile_whenParsingJExcelFile_thenDetectAllRowsEmpty()
throws IOException, BiffException {
Workbook workbook = Workbook.getWorkbook(new File(EMPTY_FILE_PATH));
Sheet sheet = workbook.getSheet(0);
for (int rowNum = 0; rowNum < sheet.getRows(); rowNum++) {
Cell[] row = sheet.getRow(rowNum);
assertTrue(jexcelHelper.isRowEmpty(row));
}
}
}
Here, we’ve opened the workbook and then retrieved the first sheet. After this, we’ll iterate through all the rows in the sheet and apply the helper method to each to assert that they are empty.
4. Using fastexcel
fastexcel is a lightweight library, optimized for reading and writing large Excel files quickly. It’s a good choice when performance is critical.
4.1. Maven Dependencies
We’ll add the fastexcel dependency to our project by including it in our pom.xml file:
<dependency>
<groupId>org.dhatim</groupId>
<artifactId>fastexcel</artifactId>
<version>0.18.3</version>
</dependency>
4.2. Detecting the Empty Rows
To detect empty rows in fastexcel, we’ll stream through the cells in a Row object and check if each cell is empty using the getText() method:
public class FastexcelHelper {
public boolean isRowEmpty(Row row) {
if (row == null) {
return true;
}
for (Cell cell : row) {
if (cell != null && !cell.getText().trim().isEmpty()) {
return false;
}
}
return true;
}
}
4.3. Testing the Method
Let’s see it in action to check if all the rows of a spreadsheet are empty:
public class FastexcelDetectEmptyRowUnitTest {
private FastexcelHelper fastexcelHelper = new FastexcelHelper();
private static final String EMPTY_FILE_PATH = "src/main/resources/empty_excel_file.xlsx";
@Test
public void givenXLSXFile_whenParsingEmptyFastExcelFile_thenDetectAllRowsAreEmpty()
throws IOException {
try (FileInputStream file = new FileInputStream(EMPTY_FILE_PATH);
ReadableWorkbook wb = new ReadableWorkbook(file)) {
Sheet sheet = wb.getFirstSheet();
try (Stream<Row> rows = sheet.openStream()) {
boolean isEmpty = rows.allMatch(fastexcelHelper::isRowEmpty);
assertTrue(isEmpty);
}
}
}
}
Like the previous tests, we’ll open the workbook and obtain the first sheet. Then, we’ll iterate through a Stream of rows to check that they’re empty.
fastexcel’s streaming API openStream() makes it efficient for processing large datasets, even when searching for empty rows.
5. Conclusion
In this tutorial, we saw that whether we use Apache POI, JExcel, or fastexcel, each library provides powerful tools to detect empty rows effectively. We also looked at the code examples using each library, which helped us understand multiple ways to detect empty rows in Excel files.
As always, the full code examples are available over on GitHub.