...the unnecessarily complicated way.
Recently I've been asked to parse text files with fixed-width columns (see also ISAM).
The task is menial but I had two different schema and particularly large files (77 fields and about 21.000 records).
The schema were described in a table with the name and the width of each field.
Here's what I did
- pasted the table into Excel
- used a formula to cleanup the field names (removing spaces, hyphen, parentheses, whatever)
- used another formula to build a Java declaration for each field
- created a class (with Lombok) with the record structure
Here's a (short) example.
field name | field width | field name (Java) | Java declaration |
flg-store | 1 | flg_store | @Column(width=1, index=47) private String flg_store; |
Code NEW | 10 | code_new | @Column(width=10, index=58) private String code_new; |
And here's the class I built:
@Data public class Item { @Column(width=1, index=0) private String flg_store; @Column(width=10, index=1) private String code_new; }
So the objects of this class will hold the data parsed from each record. The instructions for parsing are attached to each field via annotations.
Of course any other configuration method would do (property file, XML, JSON, even the Excel file itself), but I find that having the code and the schema all in the same place is very effective.
Here is the annotation:
import java.lang.annotation.ElementType; import java.lang.annotation.Retention; import java.lang.annotation.RetentionPolicy; import java.lang.annotation.Target; @Target(ElementType.FIELD) @Retention(RetentionPolicy.RUNTIME) public @interface Column { public int width(); public int index(); }
Ok, so how to use this information? I found it easier to build a list of instructions for parsing, and then parse each line to build the object.
package it.digiwrite.tiis.sap.parser; import it.digiwrite.anoto.utility.ReflectionHelper; import java.lang.reflect.Field; import java.util.ArrayList; import java.util.List; import lombok.Data; public class FixedWidthParser<T> { @Data static class FieldDescriptor { int index; int width; String fieldName; } List<FixedWidthParser.FieldDescriptor> mapIndexFieldName; public FixedWidthParser(Class<T> what) { mapIndexFieldName = new ArrayList<FixedWidthParser.FieldDescriptor>(); for (Field field : what.getDeclaredFields()) { Column column = field.getAnnotation(Column.class); FixedWidthParser.FieldDescriptor fd = new FixedWidthParser.FieldDescriptor(); fd.setFieldName(field.getName()); fd.setIndex(column.index()); fd.setWidth(column.width()); mapIndexFieldName.add(column.index(), fd); } } public T parseLine(String line, Class<T> cla22) { try { int lastPosition = 0; T item = (T)cla22.newInstance(); ReflectionHelper rh = new ReflectionHelper(item); // 1 for (int i = 0; i<mapIndexFieldName.size(); i++) { FixedWidthParser.FieldDescriptor fd = mapIndexFieldName.get(i); String token = line.substring(lastPosition, lastPosition+fd.width); lastPosition = lastPosition+fd.width; rh.set(fd.fieldName, token.trim()); // 2 } return item; } catch (Exception e) { throw new RuntimeException("Error parsing line: " + line); } } }NOTE:
- ReflectionHelper is a utility class to wrap an object; here it is used only to...
- ...easily set a field via reflection
- this code uses templates
And how do we use this?
package it.digiwrite.tiis.sap; import static org.junit.Assert.assertEquals; import it.digiwrite.tiis.sap.bean.Item; import it.digiwrite.tiis.sap.parser.FixedWidthParser; import java.io.BufferedReader; import java.io.FileReader; import org.junit.Test; public class FixedWidthParserTest { @Test public void flusso() { String line = "Y0000011111"; FixedWidthParser<Item> fwp = new FixedWidthParser<item>(FlussoCompletoSap.class); FlussoCompletoSap fcs = fwp.parseLine(line, FlussoCompletoSap.class); assertEquals("Y", fcs.getFlg_store()); assertEquals("0000011111", fcs.getCodenew()); } }
NOTE 2: here is the code for ReflectionHelper
package ...; import java.lang.reflect.Field; import java.lang.reflect.Method; public class ReflectionHelper { private Object obj; // wrapped object private Class klass; // wrapped object Class private String className; // wrapped object class name public ReflectionHelper(Object obj) { this.obj = obj; this.klass = obj.getClass(); String className = klass.getName(); this.className = className.substring(className.lastIndexOf(".")); } public String[] getFieldsNames() { Field[] fields = klass.getDeclaredFields(); String[] fieldsNames = new String[fields.length]; int i = 0; for (Field f : fields) { fieldsNames[i++] = f.getName(); } return fieldsNames; } public String getClassName() { return className; } public Class getObjectClass() { return klass; } /** * Gets the property "itemName" from the wrapped object "obj", equivalent to: obj.getItemName() * @param itemName * @return obj.getItemName * @throws Exception */ public Object get(String itemName) throws Exception { Method get; try { itemName = capitalize(itemName); try { get = klass.getMethod("get" + itemName, new Class[0]); } catch (NoSuchMethodException nsme) { get = klass.getMethod("is" + itemName, new Class[0]); } return get.invoke(obj, new Object[0]); } catch (Exception e) { throw e; } } /** * * @param itemName * @return string with the first letter uppercase (ex. "example" --> "Example") */ private String capitalize(String itemName) { return itemName.substring(0, 1).toUpperCase() + itemName.substring(1); } public Class fieldType(String fieldName) throws Exception { Class x = null; Class clazz = klass; while (clazz != null && (x == null)) { try { Field field = clazz.getDeclaredField(fieldName); x = field.getType(); } catch (NoSuchFieldException nsfe) { clazz = clazz.getSuperclass(); } } if (x == null) { throw new RuntimeException("No such method: '" + fieldName + "'"); } return x; } public void set(String fieldName, Object value) throws Exception { String methodName = "set" + capitalize(fieldName); Class class1 = null; try { class1 = fieldType(fieldName); Method set = klass.getMethod(methodName, new Class[] { class1 }); Object[] args = new Object[] { value }; set.invoke(obj, args); } catch (Exception e) { throw new RuntimeException("class: " + class1 + "; method: " + methodName,e); } } public String executeVoidToString(String methodName) { try { Method get = klass.getMethod(methodName, new Class[0]); return (String) get.invoke(obj, new Object[0]); } catch (Exception e) { throw new RuntimeException(e); } } /** * "RogerRabbit" --> "roger_rabbit" */ public static String camelCaseToDb(String fieldName) { if ((fieldName==null)||(fieldName.length()==0)) return fieldName; String charE = fieldName.substring(0, 1); boolean lowerCase = charE.toLowerCase().equals(charE); String result = charE; for (int i=1; i<fieldName.length(); i++) { charE=fieldName.substring(i,i+1); boolean isThisTheCase = charE.toLowerCase().equals(charE); if (lowerCase & !isThisTheCase) result+="_"; result+=charE.toLowerCase(); lowerCase=isThisTheCase; } return result; } }