Text this: A multiple input probabilistic classifier for imbalanced dataset problem in semiconductor manufacturing