How to find the best candidate for a job position (or vice versa) from a pool of millions of candidate documents? This task is much harder than we thought, given the daunting amount of unstructured content and the complex nature of the search problem. We used machine learning and Cortana Analytics Suite to help our client unlock the value of big data. More specifically, we developed and operationalized an end-to-end solution that mines super complex unstructured data about candidates and position specifications, and then learns the patterns of good matches. Machine Learning is a great tool to identify subtle patterns between otherwise dissimilar documents, but most implementations will struggle to scale to large content volumes and real-time results due to scalability and performance issues. It is not designed to operate in real time, but a search engine is. By indexing candidate documents upfront, we can quickly apply term vector calculations between a new, non-indexed document and the entire indexed corpus. Scoring is applied by extracting the top terms with the highest tf-idf from the input document and performing a query using those terms against indexed content. In this talk, we will present our solutions and practices that integrate the efficiency of Azure Search and the power of AML for resume/job matching to achieve better scalability and performance. This solution could be extended to other use case which related to document matching.
Machine Learning Blog link:
Cortana Analytics Suite Powers Russell Reynolds Associates' Search for the Perfect Match