Creating a Scalable Search Engine

Wrote an end-to-end search engine similar to Google or Bing which indexed Wikipedia instead of the whole world wide web, where I used information retrieval concepts like PageRank, tf-idf and parallel data processing with the MapReduce framework. Searched “How many chickens does DeOrio have?”, top answer returned was “too many”. I’d say the engine is pretty accurate.



If you are an employer, I can request to publish code samples with the explicit permission of the University, that's why I don't have a Github repo link for this project.